Azure Cosmos DB – Update existing documents with an additional field – Code Utility

[

Existing Cosmos DB documents need to be altered/updated with a new property & also existing documents of other collections need to be updated with the same new property along with its value.

Is there any recommended way or tool available to update existing documents on Cosmos DB, or is writing the custom c# application/PowerShell script using Cosmos DB SDK is the only option?

Example:

Existing user document

{
   id:[email protected],
   name: "abc",
   country: "xyz"
}  

Updated user document

{
   id:[email protected],
   name: "abc",
   country: "xyz",
   guid:"4334fdfsfewr"  //new field
} 

Existing order document of the user

{
   id:[email protected],
   user: "[email protected]",
   date: "09/28/2020",
   amt: "$45"
}  

Updated order document of the user

{
   id:[email protected],
   user: "[email protected]",
   userid: "4334fdfsfewr",  // new field but with same value as in user model
   date: "09/28/2020",
   amt: "$45"
}  

,

I’d probably go with:

  1. Update user documents through a script
  2. Have Azure Function with Cosmosdb trigger that would listen to changes on users documents and update orders appropriately

[UPDATE]
whatever type of script you feel best with: PS, C#, Azure Functions…
now, what do you mean they need to be altered with the new property “on the same time”? i’m not sure that’s possible in any way. if you want such an effect then i guess your best bet is:

  1. create new collection/container for users
  2. have an Azure Function that listens to a change feed for your existing users container (so, with StartFromBeginning option)
  3. update your documents to have new field and store them in a newly created container
  4. once done, switch your application to use new container

its your choice how would you change other collections (orders): using changeFeed & Azure Functions from old or new users container.

PS.
Yes, whatever flow i’d go with, it would still be Azure Functions with Cosmos DB trigger.

,

I have added some solution for .Net Core API 3.0 or higher version.

 // You can put any filter for result
            var result = _containers.GetItemLinqQueryable<MessageNoteModel>().Where(d => d.id == Id
                                       && d.work_id.ToLower() == workId.ToLower()).ToFeedIterator();

            if (result.HasMoreResults)
            {
                var existingDocuments = result.ReadNextAsync().Result?.ToList();

                existingDocuments.ForEach(document =>
                {
                    //Creating the partition key of the document
                    var partitionKey = new PartitionKey(document?.work_id);
                    document.IsConversation = true;

                    //Inserting/Updating the message in to cosmos db collection: Name
                    _containers.Twistle.ReplaceItemAsync(document, document.document_id, partitionKey);
                });
            }

,

We had the same issue of updating the Cosmos DB schema for existing documents. We were able to achieve this through a custom JsonSerializer.

We created CosmosJsonDotNetSerializer inspired from Cosmos DB SDK. CosmosJsonDotNetSerializer exposes the FromStream method that allows us to deal with raw JSON. You can update the FromStream method to update document schema to your latest version. Here is the pseudo-code:

 public override T FromStream<T>(Stream stream)
 {
    using (stream)
    {
      if (typeof(Stream).IsAssignableFrom(typeof(T)))
      {
          return (T)(object)stream;
      }

      using (var sr = new StreamReader(stream))
      {
          using (var jsonTextReader = new JsonTextReader(sr))
          {
              var jsonSerializer = GetSerializer();
              return UpdateSchemaVersion<T>(jsonSerializer.Deserialize<JObject>(jsonTextReader));
          }
      }
   }
}

private T UpdateSchemaVersonToCurrent<T>(JObject jObject)
{ 
   // Add logic to update JOjbect to the latest version. For e.g.
   jObject["guid"] = Guid.NewGuid().ToString();
   return jObject.ToObject<T>();
}

You can set Serializer to CosmosJsonDotNetSerializer in CosmosClientOptions while creating CosmosClient.

var cosmosClient = new CosmosClient("<cosmosDBConnectionString>",
            new CosmosClientOptions
            {
                Serializer = new CosmosJsonDotNetSerializer()
            };

This way, you always deal with the latest Cosmos document throughout the code, and when you save the entity back to Cosmos, it is persisted with the latest schema version.

You can take this further by running schema migration as a separate process, for example, inside an Azure function, where you load old documents, convert them to the latest version and then save it back to Cosmos.

I also wrote a post on Cosmos document schema update that explains this in detail.

]