Technical Notes – CosmosDB Change Feed and Azure Function
Some notes while looking deeper into the integration between Azure CosmosDB Change Feed and Azure Function. Most of the time, we simply use the built-in trigger. And it just works. That is the beauty of the Azure.
// Azure function code, CosmosDb Trigger. Took from MS Example
public static class CosmosTrigger
{
[FunctionName("CosmosTrigger")]
public static void Run([CosmosDBTrigger(
databaseName: "ToDoItems",
collectionName: "Items",
ConnectionStringSetting = "CosmosDBConnection",
LeaseCollectionName = "leases",
CreateLeaseCollectionIfNotExists = true)]IReadOnlyList<Document> documents,
ILogger log)
{
if (documents != null && documents.Count > 0)
{
log.LogInformation($"Documents modified: {documents.Count}");
log.LogInformation($"First document Id: {documents[0].Id}");
}
}
}
The example is everywhere. Nothing is fancy about it.
In a project, we took the advantages of that feature to migrate data from CosmosDB to Azure SQL Database for later processing. I wanted to make sure that we are well-prepared for the production. So I did some learnings. So here are the notes. Note that none of them are mine or new. They are simply my writing in the way that I want to remember them and in the areas that I am interested in.
Container, Logical Partition, and Physical Partition
Change Feed is per container. If a database has 3 containers, each has its own Change Feed. The Change Feed ensures that the documents are sent in the order they were written.
Change Feed is reliable as the database itself.
Under the hood, data is stored in many physical partitions. At that level, Change Feed is actually per physical partition. The data might be shuffled from one physical partition to another. When that happens, the Change Feed is moved as well. So how to ensure the document order across the physical partition, especially after moved? The Change Feed Processor (CFP) manages all the complexity for us.
Change Feed Processor (CFP)
In theory, developers can write code to interact directly with Change Feed. It is possible but not practical. In practice, not many (I cannot say NONE) want to. Instead, many depend on the Change Feed Processor (CFP). The MS Docs has sample code that you can write your own consumer code.
Azure Function with CosmosDb trigger
Azure CosmosDb trigger configuration
By default, the poll interval is 5 seconds (see the FeedPollDelay attribute).
Azure Function with the CosmosDb trigger is a layer on top of the CFP. It saves us from dealing with hosting, deployment, scaling, … with the power of Azure Function.
If the function execution fails, by throwing an exception, the changed documents are sent again in the next run. So there is a risk that the flow is stuck if the failure has not designed to handle properly. The Change Feed and Azure Function ensure that your code will received the changed documents at least once.