Interesting things happen when system is under load with many concurrent requests. I got one when working on a big project. We got the problem during the development where the system was under many concurrent requests and there were many requests to insert data.
Our system employed Event Sourcing (ES) architecture. It has been chosen before I joined the project. It had been working great until that moment.
We got many exceptions in the system. And soon we tracked down the root cause of those exceptions was that some events were not projected thus resulting inconsistent data in the query model (the read side). But how could that happen? What was the root cause of that root cause?
Our approach depends on the order of events persisted in the Event Store which is a SQL implementation. And some of the events were not in the right order that they were supposed to be.
Let say there are 2 requests R1 and R2
- R1 generates 3 events in order:
A1, A2, A3
- R2 generates 3 events in order:
B1, B2, B3
Events in a request are sent to an Event Store in a transaction. And we expect them to have consecutive orders. With the 2 above requests, we expected the orders is
A1, A2, A3, B1, B2, B3
Or
B1, B2, B3, A1, A2, A3
And it has been working like that for years.
We designed the Events
table with Order
column is the primary key with Identity(1,1)
so SQL handles the ordering for us.
One day!
When the system is under load with many concurrent requests
The event ordering is not consecutive.
What does it mean? Instead of having the orders as described earlier, we got this in the database:
A1, A2, B1, A3, B2, B3
Notice the order of B1 and A3
Because the way our projection was designed and implemented, when the ordering was wrong, not consecutive, some events were not projected.
But how could that happen? According to SQL documentation, there is no guarantee that the values generated by Identity column are consecutive in a transaction. If you want to ensure consecutiveness, you have to set proper isolation level and locking. Unfortunately, we could not do that. It will hurt the write performance. We want to write events as fast as possible.
Consecutive values within a transaction – A transaction inserting multiple rows is not guaranteed to get consecutive values for the rows because other concurrent inserts might occur on the table. If values must be consecutive then the transaction should use an exclusive lock on the table or use the SERIALIZABLE isolation level.
So what could we do? There is always a solution.
If you need orders, you have to be in charged
The first thing is that we have to manage orders manually instead of delegating to SQL Identity feature.
We know the number of events in a request/transaction. So if we know the current order number, we can preserve a range of values from Current + 1
to Current + Number Of Events
. And then set the current order to the max in the range. If we can ensure that preserving a range and update the current order is performed in an atomic manner, other requests will get correct orders. It is impossible to have 2 requests with the same range. And even if there is, the SQL will throw an unique key constraint exception.
We need a EventOrdering
table to store the current ordering.
Take the power of SQL to preserve a range. SQL allows getting values from an update statement. This query does all the power. It allows setting new value in an atomic operation and gets back the old and new values
const string sql = "UPDATE EventOrdering " +
"SET CurrentOrdering = CurrentOrdering + @numberOfEvents " +
"OUTPUT Deleted.CurrentOrdering + 1 as Lowest,Inserted.CurrentOrdering as Highest " +
"WHERE EventOrderingId=@eventOrderingId";
To access the old value, use the Deleted
temp table. To access the new value, use the Inserted
temp table
/// <summary>
/// Get the current (latest) ordering and reserve a range of ordering from [Current + 1] to [Current + <see cref="numberOfEvents"/>]
///
/// Example: Current: 11
/// numberOfEvents: 5
/// reserved range: 12, 13, 14, 15, 16
/// Updated Current:16
///
/// The returned value: [12 - 16] where
/// 12: Lowest
/// 16: Highest
/// </summary>
/// <param name="numberOfEvents"></param>
/// <returns></returns>
private OrderingRange GetOrderingRange(int numberOfEvents)
{
const string sql = "UPDATE EventOrdering " +
"SET CurrentOrdering = CurrentOrdering + @numberOfEvents " +
"OUTPUT Deleted.CurrentOrdering + 1 as Lowest,Inserted.CurrentOrdering as Highest " +
"WHERE EventOrderingId=@eventOrderingId";
using (var conn = new SqlConnection(_connectionString).AsOpen())
{
var param = new
{
numberOfEvents = numberOfEvents,
eventOrderingId = EventOrderingId
};
var ordering = conn.QueryFirstOrDefault<OrderingRange>(sql, param, commandTimeout: CommandTimeout);
if (ordering == null)
{
throw new InvalidOperationException("Cannot read the global ordering record from the table EventOrdering. " +
"Ensure that the table is in the database with one row");
}
return ordering;
}
}
/// <summary>
/// We might just need the Lowest value and increase by 1 for each event.
/// However, having both values sounds right with the range, and might support the debugging/logging
/// </summary>
public class OrderingRange
{
public long Lowest { get; set; }
public long Highest { get; set; }
}
Once we got the range, we simply assigned them to the events to be saved.
The problem solved! And I have learned new things about SQL.