Hidden Cost of an Architecture

In a normal day of developer life, I was hunting a performance issue and memory leak. It sounds mysterious, but, just another bug, another issue to solve, after all. 

When come to the performance/memory issue, one should go for PerfView. The tool gives a very detail picture of what was going on in the memory at a reasonable level that a developer can understand.

The system is a WCF service which works base on the DataContract. From the profiler, I found out that if a returned value has 10MB in size, it will cost the OS of 50MB, proximately of 3 times extra cost. That does not count the memory consumed by the WCF framework to serialize the contract.

Note that I do not judge the architecture good or bad. There were good reasons what it was designed that way.

A very simplified version looks

With that simple code setup, a simple console app that consumes the service

Here is the result of downloading a file of 74MB. The total memory consumed in the heap is 146MB.

There are 2784 objects with total of 146MB memory

Where are those extra cost coming from? The extra cost comes from BinaryDataContractSerializer.Serialize method.

  1. The memory consumed by the DataContractSerializer.
  2. And the memory consumed by the MemoryStream to return an array of bytes.

In many cases, with modern hardware, it is not a big problem. There is Garbage Collector taking care of reclaiming the memory. And if both request and response are small, you do not even notice. Well, of course, unless one day in the production, there are many requests.

There are a couple of potential issues about consuming more memory

  1. If the size is more than 85K (85000 bytes), it is stored in Gen 2 eventually. I would suggest you read more about memory allocation, especially Large Object Heap (LOH). I am so amateur to explain it.
  2. Cause memory fragmentation. Memory keeps increasing. GC has a very hard time to reclaim them.
  3. Of course the system is not in a good shape.

How could we solve the problem without changing the design, with less impact?

We know that some operation will consume lots of memory, such as downloading a file, returning a data set. Instead of returning the byte array, we extend the response to carry the object. We could do that for all operations and get rid of the byte array. However, there are hundreds of operations. And we want to keep the contract simple and with less changes as much as possible.

So an improved version looks like

Design a LargeObject contract to reduce the serialization cost

Run the application and see the memory again

There are 366 objects with total of 73MB memory

Comparing the two, there is a big win: 2784 objects vs 366 objects; 146MB vs 73MB.

With the increasing power of hardware, RAM and Disk are not problems anymore. With the support from the managed language (such as C#/DotNet), developers code without caring too much about memory, memory allocation. I am not saying all developers. However, I believe there are many that do not care much about that issue.

It is about time to care every single line of code we write, shall we? We do not have to learn and understand every detail about the topics. These are good enough to start

  1. Memory allocation in Heap, Gen 0, Gen 1, and Gen 2.
  2. Memory fragmentation. Just like disk fragmentation.
  3. Memory profiler at abstract level, such as using dotMemory, PerfView.
  4. Garbage Collector. Just have a feel of it is a good start.

I am sure you will be surprised with how fun, how far it takes you.

Support serialization with Expando object

Recently I have a chance to work with project that need to store dynamic data, work with dynamic data. The data will be stored in RavenDB. After some considerations, I decided to use this implementation from Rick  Strahl. Everything works fine, except saving to database. It is store as empty object in RavenDB.

Just add these lines of code will solve the problem


public override IEnumerable<string> GetDynamicMemberNames()
{
return Properties.Keys;
}

The end result is nice

ExpandoRavenDB