Avoid Potential Problems with Explicit API Design
A year ago, I wrote Leaky Abstraction – Linq Usage. I use the design whenever I meet the same challenge. The system works as expected. I am happy about the design.
Until recently, there was a need to order the collection. Say that we need to order a TeacherCollection by years of experience. Assuming that there is a requirement to print the result in the years of experience order.
It is a pretty simple requirement. In the TeacherCollection constructor, this code will do the job.
public TeacherCollection(IEnumerable<Teacher> teachers)
{
if (teachers != null)
_teachers = teachers.Where(x => x.IsStillAtWork).OrderBy(x => x.StartedOn).ToList();
}
Everything works as expected.
Boom! The system is very slow when having more data. The profiler shows that a high number of time spent on the ordering. In that system, the ordering logic is much more complicated. It is not a pure Linq sorting as in the example. Still, the sorting cannot be a problem. That is for sure.
The problem is that the collection is accessed too many times. It is also an expected result because the collection is designed to filter data, to work in a pipeline in a safe way.
The ordering logic should not be placed here. The collection itself has all the information to do the sorting, filtering.
What should we change in term of the design to solve the problem and also support the sorting?
Identify Responsibilities
In my opinion, this is the most difficult part of writing code. I have not found any exact formula to get it right. Identifying responsibilities is a heuristic. Experience matters here.
Filtering and ordering should be treated as two separated responsibilities. It is very easy to mix them in one implementation and thus error-prone. When defining responsibility, one should consider at least 2 factors
- The purpose of each: One is for filtering, the other is for sorting. They are 2 different operations.
- When it is used and the usage frequency. Filtering is used a lot to extract sub collection from the original collection. Ordering is, on the other hand, only used when a final result is displayed to the end user or other form of presentation such as Console screen, word document.
It is kind of tricky to see them as separated responsibilities. In many case I even not bother to think about it. Well, It proves that I was wrong. Sometimes, it sounds cool and simple if just order the list.
Extract Explicit Interfaces
Before moving on, let’s take a look at the TeacherCollection. The additional feature we need is the ability to get exact index of a teacher.
ITeacherCollection – The default interface is extracted from the current TeacherCollection. The School now holds an instance of ITeacherCollection, instead of TeacherCollection implementation. This refactoring step will not break anything.
IIndexedTeacherCollection – A simple interface which supply only GetIndex API. A key point is that a consumer cannot instantiate it. The only way to have this API is a transition from ITeacherCollection.BuildIndex.
The sorting cost is paid only whenever a need arises.
The actual implementation of the improved design is almost identical with the original version. All the major implementation logic is there in the TeacherCollection class. The refactoring is safe because the compiler will tell us what goes wrong.
The client (Program class) only deals with interfaces and interfaces transition.
Why didn’t I have the ITeacherCollection at the first time? Well, we have not needed it at that time. We should not make thing complicated if there is no demand. Design is evolved.
I just solved a bug beautifully.