Usually, you see posts with the opposite header: 10 reasons why you should not create your own DAL / ORM. And there are certainly lots of legitimate reasons why you should not, but in some cases there are some very strong reasons for writing your own Data Access Layer / Object Relational Mapper. We just did, at Timpex , and we are very proud of the result. Most of the time, I was the only developer on the project. Pair programming was not possible at this moment of time because of physical obstacles – I was isolated at a new department office in another city. In the later phase, Adlan Elmurzajev, another coworker joined the project , and we placed the final bricks. I was given time and resources by the management to implement the DAL. It was initiated by me and the leader of the development team.
I needed to change the coding strategy after three weeks. I had an idea to start out with some basic sql stuff with joins, then write some integration tests. And for the pure composition of sql I wanted to go with test-first. However, as things went on, code got complex, and the iterations between new integration tests took longer and longer time. I deciced to change strategy – I threw all existing code “out of the window” and started over again – but this time entirely with TDD. I realized that integration tests did not give much value at this point – because they didn’t drive the design. I felt a lot more comfortable with tests first, and the design evolved quite nicely. Though, I had got get used to idea that I wasn’t going to to touch the database for a long, long time…actually not for two months. However, this felt rather like a relief than a frustration. But enough intro-talk for now, here are my 10 reasons:
1. Because you are dealing with a legacy database. In our case we had to work against a legacy database with some structures from 1987. The database technology itself was not legacy, but the actual database schema was. It had no physical relations, only logical (programmatic) relations. Some relations were a bit odd, and impossible to represent with third party ORM (without hacks) like NHibernate or Entity Framework. Db-null were handled zeros in integer columns and by String.Empty in string columns. Datetime columns were not present, instead datetime was represented by integers in different formats: YYYYMMDD, HHmmSS or just HHmm. Dates that represented both day and time of day were represented by composition of two integer columns: YYYYMMDD + HHmmSS. In some cases the same table was used for different concepts, but with no columnDiscriminator. E.g. we had an Order Table that was used for both ‘Order’ and ‘Order Template’ – and they were seperated by two OrderNo ranges: regular orders in range xxxx..yyyy and template orders in range yyyy..zzzz – we called this a Formula Discriminator. The ranges themself were kept in a Range Table. So why didn’t we just upgrade the customer databases with migration scripts ? Well, first of all we have a lot of customers running – at least 200. Secondly, the major part of the program are not written in c#, they are written in a 4GL language – and unfortunately with no single-point-of-entity-creation – db-inserts and updates are scattered and duplicated in the 4GL code. We have a long term strategy to port /rewrite all of our code in c#, but this is not going to happen overnight. Until then, c#, 4GL and the legacy database must happily live together.
2. Because you don’t want to spend days searching for answers at forums (like hibernate.org or adodotnetentityframework).
3. Because you don’t wanna post topics (at nibernate.org) that no-one’s gonna answer anyway.
4. Because you don’t want to to buy or spend time with a profiler. Lately, profilers have gained a lot of popularity. Developers need profilers to find out why their ORM is so slow / chatty against the database. This is usually caused by lack of fetching strategy and relying on lazy loading instead. In our custom dal, we have an option to eagerly load entities any level. Let’s say you have an order with orderline, orderline with package and package with items. If your service knows it’s gonna need all levels down to Item, to do it’s computation, it should tell the FetchingService to do so: FetchingService.IncludeOrderLines().IncludePackage().IncludeItems() . This feature is often poorly implemented in third party ORMs – e.g. NHibernate offers a poor fetching strategy. Entity Framework does have a quite good fetchingstrategy by the way.
5. Because you want to give better estimates. When I give estimates to the the project team leder I find the most jeopardy part to be the ORM part. This has always has been a pain in the ass, and the ORM seem to be the breaking part of a tight schedule every time.
6. Because you don’t want to do mapping with a Designer. Using a designer (e.g. Entity Framework) is code-generation – which is one of the seven blind alleys in software design. It works very bad with code migrations and a legacy database with no physical relations. It also works bad with code first (TDD) approach.
7. Because you don’t want to map with xml. We are programmers and we can code. I am sure lots of you have been messing around with nhibernate mapping files. And, yes, I know Nhibernate supports fluent syntax. Problem is, the fluent syntax is poorly documented and there is not a one-to-one correspondence with xml mapping and fluent mapping.
8. Because you want to apply Dependency Injection to Entities. DI in entities is generally not recommended when working with an ORM. E.g. If you want to apply DI to NHibernate you would need to do some AOP (aspect oriented programming) to “interrupt” the ORM code and run som buildup mechanisms with the Container (IOC) – both when retrieving from the database and when creating new entities. In my eyes, AOP, is just a hack to avoid a bigger hack. The recommended approach is to let the caller pass the dependant classes to the entity. In my opinion, this is just another ugly workaround that breaks encapsulation / facade principle and single responsibility principle.
9. Because you write an understandable API – not a framework. Since you write your dal with TDD, you write no more code than you need. And you know exactly how your code behaves. You don’t write a general-purpose frameowork with lots of overloads and options. This means code changes and mapping new concepts will be easy. And you don’t need to be messing with dll version conflicts you often have when relying on SomeOrm.dll thath depends on someotherFramework.dll.
10. Because you will learn a lot. When writing a custom DAL you will get to know transaction, locking strategies, unit of work concept, hashcode, equality and caching. These are concepts that you should ought to know anyhow. And most important of all: you will become better at TDD – because writing a DAL is indeed complex and TDD is the only way to go when implementing a complex system.
Happy Easter, everybody !