Whither IMDB?

Whither IMDB?

Last week a customer asked me what happened to the in-memory database (IMDB) that was originally suppose to be part of Windows 2000. While I had almost no involvement with IMDB, one of the people who was involved worked for me at the time so I was around when the decision to not release it was made – even though it didn’t go the way I thought it would. I thought a few people might be interested in why it wasn’t shipped and what can be learned about in-memory databases from the decision process.

The bottom line was that while it was possible to write tests that showed IMDB to be significantly faster than SQL Server, when we tried complete three tier applications, using IMDB as a middle tier cache was not a lot faster than using SQL Server as a middle-tier cache. While this may not be true for all in-memory databases, an analysis of why IMDB didn’t blow SQL Server away might give you something to think about when considering middle-tier caching solutions.

As I said, this may not apply to all middle-tier solutions but in my opinion this is why IMDB didn’t have the performance advantage we expected:

· Pretty much by definition if you have enough memory for IMDB to store the database in memory, there’s enough memory so that all the data will fit into SQL Server’s buffer memory. This means that once the data has been read once, most of the IMDB advantage is gone. In addition, if you want to cache a little more data than you have memory available, SQL Server will handle it but IMDB won’t.

· Even though a cache is generally a read-mostly database, unless you are willing to live with a badly out-of-date cache, the IMDB is going to have to implement locks to allow updates of the cached data. SQL Server transactional replication has been tuned for many years for maximum performance and minimum impact to the publishing system so it’s difficult for IMDB to beat the update performance of SQL Server.

· SQL Server has one of the best query optimizers available so complex queries are likely to perform better in SQL Server than in IMDB. For simple queries against small data sets, using the brute-force approach might actually give IMDB an advantage.

· When the system starts up, SQL Server has to recover databases while IMDB has to reload the whole database over the network from the master database. This includes rebuilding any indexes maintained by IMDB. If the cache runs in the same process as the web application, it will have to be rebuilt every time the web app is recycled.

· IMDB exposed an OLEDB interface. While this was the right ting to do to minimize the number of application changes required to use the cached data, it means that all the processing required to serialize and deserialize data from the database connection has to be done with both IMDB and SQL Server. This is why caching the data in datasets or data objects might give better performance than a database-style cache. The dependency objects for SQL Server 2005 make this even more attractive because the middle-tier objects can be informed when a database change that would invalidate the cached objects has occurred.

The obvious disadvantage of SQL Server as a middle-tier cache is that unless you can get by with the 1GB of cache memory that SQL Server Express supports, you are going to have to pay for a SQL Server license. If I remember correctly, IMDB only supported 2GB of memory but it was supposed to be free so the price was right.

As I said before, this analysis may not apply to the caching solution you are considering but I hope it convinces you to do some testing before you assume an in-memory database will be the answer to your caching needs. The Microsoft IMDB team invested a great deal of effort making it as compatible as possible with SQL Server – even including some SQL Server code. It’s possible that a more simple-minded implementation that only supported a subset of relational DB functionality could provide significantly better performance.

If you are considering a caching solution, I strongly recommend spending some time understanding the Patterns and Practices Caching Architecture guide: https://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/CachingArch.asp