Data Access API of the Day, Part I

Why does it seem like Microsoft is coming up with a new Data Access API every time you turn around? First there was DbLib, then ODBC, then DAO, RDO, ODBCDirect, OLEDB, ADO, ADO.NET -- When is Microsoft going to run out of letters? And now what's this LINQ thing? And what about these "Entities" someone mentioned the other day? And who is the person writing this BLOG?

I'll answer the last question first. My name is Mike Pizzo, and I'm an Architect in the Data Programmability team at Microsoft – yes, the same team which brought you that “alphabet soup” of Data Access interfaces. In fact, I've been in the Data Programmability Team (under various names) for better than 15 years, and personally contributed to each of those APIs (with the exception of DbLib).

Following is a short history of the evolution of Microsoft Data Access APIs in four parts (so far…)

My goal is to describe the industry trends, platform shifts, and usage scenarios that shaped our API landscape over the past 15 years (and, in the retelling, perhaps receive some absolution for my sins…)

So, without further “ado”…

Part I – The Early Years…

First, let me apologize. I know – Microsoft knows – that changing something as fundamental as the way your program accesses and works with data is a tremendous cost in terms of existing codebase, tools, components and expertise. Every new option increases the decision matrix for writing new programs. I hate the very thought of mentioning a new data access API to customers. Some days I wish we could have stayed just with ODBC all of these years.

But what if we had? What if we had simply extended ODBC to support building a federated database out of individually reusable components? What if we had extended ODBC to support a disconnected programming model? What if we had extended ODBC to work over XML? What if we made ODBC work with CLR Objects?

The answer is; it wouldn't be ODBC. ODBC (which stands for Open DataBase Connectivity) was designed for a specific purpose; to be a common Call Level Interface (CLI) to a relational database. Through several years, many Standards meetings, and a good deal of excellent Indian food (a favorite of Jim Melton’s, the editor of the ANSI SQL Specification) we even made ODBC into a SQL/CLI addendum to the SQL 92 specification.

In our ODBC 2.0 design preview back in the early 90s I presented a set of proposed ODBC ISAM (Indexed Sequential Access Method) extensions for working with sets of indexed data that didn't support Query Processing. The idea was that we could provide a common query processor over non-relational sources like text, excel, dBase and Paradox files. The feedback we got was positive – that we should support ISAMs, but unequivocal that it NOT be part of ODBC. ODBC's strength, we were told, was its direct tie to relational stores. This feedback totally surprised us, but in retrospect they were absolutely right, and I believe that's why ODBC is still popular today.

And then something happened. Visual Basic became popular as a scriptable "automation language". ODBC, being a C-style interface, was not directly consumable from VB. However, some of you clever folks figured out that Microsoft Access supported executing queries against ODBC Datasources, and that Access did support scriptable automation through its Data Access Object (DAO) API. Voila! Now you could write applications against ODBC sources using VB.

However, DAO went through Access's internal "Jet" (Joint Engine Technology) database engine, which defaulted to building local keysets for each result in order to do advanced query processing and cursoring against the remote data. This was fine if you needed that functionality, but significant performance overhead and additional round trips when you didn't.

Enter the Visual Basic team who, responding to customer demand for better performance against ODBC sources, came up with something called Remote Data Objects (RDO). RDO implemented the same DAO programming patterns directly against ODBC, rather than going through Jet. RDO was extremely popular among VB developers, but the fact that we had two different sets of automation objects for accessing ODBC sources caused confusion.

But apparently not enough confusion, because our solution was to introduce "ODBCDirect". Despite its name, ODBCDirect was not a new API; it was just a mode we added to DAO that set defaults in such a way as to avoid the overhead of building keysets and such.

And then something happened. The industry made a big push toward object-oriented, distributed, componentized software. With the Object Management Group (OMG) pushing their Common Object Request Broker Architecture (CORBA), Microsoft needed first-class support for data access for its Component Object Model (COM)…

Next: Part II – Componentizing Data Access (OLE DB)

Mike Pizzo
Architect, Data Programmability