ObjectSpaces: The Power of the Dot


When the ObjectSpaces project was first getting underway, back when the team consisted of just myself and Luca, there was a strange sort of awakening that occurred.  One that can only be described as the sudden realization and belief in the power of the dot.


Now, I’m not intending to steal the thunder from any company that wants to be known as the dot in dot-net or some such inanity.  I’m talking about the true, honest to God, DOT.  The dot in your code.


You see the ObjectSpaces project was born as an infant from its data-access (ado) and xml parents.  We were trying to complete the trinity of data; relational, xml and objects.  Today Erik Meijer likens these to rectangles, triangles and circles respectively.


We started out thinking that we ought to leverage as much as possible from the current framework, so the first prototype of ObjectSpaces was built over the dataset.  What was shown at the PDC a couple of years ago was pretty much this, except there was not much of my original code left in it after turning over to Jason and his team.  In fact, the most current codebase doesn’t even recognize me anymore.  When I go to see it, it shies away behind Jason and Luca.


Anyhow, we also knew that query would be very important for this new thing.  Too many attempts in the past tried treating persistent objects as things you could only retrieve given a reference or key.  We knew we had to do better, and we knew we had the perfect language already.


You see, XML and objects were pretty similar.  After talking to a lot of customers that worked with objects in data-access layers or with other persistence technologies we found that stored objects were very much graph-like in nature; meaning there was a lot of hierarchy that could be accessed from a variety of directions. Sometimes, though, all you had was a starting point, and so the only way to ask any meaningful question of the data was to query into it from that one point.  This was very different than relational data that is always broken down into distinct tables.  However, XML was very much like this object data.  XML was a pure tree structure, but a query language like XPath was ideally suited for drilling down though structure.


We had XPath.  Heck, we had just written probably the fastest XPath processor on the planet.  It was no great leap of logic to see that XPath would be the ideal language for querying graphs of objects.  So I set out to purloin the XPath parser and execution engine into my project folder.  When I was done I could use the xpath engine to query objects in memory and a translator I wrote to query objects stored in SQL server.  Everything seemed peachy.


Then we showed our wondrous invention to a variety of customers under NDA.  Most of them already had experience using object persistence technologies, so they pretty much knew what they wanted to build.  However, when it came to the point of writing the queries they were perplexed.  “What’s this XPath thing?”  Luca showed them some queries, but they all but scratched their heads.  You see, they didn’t much know about XML or XPath at this point. Even the ones that did found it a very difficult leap to make, transitioning the understanding of their object model into an XML schema and back.  It just wasn’t natural.


That’s were we had made our mistake.  We innocently assumed that since we were trying to unify all of relational, xml and objects into one tidy unit, our customers would too.  Except they weren’t.  As it turns out for many of them, the whole point of writing their own abstraction layers over ADO or using a persistence technology was to be able to remove themselves from the nitty-gritty of data storage, so they could focus on their applications, all of which were expressed in terms of their own objects.


They only wanted their ‘application programmers’ messing with their objects, and not to be bothered with details about how the information is stored or expressed in other forms.  They wanted their programmers concerned with the business logic and the code they were writing.


That’s were the inspiration came from.  I deduced that the problem that was hanging them up was the awkward slash notation of xpath and the angle brackets of XML.  They were unnatural because they didn’t fit into their world view of what they did.  They wrote code in languages like VB or now C#.  If the query language looked like their code and the data representation in the query always looked like the objects as defined in their programming language it would be a much more natural fit.


So I went to work.  I went into the code and started to make some changes.   For one, all tag names changed to property names.  That’s what you do in code, you access properties.  And the big change for the parser, I changed the slash to a dot.  That’s it.  Now to be truthful, the language has changed considerable since that day.  It has become much more formalized and is now very different from XPath, but back then that’s all it was.  


We took the prototype back to the same customers.  Low and behold, they ‘got it’.  Without any documentation at all, after only a brief example they were off writing their own queries.  It was like magic.  It was the dot, stupid.


Now that’s power.


Matt

Comments (26)

  1. Jeff says:

    There is a lot of neat stuff in Whidbey, but honestly I can’t say I’ve been more excited about anything else. ObjectSpaces solve so many problems I’ve had over the last two years or so, and the RTM can’t come soon enough!

  2. RebelGeekz says:

    Changing one char for another can make such a great difference.

    Welcome to software ergonomics.

    Ergonomics: an applied science concerned with designing and arranging things people use so that the people and things interact most efficiently and safely — called also human engineering.

  3. Jason Mauss says:

    Matt (or anyone) – can you point me towards a primer on ObjectSpaces and what they’re all about? I’d like to learn more.

  4. From <a href="http://weblogs.asp.net/mattwar/archive/2004/02/18/75770.aspx">"ObjectSpaces: The Power of the Dot"</a>:

    "They only wanted their ‘application programmers’ messing with their objects, and not to be bothered with details about how the information is stored or expressed in other forms. They wanted their programmers concerned with the business program and the code they were writing."

    I suggest reading the entire post – but this really sums up why I think that datareaders and datasets shouldn’t be let out of captivity.

  5. Paul Wilson says:

    Hi Matt:

    I agree the dot is better than the slash, but I don’t buy that its good enough. Why can’t we have some type of strongly typed query API that’s fully supported by intellisense? Instead of GetObject("MyObject.MyProperty = 1"), why can’t I have GetObject(MyObject.MyProperty, Operator.Equals, 1), where intellisense helps me with MyProperty? I’m sure its not trivial, but surely its doable since the ASP.NET team does some on the fly strongly typed stuff that works with intellisense for MasterPages. And surely the actual customers you speak of would prefer working with objects and intellisense instead of learning even the not-so-hard OPath syntax that is still prone to errors.

    Thanks, Paul Wilson

  6. Matt says:

    You are correct. Strongly typed queries embedded in a host language beat queries in a string any day of the week. This is exactly why I spent the last 2 years working on the X#/Xen research project. One day you may just get your wish. 😉

  7. Matt says:

    If you want to find out about ObjectSpaces you can check out the newsgroup: microsoft.public.objectspaces

    You can also check out:

    http://longhorn.msdn.microsoft.com/lhsdk/ref/system.data.objectspaces.aspx

    Some independent sites too:

    http://my.execpc.com/~gopalan/dotnet/object_spaces/object_spaces.html

    http://www.langreiter.com/space/ObjectSpaces

    Also, check out Andrew’s blog. He talks covers a lot about ObjectSpaces.

    http://weblogs.asp.net/aconrad/category/2231.aspx

  8. Jason Mauss says:

    Thanks Matt & Paul – I’ll check those out.

  9. David Goldstein says:

    Quoth Paul:

    "Instead of GetObject("MyObject.MyProperty = 1"), why can’t I have GetObject(MyObject.MyProperty, Operator.Equals, 1)"

    That’s why I keep wishing for C# (sure VB.NET too) to allow syntax to reference a property or field, sorta like this:

    typeof(MyClass.MyProperty)

    or even a java like syntax (eek!)

    MyClass.class

    MyClass.MyProperty.member

    Then you can directly reference your member (boy that sounds, er, obscene….) while getting compile-time checking.

    HEY have you considered allowing precompiled OPath queries, kind of like regex?

    private static readonly QUERY_CustomersEtc =

    new OPathQuery("MyClass.MyProperty = 1");

    but OOPS oh yeah where’s the context… *sigh*

    Maybe the key then is to be able to declare queries in your mapping file.

    <query id="CustomersEtc">

    <where syntax="opath">MyCustomer.MyProperty = 1</where>

    </query>

    dunno.. sorry I haven’t studied OPath enough yet to give a good example.

    but here the very cool (IMHO) thing is that you _administratively_ declare a query… and what relationships to traverse. the uncoll part is that the input parameters and output columns still end up having loosely coupled dependencies with the code that uses it…

  10. Users of the Wilson O/R Mapper sometimes ask for some documentation about OPath in their user forum, since the mapper supports a query language similar (or identical?) to OPath.

    Here you are—ObjectSpaces articles that might be useful when working with Wilson O/R Mapper (a design goal of which seems to be to mimic the ObjectSpaces API design), an incomplete list in no particular order.

  11. Users of the Wilson O/R Mapper sometimes ask for some documentation about OPath in their user forum, since the mapper supports a query language similar (or identical?) to OPath.

    Here you are—ObjectSpaces articles that might be useful when working with Wilson O/R Mapper (a design goal of which seems to be to mimic the ObjectSpaces API design), an incomplete list in no particular order.

  12. Microsoft’s Luca Bolognese has a nice overview of PDC sessions related to the .NET Language Integrated…

  13. Microsoft’s Luca Bolognese has a nice overview of PDC sessions related to the .NET Language Integrated…