"Linq to Memory": another old idea

Article
01/02/2015

One of my readers asked if I had any particular plans around memory density for Visual Studio to go with my last posting and indeed I had one thought that I considered a very powerful way to create data structures that were more likely to be scalable. Like most of my ideas it’s a very simple thought that I think is still effective, and I still give the same advice because I’m boring like that.

Basically, it’s this: rather than design your data structures the way they taught you in your CS2xx class, design them like you were going to store them in a SQL database. In fact, you are likely to be well served by using a schema designer. Then store them exactly like that in memory. RAM is the new disk. Cache is the new RAM.

In fact, I suggested at that time that we write a thing I called “Linq To Memory” – a stark contrast from “Linq to Objects” to help facilitate this. Basically Linq to Memory was a hypotheticial thing that would be more like an in-memory-database with “tables” that were based on dense structures like b-tree and such but no query language other than Linq needed.

Why?

Because these choices give you an “automatic” way to think about isolation, locking, multiple readers, and even multiple writers if needed. And they do so comparatively economically. And if you include the compiled query type options that are present in Linq to SQL or Linq to Entities then you can get something very efficient for read access.

Note that this design actively discourages the forest of pointers that you normally get in OOP and yet you get a nice object model to work with (like Linq to SQL gives you) with the slices you are actually working with hydrated into objects and the other parts densely stored.

This metaphor easily supports multiple ACID choices with as much or as little durability and isolation as you want.

It also facilitates things like 64 bit systems because you do not have to pay a 64 bit pointer or five when a smaller or variable sized index would do the job. Yet you can have huge amounts of data if you need them.

Natural keys facilitate locality as do typical database style indices.

One time overhead for creating your compiled queries to access the data is amortized over the life of the program and saves due to compactness continue to yield benefits forever. Lack of pointers in the primary storage also reduces garbage collection costs massively.

MANY systems are well suited for this choice. And as a primary storage it provides both clarity and economy.

I haven’t seen anything like this but it’s totally buildable, I knew where to find the back-end for it in-house. It's good not only for VS but for many applications with non-trivial data storage requirements.

"Linq to Memory": another old idea

Additional resources