Cache-oblivious data structures

In most data structure and algorithms classes, the model used for basic analysis is the traditional RAM model: we assume that we have a large, random-access array of memory, and count the number of simple reads/writes needed to perform the algorithm. For example, selection sort takes about n(n-1)/2 reads and 2n writes to sort an…

10

Succinct data structures

Sorry for the long hiatus, everyone. Today I’m going to talk about succinct data structures, which are informally data structures that use very close to the absolute minimum possible space. This material is largely based on lectures by MIT professor Erik Demaine, as transcribed by 6.897 students Wayland Ni and Gautam Jayaraman (link). Also see…

4

Persistent data structures

When learning to program in a functional language such as Lisp, Ocaml, or Haskell, one of the most difficult aspects of the paradigmic shift is that data in these languages is almost entirely immutable, or read-only. In purely functional programs, there is no assignment, and data structures cannot be modified. But how are we to…

3

Unrolled linked lists

Today I’ll be discussing unrolled linked lists, a simple variant of the linked list which has many of its desirable properties but exploits the cache to yield considerably better performance on modern PCs. In elementary data structures classes, students are often taught about arrays and linked lists and the algorithmic tradeoffs between the two. In…

9

Bloom filters

Imagine you’re writing a simple spell checker in C. You’ve already collected a dictionary of 100,000 words, and you want to process the document a word at a time and find any words that aren’t in the dictionary. You don’t care about providing suggestions for wrong words. How would you go about this? Clearly, comparing…

11