Why Pay for Sorting if You Don’t Need it? (3 of 3)

By Chen Fu, Tanj Bennett, Ashwini Khade In our previous posts, we discussed the high cost of data sorting. Here we introduce Venger Index, an alternative option when you need memory efficiency but not data sorting. 1. Venger Index – Balancing Cost and Performance Hash tables do not require data sorting, and are widely used in…


Why Pay for Sorting If You Don’t Need It? (2 of 3)

In our last blog post, we briefly discussed high cost of LSM Trees. Here we introduce how data are organized in LSM trees and the cost induced. LSM Tree Compaction In a LSM tree, data are divided into multiple levels.  New records are inserted into a memory buffer where all records are stored in order….


Why Pay for Sorting If You Don’t Need It? (1 of 3)

Log-structured Merge-trees (or LSM trees) are used by many famous data stores, such as BigTable, HBase, Rocks DB, Cassandra, etc. LSM trees provide desirable features such as memory efficiency, quick updating, and range query, i.e. access to all the records in a key range, according to one predefined total order of the keys. Unfortunately, LSM…


Threads, Events, and Mala (in C++)

By Chen Fu and Tanj Bennett When it comes to constructing high performance server applications, there is a decade long debate between threads vs events. And it is not getting old.  Here we examine the story from both sides, and explore how back end server programmers can get the advantages from both sides, using an open source library…