WASB back stories: Masquerading a key-value store

There are a few excellent articles out there already that introduce the concept of Azure Blob Storage and how it’s accessed from HDInsight. In this post though I wanted to start giving some backstage looks at some of the decisions we made while exposing blobs to HDInsight, hopefully answering a tiny part of the oft-asked…


On configurable code

I have a confession to make, but before I make it I want to clarify where I come from. I’ve been in software for many years now, and over those years I’ve configured systems and have seen and been exposed to many ways software can be configured, even written a few. I’ve come across systems…

5

Merging small files on HDInsight

The situation The following is a drammatically enhanced story inspired by true events It’s weird how we can see our signal in the daytime. I guess that’s why Microsoft chose the Seattle area: all these clouds provide a great surface to project on. But here it was: the ominously cute elephant on the faint blue…

1

Working on Hadoop code on Windows

As a member of the HDInsight team I worked a bit on Hadoop code on Windows and contributed a couple of JIRA’s there (JIRA is a bug tracking system Apache uses – contributing code usually involves filing JIRA’s and posting patches there). Even if you don’t want to contribute to Hadoop code (though it’s fun),…


Analyzing Azure Table Storage data with HDInsight

HDInsight was optimized from the start to be able to quickly analyze data on Azure’s blob storage service using Hadoop by using the WASB file system to expose the data there as a native Hadoop file system. But the spirit of Hadoop has always been to be able to analyze data wherever it is, so…

28