Azure HDInsight - Hadoop/Big Data on Azure

I get to my blog after a long gap, but I promise I shall devote more time to it going forward.

The inspiration behind this one is my friend Karan's blog about resources on HDInsight. It is a great compilation of resources, and I thought why shouldn't I post one for myself as well.

Well, here it is!

Azure HDInsight Documentation - Important Links:-

Here are the Hadoop technologies currently in HDInsight:

  • Avro (Microsoft .NET Library for Avro): Data serialization for the Microsoft .NET environment
  • HBase: Non-relational database for very large tables
  • HDFS: Hadoop Distributed File System
  • Hive: SQL-like querying
  • Mahout: Machine learning
  • MapReduce and YARN: Distributed processing and resource management
  • Oozie: Workflow management
  • Pig: Simpler scripting for MapReduce transformations
  • Sqoop: Data import and export
  • Storm: Real-time processing of fast, large data streams

Other popular Videos

Books

Introducing Microsoft Azure HDInsight In Introducing Microsoft Azure HDInsight, we cover what big data really means, how you can use it to your advantage in your company or organization, and one of the services you can use to do that quickly—specifically, Microsoft’s HDInsight service. We start with an overview of big data and Hadoop, but we don’t emphasize only concepts in this book—we want you to jump in and get your hands dirty working with HDInsight in a practical way.

  Download the PDF (6.37 MB)

For the Hadoop Experts, this may sound a little elementary, but I'd love to hear on what else we should bring about to make your life easier working with HDInsight. Please add any more links that you feel add value to this compilation and I will append it to the list. Also feel free to comment on any projects that you feel could use some of my help.

Till next time...Cheers!