Ooooh I’m Telling: Doing Swear Word Analysis with Storm on HDInsight

As promised, this is the first of three (maybe more) posts that will present an end-to-end example to showcase the distributed streaming capabilities of the Apache Storm project. This first post will provide an introduction to the project and an overview of all the moving pieces. Please note that I will not be getting into…

3

Introduction to Apache Storm

The Apache Storm project delivers a platform for real-time distributed (complex event) processing across extremely large volume, high velocity data sets. By providing a simple, easy-to-use abstraction, Storm enables real-time analytics, online machine learning and operational/ETL scenarios that have previously been non-trivial to implement. In this post we will familiarize ourselves with the Storm platform, its…

4

Something’s Brewing with Azure Data Factory – Part 3

In the first two parts of this blog series (HERE and HERE), we used Azure Data Factory to load Beer review data from an Azure SQL Database to an Azure Blob Storage account. We then processed that data using HDInsight and the Mahout Machine Learning Library to generate user-based recommendations. In this final post, we…

3

Something’s Brewing with Azure Data Factory Part 2

In my last post (HERE), I started hacking my way through the new Azure Data Factory service to automate my beer recommendation demo. The first post was all about setting up the necessary scaffolding and then building that first pipeline to move data from the Azure SQL Database into Azure Blob Storage. In this post,…

0