Hadoop for .NET Developers: Manually Loading Data to Hadoop

NOTE This post is one in a series on Hadoop for .NET Developers. To manually load a file to Hadoop, the file should first be loaded to the name node server.  With the file now on the name server, one of either of two commands can be used at the Hadoop command prompt to load…

1

Hadoop for .NET Developers: Understanding HDFS

NOTE This post is one in a series on Hadoop for .NET Developers. From a data storage perspective, you can think of Hadoop as simply a big file server.  Through the name node, the Hadoop cluster presents itself as a single file system accepting basic Linux file system commands such as ls, rmr, mkdir, and…

0

Hadoop for .NET Developers: Obtaining the Sample Data Sets

NOTE This post is one in a series on Hadoop for .NET Developers. In the exercises that follow, we will work with two sample data files.  These files are available as part of a ZIP file associated with this blog post. The first sample data file, integers.txt, contains a simple list of integers from 1 to…

0

Hadoop for .NET Developers: Setting Up an Azure Cluster

NOTE This post is one in a series on Hadoop for .NET Developers. For rapid provisioning and lack of long-term commitment, the cloud is an excellent place to try your hand with a multi-node Hadoop cluster.  If you are an MSDN subscriber, Microsoft provides you access to cloud services as part of your benefits as described…

1

Hadoop for .NET Developers: Setting Up a Desktop Development Environment

NOTE This post is one in a series on Hadoop for .NET Developers. If you are a .NET developer, you will want to setup a desktop development environment with the following components: Visual Studio 2010 or 2012 NuGet Package Installer for Visual Studio A Local, Single Node Hadoop “Cluster” Having these components installed on your…

7

Hadoop for .NET Developers: Basic Architecture

NOTE This post is one in a series on Hadoop for .NET Developers. Hadoop is implemented as a set of interrelated project components. The core components are MapReduce, which handles job execution, and a storage layer, typically implemented as the Hadoop Distributed File System (HDFS). For the purpose of this post, we will assume HDFS…

0

Hadoop for .NET Developers: Understanding Hadoop

NOTE This post is one in a series on Hadoop for .NET Developers. Big Data has been a source of excitement in the analytics community for a few years now. For the purpose of this blog series, I’ll loosely define the term to mean an expansion of focus from data originating from core operational systems – the domain…

0

Hadoop for .NET Developers

Well, it’s Summer again and time for some new blog entries.  This Summer, I’ve had some time to dig into Hadoop and want to share some of the basics of storage and job processing from a .NET developer’s perspective.  Hadoop is an open-source platform written in Java.  However, thanks to work by Hortonworks and Hortonworks,…

2

Presenting Actuals and Forecast Concurrently in a Write-Enabled Cube

I have written a series of entries on writeback applications and wanted to add this last entry highlighting a common cube design challenge associated with these.  Quite often with write enabled cubes, we enter forecasted data for future periods.  When those periods come to pass we bring in actual (historical) data into the data model…

0