Hadoop .Net HDFS File Access

Provided with the Microsoft Distribution of Hadoop, HDInsight, is a C library for HDFS file access. This code extends this library through a Managed C++ solution. This solution enables one to consume HDFS files from within a .Net environment. The purpose of this post is first to ensure folks know about the new Windows HDFS…

8

Implementing a MapReduce Join with Hadoop and the .Net Framework

I have often been asked how does one implement a Join whilst writing MapReduce code. As such, I thought it would be useful to add an additional sample demonstrating how this is achieved. There are multiple mechanisms one can employ to perform a Join operation, and the one to be discussed will be a Reduce…

1

Execution Time Based Heuristic Custom Task Scheduler

If you follow the samples for Parallel Programming with the .Net Framework, you may have come across the ParallelExtensionsExtras and the Additional TaskSchedulers. Although these samples cover a broad set of requirements I recently came across another that could be satisfied with the creation of a new custom task scheduler. In the current samples there…

0

C# MapReduce Based Co-occurrence Item Based Recommender

As promised, to conclude the Co-occurrence Approach to an Item Based Recommender posts I wanted to port the MapReduce code to C#; just for kicks and to prove the code is also easy to write in C#. For an explanation of the MapReduce post review the previous article: http://blogs.msdn.com/b/carlnol/archive/2012/07/07/mapreduce-based-co-occurrence-approach-to-an-item-based-recommender.aspx The latest version of the code…

0

Framework for .Net Hadoop MapReduce Job Submission Json Serialization

A while back one of the changes made to the “Generics based Framework for .Net Hadoop MapReduce Job Submission” code was to support Binary Serialization from Mapper, in and out of Combiners, and out from the Reducer. Whereas this change was needed to support the Generic interfaces there were two downsides to this approach. Firstly…

0

Hadoop .Net HDFS File Access (Revisited Archived)

Updated post can be found here: http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx Provided with the Microsoft Distribution of Hadoop, in addition to the C library, a Managed C++ solution for HDFS file access is provided. This solution enables one to consume HDFS files from within a .Net environment. The purpose of this post is first to ensure folks know about…

9

Generics based Framework for .Net Hadoop MapReduce Job Submission

Over the past month I have been working on a framework to allow composition and submission of MapReduce jobs using .Net. I have put together two previous blog posts on this, so rather than put together a third on the latest change I thought I would create a final composite post. To understand why lets…

8

.Net Hadoop MapReduce Job Framework – Revisited (Archived)

An updated version of this post can be found at: http://blogs.msdn.com/b/carlnol/archive/2012/04/29/generic-based-framework-for-net-hadoop-mapreduce-job-submission.aspx If you have been using the Framework for Composing and Submitting .Net Hadoop MapReduce Jobs you may want to download an updated version of the code: http://code.msdn.microsoft.com/Framework-for-Composing-af656ef7 The biggest change in the latest code is the modification of the serialization mechanism. Formerly data was…

0

Framework for Composing and Submitting .Net Hadoop MapReduce Jobs (Archived)

An updated version of this post can be found at: http://blogs.msdn.com/b/carlnol/archive/2012/04/29/generic-based-framework-for-net-hadoop-mapreduce-job-submission.aspx If you have been following my blog you will see that I have been putting together samples for writing .Net Hadoop MapReduce jobs; using Hadoop Streaming. However one thing that became apparent is that the samples could be reconstructed in a composable framework to…

0

Hadoop .Net HDFS File Access (Archived)

Updated post can be found here: http://blogs.msdn.com/b/carlnol/archive/2013/02/08/hdinsight-net-hdfs-file-access.aspx If you grab the latest installment of Microsoft Distribution of Hadoop you will notice, in addition to the C library, a Managed C++ solution for HDFS file access. This solution now enables one to consume HDFS files from within a .Net environment. The purpose of this post is…

12