Implementing a MapReduce Join with Hadoop and the .Net Framework

I have often been asked how does one implement a Join whilst writing MapReduce code. As such, I thought it would be useful to add an additional sample demonstrating how this is achieved. There are multiple mechanisms one can employ to perform a Join operation, and the one to be discussed will be a Reduce…

1

Framework for .Net Hadoop MapReduce Job Submission V1.0 Release

It has been a few months since I have made a change to the “Generics based Framework for .Net Hadoop MapReduce Job Submission” code. However I was going to put together a sample for a Reduce side join and came across a issue around the usage of partitioners. As such I decided to add support…

0

Framework for .Net Hadoop MapReduce Job Submission TextOutput Type

Some recent changes made to the “Generics based Framework for .Net Hadoop MapReduce Job Submission” code were to support Json and Binary Serialization from Mapper, in and out of Combiners, and out from the Reducer. However, this precluded one from controlling the format of the Text output. Say one wanted to create a tab delimited…

0

C# MapReduce Based Co-occurrence Item Based Recommender

As promised, to conclude the Co-occurrence Approach to an Item Based Recommender posts I wanted to port the MapReduce code to C#; just for kicks and to prove the code is also easy to write in C#. For an explanation of the MapReduce post review the previous article: http://blogs.msdn.com/b/carlnol/archive/2012/07/07/mapreduce-based-co-occurrence-approach-to-an-item-based-recommender.aspx The latest version of the code…

0

MapReduce Based Co-occurrence Approach to an Item Based Recommender

In a previous post I covered the basics for a Co-occurrence Approach to an Item Based Recommender. As promised, here is the continuation of this work, an implementation of the same algorithm using MapReduce. Before reading this post it will be worth reading the Local version as it covers the sample data and general co-occurrence…

0

Framework for .Net Hadoop MapReduce Job Submission Json Serialization

A while back one of the changes made to the “Generics based Framework for .Net Hadoop MapReduce Job Submission” code was to support Binary Serialization from Mapper, in and out of Combiners, and out from the Reducer. Whereas this change was needed to support the Generic interfaces there were two downsides to this approach. Firstly…

0

Framework for .Net Hadoop MapReduce Job Submission configuration update

To better support configuring the Stream environment whilst running .Net Streaming jobs I have made a change to the “Generics based Framework for .Net Hadoop MapReduce Job Submission” code. I have fixed a few bugs around setting job configuration options which were being controlled by the submission code. However, more importantly, I have added support…

0

Framework for .Net Hadoop MapReduce Job Submission Binary Output

To end the week I decided to make a minor change to the “Generics based Framework for .Net Hadoop MapReduce Job Submission”. I have been doing some work on creating a co-occurrence matrix for item recommendations. I was going to map the process to a MapReduce job(s), then came across the issue of how I…

0

Framework for .Net Hadoop MapReduce Job Submission libjars update

If you have been using the “Generics based Framework for .Net Hadoop MapReduce Job Submission” you may want to download the latest version of the code. The previous version of the code, when processing XML and Binary files, was dependent on a custom streaming JAR that contained the necessary reader classes. This was not an…

0

Generics based Framework for .Net Hadoop MapReduce Job Submission

Over the past month I have been working on a framework to allow composition and submission of MapReduce jobs using .Net. I have put together two previous blog posts on this, so rather than put together a third on the latest change I thought I would create a final composite post. To understand why lets…

8