Apache Hadoop on Windows Azure Part 7 – Writing your very own WordCount Hadoop Job in Java and deploying to Windows Azure Cluster

In this article,  I will help you writing your own WordCount Hadoop Job and then deploy it to Windows Azure Cluster for further processing.   Let’s create  Java code file as “AvkashWordCount.java” as below:   package org.myorg; import java.io.IOException; import java.util.*; import org.apache.hadoop.fs.Path; import org.apache.hadoop.conf.*; import org.apache.hadoop.io.*; import org.apache.hadoop.util.*; import org.apache.hadoop.mapreduce.Mapper; import org.apache.hadoop.mapreduce.Reducer; import org.apache.hadoop.conf.Configuration;…

1

Apache Hadoop on Windows Azure Part 6 – Running 10GB Sort Hadoop Job with TeraSort Option and understanding MapReduce Job administration

In this section we will run the same 10GB sorting Hadoop job with TERASORT option. With TeraSort option the parameters are changed as below:     With above parameters couple of things to remember: You must have /example/data/10GB-sort-input folder along with data (This is created when you use teragen option first as explained in Exercise…

0

Apache Hadoop on Windows Azure Part 5 – Running 10GB Sort Hadoop Job with Teragen, TeraSort and TeraValidate Options

This example consists of the 3 map/reduce applications that Owen O’Malley and Arun Murthy used win the annual general purpose (daytona) terabyte sort benchmark @ sortbenchmark.org. This sample is part of prebuilt package in your Hadoop on Azure portal so Just like any other prebuilt sample you can deploy it to cluster as below:    …

0

Apache Hadoop on Windows Azure Part 4- Remote Login to Hadoop node for MapReduce Job and HDFS administration

When you are running Apache Hadoop job in Windows Azure, you have ability to remote into the main node (It is a virtual machine) and then perform all the regular tasks i.e.: Hadoop Map/Reduce Job Administration HDFS management Regular Name Node Management tasks   To login you just need to select the “Remote Desktop” button…

2

Apache Hadoop on Windows Azure Part 2 – Creating a Pi Estimator Hadoop Job

 Once you have created a cluster in Windows Azure, you will have a few prebuilt samples provided in your account so let’s select “Samples” as below:     In the Hadoop Samples gallery lets select “Pi Estimator” sample below:     You will see “Pi Estimator” sample details as below. After reading the details and…

0

Apache Hadoop on Windows Azure Part 1- Creating a new Windows Azure Cluster for Hadoop Job

Once you have applied for Apache Hadoop on Windows Azure CTP account you can create a new cluster using this information. If you want to learn more about Hadoop on Azure CTP, visit my previous blog here. After you have got Hadoop on Azure CTP access, use Windows Live Account to Login at http://www.hadooponazure.com Now you…

0

Top 12 Articles on Cloud Services and Big Data on Windows Azure in December

Windows Azure Cloud Services Newly Designed Windows Azure Developer Center Article Tutorial: Running a Python Web Application in Windows Azure Article  Tutorial: Running the Mongoose Web Server in Windows Azure Article  Node.js in Windows Azure, To the Cloud and Beyond! Article  Windows Azure Toolkit for Social Games Version 1.2.0 (beta) Released Article  Windows Azure: Hands on…

0

How to Modify Registry keys in Windows Azure Virtual Machine from a web or worker role?

If you have a requirement and decided to modify VM registry keys, you have two options: Do it from a Standalone Startup task This modification will be completed even before your role start Be sure to run the startup task as in elevated mode. You just need to use standard Windows API to access the…

0

Windows Azure: Hands on Lab for Moving Applications to the Cloud

Windows Azure team created a detailed hands on lab to help everyone who wants to move their application to Windows Azure cloud. Each of the Hands-On Labs is separate and stand-alone so you can choose which ones you want to use, and you can work through them in any order. However, it is recommended that…

1