Pushing Hadoop Cluster Configuration Changes using PowerShell

In my previous post I talked about Implementing and Deploying Rack Awareness using PowerShell. However PowerShell is a great tool for not only managing things like Rack Awareness but for installing and managing the Hadoop cluster; especially for managing configuration changes, the focus of this post. All the files relating to post can be found…

3

Deploying Hadoop Rack Awareness with PowerShell

In a previous post I talked about Implementing Hadoop Rack Awareness with PowerShell. One thing I skimmed over in this post was how to deploy the necessary files to the cluster and make the configuration file changes. Once again PowerShell is your friend. Deploying this solution involves two processes. Firstly copying the necessary files to…

2

Implementing Hadoop Rack Awareness with PowerShell

This post walks-through building a PowerShell script for enabling Rack Awareness in Hadoop. While several example scripts can be found online for Linux, samples building a script for Windows is less common. Hadoop divides the data into multiple file blocks and stores them on different machines. By default all machines are deemed to be on…

0

Managing Your HDInsight Cluster using PowerShell – Update

Since writing my last post Managing Your HDInsight Cluster and .Net Job Submissions using PowerShell, there have been some useful modifications to the Azure PowerShell Tools. The HDInsight cmdlets no longer exist as these have now been integrated into the latest release of the Windows Azure Powershell Tools. This integration means: You don’t need to…

0

Managing Your HDInsight Cluster and .Net Job Submissions using PowerShell

This post explains how best to manage an HDInsight cluster using a management console and Windows PowerShell. The goal is to outline how to create a simple cluster, provide a mechanism for managing an elastic service, and demonstrate how to customize the cluster creation. Before provisioning a cluster one need to ensure the Azure subscription…

3

Implementing LOB Storage in Memory Optimized Tables

Memory optimized tables do not have off-row or large object (LOB) storage, and the row size is limited to 8060 bytes. Thus, storing large binary or character string values can be done in one of two ways: •                    Split the LOB values into multiple rows •                    Store the LOB values in a regular non-memory optimized…

0

Managing Hive Job Submissions With PowerShell

In my previous post, I talked about “Managing Your HDInsight Cluster with PowerShell”. In this post I made no mention of using Hive. I hope to re-address this balance by specifically talking about how you can submit Hive jobs from the same local management console. As before all the scripts mentioned in this and the…

3

Managing Your HDInsight Cluster with PowerShell

An updated version of this post can be found here. This blog post provides a mechanism for managing an HDInsight cluster using a local management console through the use of Windows PowerShell. The goal is to outline how to configure the local management console, create a simple cluster, submit jobs using MRRunner, and finally provide…

0

Hadoop .Net HDFS File Access

Provided with the Microsoft Distribution of Hadoop, HDInsight, is a C library for HDFS file access. This code extends this library through a Managed C++ solution. This solution enables one to consume HDFS files from within a .Net environment. The purpose of this post is first to ensure folks know about the new Windows HDFS…

8

Submitting Hadoop MapReduce Jobs using PowerShell

As always here is a link to the “Generics based Framework for .Net Hadoop MapReduce Job Submission” code. In all the samples I have shown so far I have always used the command-line consoles. However this does not need to be the case, PowerShell can be used. The Console application which is used to submit…

0