How to use parameter substitution with Pig Latin and PowerShell

When running Pig in a production environment, you’ll likely have one or more Pig Latin scripts that run on a recurring basis (daily, weekly, monthly, etc.) that need to locate their input data based on when or where they are run. For example, you may have a Pig job that performs daily log ingestion by…

1

Customizing HDInsight Cluster provisioning

In my last blog, I discussed how we can specify Hadoop configurations for a job on an HDInsight cluster. At the end of that blog, I also dicussed the alternative approach where you may want to change certain hadoop configurations from default values and would like to preserve the changes throughout the lifetime of the…

7

How to pass Hadoop configuration values for a job on HDInsight

I came across the question a few times recently from several customers– “how do we pass hadoop configurations at runtime for a mapreduce job or Hive Query via HDInsight PowerShell or .Net SDK?” I thought of sharing the answer here with others who may run into the same question. It is pretty common in Hadoop…

0

How to add custom Hive UDFs to HDInsight

I recently had a need to add a UDF to Hive on HDInsight. I thought that it would be good to share that experience on a blog post. Hive provides a library of built-in functions to achieve the most common needs. The cool thing is that it also provides the framework to create your own…


Getting started with the HDInsight PowerShell tools and SDK

Hi, my name is Azim and I work on the Big Data Support Team at Microsoft. If you have had a chance to read an earlier post by Dharshana, you may have seen how we can submit Hive query using the HDInsight PowerShell tools. In this blog, we will cover some basics of the HDInsight…

0