Understanding Spark’s SparkConf, SparkContext, SQLContext and HiveContext

  The first step of any Spark driver application is to create a SparkContext. The SparkContext allows your Spark driver application to access the cluster through a resource manager. The resource manager can be YARN, or Spark’s cluster manager. In order to create a SparkContext you should first create a SparkConf. The SparkConf stores configuration…

2

Spark or Hadoop

  Spark is the most active Apache project and has a lot of media press in the big data world. So how do you know if Spark is right for your project and what is the difference between Spark and Hadoop when run on HDInsight? I’ll cover some of the differences between Spark and Hadoop…

0

How to install Splunk on HDINSIGHT with a custom action script

  Recently I worked with a customer that wanted to use Splunk Enterprise and Splunk Forwarder to monitor and manage their HDINSIGHT Storm cluster. You can learn more about Splunk at http://www.splunk.com/ . Splunk has a version called Splunk Light that you can download for free. There are some restrictions, so read the documentation and…

0

Oozie sqoop action hits primary key violation

We have seen multiple customers contact us where an oozie job appears to hang. The oozie job involves a sqoop action which is exporting data from a file in HDInsight to a table in a SQL Azure database. For background on Sqoop see Getting Started with Sqoop . We will use this blog to help…

1

Structured vs Semi-structured Data

My name is Bill Carroll and I am a member of the Microsoft HDInsight support team. The majority of my working career has been spent on SQL Server, a relational database. Little did I think about it all these years, but relational databases are structured data. When we create a table we define the structure…

0

How to manually compile and create your own jar file to execute on HDInsight

Hi, my name is Bill Carroll and I am a member of the Microsoft HDInsight support team. At the heart of Hadoop is the MapReduce paradigm. Knowing how to compile your java code and create your own jar file is a useful skill, especially for those coming from the C++ or  .Net programming world. So…

2