Sqoop Job Performance Tuning in HDinsight (Hadoop)

Overview Apache Sqoop is designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. HDInsight is Hadoop cluster deployed in Microsoft Azure and it includes Sqoop. When transferring small amount of data Sqoop performance is not an issue. However, when transferring huge amount of data it is important to…

2

Oozie sqoop action hits primary key violation

We have seen multiple customers contact us where an oozie job appears to hang. The oozie job involves a sqoop action which is exporting data from a file in HDInsight to a table in a SQL Azure database. For background on Sqoop see Getting Started with Sqoop . We will use this blog to help…

1

HDInsight News - New Articles to read

Hi Folks, I’m Jason from the Microsoft Big Data Support team. Thanks for reading our blog, and for trying out HDInsight in your own business. I want to share some new articles Microsoft just published that will be helpful for getting started with HDInsight in your business. To help folks who are not so familiar…

0

Getting started with Sqoop in HDInsight

My name is Farooq and I am with HDinsight support team here at Microsoft. In this blog I will try to give some brief overview of Sqoop in HDinsight and then use an example of importing data from a Windows Azure SQL Database table to HDInsight cluster to demonstrate how you can get stated with…

5