Multi-Stream support in SCP.NET Storm Topology

Streams are in the core of Apache Storm. In most cases topologies are based on a single input stream, however there are situations when one may need to start the topology with two or more input steams. User code to emit or receive from distinct streams at the same time is supported in SCP. To…

0

Collecting logs from Apache Storm cluster in HDInsight

While running an Apache Storm topology in a multi node storm cluster different components of the topology log in different files that are saved in different nodes in the cluster, depending on where that component is running. Today in this blog I will discuss the log files that are available in a storm cluster and…

0

Troubleshooting Hive query performance in HDInsight Hadoop cluster

One of the common support requests we get from customers using Apache Hive is –my Hive query is running slow and I would like the job/query to complete much faster – or in more quantifiable terms, my Hive query is taking 8 hours to complete and my SLA is 2 hours. Improving or tuning hive…

1

Sqoop Job Performance Tuning in HDinsight (Hadoop)

Overview Apache Sqoop is designed for efficiently transferring bulk data between Apache Hadoop and structured datastores such as relational databases. HDInsight is Hadoop cluster deployed in Microsoft Azure and it includes Sqoop. When transferring small amount of data Sqoop performance is not an issue. However, when transferring huge amount of data it is important to…

3

Loading data in HBase Tables on HDInsight using bult-in ImportTsv utility

Apache HBase can give random access to very large tables– billions of rows X millions of columns. But the question is how do you upload that kind of data in the Hbase tables in the first place? HBase includes several methods of loading data into tables. The most straightforward method is to either use the…

7

Getting started with Sqoop in HDInsight

My name is Farooq and I am with HDinsight support team here at Microsoft. In this blog I will try to give some brief overview of Sqoop in HDinsight and then use an example of importing data from a Windows Azure SQL Database table to HDInsight cluster to demonstrate how you can get stated with…

7