HDFS gets full in Azure HDInsight with many Hive temporary files

Sometimes when Hive is using temporary files, and a VM is restarted in an HDInsight cluster in Microsoft Azure, then those files can become orphaned and consume space. In Azure HDInsight, those temp files live in the HDFS file system, which is distributed across the local disks in the worker nodes. This is a different…

0

How to Lock a Resource Group to prevent accidental deletion of resources like HDInsight

Did you know it is possible to prevent accidental deletion of resources in Azure? This could apply to any number of resource, HDInsight, Stream Analytics jobs, Data Factories, DocumentDB accounts, etc. We can add a lock to the resource group to prevent resources from being removed inadvertantly. I found out the hard way when someone…


HDInsight Name Node can stay in Safe mode after a Scale Down

This week we worked on an HDInsight cluster where the Name Node has gone into Safe mode and didn’t leave that mode on its own. It’s not very common, but I wanted to share why it happened, and how to get out of the situation, in case it prevents a headache for someone else. HDInsight…

0

HDInsight Hive Metastore fails when the database name has dashes or hyphens

Working in Azure HDInsight support today, we see a failure when trying to run a Hive query on a freshly created HDInsight cluster. Its brand new and fails on the first try, so what could be wrong? Our Hive client app fails with this kind of error. Exception in thread “main” java.lang.RuntimeException: java.lang.RuntimeException: Unable to…

0

Encoding 101 – Exporting from SQL Server into flat files, to create a Hive external table

Today in Microsoft Big Data Support we faced the issue of how to correctly move Unicode data from SQL Server into Hive via flat text files. The main issue faced was encoding special Unicode characters from the source database, such as the degree sign (Unicode 00B0) and other complex Unicode characters outside of A-Z 0-9….

0

Encoding the Hive query file in Azure HDInsight

Today at Microsoft we were using Azure Data Factory to run Hive Activities in Azure HDInsight on a schedule. Things were working fine for a while, but then we got an error that was hard to understand. I’ve simplified the scenario to illustrate the key points. The key is that Hive did not like the…

0

Azure Data Factory JSON Changes in July 2015

Azure Data Factory factories are designed with a series of fairly simple JSON documents and uploaded to Azure using either the web interface, PowerShell, .Net, or Visual Studio. If you were using the pre-release public preview of Azure Data Factory, you should be aware of a recent change in the SDK,  in order to make…

0

HDInsight News – New Videos to watch – HDInsight Provisioning demonstrations

Check out these two recent videos demos regarding HDInsight provisioning These videos complement the product documentation outlined at http://azure.microsoft.com/en-us/documentation/articles/hdinsight-get-started/#provision HDInsight is the name given to the Microsoft Azure service (in the Microsoft cloud data centers) running the Hortonworks Data Platform distribution of Apache Hadoop on Microsoft Windows. Provisioning is the word we use to describe the…


HDInsight News – New Articles to read

Hi Folks, I’m Jason from the Microsoft Big Data Support team. Thanks for reading our blog, and for trying out HDInsight in your own business. I want to share some new articles Microsoft just published that will be helpful for getting started with HDInsight in your business. To help folks who are not so familiar…

0