How to use DBeaver with Azure #HDInsight

  DBeaver is SQL client and a database administration tool. It is free and open-source (ASL). DBeaver use JDBC API to connect with SQL based databases. Following is a simple walk through of how to connect Azure HDInsight cluster [Hadoop or Interactive Query] with DBeaver. This article is based on HDInsight 3.6 version. Step1: Install…


HDInsight HBase: Migrating to new HDInsight version

Following are short steps to upgrade your HDInsight HBase cluster with small downtime. Before you migrate please note that there may be incompatibilities between HBase Major/Minor version and below steps only works if there is no version compatibility issues between source and destination cluster. We recommend you to review HBase book before undertaking an upgrade….


What is Azure HDInsight?

Fully managed Big Data Open Source Analytics Service with popular open source frameworks such as Kafka, Storm, R, Spark, Hive, HBase, Phoenix, LLAP, Sqoop, Oozie & Hadoop. 100% Apache Open Source with No lock in. Customers can freely move between on premise, Azure and other clouds as Microsoft does not use any proprietary code with…


HDInsight# Hive Frequently Asked Questions

In this bog post we have captured some common questions and answers related to Hive in Azure HDInsight. How do I check DAG counters for a query? Go to Tez View. Find the DAG and click on Dag counters tab. How do I check that amount of data being read/writted/shuffled by a query? Go to…


HDInsight: How to enable Oozie UI in Ambari

Oozie UI can be accessed by creating an Oozie view in Ambari 1.        Go the cluster Ambari dashboard 2.        Click on Admin -> manage Ambari 3.        On the left click on views 4.        Expand WORKFLOW_MANAGER and press Create Instance 5.        Fill in the details…


XBox: Analytics on petabytes of gaming data with Azure HDInsight

Cross post from https://azure.microsoft.com/en-us/blog/how-xbox-uses-hdinsight-to-drive-analytics-on-petabytes-of-telemetry-data/ Microsoft Studios produces some of the world’s most popular game titles including the Halo, Minecraft, and Forza Motorsport series. The Xbox product services team manage thousands of datasets and hundreds of active pipelines consuming hundreds of gigabytes of data each hour for first party studios. Game developers need to know the health…


Azure HDInsight Performance Insights: Interactive Query, Spark and Presto

Cross post from https://azure.microsoft.com/en-us/blog/hdinsight-interactive-query-performance-benchmarks-and-integration-with-power-bi-direct-query/ Fast SQL query processing at scale is often a key consideration for our customers. In this blog post, we compare HDInsight Interactive Query, Spark and Presto using an industry standard benchmark derived from the TPC-DS Benchmark. These benchmarks are run using out of the box default HDInsight configurations, with no special optimizations….


Azure HDInsight Integration with Azure Log Analytics is now generally available

Cross post from https://azure.microsoft.com/en-us/blog/azure-hdinsight-integration-with-azure-log-analytics-is-now-generally-available/   I am excited to announce the general availability of HDInsight Integration with Azure Log Analytics. Azure HDInsight is a fully managed cloud service for customers to do analytics at scale using the most popular open-source engines such as Hadoop, Hive/LLAP, Presto, Spark, Kafka, Storm, HBase etc. ​ Thousands of our customers…