Azure HDInsight Performance Insights: Interactive Query, Spark and Presto

Cross post from https://azure.microsoft.com/en-us/blog/hdinsight-interactive-query-performance-benchmarks-and-integration-with-power-bi-direct-query/ Fast SQL query processing at scale is often a key consideration for our customers. In this blog post, we compare HDInsight Interactive Query, Spark and Presto using an industry standard benchmark derived from the TPC-DS Benchmark. These benchmarks are run using out of the box default HDInsight configurations, with no special optimizations….


General availability of HDInsight Interactive Query – blazing fast queries on hyper-scale data

Cross post from https://azure.microsoft.com/en-gb/blog/general-availability-of-hdinsight-interactive-query-blazing-fast-data-warehouse-style-queries-on-hyper-scale-data-2/ It’s 2017, and big data challenges are as real as they get. Our customers have petabytes of data living in elastic and scalable commodity storage systems such as Azure Data Lake Store and Azure Blob storage. One of the central questions today is finding insights from data in these storage systems…


Hive Metastore in HDInsight –Tips, Tricks & Best Practices

When you create a Hive table, the table definition (column names, data types, comments, etc.) are stored in the Hive Metastore. Hive Metastore is critical part of Hadoop architecture as it acts as a central schema repository which can be used by other access tools like Spark, Interactive Hive (LLAP), Presto, Pig and many other…