Integrating Kyligence Analytics Platform with Microsoft Azure HDInsight

This is a guest blog from Shaofeng Shi, Senior Architect from Kyligence Inc. Introducing Kyligence Analytics Platform Kyligence Analytics Platform (KAP) is an enterprise-ready big data warehouse on Apache Hadoop. Created by the same development team of Apache Kylin, an open-source distributed OLAP engine for big data, KAP inherits all Kylin’s advantages and has more innovations,…


How WebHCat Works and How to Debug (Part 2)

Link to Part 1 2. How to debug WebHCat 2.1. BadGateway (HTTP status code 502) This is a very generic message from Gateway nodes. We will cover some common cases and possible mitigations. This is the most common Templeton problems customer are seeing right now. 2.1.1. WebHcat service down This happens in-case WebHCat server on…

2

Ingest data into Azure Data Lake Store with StreamSets Data Collector

Today, I want to give a shout out to one of our partners who has a great offering for Azure Data Lake Store customers.  When ingesting large scale data into a data lake, data often requires data transformations such as cleaning and filtering.  StreamSets Data CollectorTM offers an open source software for building and deploying…

0

Building advanced analytical solutions faster using Dataiku DSS on HDInsight

The Azure HDInsight Application Platform allows users to use applications that span a variety of use cases like data ingestion, data preparation, data processing, building analytical solutions and data visualization. In this post we will see how DSS (Data Science Studio) from Dataiku can help a user build a predictive machine learning model to analyze…


Spark Job Submission on HDInsight 101

This article is part two of the Spark Debugging 101 series we initiated a few weeks ago. Here we discuss ways in which spark jobs can be submitted on HDInsight clusters and some common troubleshooting guidelines. So here goes. Livy Batch Job Submission Livy is an open source REST interface for interacting with Apache Spark remotely from…


OozieBot: Automated Oozie Workflow and Coordinator Generation

Introducing OozieBot – a tool to help customers automate Oozie job creation. Learn how to use OozieBot to generate Apache Oozie coordinators and Workflows for Hive, Spark and Shell actions and run them on a Linux based HDInsight cluster. Introduction Apache Oozie is a workflow/coordination system that manages Hadoop jobs. It is integrated with the…


Using Cask Data Application Platform on Azure HDInsight

Recently, CDAP (Cask Data Application Platform) by Cask, was added to the set of applications that are available to be installed on the HDInsight cluster. This blog-post aims to illustrate the scenarios enabled by using CDAP on HDInsight. HDInsight application platform Azure HDInsight recently announced an easy way to distribute, discover and install applications that…