start a Pig + Jython job in HDInsight thru WebHCat

You can also use HDInsight with Hive + Python. The drawback of the latter is that you use streaming between Hive and Python. In Hadoop streaming is just a way to call stdin/stdout inter process communication. So if you just do simple operations like string concatenations between two fields in Python it may be slow….

0

HDInsight + PowerBI: un exemple simple

En octobre dernier, j’ai eu l’occasion de montrer comment analyser des données venant de logs Web et Twitter avec PIG et HIVE dans Hadoop, puis de croiser les résultats dans Excel, ce qui permet de décliner le résultat dans Power BI. Je mets ici les diapos et les vidéos (les vidéos sont les vidéos de…

0

Why can’t I remove my storage account ?

You may want to remove a storage account you’ve created and get a message like this one: Storage account <mystorage> has container(s) which have an active image and/or disk artifacts. Ensure those artifacts are removed from the image repository before deleting this storage account. Here is what you may want to check. In the management…

0

How to deploy a Python module to Windows Azure HDInsight

Introduction In a previous post, I explained how to run Hive + Python in HDInsight (Hadoop as a service in Windows Azure). The sample showed a Python script using standard modules such as hashlib. In real life, modules need to be installed on the machine before they can be used. Recently, I had to use…

0

A simple example: how to call Python from Hive in HDInsight

Introduction Hadoop framework distributes code execution automatically in a multi node cluster. This code is also distributed against the dataset. Code development in Hadoop can be done in Java and one has to implement a map function and a reduce function; both manipulate keys and values as inputs and outputs. At a higher level, there…

0