Enabling U-SQL Advanced Analytics for Local Execution

After we announced the ability for U-SQL to massively distributed Python code in the Azure Data Lake Analytics service, a lot of developers have been asking us when the the Python support will work using U-SQL Local Execution. In this post, we’ll describe how to make that happen. To set expectations, what we will describe below is not officially supported – we do…


Connecting your own Hadoop or Spark to Azure Data Lake Store

A frequent question we get is how do I connect my Hadoop or Spark cluster to Azure Data Lake Store. Turns out it is really easy to do. Here is a step by step article that will help you get this configured. Enjoy! Connecting your own Hadoop or Spark to Azure Data Lake Store

Building advanced analytical solutions faster using Dataiku DSS on HDInsight

The Azure HDInsight Application Platform allows users to use applications that span a variety of use cases like data ingestion, data preparation, data processing, building analytical solutions and data visualization. In this post we will see how DSS (Data Science Studio) from Dataiku can help a user build a predictive machine learning model to analyze…

HDinsight – How to perform Bulk Load with Phoenix ?

Apache HBase is an open Source No SQL Hadoop database, a distributed, scalable, big data store. It provides real-time read/write access to large datasets. HDInsight HBase is offered as a managed cluster that is integrated into the Azure environment. HBase provides many features as a big data store. But in order to use HBase, the…


Uncover insights rapidly from petabytes of data in Azure Data Lake Store with SQL Data Warehouse PolyBase support

Most common patterns using Azure Data Lake Store (ADLS) involve customers ingesting and storing raw data into ADLS. This data is then cooked and prepared by analytic workloads like Azure Data Lake Analytics and HDInsight. Once cooked this data is then explored using engines like Azure SQL Data Warehouse. One key pain point for customers…


Distributed Deep Learning on HDInsight with Caffe on Spark

Introduction Deep learning is impacting everything from healthcare to transportation to manufacturing, and more. Companies are turning to deep learning to solve hard problems, like image classification, speech recognition, object recognition, and machine translation. There are many popular frameworks, including Microsoft Cognitive Toolkit, Tensorflow, MXNet, Theano, etc. Caffe is one of the most famous non-symbolic (imperative)…

U-SQL Deprecation Update: Migration of Data Source Credentials and Removal of CREATE CREDENTIAL, ALTER CREDENTIAL and DROP CREDENTIAL

Back in October, we announced that we simplified the U-SQL Credentials by merging the password secrets that are being created in Powershell and the other parts of the credential object into credentials that are being completely created with a Powershell command. This reduces one statement from the creation process. During the initial phase, we did…


U-SQL Deprecation notice: PARTITION BY BUCKET will be removed

Hi all In the upcoming refresh, we are removing the deprecated syntax PARTITION BY BUCKET and will raise an error. Thus, if you have not yet updated your table definitions with the previously announced new syntax, please do so now or your scripts will fail starting some day in February! For more information, please see:…


Introducing: Microsoft Azure Data Lake Tools for Visual Studio Code

Welcome to the Microsoft Azure Data Lake Tools preview for Visual Studio Code, an extension for developing U-SQL projects against Microsoft Azure Data Lake! This extension provides you a cross-platform, light-weight, keyboard-focused authoring experience for U-SQL while maintaining a full set of development functions. Want to make this extension even more awesome? Share your feedback….