Create U-SQL EXTRACT Script Automatically

In this blog, you will learn how to create U-SQL EXTRACT script automatically using the latest version of Azure Data Lake Tools for Visual Studio.

Watching this 3 minutes video to learn more.

 

One of U-SQL's core capabilities is to be able to schematize unstructured data on the fly without having to create a metadata object for it. This capability is provided by the EXTRACT expression that will invoke either a user-defined extractor or built-in extractor to process the input file or set of files specified in the FROM clause and produces a rowset whose schema is specified in the EXTRACT clause.

While using the build-in extractor to schema semi-structured data, like data in .csv file, the schema definition in U-SQL is slow and error prone, especially for the .csv file contains hundreds of columns.

Recently, we released a new feature in the latest version of Azure Data Lake Tools for Visual Studio to help you generate this U-SQL EXTRACT statement automatically.

How to use this feature?

Step 1:

Double click your ADLS account in Server Explorer to open Azure Data Lake Explorer in VS.

 

Step 2:

Find the file, right click it to choose Create EXTRACT Script.

If you have an ADLS URI for the file you want to query, through Tools > Data Lake > Open ADLS Path to open file preview, and then click Create EXTRACT Script in file preview window.

 

Step 3:

Adjust the automatically generated script as needed. Click the column header to change the column name, and click the icon besides of the column name to change associated data type. If the first row of your file is header row, check File Has Header Row? to use the first row as the columns names.

 

Step 4:

Click Copy Script? to copy the script to clipboard, or click New Query from Script? to open a new temp query with the script.

 

 

Get the latest Azure Data Lake Tools for Visual Studio from https://aka.ms/adltoolsvs.

Contact us at adldevtool@microsoft.com if you have questions of feedback.