Continuous integration made easy with MSBuild support for U-SQL (preview)

Azure Data Lake provides enterprise ready big data as a service on Azure. A key requirement of enterprises is the ability to automate code integration processes to ensure fast, consistent and promotion of tested code. Today, we are pleased to announce a highly requested feature from the Azure Data Lake customers: The ability to use the Microsoft Build Engine (MSBuild) for scripted build and deployment of U-SQL applications. By using MSBuild, customers can orchestrate, automate and build U-SQL projects in many environments including those where Visual Studio isn't installed. This accelerates the process of promoting code from development to production environments.

Project migration

During this preview, you have to migrate your U-SQL projects manually in order to use MSBuild. In the next update of the Azure Data Lake Tools for Visual Studio package, the tool will apply the new template to existing projects when they are opened.

To migrate current U-SQL project, the project file needs to be modified. After uploading the project in Visual Studio right click the project and choose Edit.

Find "<Import Project="$(AppData)\Microsoft\DataLake\MsBuild\1.0\Usql.targets" />" in the project file which is usually the second to last line and change it to:

   <!-- check for SDK Build target in current path then in USQLSDKPath in the case of command line build -->
  <Import Project="UsqlSDKBuild.targets" Condition="'$(BuildingInsideVisualStudio)' != 'true' And  Exists('UsqlSDKBuild.targets')" />
  <Import Project="$(USQLSDKPath)\UsqlSDKBuild.targets" Condition="'$(BuildingInsideVisualStudio)' != 'true' And !Exists('UsqlSDKBuild.targets') And '$(USQLSDKPath)' != '' And Exists('$(USQLSDKPath)\UsqlSDKBuild.targets')" />
  <!-- backward compatible with IDE build -->
  <Import Project="$(AppData)\Microsoft\DataLake\MsBuild\1.0\Usql.targets" Condition="'$(BuildingInsideVisualStudio)' == 'true'" />

Get Nuget package

MSBuild doesn't provide built-in support for U-SQL project type. To add this ability you need to add a reference for your solution to the Microsoft.Azure.DataLake.USQL.SDK Nuget package that adds the required language service.

To add the Nuget package reference, you can right click the solution in Solution Explorer, and choose Manage NuGet Packages for Solution, then search the Nuget package. Or you can add a file called "packages.config" in the solution and add below contents into it.

 <?xml version="1.0" encoding="utf-8"?>
<packages>
  <package id="Microsoft.Azure.DataLake.USQL.SDK" version="1.3.1019-preview" targetFramework="net452" />
</packages>

MSBuild command line

After migrating the project and getting the Nuget package, you can call the standard MSBuild command line with the additional arguments below to build your U-SQL project:

 msbuild USQLBuild.usqlproj /p:USQLSDKPath=packages\Microsoft.Azure.DataLake.USQL.SDK.1.3.1019-preview\build\runtime;USQLTargetType=SyntaxCheck;DataRoot=datarootfolder

The arguments definition and values are:

USQLSDKPath=<U-SQL Nuget package>\build\runtime: This refers to the install path of the NuGet package for the U-SQL language service mentioned above.

USQLTargetType=Merge or SyntaxCheck:

  • Merge: Merge mode compiles code-behind files, like .cs, .py and .r files, and inlines the resulting user defined code library (such as a dll binary, Python or R code) into the U-SQL script.
  • SyntaxCheck: SyntaxCheck mode first merges code-behind files into the U-SQL script, and then compiles the U-SQL script to validate your code.

DataRoot=<DataRoot path>: DataRoot is only needed for SyntaxCheck mode. While building the script with SyntaxCheck mode, MSBuild checks the references in the script to database objects. You need to make sure to set up a matching local environment that contains the referenced objects from the U-SQL database on the build machine's DataRoot folder before building. Note that MSBuild only checks database objects reference, not files.

Continuous integration with VSTS

Besides the command line, customers can also use Visual Studio Build or MSBuild task to build U-SQL projects in VSTS. To do this, make sure to:

  1. Add Nuget restore task to get the solution referenced Nuget package including Azure.DataLake.USQL.SDK, so that MSBuild can find the U-SQL language targets. 
  2. Set MSBuild Arguments, and you can set the arguments in Visual Studio Build or MSBuild task like below:
 /p:USQLSDKPath=$(Build.SourcesDirectory)/USQLMSBuild/packages/Microsoft.Azure.DataLake.USQL.SDK.1.3.1019-preview/build/runtime /p:USQLTargetType=SyntaxCheck /p:DataRoot=$(Build.SourcesDirectory)

Or you can define variables for these arguments in VSTS build definition.

Build output

After running MSBuild from command line or as a VSTS task, all scripts in the U-SQL project are built and output to a single file at " <Build output path>/<script name>/<script name>.usql" ?. You can copy this composite U-SQL script to the release folder for further deployment.

Known limitation and roadmap

For this preview, when building on SyntaxCheck mode, you need to make sure to set up the referenced database environment on the build agent. For example, if a U-SQL script queries a table and references an assembly, you need to create that table and register that assembly to the local database stored in DataRoot folder on the build agent before the build task is started, or the build process will fail.

To resolve this inconvenience, as well as speed up U-SQL database development and deployment, we are developing a new project type called U-SQL Database Project which helps to develop, manage and deploy U-SQL databases more easily.

If you are interested in giving us feedback on U-SQL Database Project or trying the preview version soon, please take this survey (1 minute to complete) or contact us at adldevtool@microsoft.com. Your input will be extremely helpful for us to improve the experience.