Some highlights of the August 1st U-SQL refresh: Skipping header rows, Database Level ACLs, improved Extractor framework

We just released some big updates on U-SQL and Azure Data Lake.

Check out the U-SQL release notes and the blog post summarizing the new file and folder-level access control in Azure Data Lake Storage.

In this blog post I want to call out some of the highlights.

First we released access control at the database level. This gives you now control on who can create databases, and use them to read from them or use them to create objects in them. Note that the master database in U-SQL per default continues to be open for everyone to use.

Secondly, we fixed the so-called record boundary-extent boundary alignment issue that causes large files to fail during extraction unless they were uploaded in specific ways. Fixing this issue also gives us now correct information about the segments (extents) that are being passed to an extractor. Which in turn now allows us to finally enable the skipFirstNRows parameter on the built-in extractors as well as write parallelizable custom extractors that process the first or last file segment specially. I will provide some more detailed blog posts on such custom extractors soon.

Finally, I strongly suggest to sign up to test the new sampling capabilities.