Ask Learn
Preview
Please sign in to use this experience.
Sign inThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
The U-SQL/Python extensions for Azure Data Lake Analytics ships with the standard Python libraries and includes pandas and numpy. We've been getting a lot of questions about how to use custom libraries. This is very simple!
PEP 273 (zipimport) gave Python's import statement the ability to import modules from ZIP files. Take a moment to review the zipimport documentation before we proceed.
Here are the the basics:
Before you try this with U-SQL, first master the mechanics of zipimport on your own box.
Create a file called mymodule.py with the following contents:
# demo module
hello_world = "Hello World! This is code from a custom module"
This module defines a single variable called hello_world.
Create a zip file called modules.zip that contains the mymodule.py at the root .
Create a test.py Python file in the same folder as mycustommodules.zip.
import sys
sys.path.insert(0, 'mycustommodules.zip')
import mymodule
print(mymodule.hello_world)
Your folder should contain:
Now run the test.py program
python test.py
The output should look like this:
Hello World! This is code from a custom module
First upload the mycustommodules.zip file to your ADLS store - in this case we will upload it to the root of the default ADLS account for the ADLA account we are using - so its path is "\mycustommodules.zip"
Now, run this U-SQL script
REFERENCE ASSEMBLY [ExtPython];
DEPLOY RESOURCE @"/mycustommodules.zip";
// mymodule.py is inside the mycustommodules.zip file
DECLARE @myScript = @"
import sys
sys.path.insert(0, 'mycustommodules.zip')
import mymodule
def usqlml_main(df):
del df['number']
df['hello_world'] = str(mymodule.hello_world)
return df
";
@rows =
SELECT * FROM (VALUES (1)) AS D(number);
@rows =
REDUCE @rows ON number
PRODUCE hello_world string
USING new Extension.Python.Reducer(pyScript:@myScript);
OUTPUT @rows
TO "/demo_python_custom_module.csv"
USING Outputters.Csv(outputHeader: true);
It will produce a simple CSV file with "Hello World! This is code from a custom module" as a row.
Please sign in to use this experience.
Sign in