Jupyter Notebooks in Azure Machine Learning Studio the perfect tool for Academics and Students

Azure ML Studio is a powerful canvas for the composition of machine learning experiments and their subsequent operationalization and consumption. 

Azure ML Studio provides an easy to use, yet powerful, drag-drop style of creating experiments

A really cool feature now within Azure ML is the integration of Jupyter Notebooks, I really love Jupyter Notebooks as they provide

1. A really nice interface for quickly running code, visualizing data, exploring insights, and trying out ideas:

2. Are superb for academic documentation and submissions from student for projects, tasks or assignments

image

 

image

What's really nice about Jupyter Notebooks is that they run OS and modern browser. 

The Notebooks at a high level, consist of two main types of “cells” which are clearly identified in the Notebook and perfect for submission as part of academic exercises.

The cells are clearly identified as

Markdown cells for documentation

Executable code cells.

After editing a cell, press Shift+Enter to run it:

Here an example of a Jupyter Notebook and the markdown

image

 

Jupyter Notebooks  also provide special commands (“magics”) that act as macros:

And also an escape character (“!”) to access the shell:

 

Integration with Azure Machine Learning Studio

Simply click +New, get a blank notebook, enter some Python, perhaps import some data from Azure blob storage and compute away: 

image

 

Finding out more about your dataset

Simply select it, then choose to open it in a Notebook and explore away. Your dataset is automatically available as a Pandas dataframe:

 

image

Sometimes you need to check out a dataset in between phases. 

There is now an easy way to do this, first, add a convert-to-csv node. Then right click on it and open in a Notebook.

Your data will be available as a Pandas dataframe as in the above case:

image

Currently you can add R and Python code modules in your experiments by editing them directly in the embedded editor.

While convenient for short snippets, it does not provide an execution environment. 

You can use Notebooks to author and debug your modules and then paste them back into the experiment nodes instead.

image

Your notebooks are persisted in your workspace and can be used in subsequent sessions. 

You can see a list of your notebooks by clicking on the Notebook tab. Notebooks can be renamed, deleted, copied, etc. from either the Studio or from Jupyter directly and both environments will sync up.

Azure ML Client SDK

Enumerating and exploring your datasets and experiments from within the notebook (or any IDE for that matter) is pretty easy: 

image

You can actually slice, dice, and store the modified dataset back into Azure ML. These and similar functionality is available via the recently enhanced Azure ML Client SDK. 

Additionally, you can use the Python Azure SDK to access a wide variety of services in Azure. These including operations such as Storage, Service Management, etc.:

image

Note: Both the Azure SDK and the Azure ML Client SDK are preinstalled for you.

Execution Environment

The Notebook environment currently supports Python 2 and Python 3 and R.. 

When you start up a Notebook, you have the full Anaconda 64-bit distro available to you. The full list of pkgs can be found here

The most relevant ones are: numpy/scipy, pandas, matplotlib, scikit-learn. 

For the curious, the Notebook service runs on Ubuntu 14.04.02 under Docker. Shell commands are available via the “!” escape character.

If you are inactive for more than one hour, your Notebook Server will be reclaimed. 

Notebooks are check-pointed regularly and the latest saved version will appear in your Studio workspace. Y

ou can also manually click Save on the menu bar as well as download the Notebook to your local machine. 

Jupyter and Data Science

Jupyter is one of the most important innovations in the data science and technical computing space in recent years. You now have full access to its power from any OS, from any modern browser directly from inside Azure ML Studio. You can choose whichever canvas makes the most sense at that particular moment. The two work together hand in hand to ensure a productive and delightful experience for you.

Notebooks are easy and fun to use – give it a try right now:

  • Go to https://studio.azureml.net and select “Get started”

  • Select Guest or, better yet, create an account and login so your Notebooks persist

  • Click on Datasets left-tab, then “Samples” on top, then “Movie Ratings”

  • At the bottom of the page, select “Open in Notebook”, Python 2

  • Note that your auth and conversion to dataframe code is set up for you

  • Paste in this code into a new cell:

  
 frame.describe()
 frame['Rating'].head(200).plot(figsize=(12,4))
  • From the top Menu select “Run All”

You should see a summary of your data, plus a plot of the 1st 200 ratings:

image

 

Resources

The following links provide further information on Jupyter and Azure ML:

Jupyter

Azure ML

Python Tools for Visual Studio

Anaconda and key pkgs