Learn the fundamentals of Microsoft’s Machine Learning and Artificial Intelligence services

Article
03/02/2018

Complete learning paths: 15h 10m + practice labs, see https://azure.microsoft.com/en-us/training/

Learn the fundamentals of Microsoft’s Machine Learning and Artificial Intelligence services and how to build applications using them.

Develop practical job skills that you can start using straight away.

Free registration. Courses provided free in partnership with Pluralsight.

Beginner

Developing AI applications on Microsoft Azure – Getting started

Self-paced course 4h 25m

Overview
Using Microsoft Cognitive Services in your applications
Let's begin with this short video that explains the Microsoft Cognitive Services and shows a sample application.
Explain to a friend what Cognitive Services are and give an example of how you can use them in an application.

Introduction and Setup
Install and configure Visual Studio
To get started, you'll need Visual Studio to create and run your Cognitive Services application. The free Community Edition Works fine.
Open Visual Studio and create a simple application. Copy the sample application
Next, you'll need to download the sample application. You'll work through other example Solutions, but this one has the complete application used for this learning path.
Open the ZIP file and expand it to a directory on your computer. Open the Visual Studio solution (the ".sln" file) in Visual Studio.

Working with the Universal Windows Platform SDK
The Universal Windows Platform (UWP) is the app platform for Windows 10. We'll be using this SDK throughout the Learning Path to work with vision, text and speech. This overview gives you an introduction to what the SDK is.
First, set up the SDK, and then complete a sample application.

The Acquire Phase
Accessing the Camera in Windows
We'll start the Acquire Phase by learning to use the Windows built-in camera UI, where you'll learn how to create Universal Windows Platform (UWP) apps that use the camera to capture photos, video, or audio. This github sample has the Windows.Media.Capture API and how to use it.
Create a basic app that has photo, video, and audio capture using MediaCapture. Working with Handwriting
Now that we have the video and audio interfaces created, we're ready to add in the next human-interaction part of the application - handwriting. We'll learn how to take input using the InkCanvas API. This github-based project (scroll down to the README file) shows how to use ink functionality (such as capturing ink from user input and performing handwriting recognition on ink strokes) in Universal Windows apps using C#.
Create an app that supports writing and drawing with Windows Ink. Acquiring Speech with Cortana
With both an Acquire and Processing capability, we'll now focus on Microsoft's Intelligent Assistant, Cortana, and explain how it can start the sample application, and set and receive messages by using natural language. This github site demonstrates how to integrate your app with Cortana voice commands. It covers authoring and installing a Voice Command Definition file, and shows how your app can respond to being activated by Cortana.
Ensure you can start an application using Cortana.

Processing Phase
Processing speech with the Windows.Media.SpeechRecognition Namespace
One of the most complicated (and powerful) capabilities in human-like processing in AI is working with speech. The Windows.Media.SpeechRecognition namespace provides functionality which you can acquire and monitor speech input, create speech recognition grammars that produce both literal and semantic recognition results, capture information from events generated by the speech recognition and, and configure and manage speech recognition engines. This resource has an overview of this technology in Windows, and also has a set of articles at the bottom of the page to learn more, and samples you should work through.
Make a UWP app respond with speech. Recognizing a Face Image
We'll continue the Processing Phase using the video we've captured, we'll zero in on the face of the user using the FaceDetectionEffect Class of the Windows.Media.Core namespace. Next, we'll take that area of the video and pass it along to our first Cognitive Services call, the Face API. Using this service, we'll first upload a still shot of the user, which we can then compare using the Face API by identifying previously tagged people in images. This lets the user set or get messages. this github sample shows you how to work with the API's.
Create a simple app that detects faces in an image. Working wtih the Cognitive Services API
We're ready to make our first call to the Cognitive Services API. In this step we'll learn how to take the images and send them to the API for processing, and how to interpret the results. Read this resource and the API guides it references.
Work through this tutorial to ensure you know how to make calls to the CS API.

The Response Phase
Compiling and Running the sample Application
We're ready to try out the application. We'll compile the application, open the settings, and enter the information for the Cognitive Services API. We'll then test the application with video, voice and handwriting.
Download the application at this link, and then open the "FamilyNotes.sln" file in Visual Studio. Compile the application, set your API key, and try it out!

Next Steps
Other Cognitive Services
Now you're ready to branch out and create your own applications using more of the Cognitive Services. Take a look at the list here and follow the samples they have.
Create and deploy more AI applications

Intermediate

Developing and deploying custom AI applications using cognitive services

Self-paced course 5h 15m

Introduction and setup
Introduction to Language Understanding Intelligent Service (LUIS) - Microsoft Cognitive Services
We'll start with an overview of the service, and how you can use it in your applications with this short video. It introduces the concepts we'll use in this Learning Path.
Explain to a colleague what LUIS is using an example scenario.

Install and configure Visual Studio
In this Learning Path, we'll stay in the Microsoft Azure Portal for most of the steps. To create a user-friendly UI, you'll need Visual Studio to create and run your Cognitive Services application using Node.js, Python, C# and other programming languages. The free Community Edition Works fine.
Open Visual Studio and create a simple data application. Creating Subscription Keys Via Azure
You can create and manage your keys on the My Apps page in the Azure Portal. You can always access this page by clicking My Apps on the top navigation bar of the LUIS web page. This reference covers how to do that.
Create your LUIS subscription keys in the Azure Portal. Concepts - Intents
As you saw in the Overview Video, Intents are a task or action the user wants to perform. It is a purpose or goal expressed in a user's input, such as booking a flight, paying a bill, or finding a news article. In this reference, you'll learn more about how to use Intents.
Think of a LUIS application that you would like to create. Begin planning your application by identifying your domain and your main intents. Concepts - Entities
Now we're ready to understand Entities. Entities are important words in utterances that describe information relevant to the intent, and sometimes they are essential to it. Entities belong to classes of similar objects. This article covers the types of Entities and how you can use them. It also covers Hierarchical types.
Identify your main entities and map out any hierarchical, composite, or list entities for the application you would like to create. Concepts - Utterances
Utterances are the input from the user. This reference covers training LUIS to extract intents and entities from Utterances.
Come up with 3-4 utterances for each of the intents that you identified.

Concepts - Features
LUIS is using machine learning in the background. A machine learning Feature is an attribute of data that affects the outcome you want. You add Features to a language model, to provide hints about how to recognize the input that you want to classify - called the Label. This article explains how Features help LUIS recognize both intents and entities.
Determine which phrase lists and patterns might be useful to include in your app.

Concepts - Working with multiple languages
Multiple languages are available in LUIS, and more are being added. In this reference, you'll cover how to detect rare words, and when and how you need to add tokens.
Explain any words or characters you think should be added to a non-exchangeable phrase-list feature.

The Acquire Phase
Create the application and get API keys
Building a LUIS app ends up with an API that you can call from various languages, such as Node.js, Python, C# and more. The first step is creating the application itself, and getting the endpoint key. This article explains how to do that.
Create the application you planned in the Introduction and Setup section above using the subscription key you created earlier, and obtain the API key. Add Intents
To take input from the user, we need to set up the Intents of the application first, as you learned about in the Introduction section above. This reference explains how to set those up.
Add your intents to LUIS, checking first to see if there are any prebuilt domains you could use. Add Utterances and Search Constructs
Your users will submit Natural Language queries to your application. To ensure they are trained by the Machine Learning system behind LUIS, you need to give it some examples to work with. In this reference, you'll create some Utterances, and then label them as Intents or Entities. You'll also set up the search and filter constructs.
Add all the utterances you came up with in Introduction and Setup. Next, label them with your intents and entities, adding additional intents/entities as needed. Add Entities
The next step is to add your Entities, which can be grouped into Classes to show similarity in your topics. This article shows you how to do that, and how to leverage the pre-built Entities in the system.
Identify which entities could be satisfied by the prebuilt entities and add them. Next, add the remaining custom entities for your app.

Processing Phase
Using Features to Increase Model Performance
We now move to the Processing Phase of your application. Features help LUIS recognize both intents and entities, by providing hints to LUIS that certain words and phrases are part of a category or follow a pattern. When your LUIS app has difficulty identifying an entity, adding a feature and retraining the LUIS app can often help improve the detection of related intents and entities. This reference explains how to add them to your application.
Add the phrase lists and patterns you developed from the Introduction and Setup section.

Train your Natural Language Processing Application
When you "train" a model, LUIS generalizes from the examples you have labeled, and builds model to recognize the relevant intents and entities in the future, thus improving its classification accuracy. This article covers that topic.
Train your app and observe its performance.

Retraining - or "Active Learning"
Once a Machine Learning model is trained, you can provide it with real-world data to retrain it and make it perform more accurately. LUIS contains a system that examines all the utterances that have been sent to it, and calls to your attention the ones that it would like you to label. This process is called "Active Learning", and you can implement it using these steps.
Address any errors you notice in testing by adjusting your intents/entities/utterances/features, retraining, and testing again.

The Response Phase
Publish and Access the Application
We're ready to use your application - you can either publish your app directly to the Production Slot where end users can access and use your model, or you can publish to a Staging Slot where you can iteratively test your app to validate changes before publishing to the production slot. This reference will help you do that.
Publish and test your app by setting the URL parameter. After some testing, revisit the previous step and use active learning to view and labeled any additional suggested utterances. You can also perform interactive testing on current and published models. Create a Complete LUIS Application with Python
This quick application puts everything together for you to use in a Python application as an example. You can see how everything you've learned is put into production all the way out to a full client application.
In the previous steps you created what should be a fairly robust LUIS app. Integrate it into a simple Python app.

Use a LUIS Application with Cortana
While there are advantages to using prebuilt domains to extend your LUIS application, you can also use the Cortana system in Windows 10 to implement LUIS. This article shows you how.
Detail 2-3 scenarios where the Cortana prebuilt app would be useful, and 2-3 scenarios where it would fail.

Review The LUIS forums
Have questions? There's a good chance someone else has the same question. Or maybe you're ready to help someone else.
Review the forums, answer at least one question.

Advanced

Integrating advanced machine learning services into AI applications

Self-paced course 5h 30m

Overview
Introduction to Deep Learning for Computer Vision
This lecture, the first from a course at Stanford, provides a great introduction to computer vision. It introduces the field and describes the approach we will be taking in this Learning Path
Explain at a high-level the success of convolutional neural networks to computer vision

Setup
Deploy an Azure DSVM
In order to do deep learning, you'll need a system equipped with the heavy computation requirements. The Azure DSVM will be our primary workspace for training deep learning algorithms. It is already equipped with the drivers we need to get GPU-acceleration, CNTK, and a complete Python environment for writing your application. Make sure you select an NC6, NC12 or NC24SKU in order to get GPU access
Launch a Jupyter notebook, and do the data science walkthroughon your brand-new DSVM

101-Course
edX - Deep Learning Explained
This is an introductory course to CNTK and deep learning, covering the core concepts in CNTK and deep learning. This course runs over a 6-week duration, but each week can be completed in a few hours. While we highly recommend this course, this learning path also provides links to tutorials you can do at your own pace.
Complete the first four modules of the course

Concepts
Test CNTK on Synthetic Data
This first tutorial will give you an introduction to CNTK's Python interface and the process of using CNTK for machine learning. While the data here is synthetic, you'll get a firm foundation of the core concepts of CNTK, including data loading, training and evaluation.
Run through the entire notebook and change the number of output classes

OCR Recognition
Ingest and Explore Data with CNTK and Python's Scientific Packages
Now that understand CNTK's core Python API for data processing and model training/evaluation, you are ready to try out your skills on real dataset. The MNIST dataset is sometimes referred to as the "Hello, World!" of the machine learning world. Understanding this dataset with CNTK is a great way of learning about deep learning and CNTK
Explore the dataset in different ways: try shuffling the data, apply transformations to see how they distort the images, and add noise to see how that impacts generalization error

Logistic Regression with MNIST
This tutorial continues our OCR problem using a simple logistic classifier. While not yet a "deep" neural network, this approach nonetheless gets an accuracy of 93%, and shows the standard workflow of CNTK: data reading, data processing, creating a model, learning, and evaluation.
Adjust the following parameters and see how they effect your test accuracy: minibatch size, number of sweeps, network architecture. Can you explain why these parameters effected the score the way they did?

Fully Connected Neural Network with Two Hidden Layers
Finally, a "deep" net! Here you will implement a fully connected neural network with two hidden layers. Your test score should go up to nearly 99%!
Try adding more hidden units. At what point does "overfitting" occur?

Convolutional Neural Network for MNIST
Convolutional neural networks (CNNs) are very similar to the perceptron and multi-layer neural network of the previous sections. The main difference is the assumptions they make on the data source. They explicitly assume the data source is an image, and constrain the network architecture in a sensible way, in part, inspired by our understanding of our visual cortex. Rather than connecting each previous layer's neurons to every neuron in the next layer, CNNs use "local connections", where only a small input region (the receptive field) is applied to the next layer. Here you'll implement a CNN on the MNIST dataset, and if you're careful, you could even crack past a 99% accuracy rate.
Try different activation functions and strides, and see how they impact performance.

Transfer Learning
Image Recognition with Transfer Learning
In the previous sections, you learned about CNTK's core API and trained a neural network for OCR recognition. While this worked great for our problem, our data was not very large, or very varied. It consisted entirely of greyscale images with only 10 categories. In real world examples however, you'll frequently encounter images with far greater diversity and categories. Rather than training a hugely deep architecture for each new problem you encounter, transfer learning allows you to reuse an existing architecture trained on a large image corpus (usually ImageNet) and only retrain the last few layers to "adapt" to your new dataset.
Try this notebook with a different pretrained model, say ResNet-50, and a different dataset

Neural Style Transfer
While not directly related to transfer learning, this module will show you how to utilize the optimization procedure we have used so far to optimize a loss function that leads to a result that is close to both the content of one image, as well as the style of the other image. Here, you'll take a famous painting by van Gogh, and synthesize its texture on any image your choosing!
Try different images and different pre-trained network architectures.

Object Detection
Fast R-CNN Object Detection Tutorial
You've understood the core concepts of CNTK, image classification, transfer learning, and optimization of loss functions. Now you're going to apply all those to the task of object detection! In particular, you will use an approach called Fast R-CNN. The basic method of R-CNN is to take a deep Neural Network which was originally trained for image classification using millions of annotated images and modify it for the purpose of object detection.
Try this same approach on the MS Cocodataset, a far richer and more challenging dataset for object detection.

Operationalization
Embarassingly Parallel Image Classification on pySpark with HDInsight
Now that you have trained your deep neural network for the tasks of image classification and object detection, the question is how to operationalize? In this module, you'll take an image classification model and deploy it on HDInsight Spark, where it can score thousands of image while simultaneously using the distributed nature of Spark and the high throughput of Azure Data Lake.
Adapt the model to use the Fast-RCNN model you trained in the previous section

Deploy CNTK on a Raspberry Pi
Now you'll take your CNTK architecture and deploy it on a simple Raspberry Pi. The Microsoft Embedded Learning Library allows you to build and deploy machine-learned pipelines onto embedded platforms, such as the Raspberry Pi, Arduino, micro:bits, and other microcontrollers. We'll take our image classification model and put it into a Raspberry Pi where it can do real-time classification using a video feed
Decrease the size of your network architecture and see how that improves the latency of your real-time classification device.