Ask Learn
Preview
Please sign in to use this experience.
Sign inThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
In this post I will show you step by tutorial on how to create a basic two-class machine learning experiment using breast cancer data. This post is part of a series of different two-class prediction examples to help you learn how to create experiments using Azure Machine Learning studio
For a more comprehensive introduction to data science and Azure Machine Learning Studio check out Data Science and Machine Learning Essentials on MVA
To use Azure Machine Learning Studio from your Azure account, you need a Machine Learning workspace. This workspace contains the tools you need to create, manage, and publish machine learning experiments.
To create a workspace, sign-in to your Microsoft Azure account.
Azure Machine Learning studio is part of Microsoft Azure. Microsoft Azure is a paid service, but there are a number of programs or trials you can use to explore it’s capabilities
1. Navigate to the Microsoft Azure portal portal.azure.com and log in using your Microsoft account credentials
2. In the Microsoft Azure portal create a new machine learning workspace. Select + New | Data + Analytics | Machine Learning
You will be redirected to the original Azure portal to enter the details for your machine learning workspace.
1. Enter a WORKSPACE NAME for your workspace
NOTE: Later, you can share the experiments you're working on by inviting others to your workspace. You can do this in Machine Learning Studio on the SETTINGS page. You just need the Microsoft account or organizational account for each user.
2. Specify the Azure LOCATION
3. Select an existing Azure STORAGE ACCOUNT or select Create a new storage account to create a new one and give your new storage account a name.
4. Select CREATE AN ML WORKSPACE.
After your Machine Learning workspace is created, you will see it listed on the portal under MACHINE LEARNING. At the time this post was created Machine Learning Workspaces are always displayed in the Azure Classic portal (even if you select the menu option from the new portal to create it), at some point the new portal will be updated so you can list them without going to the Classic view.
Once your Machine Learning workspace is created, select your workspace from the list and then select Sign-in to ML Studio to access the Machine Learning Studio so you can create your first experiment!
When prompted to take a tour select Not Now. You may want to take a tour later when you are exploring this tool on your own.
At the bottom of the screen select +NEW
Change the title at the top of the experiment to read “Breast Cancer Experiment”
The Wisconsin Breast Cancer data set is not a sample data set already loaded in Azure Machine Learning Studio. The data used in this example is the Wisconsin Breast Cancer data set from the University of Wisconsin hospitals provided by Dr William H. Wolberg you can download the dataset file breast-cancer-wisconsin.data here.
Once you have downloaded the file you will need to create a dataset in Azure Machine Learning Studio for the breast-cancer-wisconsin.data file.
Select + NEW at the bottom of the screen
Select DATASET | FROM LOCAL FILE
1. Select the DATA TO UPLOAD by browsing to select the csv file you downloaded containing the breast cancer data.
2. Enter NAME FOR THE NEW DATASET
3. Specify the TYPE FOR THE NEW DATASET as Generic CSV File with a header (.csv) this indicates we have a csv file and the first row of the csv file contains the headers for the data columns
4. Enter a description of the dataset to help you remember the dataset contents
5. Select the checkmark to start uploading the data into a dataset
Expand Saved Datasets | My Datasets and drag your newly created Breast cancer dataset to the experiment
Right click on the dataset on your worksheet and select dataset | visualize from the pop-up menu, explore the dataset by clicking on different columns. It’s essential in Machine Learning to be familiar with your data. This dataset contains information about pcharacteristics of tumours and whether those tumours were benign or malignant.
We are going to use Machine Learning to create a model that predicts whether a tumour is benign or malignant
Some of the columns in the dataset are not meaningful for predicting whether a tumour is malignan, for example sample Code number is just a number assigned to each sample.
Let’s select only the significant features in our dataset to use in our machine learning experiment.
Type “Select” into the search bar and drag the Select Columns in Dataset task to the workspace. Connect the output of your dataset to the project columns task input
The Select Columns in Dataset task allows you to specify which columns in the data set you think are significant to a prediction (i.e. your features). You need to look at the data in the dataset and decide which columns represent data that you think will affect whether or not a passenger would survive. You also need to select the column you want to predict. In this case we are going to try to predict the value of Class. This will contain a value of 2 if the tumour is benign and a value of 4 if the tumour is malignant.
Click on the Select Columns in Dataset task. On the properties pane on the right hand side, select Launch column selector Select the columns you think affect whether or not a tumour is malignant as well as the column we want to predict: Class. In the following screenshot, I selected all the columns except Sample Code number
Whenever we execute machine learning experiments, we use some of our data to train the model and we put some data aside to test the model. In Azure Machine Learning Studio, we use the Split Data task to put aside data for testing.
The Split Data task allows us to divide up our data, we need some of the data to try and find patterns and we need to save some of the data to test if the model we create successfully makes predictions. Traditionally you will split the data 80/20 or 70/30.
Type “split” into the search bar and drag the Split Data task to the workspace. Connect the output of Project Columns task to the input of the Split Data task.
Click on the Split Data task to bring up properties, specify .8 as the Fraction of rows in the first output
Now we can get Azure Machine Learning Studio to train the model so we can find the patterns in the historical data to make predictions for new records.
Type “train model” into the search bar. Drag the train model task to the workspace. Connect the first output (the one on the left) of the Split Data task to the rightmost input of the Train model task. This will take 80 % of our data and use it to train/teach our model to make predictions.
We need to tell the train model task which column we are trying to predict with our model. In our case we are trying to predict the value of the column Class which indicates if a tumour is malignant or benign.
Click on the Train Model task. In the properties window select Launch Column Selector. Select the column Class.
If you are a data scientist who creates your own algorithms, you could now import your own R code to try and analyze the patterns. But, we can also use one of the existing built-in standard algorithms.
Different types of machine learning, use different algorithms. Since we are trying to predict if an output has one of two values we want to use a two-class algorithm to train our model. Two-clas algorithms are used to predict outcomes that can only have two possible values. In our case a value of 1 or 0 which indicates survival.
Type “two-class” into the search bar. You will see a number of different classification algorithms listed. Each algorithm has its advantages and disadvantages. Check out the Azure Machine Learning Studio Cheat Sheet for a quick reference guide to algorithm selection. I am going to select the Two-Class Decision Forest to train my model. Select one of the two-class algorithms and drag it to the workspace.
Connect the output of the Algorithm task to the leftmost input of the train model task.
After the model is trained, we need to see how well it makes predictions, so we need to score the model by having it test against the 20% of the data we split to our second output using the Split Data task.
Type “score” into the search bar and drag the Score Model task to the workspace. Connect the output of Train Model to the left input of the Score model task. Connect the right output of the Split Data task to the right input of the Score Model task as shown in the following screenshot.
Now we need a report on our test results.
Type “evaluate” into the search bar and drag the Evaluate Model task to the bottom of the workspace. Connect the output of the Score model task to the left input of the Evaluate Model task.
You are now ready to run your experiment!
Press Run on the bottom toolbar. You will see green checkmarks appear on each task as it completes. When the entire experiment is completed you can check how well your model makes predictions.
To see your test results, right click on the evaluate model task and select “ Evaluation results | Visualize”.
The closer the graph is to a straight diagonal line the more your model is guessing randomly. You want your line to get as close to the upper left corner as possible.
If you scroll down you can see the detailed results. AUC (Area Under Curve) is a great overall indicator of your model performance. The closer AUC is to 1, the better your model is making predictions.
You can also see the number of false and true positive and negative predictions
You want high values for True positives and True negatives, you want low values for False Positives and False negatives.
Once you have trained a model with a satisfactory level of accuracy, how do you use it? One of the great things about Azure Machine Learning Studio is how easy it is to take your model and deploy it as a web service. Then you can simply have a website or app call the web service, pass in a set of values for the project columns and the web service will return the predicted value and confidence of the result.
Once you've trained your model, you're ready to use it to make predictions for new data. To do this, you convert your training experiment into a predictive experiment. By converting to a predictive experiment, you're getting your trained model ready to be deployed as a web service. Users of the web service will send input data to your model and your model will send back the prediction results.
To convert your training experiment to a predictive experiment, click Run at the bottom of the experiment canvas, then select Set Up Web Service
Once you have trained a model with a satisfactory level of accuracy, how do you use it? One of the great things about Azure Machine Learning Studio is how easy it is to take your model and deploy it as a web service. Then you can simply have a website or app call the web service, pass in a set of values for the project columns and the web service will return the predicted value and confidence of the result.
Once you've trained your model, you're ready to use it to make predictions for new data. To do this, you convert your training experiment into a predictive experiment. By converting to a predictive experiment, you're getting your trained model ready to be deployed as a web service. Users of the web service will send input data to your model and your model will send back the prediction results.
To convert your training experiment to a predictive experiment, click Run at the bottom of the experiment canvas, then select Set Up Web Service
Select Set Up Web Service, then select Predictive Web Service.
This will create a new predictive experiment for your web service. The predictive model doesn’t have as many components as your original experiment, you will notice a few differences:
Delete the connection from the Web input to Select Columns in Dataset task and redraw the connection from the Web input to the Score Model task. If you leave the web input connected to project columns, the web service will prompt you for values for all the data columns even though we don’t use them to make our prediction. If you have the web input connected to the score model directly, the web service will only expect the data columns we selected in our Select Columns in DataSet task which we determined are relevant for making predictions.
For more details on how to do this conversion, see Convert a Machine Learning training experiment to a predictive experiment
Now that the predictive experiment has been sufficiently prepared, you can deploy it as an Azure web service. Using the web service, users can send data to your model and the model will return its predictions.
To deploy your predictive experiment,
click Run at the bottom of the experiment canvas
After it runs successfully
Select Deploy Web Service. The web service is set up and you are placed in the web service dashboard.
Select the Test link in the web service dashboard. A dialog pops up to ask you for the input data for the service. These are the columns expected by the scoring experiment. Enter a set of data and then select OK. The results generated by the web service are displayed at the bottom of the dashboard.
You may have to scroll down to see all the fields you need to enter
The results of the test will appear at the bottom of the screen.
Select Details to see the full record returned
You will see the record you entered followed by the predicted output and the probability (columns scored label, and scored probabilities respectively). In the screenshot below there is a .375 (37.5%) probability my imaginary tumour is benign on the titanic (predited outcome for class is 2). The value you see returned will vary depending on the data you specified.
Once you deploy your web service from Machine Learning Studio, you can send data to the service and receive responses programmatically.
The dashboard provides all the information you need to access your web service. For example, the API key is provided to allow authorized access to the service, and API help pages are provided to help you get started writing your code. Select Request/Response if you are going to call the web service passing one record at a time. Select Batch Execution if you are going to pass multiple records to the web service at a time.
On the API help page select Sample code
You will be presented with code samples for calling the web service from C#, Python and R
Replace the apiKey of abc123 with the API key displayed in the dashboard of your web service.
Replace the values with the values you wish to pass into the web service and you can now call the web service from your code to retrieve predictions!
For more information about accessing a Machine Learning web service, see How to consume a deployed Azure Machine Learning web service.
Congratulations you have created a machine learning experiment and a web service to make predictions based on your trained model!
Please sign in to use this experience.
Sign in