Guest post by Zurab Murvanidze Microsoft Student Partner University College London
My name is Zurab Murvanidze, I am 1st year computer science student at UCL. I love learning about technology and have deep interest in machine learning, data science, quantum computing and artificial intelligence.
During this academic year, I became winner of few competitions:
Porticode 2.0 Hackathon at UCL – 1st place, ICHack18 – best mobile app, TechXLR8 London – Urban-X Startup accelerator prize. I like developing applications and games in my spare time and in this article would love to share my experience in ML.NET.
This article will cover basics of machine learning, will introduce you to ML.NET and teach you how to create and train machine learning models. It will also demonstrate how can we implement machine learning in ASP.NET Core Web Application.
I hope once you get familiar with this technology you will come up with many creative ideas to apply machine learning to different problems.
ML.NET is open-source framework that allows developers to easily implement their custom machine learning models. You are not required to have any background in machine learning as this article covers the basics, however it would be helpful if you are familiar with C# and other .NET libraries.
Make Machine Learning model:
Choose problem and Data Set:
Before we start coding, we need to have an idea what are we trying to achieve, what data are we using and how can we interpret it to get desired result.
In our application, we will try to predict age of marine snails, Abalones. Traditional way to determine age of Abalone involves cutting their shell through the cone, staining it and counting the number of rings using microscope (number of rings corresponds to their age).
We are not going to use traditional method, firstly because we don’t have the microscope and secondly because we can use machine learning approach.
Method is pretty straightforward, we will take data set that contains information about physical measurements of the Abalones and train model that can learn and spot some correlations between Abalone’s physical development at different ages. Hence we will make predictions based on the trained ML model.
To get data set: https://archive.ics.uci.edu/ml/machine-learning-databases/abalone/ download abalone.data file. Also check out abalone.names to find out some general information about data set.
Making Console Application to set up the model:
Open Visual Studio 2017, select New Project > .Net Core > Console Application
Press next, select “.Net Core 2.0” as a Target Framework and press next again. In the name field write Abalone_Age and Press Create.
We need to add ML.NET NuGet Package in our solution, so open solution explorer, right click on project Abalone_Age > Add NuGet Packages… >in search field type Microsoft.ML, select it and press add package.
To keep things organized, create folder to keep data, right click the project “Abalon_Age” and select Add > New Folder, call it data. Now add abalone.data.txt in data folder.
Our console application is organized and ready for further development, but we need to know how do we use of the data we have got? As you have probably read in description of abalone.data it provides information about sex, length, diameter, height, whole weight, shucked weight, viscera weight, shell weight and age of the abalone. We need to create class that represents all of these features and then decide which of those features are needed to prediction age. We must also create separate class to indicate feature we are predicting, in our case it is Age.
In solution explorer right click the project, Add > New File… > Empty C# class and call it Abalone.
Class Abalone is the input data class that includes all the features provided and attribute Column specifies the indices of the columns in the data set.
AbaloneAgePrediction is class that represents predicted results, it has attribute ColumnName that indicates the column “Score” which is the special column in ML.NET. The model outputs predicted values into that column.
Machine learning model has many parts that needs to be tied together in order to execute and produce desired output, this process of collecting everything and tying them together is called pipeline. From the definition it should be clear that pipeline is the most important factor in creating accurate machine learning model.
So after we created Abalone class and we know what features it has, we can start coding pipeline for our ML model.
Go to Proram.cs file and add additional “using” statements at the top of the page.
Now declare Train() function.
The code above tells compiler that our Prediction model is based on class Abalone, and it must output predicted value in AbaloneAgePrediction class, more specifically to the value in that class which corresponds to ColumnName “Score”.
Inside the function Train() we must define learning pipeline.
Pipeline where it gets data from, which features should be included/excluded in training process and many more. It can be complicated to understand in the beginning what each piece of code does in this function, therefore I will go through all of them to make sure everything is clear.
TextLoader function takes string as a parameter that indicates path to the abalone.data.txt So, we can define constant string _datapath as a variable in the code and pass it to TextLoader function as a parameter.
I will also define constant _modelpath which will indicate where to output the model after training. (we will use this later and make sure you define it)
.CreateFrom<Abalone>(separator: “,”) separator tells compiler how to separate columns from each other, in our data set, values are separated by comma therefore we have to indicate it.
ColumnCopier – when model is trained, values under Column Label are considered as correct values and as we want to predict Age we should copy Age column into Label Column.
CategoricalOneHotVectorizer – Values in Column Sex are M or F, however algorithm requires numeric values, therefore this function assigns them different numeric values and makes suitable for training model.
ColumnConcatenator – , this is the function in which we tell pipeline which features to include to predict Age of the Abalone. We must decide how relevant are the features for our calculation and based on that decide whether include it or not. Only those features will be included in learning process, whose names are declared in this function. As you can see I have excluded Shucked Weight, and Viscera Weight, even though they are absolutely relevant and will probably make calculation a bit more accurate, they are not easy to obtain. Our aim is to make predictions based on easily obtainable measures, so we can quickly check how old tha Abalone is. To make things even easier, we can exclude Sex feature, as it is not very intuitive to tell whether Abalone is male or female, and it doesn’t have massive influence on our prediction.
FastTreeRegressor() – finally we define the algorithm that we want to use in the learning process, I will not go in too much detail here but if you want to find out more about Tree Regression algorithms, you can read this article.
Once we are done with the pipeline, we can define prediction model and export it to the data folder.
To Execute our first machine learning application, we need Add using System.Threading.Tasks; at the top of the page and overwrite Main function with the code provided below.
This code creates machine learning model based on pipeline we have declared in Train function.
Now before building the solution, we need to go to project options > General > Language options and change C# language version from default to Version 7.1. Now you can press build and hopefully you have created your machine learning model without any errors.
You should get similar message on the console. You can now check data folder and see that Model.zip is added.
Now by adding few lines of code, we can start making predictions. Create new instance of the Abalone in Main function providing the body measurements, Declare
var prediction = model.Predict(abalone1);
now prediction variable contains the predicted value which can be printed out on the console using Console.WriteLine(prediction.Age);
Here is the full code for Program.cs
And as you build the application, it should display the predicted value as shown in picture.
Predicted value was supposed to be 9, but there is a little error which is expected. Firstly, because there are only 4177 instances of abalones, secondly in nature physical measurements might vary and do not directly related to the age. Therefore, we do not expect this model to be 100% accurate, however it is a good approximation.
Using exported model without further training:
As our data set is not growing It is pointless to retrain model before every prediction. Instead we can re-use already trained model. This makes predictions faster and cost effective as they do not consume too much processing power. Downsides are that our model will not get any better over time.
When model is ready we can get rid of train function, and all other unnecessary code. (comment it out or just delete it) See example below:
This code is enough to predict models based on different values.
Integrating Machine Learning model into ASP.NET CORE Web Application
I will cover this section briefly and will not go through into details as this blog is not directly related to ASP.NET Core. I just want to demonstrate how easy it is to integrate ML.NET into other .NET libraries.
let’s get started.
Open Visual Studio 2017, click on new solution > .Net Core > ASP.NET Core Web App.
Good thing about this library is that it allows coding back end of the web page in C# and easily integrates other .Net libraries such as ML.NET
Make sure you go to options as we did earlier and change C# Language Version from default to Version 7.1. Also add Classes Abalone.cs and ML.cs.
Copy code for Abalone class from previous project and paste it here, just make sure you edit the namespace after pasting code. For ML.cs code is similar to Program.cs from previous project, however we need to make few adjustments, code for ML.cs is provided below.
As you can see we got rid of Main function and we do not instantiate object of Class Abalone in ML class anymore, instead we pass it to function Run as parameter, from Index page. this is because we need to read values from Index Page, instantiate object of class Abalone there and display calculated value on the web page for user.
See HTML code of this page below.
Interesting part happens on back end of the page, when we get filled information from user,
We create new instance of Abalone, using inputs provided by user, then function Predict() is invoked which creates new instance of ML class which contains our machine learning model. Instance of Abalone is passed to function Run() and after prediction is made it gets displayed on index Page.
Purpose of the last part of the article was to demonstrate how easy it is to make use of ML.NET and implement it into your applications or websites.
Hopefully this blog was helpful to get the basics of machine learning with ML.NET, and If you built the app, please share it to your friends so they can find out age of the Abalones without killing them.