Ask Learn
Preview
Please sign in to use this experience.
Sign inThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
MicrosoftML's rxNeuralNet
model supports GPU acceleration. To enable GPU acceleration, you need to do a few things:
Install and configure CUDA on your Windows machine.
For the purpose of this setup and later performance comparison, this is the machine used in this blog.
There is an excellent old blog on how to do this. However, it's slightly out of date for the purpose of our setup, so the steps are repeated here with updated product versions.
(If you were to get a Deep Learning toolkit for the DSVM which comes with R Server, Visual Studio 2015 and CUDA Toolit 8.0 preinstalled, then all you is the four dlls, basically step 3, 5 and 6. Update: Not needed any more for Deep Learning toolkit for the DSVM. )
system.file("mxLibs/x64", package = "MicrosoftML")
.Experiment: Performance analysis of rxNeuralNet in Microsoft ML with OR without GPU through a simple example
(By performance here we simply mean the training speed. )
We make a 10,000 by 10,000 matrix composed completely of random noise and another column of 10,000 random 0s and 1s. They are combined together as our training data. Then we simply build two rxNeuralNet
models for binary classification. One with GPU acceleration and the other without. The parameter miniBatchSize
is the number of training examples used to take a step in stochastic gradient descent. The bigger it is, the faster progress is made. But large step sizes can lead to difficulty for the algorithm to converge. We used ADADELTA here as our optimizer here as it can automatically adjust the learning rate. miniBatchSize
is only used with GPU acceleration. Without GPU acceleration, it's by default set to 1.
Output fromrxNeuralNet with GPU acceleration
Output from rxNeuralNet with SSE acceleration
|
Both models achieved similar training errors by the end of 50 iterations. (In fact, the GPU accelerated model did slightly better.)
With GPU acceleration, the training time is 2'16'' compared to 19'57'' without GPU acceleration. The performance boost is almost 9X.
We can also see in Net Definition the structure of our neural nets. Here both models are using a neural net of an input layer of 10,000 nodes (the number of nodes in the input layer is always either equal to the number of input variables or plus one node for a bias term), a single hidden layer of 100 nodes and an output layer of one node for this binary classification problem.
In rxNeuralNet
the number of nodes in the hidden layer defaults to 100. For a single hidden layer, it's usually recommended to be between the number of nodes in the input layer and number of nodes in the output layer. MicrosoftML also supports user-defined network architectures like convolution neural networks using the NET# language.
Training time vs. mini batch size
Further experiments for the same data give the following relation between training time and miniBatchSize
:
Along with the learning curve for the same combinations of acceleration and miniBatchSize
above.
MicrosoftML manual
Install and Configure CUDA on Windows CUDA Installation Guide for Microsoft Windows
Please sign in to use this experience.
Sign in