Real Time scoring of streaming data using Machine Learning models


Many a times while talking to customers on their IoT scenarios we have heard a common theme – “Given millions of sensors continuously emitting data, it is very hard to manage thresholds for alerting in case of issues. Every time the environment changes, we need to go change the thresholds manually. This is a very cumbersome process”. These customers want to be able to use a machine learning model that can automatically detect anomalies over time series data.

One of the companies who were doing social media analytics for sports domain were frustrated using sentiment analytics models. Here are some examples they gave, “this goal is sick”, is a positive sentiment in sports. Most generic sentiment analytics models gives it a negative sentiment. It is essential to be able to use domain and language specific models to give correct results. Similarly we have heard other use cases of using machine learning models over continuously streaming data, whether be it in fraud detection, recommendation or other use cases.

With Azure stream analytics we aspire to make it easier and easier to do analytics over your streaming data. Today we are announcing private preview for customers to be able to call machine learning models through azure stream analytics. Customers can use a very simple SQL like syntax to call the operationalized endpoint of their model. For ex. Customers can write a statement like,

Select Text, sentiment(Text), CreatedAt from input

where Language=’English’

Text is the tweet coming in from a social feed like Twitter, and Sentiment is a user defined function to the endpoint for sentiment analytics model. Each Text is sent to the Sentiment model to be scored. Please see the demo of this in the following video starting at 1:05:00.

Similarly customers can write a query like,

SELECT AnomalyDetection(SystemTime, Temperature, 1) as TemperatureAnomalyResult, ID FROM aggTemperature

GROUP BY (TumblingWindow(minute, 5), ID)

Here a customer is sending 5 minute history of temperature events to an anomaly detection model.  Azure Machine Learning enables you to create a web service endpoint for your models – http://azure.microsoft.com/en-us/documentation/articles/machine-learning-publish-a-machine-learning-web-service/. Our goal is to be able to simplify stream processing and machine learning integration to be able to call the operationalized models web service end point through a user defined function in SQL.

Private Preview

We request customers to fill out the following questionnaire to be considered into this preview: http://www.instant.ly/s/ALWbd. We want to learn about your streaming and machine learning scenarios, however initially we will be prioritizing customers wanting to use scenarios Anomaly Detection or sentiment analytics into the preview. This is a very limited private preview, and the product team will be working closely with customers to enable their end to end scenarios. Please be sure to fill in any of your scenarios, as we will be expanding the scope and audience of our preview regularly.

Our expectation from the customers coming into preview is that we will be able to quickly bring their pipeline up, and have regular interactions for feedback. We will also be prioritizing customers who want to move to production as these functionalities get to public preview or general availability.  

For more information or giving feedback on azure stream analytics, please look through these resources – http://blogs.msdn.com/b/streamanalytics/archive/2014/11/17/reference-links-amp-related-materials.aspx.

 

 


Comments (4)

  1. Gerardo Saca says:

    This is awesome! We're interested in leveraging this for error and ETW event telemetry, basically use anomaly detection to catch when something goes off and we start getting an abnormal number of errors or other events.

    We experience weekly cyclical variations (slow weekends with fewer events and busy weekdays with more events).

    What's the guidance for this scenario, particularly the longer time range?

    Would we be ok with the 5 minute tumbling window?

    Can we do a 14 day tumbling window? Is that too much?

    Thanks!

  2. Hi Gerardo, could you please reach out to me @ santoshb@microsoft.com. I would need to get a few more folks from the machine learning folks in this discussion.

  3. Dave Poleon says:

    Any progress on this… "Our goal is to be able to simplify stream processing and machine learning integration to be able to call the operationalized models web service end point through a user defined function in SQL."

    I am trying to create an internal demonstration for my organisation using stream analytics and machine learning, the above would be perfect!

Skip to main content