How to deal with missing events in streaming data?


Streaming data is often not perfect – some of the events can be missing and some can be generated or received with delay. At the same time downstream applications may require input data within regular intervals (e.g. every 5 seconds)

Some customers asked us – how can Azure Stream Analytics be used to convert a stream of event with missing values into a stream of events with regular intervals? Event that was received last should be used to fill in missing values.

This is easy with Hopping Window:

SELECT

       System.Timestamp AS windowEnd,

       TopOne() OVER (ORDER BY time DESC) AS lastEvent

    FROM

        input TIMESTAMP BY T

    GROUP BY HOPPINGWINDOW(second, 300, 5)

 

This query will generate events every 5 second and will output last event that was received before. Please note, that as part of the Window definition, you need to specify Window duration – this is how much back the query will look to find the latest event (300 seconds in our example).

 

Please check more query examples at Query examples for common Stream Analytics usage patterns

Comments (1)

  1. Charles Phillips says:

    Are you implying that not all events are picked up by streaming analytics? Or is this just to create a heartbeat just in-case there are not enough events?

Skip to main content