Update to Stream Analytics: Query Validation with Sample Data


Azure’s latest update extends the query authoring experience in Stream Analytics to enable query execution over test data. 

 

To validate your query, click the Test button on the Query page and browse to local JSON file(s) with sample data.  The JSON file must contain an array of valid JSON objects.  Only JSON format is supported at this time, so you may want to use a CSV to JSON converter like this one if your data is in a different format.

2

 

The Test dialog will prompt for a file for each input source referenced in the job query.

3

 

After closing the Test dialog, the query will be executed against the local data and the output will be shown on the Query page:

4

 

You can iterate over your query definition and test each version using the sample data by clicking the Rerun button.  To change the sample input files, click the Test button.  Please note that you will need to respecify test data files if the name of an input changes in the query definition. 

There are some small behavioral differences between the Test feature and actual Stream Analytics job queries.  Both use half-open intervals for windowing, but Test uses (start, end] and Stream Analytics jobs use [start, end).  This means in queries where temporal windowing is used and an event timestamp falls exactly on a window boundary, output from actual jobs will fall in window i and output from Test will fall in window i + 1.  There are also some differences in the mappings of datetime values between the two implementations, meaning that window boundaries may sometimes be misaligned.

Sample Files and Queries

To get you started, the Stream Analytics team has hosted three sample data files using the toll station scenario described here.  Below are some example queries over the data.

Find the Model of all vehicles manufactured by Honda

SELECT Model FROM EntryStream WHERE Make = ‘Honda’ 

 

Determine how many vehicles entered each toll station every 5 minutes

SELECT TollId, COUNT(*) FROM EntryStream

TIMESTAMP BY EntryTime

GROUP BY TollId, TumblingWindow(minute,5)

 

Calculate the time it took for each car to pass through the toll station

SELECT EN.LicensePlate, DATEDIFF(minute, EN.EntryTime, EX.ExitTime) AS TotalTime

FROM EntryStream EN TIMESTAMP BY EntryTime

JOIN ExitStream EX TIMESTAMP BY ExitTime

ON EN.TollId = EX.TollId AND EN.LicensePlate = EX.LicensePlate
AND DATEDIFF(minute, EN, EX) BETWEEN 0 AND 15

 

Report all commercial vehicles with expired registration

Note that queries using reference data will not resolve unless an input source for the reference data source is defined on the job’s Input page.  In this example you will need to specify a reference data input source named Registration.

 

SELECT EN.LicensePlate, REG.RegistrationId

FROM EntryStream EN TIMESTAMP BY EntryTime

JOIN Registration REG

ON EN.LicensePlate = REG.LicensePlate

WHERE REG.Expired = ‘1’

  

You can leverage the Stream Analytics Query Language Reference to continue experimenting with the query language.  If you have any questions or feedback, please visit our forum.  We look forward to hearing from you!

 


Comments (1)

  1. Bob says:

    Thanks for this Query Validation tutorial. I've used beautifytools.com/csv-to-json-converter.php as my CSV To JSON Converter.

Skip to main content