Using Azure Stream Analytics JavaScript UDF to lookup values in JSON array


JavaScript UDF (User-Defined Function) allows you to handle complex JSON schema and keep your query clean. In this blog you will learn how to handle nested JSON arrays with a JavaScript UDF. Below is an example event generated by an IoT gateway. There is an array of two devices – device01 and device02; each device has an array of measurements which represent three types of sensor data: temperature, humidity, and pressure.

 

[sourcecode language='javascript'  padlinenumbers='true']
[{
    "gateway": "3a40067f-9d21-4a02-ad5f-926f31648f71",
    "timestamp": "2017-03-13T14:33:02",
    "devices": [{
        "deviceId": "device01",
        "timestamp": "2017-03-13T14:33:02",
                "measurements": [{
            "type": "Temperature",
            "unit": "F",
            "value": 76.34
          }, {
            "type": "Humidity",
            "unit": "%RH",
            "value": 53.32
          }, {
            "type": "Pressure",
            "unit": "psi",
            "value": 21.2586
          }
        ]
      }, {
        "deviceId": "device02",
        "timestamp": "2013-03-20T09:33:12",
        "measurements": [{
            "type": "Temperature",
            "unit": "F",
            "value": 75.6875
          }, {
            "type": "Humidity",
            "unit": "%RH",
            "value": 52.9668
          }, {
            "type": "Pressure",
            "unit": "psi",
            "value": 22.1355
          }
        ]
      }
    ]
  }
]
[/sourcecode]

 

Below is the desired output, which converts each device as one output event, and all measurements under a device become properties.

Timestamp

DeviceId

Temperature

Humidity

Pressure

2017-03-13T14:33:02Z

device01

76.34

53.32

21.2586

2013-03-20T09:33:12Z

device02

75.6875

52.9668

22.1355

 

The Solution:

We can use two techniques to handle arrays: GetArrayElements() and CROSS APPLY generate one row for each device record in the devices array; UDF getValue() does a value lookup from the measurements array. First, let’s define a JavaScript UDF getValue(identifier, arr) as below. If you don’t know how to add JavaScript UDF, take a look at https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-javascript-user-defined-functions.

 

[sourcecode language='javascript'  padlinenumbers='true']
// Name the UDF as getValue()
// Set return type as any
// identifier == name
// arr == JSON array containing objects
// returns value or null if not found
function main(identifier, arr) {
  var result = null;

  if (Object.prototype.toString.call(arr) == "[object Array]") {
    for (i = 0; i < arr.length; i++) {
      if (arr[i].type == identifier) {
        result = arr[i].value;
      }
    }
  }
  return result;
}
[/sourcecode]

 

Your query will be like below:

 

[sourcecode language='sql' ]
SELECT
    d.arrayvalue.timestamp,
    d.arrayvalue.deviceId,
    udf.getValue('Temperature', d.arrayvalue.measurements) as Temperature,
    udf.getValue('Humidity', d.arrayvalue.measurements) as Humidity,
    udf.getValue('Pressure', d.arrayvalue.measurements) as Pressure
FROM input
CROSS APPLY GetArrayElements(input.devices) as d
[/sourcecode]

 

 

An alternative solution:

You might have noticed that in the above query we call getValue() function against the same array multiple times. If there are lots of identifier values to look up, multiple calls to JavaScript UDF may increase your job’s end to end latency. We can modify the UDF to return multiple values in a JSON object, then parse the JSON object in query language.

The below JavaScript UDF does a lookup for all type-value pairs and returns a JSON object.

 

[sourcecode language='javascript' ]
// Name the UDF as getValues()
// Set return type as any
// arr == JSON array containing type and value pairs
// returns value or null if not found
function main(arr) {
  var result = {};
  if (Object.prototype.toString.call(arr) == "[object Array]") {
    for (i = 0; i < arr.length; i++) {
      var identifier = arr[i].type;
      if (identifier != null) {
        result[identifier] = arr[i].value;
      }
    }
  }
  return result;
}
[/sourcecode]

 

 

Your query will get the JSON object first then separate them into multiple properties.

 

[sourcecode language='sql' ]
WITH flattened AS
(
    SELECT
        d.arrayvalue.timestamp,
        d.arrayvalue.deviceId,
        udf.getValues(d.arrayvalue.measurements) as measurements
    FROM input
    CROSS APPLY GetArrayElements(input.devices) as d
)

SELECT
    timestamp,
    deviceId,
    measurements.Temperature,
    measurements.Humidity,
    measurements.Pressure
FROM flattened f
[/sourcecode]

 

 


Comments (0)

Skip to main content