Ask Learn
Preview
Please sign in to use this experience.
Sign inThis browser is no longer supported.
Upgrade to Microsoft Edge to take advantage of the latest features, security updates, and technical support.
Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
Azure Stream Analytics query language extends the SQL syntax to enable complex computations over streams of events. With Stream Analytics there are some concepts related to Event Delivery worth discussing:
In addition to these concepts detailed below, it is important to consider the start option of a job to make sure that no data loss can happen.
Exactly-once processing guarantee means that given a set of inputs, the system always returns the same results. This is very important for repeatability, and applies even in case of restart of the job, or across multiple jobs running in parallel on the same input. Azure Stream Analytics guarantees exactly once processing.
Exactly-once delivery guarantee means all outputs from exactly-once processing are delivered to the output sink exactly once, so there is no duplicate output. This requires transactional capabilities on the output adaptor to be achieved.
Azure Stream Analytics guarantees at-least-once delivery to output sinks, which guarantees that all results are outputted, but duplicate results may occur. However exactly-once delivery may be achieved with several outputs such as Azure Cosmos DB or Azure SQL.
Due to at-least-once delivery guarantee, when a Stream Analytics job is running, duplicate records may occasionally be noticed in the output data. These duplicate records are expected because Azure Stream Analytics output adapters don’t write the output events transactionally. This 'duplicate record' scenario can result if one of the following conditions occur:
The downstream consumer of the output events need to dedupe the events using logical identity of the events. For example, if you are aggregating events by groups in a tumbling window, the logical identity of the event is the groups and the tumbling window’s end time. If you are running a pass through query, you may need to carry a unique id on the event in order to dedupe.
Using Azure Cosmos DB, Azure Stream Analytics guarantees exactly-once delivery. Since Azure Stream Analytics uses upsert, no action is needed by the user. See more information on Azure Stream Analytics output to Azure Cosmos DB.
When using SQL output, users can achieve exactly-once delivery if the following requirements are met:
This is sufficient to avoid duplicates because the SQL output honors any constraints placed on the table by skipping any events that cause a unique constraint violation.
All entities in an Azure Storage Table are uniquely identified by the concatenation of the RowKey
and PartitionKey
fields. Azure Stream Analytics upserts entities, so the value of a table entity will be the latest output event with the corresponding RowKey
/PartitionKey
combination. Therefore, to achieve exactly-once delivery, ensure that each output event has a unique RowKey
/PartitionKey
combination. If this is done, duplicate events will overwrite earlier versions. (The system-defined Timestamp
field, which is the last modified time for the entity, will still change in this case.)
Please sign in to use this experience.
Sign in