Sending and consuming events in Avro format


Azure Stream Analytics currently supports three formats for input event serialization: Avro, CSV and JSON. This blog post will demonstrate how to send events to an input source in the Avro format, to be later consumed by a Stream Analytics job
For examples below, assume that we are sending events to an Event Hub instance.  

Let’s start by defining the events that would be sent to input source.

[DataContract]

public class SampleEvent

{

[DataMember]

public int Id { get; set; }

 }

We will be using “Microsoft.Hadoop.Avro” library for Avro serialization. You will need to add a nuget reference for this library through Project -> Manage Nuget Packages. 

 

 Here is how Packages.config file looks like after adding nuget reference. I have added packages for Service bus as well.

 <?xmlversion="1.0"encoding="utf-8"?>

<packages>

  <packageid="Microsoft.Hadoop.Avro"version="1.5.6"targetFramework="net45" />

  <packageid="Microsoft.WindowsAzure.ConfigurationManager"version="3.1.0"targetFramework="net45" />

  <packageid="Newtonsoft.Json"version="6.0.4"targetFramework="net45" />

  <packageid="WindowsAzure.ServiceBus"version="3.0.6"targetFramework="net45" />

 </packages>

 

Now let’s look at the serialization code. We will be using classes from Microsoft.Hadoop.Azure namespace. Stream Analytics expects the events to be serialized sequentially in an Avro container.

 private class AvroEventSerializer<T>

        {

            private IAvroSerializer<T> avroSerializer;

            public AvroEventSerializer()

            {

                this.avroSerializer = AvroSerializer.Create<T>();

            }

            public byte[] GetSerializedEvents<T>(IEnumerable<T> events)

            {

                if (events == null || !events.Any())

                {

                    return null;

                }

                using (var memoryStream = new MemoryStream())

                using (var avroWriter = AvroContainer.CreateWriter<T>(memoryStream, Codec.Null))

                using (var sequentialWriter = newSequentialWriter<T>(avroWriter, events.Count()))

                {

                    foreach (var e in events)

                    {

                        sequentialWriter.Write(e);

                    }

                    return memoryStream.ToArray();

                }

            }

        }

Above code serializes the events in Avro format, and also includes the schema in each payload. Azure Stream Analytics requires schema to be specified with the payload. Note that the container has multiple events and schema is specified only once.

Finally, let’s send events to Event Hub using above code:

private static void Main(string[] args)

        {

            var eventHubClient =

                EventHubClient.CreateFromConnectionString(

                    "<ReplaceWithServiceBusConnectionString>",

                    "<ReplaceWithEventHubName>");

            var avroEventSerializer = new AvroEventSerializer<SampleEvent>();

            while (true)

            {

                var eventsPayload =

                    avroEventSerializer.GetSerializedEvents(

                        Enumerable.Range(0, 5).Select(i => newSampleEvent() { Id = i }));

                eventHubClient.Send(newEventData(eventsPayload));

                Thread.Sleep(TimeSpan.FromSeconds(10));

            }

        }

We have seen example code for sending events in Avro format to Event Hub. These events can now be consumed by a Stream analytics job by configuring an eventhub input and selecting Avro format. 

[DataContract]

public class SampleEvent

{

[DataMember]

public int Id { get; set; }

}

Comments (0)

Skip to main content