Data Retention in Event Hubs

This is a common question so I'm going to put it on the blog before it works its way into our more formal documentation. Event Hubs have a message retention setting that is chosen at creation time; it can also be changed later.

 

The setting is shown in the portal below.

 

The default value is one day. This, however, leads to confusion in that Service Bus Messaging (Queues & Topics) have a Default Message Time to Live (TTL) on a Queue, Topic, or Subscription. These concepts do not mean the same thing - though they are both defaults and involve time. The similarities end there.

UPDATE: This logic has changed and we now delete data at the specified retention window no matter how much or little is in the partition. We still delete the entire extent, but we no longer require it to be closed.

This leads to the common confusion "I see events older than one day in my Event Hub - what is wrong". The short answer is nothing - read on for more information.

Service Bus Messaging (again Queues & Topics) provides per message control and lifetime management. Event Hubs do not. Event Hubs are much more coarse grained. This means on Service Bus Messaging entities (Queue / Topic / Subscription) you can specify lifetime / expiration on a single message or on an entity. When that specific message has expired, it is no longer in the entity - at least from the users' viewpoint.

The same is not true of Event Hubs - the EventData class does not expose such a concept as TTL and the only time based value to be specified is the Message Retention interval at the Event Hub level. Here is where the confusion comes into play. Event Hubs are a massive scale service - we don't track very much on a per event basis. Each partition is basically a sequence of event packages or blocks. In a way you could think of this like shipping containers on a transport ship. We load events into the container then, when it is full, load it on the ship; marking the time at which this happens (OK - it doesn't really work that way, but use your imagination a bit). Every now and then we look at the fully loaded containers, which have their time stamp on them, and unload or discard the ones that are outside the message retention window.

There are two implications here: the first is that retention happens in a fairly large block. Again, Event Hubs are made for big scale - if you don't need scale a Service Bus Queue may be better for you. Secondly we don't look at the containers until they're full. So if you put only a few items into an Event Hubs - even a few tens of thousands - there is a high likelihood that you will see events from beyond the retention period. 

Nothing is wrong - keep reading events and if you're very concerned about older events keep careful track of your offsets or use a DateTime value when connecting to only read events from a specific point in time.