The Death of DateTime?

SQL Server 2008 added a new data type named “datetimeoffset”. This is similar to the old datetime data type, with the following significant differences:

· Internally, the time is stored in unambiguous UTC format

· The local time zone offset is stored along with the UTC time, which allows the time to be displayed as a local time value (or converted to any another time zone offset)

· The data type is capable of storing more precise times than datetime

 

So, when should you use datetimeoffset, and when should you use datetime? The first answer is that you have no choice at all if you aren’t using SQL 2008, since datetimeoffset was first added in this SQL version. If you are working with SQL 2008, let’s address the question by examining some problems with the older datetime type:

 

Problem 1: DateTime is inherently ambiguous

Suppose you are a consultant looking at an existing table with a datetime column. A row in this table tells you that some critical event occurred at 4:35pm. Is this the server’s local time? Local time for the end user’s machine? GMT, or Coordinated Universal Time (UTC)? Local time for a time zone selected by convention that may or may not match the server’s local time zone? Here in Microsoft, for example, some systems were designed to store Pacific time by convention, even though in some of these cases the SQL Server may reside in a data center that isn’t on the west coast. Other databases store UTC times, again by convention adopted by whoever designed those systems. So this is the first problem: datetime is ambiguous. A datetime value by itself actually does not identify a particular moment in time; it takes on an clear meaning only when you interpret it in the context of some assumed, and usually unenforced, time zone.

 

Problem 2: It is impractical to convert historical DateTime values to/from time zones

If you’ve ever built a data warehouse that consolidated data from several data sources, you may have struggled to convert various ambiguous local time representations into a consistent form. Some data sources store their times as UTC, some store local server time, others use a local time based on some end user’s time zone. You could just cram all of those values into a datetime column in your warehouse, but it would be impossible to interpret the times in a meaningful way. No matter what time zone you selected as a lens through which to interpret the data, it would be wrong for some values.

 

So you must expend some extra effort to figure out the implicit time zone for every data source in your warehouse, and then you have to write code that converts all of these various times to some consistent time zone, likely UTC time, in the central data warehouse. If that was the end of it, it would be bad enough. But Daylight Savings Time makes it next to impossible to do this conversion in a generic way. The problem is that different locales have different Daylight Savings Time rules. Many parts of the world don’t honor DST at all, some places do honor it but use half-hour offsets instead of full hour offsets, and various places begin and end DST on different dates. Here’s an concrete example: suppose you have the local datetime value 2008-03-05 08:30:00 in the database, and you need to convert this to UTC time. By interviewing the right DBA or by examining source code, you have determined that this database stores datetime values using local server time. The local server is in the Pacific time zone, and you’ve found that you can use a T-SQL expression like this to determine the current time zone offset for the local server:

 

                DATEDIFF (minute, GETUTCDATE(), GETDATE())

 

This tells you that the local server time is 7 hours behind UTC right now, so you should be able to add 7 hours to the local time to get the equivalent UTC time, right? That would be wrong; Daylight Savings Time is in effect in the U.S. today (April), but when this datetime value was collected back in March of last year, DST was not in effect. The correct time zone offset to use is UTC minus 8 hours. So you have to have knowledge of what the local time zone offset would have been on arbitrary past or future dates, and bake this knowledge into your conversion routine. If your data comes from a variety of locales, you have to have correct information about the time zone rules in every place your data comes from. To make matters worse, the rules for DST may change from year to year in the same locale due to legislative changes, so you have to capture different sets of rules for different ranges of dates within each region. Have you taken into account the fact that (most of) the state of Arizona doesn’t use DST? Or that Indiana didn’t use DST at all prior to 2006, but you do need to adjust for DST for any data that was captured on a server from Indiana after 2006? Does your conversion routine account for the fact that Daylight Savings Time in the U.S. was lengthened by about a month starting in 2007? This problem isn’t unique to the United States, and the situation can be even more grim if your data comes from more than one country. DBAs and developers in China, India, and Japan get off a little bit easier because DST is not observed in those countries, but they still have the problem to some degree if they ever need to consume data that originated in other places or push their data to a consumer in a different country.

 

Finally, there is small time window each year that will defeat even the world’s most intelligent time zone conversion routine. In 2009, DST in the United States will end at 2:00 am on November 1st. At that time the clocks will roll back an hour, to 1:00 am. In other words, each year there is a one-hour window during which times like 2009-11-01 01:35 am will actually occur twice. That ambiguity is completely intractable.

 

Problem 3: If you do manage to convert DateTime values collected in various places to a single time zone, you lose important information

Let’s suppose that a database you are working with stores UTC times. Good for you: all of your times are unambiguous. But unless your users all live in London or Lisbon (and DST is not in effect), UTC is generally not very meaningful to a user. You could theoretically convert the times to the end user’s local time zone (if it weren't for the inconvenient fact that this is impractical, as we just discovered). But what if you wanted to present the time in local time relative to the place where the timestamp was captured? For example, suppose you wanted to show records from a consolidated server health log as local times for the server where they were captured. You can’t. The information about the current local time zone offset at the moment the timestamp was collected was lost when you converted the difficult-to-work-with local time into that nice, pure UTC time.

 

 

These three problems can combine to create a real mess. Frankly, I think I might consider a career change if I was tasked with solving all of these problems in a large-scale project that consolidated historical data from many different places. Of course, you might be thinking, My datetime values are all local server time; their meaning is perfectly clear to me. Well, one day your company may expand and your little homegrown system might need to handle data from more than one region. Or you might need to import the data into a new system when your solution is thrown out for being too provincial :). Or the data in your local database might turn out to be needed in some central data warehouse that consolidates data from a variety of sources. You can save yourself and your successors some grief by using the more robust datetimeoffset data type from the start.

 

 

When to Use DateTimeOffset?

Because the datetimeoffset data type stores a UTC date internally, it’s free of the ambiguity that causes problems #1 and #2. And because it also stores the time zone offset that was current at the time the timestamp was generated, it doesn’t suffer from the data loss problem that you face if you store times as UTC datetime values (problem #3). In other words, with the same datetimeoffset value you can represent the value as a local time or easily convert it to a UTC time. SQL will do the right thing if you compare two datetimeoffset values, even if the values were captured from systems with very different time zone offsets.

 

So, let's return to the original question: When should you use datetimeoffset instead of datetime? The answer is: you should almost always use datetimeoffset. I’ll make the claim that there is only a single case where datetime is clearly the best data type for the job, and that’s when you actually require an ambiguous time. For example, if you wanted a column to record the fact that all stores in a chain should open at 8:00am local time (whatever the local time zone may be), you should use datetime. But any time you want to store a value that represents an absolute moment in time, you would be better off using datetimeoffset. For most applications, that means that just about everywhere you currently use datetime would be a good candidate for datetimeoffset.

 

Please don’t beat me up over the fact that SQL 2008’s DMVs still use datetime :). This is a known problem, and it’s on the books to look at for future versions. We’re facing the same problems that you’ll face in your existing systems: it’s hard to make a system-wide datatype change that doesn’t break someone somewhere. For any brand new SQL development work you do, though, I encourage you to pause and consider your choice carefully before using datetime. It may be an appropriate choice in some cases, but most of the time you’d probably be better off with datetimeoffset.

 

Retrieve the current time as a datetimeoffset (comparable to the venerable GETDATE function):

                SELECT SYSDATETIMEOFFSET()

 

Retrieve the server’s current time zone offset (the number of minutes before or after UTC):

                SELECT DATENAME (TZoffset, SYSDATETIMEOFFSET())

 

Convert from datetime to datetimeoffset (note that this uses the server’s current time zone offset, which could be inappropriate for historical dates):

                SELECT TODATETIMEOFFSET (datetimevalue, DATENAME (TZoffset, SYSDATETIMEOFFSET()))

 

Convert a datetimeoffset value (in this case, local server time returned by SYSDATETIMEOFFSET) to a new time zone offset:

SELECT SWITCHOFFSET (SYSDATETIMEOFFSET(), '-05:00')

 

Finally, be aware that .Net 2.0 SP1 added support for the datetimeoffset data type, so you can round-trip this nice new data type between SQL and a client app without any fuss.