Parsing Non-Standard Date and Time Formats [Ron Petrusha]

Frequently, particularly when dealing with remote data collection devices, an application receives string data containing date and time information that must be converted to either DateTime or DateTimeOffset values. In these cases, the most commonly used parsing methods, the overloads of the DateTime.Parse or DateTimeOffset.Parse methods, all throw a FormatException when they attempt to perform the conversion, and their corresponding TryParse overloads all return false and assign a data and time of MinValue to the method’s date and time argument. However, the .NET Framework provides two alternative overloaded methods, ParseExact and TryParseExact, that can be used to parse the string representations of date and time values.

In parsing date and time strings, both Parse and TryParse use a NumberFormatInfo object that corresponds either to the current culture or, if one of the overloads with an IFormatProvider parameter is called, to the culture that corresponds to the IFormatProvider argument. The properties of this NumberFormatInfo object determine the acceptable date and time formats that a string can have in order to parse successfully. Even though the number of supported formats is typically sizable, the Parse and TryParse methods cannot parse strings in non-standard formats successfully.

On the other hand, the ParseExact or TryParseExact methods are ideal, since they require that a string exactly conform to a particular definable pattern if the parse operation is to succeed. They are useful when either of the following conditions holds true:

  • The precise format of a string containing date and time information is known in advance, as it is when the data is gathered by some data collection object or device.
  • User input must have a particular format.

The overloads of the ParseExact and TryParseExact methods of both the DateTime and DateTimeOffset objects allow you to specify either a single pattern or an array of patterns. You to define a pattern by specifying a format string that consists of either a standard format specifier or one or more custom format specifiers. The input string must then conform to that single pattern or to any one of the patterns in the array. Two of the overloads also include a DateTimeStyles parameter that allows you to define style elements (such as white-space characters) that may be included in the input, or to determine how the input string is to be interpreted (for example, as local or universal time).

To see how the ParseExact and TryParseExact methods can help with parsing date and time strings, let’s take three examples. First, in some cases, date and time data is provided to an application as a set of numbers without any separators (for example, “20100501061245500”). The following code uses a custom date and time format string to do this.

  [Visual Basic] 
Module Example
   Public Sub Main()
      Dim fmt As String = "yyyyMMddhhmmssFFFFFFF"
      Dim value As String = "20100501061245500"
      Dim dt As DateTime = DateTime.ParseExact(value, fmt, Nothing)
      Console.WriteLine("{0}--> {1}", value, dt)
   End Sub
End Module

 [C#] 
using System;

public class  Example
{
   public static void Main()
   {
      string fmt = "yyyyMMddhhmmssFFFFFFF";
      string value = "20100501061245500";
      DateTime dt = DateTime.ParseExact(value, fmt, null);
      Console.WriteLine("{0} --> {1}", value, dt);
   }
}

 

The example displays the following output:

 20100501061245500 --> 5/1/2010 6:12:45 AM

 

Second, in some cases, date and time strings include time zone information expressed as the time zone’s offset from Coordinated Universal Time (UTC). The z, zz, or zzz custom format specifiers can be used in a custom format string to handle such time zone offsets. For example, the following code uses both the ParseExact and TryParseExact methods to parse the string representation of a date and time that contains a time zone offset.

  [Visual Basic] 
Imports System.Globalization

Module Example 
   Sub Main() 
      Dim value As String = "12/15/2009 6:32 PM -05:00"
      Dim fmt As String = "MM/dd/yyyyh:mm tt zzz"
      Dim date1 As DateTime 
      If DateTime.TryParseExact(value, fmt, Nothing, _
                                DateTimeStyles.AllowWhiteSpaces, date1) Then
         Console.WriteLine("Converted '{0}' to {1}, Kind {2}", _
                           value, date1, date1.Kind) 
      Else 
         Console.WriteLine("Unable to convert '{0} to a date.", value)
      End If

      Try 
         Dim date2 As DateTime = DateTime.ParseExact(value, fmt, _
                                 Nothing, DateTimeStyles.AdjustToUniversal) 
         Console.WriteLine("Converted '{0}' to {1}, Kind {2}", _ 
                           value, date2, date2.Kind) 
      Catch e As FormatException 
         Console.WriteLine("Unable to convert '{0} to a date.",
                           value) 
      End Try
   End Sub
End Module

 [C#] 
using System;
using System.Globalization;

class Example
{
   static void Main()
   {
      string value = @"12/15/2009 6:32 PM -05:00";
      string fmt = @"MM/dd/yyyy h:mm tt zzz";
      DateTime date1;

      if (DateTime.TryParseExact(value, fmt, null,
                   DateTimeStyles.AllowWhiteSpaces, out date1))
         Console.WriteLine("Converted '{0}' to {1}, Kind {2}",
                           value, date1, date1.Kind);
      else
         Console.WriteLine("Unable to convert '{0} to a date.", value);

      try {
         DateTime date2 = DateTime.ParseExact(value, fmt, null,
                          DateTimeStyles.AdjustToUniversal);
         Console.WriteLine("Converted '{0}' to {1}, Kind {2}",
                           value, date2, date2.Kind); 
      }
      catch (FormatException)
      {
         Console.WriteLine("Unable to convert '{0} to a date.", value); 
      } 
   }
}

The example displays the following output:

 Converted '12/15/2009 6:32 PM -05:00' to 12/15/2009 3:32:00 PM, Kind Local
Converted '12/15/2009 6:32 PM -05:00' to 12/15/2009 11:32:00 PM, Kind Utc

Note that the parsing method has not only converted a string to a DateTime value, but that in the process it has also performed a time zone conversion. In the first case, the time was adjusted to the time zone of the computer system on which the application was run (which was U.S. Pacific Standard Time in the case of our example). In the second case, because the ParseExact method was called with the styles parameter set to DateTimeStyles.AdjustToUniversal, the string was converted to Coordinated Universal Time.

It is possible to avoid the loss of time zone information in the input string by calling the DateTimeOffset.ParseExact or DateTimeOffset.TryParseExact method rather than the DateTime.ParseExact or DateTime.TryParseExact method, as the following example code shows.

  [Visual Basic] 
Imports System.Globalization

Module Example
   Public Sub Main()
      Dim value As String = "12/15/2009 6:32 PM -05:00"
      Dim fmt As String = "MM/dd/yyyy h:mm tt zzz"
      Dim date1 As DateTimeOffset

      If DateTimeOffset.TryParseExact(value, fmt, Nothing, _ 
                               DateTimeStyles.AllowWhiteSpaces, date1) Then
         Console.WriteLine("Converted '{0}' to {1}", value, date1)
      Else
         Console.WriteLine("Unable to convert '{0} to a date.", value)
      End If
   End Sub
End Module

 [C#] 
using System;
using System.Globalization;

public class Example
{
   public static void Main()
   {
      string value = @"12/15/2009 6:32 PM -05:00";
      string fmt = @"MM/dd/yyyy h:mm tt zzz";
      DateTimeOffset date1;

      if (DateTimeOffset.TryParseExact(value, fmt, null,
                         DateTimeStyles.AllowWhiteSpaces, out date1))
         Console.WriteLine("Converted '{0}' to {1}", value, date1);
      else
         Console.WriteLine("Unable to convert '{0} to a date.", value);
   }
}

The example displays the following output:

 Converted '12/15/2009 6:32 PM -05:00' to 12/15/2009 6:32:00 PM -05:00

As the third and final example, assume that our application must parse a string in the form “mm/dd/yy hh:mm:ss PST” submitted by a control device. In this case, the time zone is represented by a predefined time zone abbreviation. In this case, the following code successfully parses the string.

  [Visual Basic] 
Module Example
   Public Sub Main()
      Dim values() As String = {"12/15/2009 6:32 PM PST", _
                                "1/2/2010 4:18 AM PST", _
                                "01/16/2010 11:17 PM PST"}
      Dim fmt As String = "M/d/yyyy h:mm tt PST";
      Dim dt As DateTime

      For Each value As String In values
         Try
            dt = DateTime.ParseExact(value, fmt, Nothing)
            Console.WriteLine("'{0}' --> {1}", value, dt)
         Catch e As FormatException
            Console.WriteLine("'{0}': Bad Format", value)
         End Try
      Next
   End Sub
End Module

 [C#] 
using System;

public class Example
{
    public static void Main()
    {
        string[] values = { "12/15/2009 6:32 PM PST",
                            "1/2/2010 4:18 AM PST",
                            "01/16/2010 11:17 PM PST" };
        string fmt = "M/d/yyyy h:mm tt PST";
        DateTime dt;

        foreach (string value in values)
        {
            try
            {
                dt = DateTime.ParseExact(value, fmt, null);
                Console.WriteLine("'{0}' --> {1}", value, dt);
            }

            catch (FormatException)
            {
                Console.WriteLine("'{0}': Bad Format", value);
            }
        }
    }
}

The example displays the following output:

 '12/15/2009 6:32 PM PST' --> 12/15/2009 6:32:00 PM
'1/2/2010 4:18 AM PST' --> 1/2/2010 4:18:00 AM
'01/16/2010 11:17 PM PST' --> 1/16/2010 11:17:00 PM

In this case, the ParseExact method recognizes only a single time zone abbreviation at the end of the string. By calling another overload that accepts an array of custom format strings, we can extend the example so that it recognizes multiple time zone abbreviations. The following code uses the DateTimeOffset.TryParseExact method for this purpose.

  [Visual Basic] 
Imports System.Globalization

Module Example
   Public Sub Main()
      Dim values() As String = {"12/15/2009 6:32 PM PST", _
                                 "1/2/2010 4:18 AM PDT", _
                                 "01/16/2010 11:17 PM CST"}
      Dim fmts() As String = {"M/d/yyyy h:mm tt PST", _
                               "M/d/yyyy h:mm tt PDT", _
                               "M/d/yyyy h:mm tt MST", _
                               "M/d/yyyy h:mm tt MDT", _
                               "M/d/yyyy h:mm tt CST", _
                               "M/d/yyyy h:mm tt CDT", _
                               "M/d/yyyy h:mm tt EST", _
                               "M/d/yyyy h:mm tt EDT"}
      Dim dt As DateTimeOffset

      For Each value As String In values
         If DateTimeOffset.TryParseExact(value, fmts, Nothing,
                           DateTimeStyles.None, dt) Then
            Console.WriteLine("'{0}' --> {1}", value, dt)
         Else
            Console.WriteLine("Cannot parse '{0}'", value)
         End If
      Next
   End Sub
End Module

 [C#] 
using System;
using System.Globalization;

public class Example
{
   public static void  Main()
   {
      string[] values = { "12/15/2009 6:32 PM PST",
                          "1/2/2010 4:18 AM PDT",
                          "01/16/2010 11:17 PM CST" };
      string[] fmts = { "M/d/yyyy h:mm tt PST",
                        "M/d/yyyy h:mm tt PDT",
                        "M/d/yyyy h:mm tt MST",
                        "M/d/yyyy h:mm tt MDT",
                        "M/d/yyyy h:mm tt CST",
                        "M/d/yyyy h:mm tt CDT",
                        "M/d/yyyy h:mm tt EST",
                        "M/d/yyyy h:mm tt EDT" };
      DateTimeOffset dt;

      foreach (string value in values)
      {
         if (DateTimeOffset.TryParseExact(value, fmts, null,
                               DateTimeStyles.None, out dt))
            Console.WriteLine("'{0}' --> {1}", value, dt);
         else
            Console.WriteLine("Cannot parse '{0}'", value);
      }
   }
}

The example displays the following output:

 '12/15/2009 6:32 PM PST' --> 12/15/2009 6:32:00 PM -08:00
'1/2/2010 4:18 AM PDT' --> 1/2/2010 4:18:00 AM -08:00
'01/16/2010 11:17 PM CST' --> 1/16/2010 11:17:00 PM -08:00

But since we are now parsing times from different time zones, the output reveals a significant limitation of this example: it doesn’t recognize the time zone abbreviation in our input strings, and so it assumes that the input string represents the data and time in the local system’s time zone (which in this case is in the U.S. Pacific Standard Time zone). We can make our example time zone-aware by using a generic Dictionary object to look up the time zone identifier from its abbreviation, and supplying the identifier as an argument to the TimeZoneInfo.ConvertTimeBySystemTimeZoneId method. The following is the revised, time zone-aware version of our example.

  [Visual Basic] 
Imports System.Collections.Generic
Imports System.Globalization
Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      ' Create dictionary of time zones.
      Dim timeZones As New Dictionary(Of String, String)
      timezones.Add("PST", "Pacific Standard Time")
      timezones.Add("PDT", "Pacific Standard Time")
      timezones.Add("MST", "Mountain Standard Time")
      timezones.Add("MDT", "Mountain Standard Time")
      timezones.Add("CST", "Central Standard Time")
      timezones.Add("CDT", "Central Standard Time")
      timezones.Add("EST", "Eastern Standard Time")
      timezones.Add("EDT", "Eastern Standard Time")

      Dim values() As String = {"12/15/2009 6:32 PM PST", _
                                "1/2/2010 4:18 AM PDT", _
                                "01/16/2010 11:17 PM CST"}

      Dim fmts() As String = {"M/d/yyyy h:mm tt PST", _
                              "M/d/yyyy h:mm tt PDT", _
                              "M/d/yyyy h:mm tt MST", _
                              "M/d/yyyy h:mm tt MDT", _
                              "M/d/yyyy h:mm tt CST", _
                              "M/d/yyyy h:mm tt CDT", _
                              "M/d/yyyy h:mm tt EST", _
                              "M/d/yyyy h:mm tt EDT"}

      Dim rgx As New Regex("\w(D|S)T", RegexOptions.IgnoreCase)
      Dim dt As DateTime
      Dim tz As String = Nothing

      For Each value As String In values
         Try
            tz = rgx.Match(value).Value
            dt = TimeZoneInfo.ConvertTimeBySystemTimeZoneId( _
                             DateTime.ParseExact(value, fmts, Nothing, DateTimeStyles.None), _
                             timeZones.Item(tz), _
                             TimeZoneInfo.Local.Id)
            Console.WriteLine("'{0}' --> {1}", value, dt)
         Catch e As FormatException
            Console.WriteLine("Cannot parse '{0}'", value)
         Catch e As TimeZoneNotFoundException
            Console.WriteLine("Cannot identify time zone {0}", tz)
         End Try
      Next
   End Sub
End Module

 [C#] 
using System;
using System.Collections.Generic;
using System.Globalization;
using System.Text.RegularExpressions;

public class Example
{
   public static void Main()
   {
      // Create dictionary of time zones.
      Dictionary<string, string> timeZones = new Dictionary<string, string>();
      timeZones.Add("PST", "Pacific Standard Time");
      timeZones.Add("PDT", "Pacific Standard Time");
      timeZones.Add("MST", "Mountain Standard Time");
      timeZones.Add("MDT", "Mountain Standard Time");
      timeZones.Add("CST", "Central Standard Time");
      timeZones.Add("CDT", "Central Standard Time");
      timeZones.Add("EST", "Eastern Standard Time");
      timeZones.Add("EDT", "Eastern Standard Time");

      string[] values = { "12/15/2009 6:32 PM PST",
                          "1/2/2010 4:18 AM PDT",
                          "01/16/2010 11:17 PM CST" };

      string[] fmts = { "M/d/yyyy h:mm tt PST",
                        "M/d/yyyy h:mm tt PDT",
                        "M/d/yyyy h:mm tt MST",
                        "M/d/yyyy h:mm tt MDT",
                        "M/d/yyyy h:mm tt CST",
                        "M/d/yyyy h:mm tt CDT",
                        "M/d/yyyy h:mm tt EST"
                        "M/d/yyyy h:mm tt EDT" };

      Regex rgx = new Regex(@"\w(D|S)T", RegexOptions.IgnoreCase);
      DateTime dt;
      string tz = null;

      foreach (string value in values)
      {
         try {
            tz = rgx.Match(value).Value;
            dt = TimeZoneInfo.ConvertTimeBySystemTimeZoneId(
                                 DateTime.ParseExact(value, fmts, null, DateTimeStyles.None),
                                 timeZones[tz], TimeZoneInfo.Local.Id);
            Console.WriteLine("'{0}'--> {1}", value, dt);
         }
         catch (FormatException) {
            Console.WriteLine("Cannot parse '{0}'", value);
         }
         catch (TimeZoneNotFoundException) {
            Console.WriteLine("Cannot identify time zone {0}", tz);
         }
      }
   }
}

The example displays the following output:

 '12/15/2009 6:32 PM PST' --> 12/15/2009 6:32:00 PM
'1/2/2010 4:18 AM PDT' --> 1/2/2010 4:18:00 AM
'01/16/2010 11:17 PM CST' --> 1/16/2010 9:17:00 PM