DateTime Compatibility Issue


The BCL team is having some design discussions on how to fix a bug that was recently found.  I thought you might want to chime in as some of the implications have wide applicability.  


 


Here is the thread from Anthony… comments welcome.


 


 


I would like to get some discussion of this design/compatibility issue:


 


<BugNumber>               Parse and TryParse on all numerical base types should check for negative NumberStyles


 


The Issue


The Parse and TryParse routines on our base data types that take either NumbersStyles or DateTimeStyles effectively don’t do any range checks on them. You can do the following and it will not throw:


 


            Int32.Parse(s, (NumberStyles) 0xFFFFFFFF);


            Int32.Parse(s, (NumberStyles) (-1));


 


Strangely enough we actually do a little bit of validation on the styles for both numbers and dates. We validate certain corner cases where styles or formats don’t make any sense together. However, nothing blows up if you use a style value that we have not defined yet.


 


I don’t think there is any doubt that whatever we do, it should be consistent across data types and across both Parse and TryParse.


 


The issue is: given that we already shipped 2 versions with no range checks, should we put them in now? There is a slight backward compatibility risk, and some forward compatibility concerns.


 


A broader issue is: how important are range checks for flags-style enums, anyway? I would think that we would advocate range checks for sure if we could wind the clock back to V1.0. However, if someone from the old days remembers a reason why these were left off, please speak up.


 


Considerations


This issue is practically performance neutral. We already have a site that does some checks and throws if they are invalid, so all this is doing is adding a couple of branches. We would not fix this by calling Enum.IsDefined, rather we would compare against a special range and throwing.


 


The backward compatibility issue is that someone could have written pathological code that passes in a bogus enum value, but otherwise worked fine, and adding these range checks would break them at upgrade time. They are somewhat at risk anyway because we could decide to attribute meaning to those flags in a future version anyway; it could just do something unexpected. By missing the argument checks in V1.0, we effectively made any further additions to the enums breaking changes. If we want to be really hard core we need to freeze these enums and create new ones if we want any more opt-in behavior.


 


A forward compatibility question arises: Supposing you want to use a new flag value in Whidbey, but you app also works down-level or locked back. If we follow this pattern, the code would just throw down level. If we don’t, the code would not throw, but it would not use the opt-in feature and might fail in a more subtle way. Which is worse? They are both kind of bad in a way.


 


The difficulty of this scenario is: the reason it feels safe to break this is because we think people won’t be doing this. However, if that is the case, why would we bother to fix it? The argument is that more people will use the framework in the long run and would rather get a good argument check. The counter argument is that letting people know they have used an invalid flag at upgrade time is not the idea timing. A CDP might be better.


 


Options


1.       Hard core non-breaking. We don’t add the argument checks, and we freeze the enums. We would need to add new enums for both numbers and dates for any opt-in behavior. We would need to take a DCR to change the existing flags that have been added already to DateTimeStyles and add them in a different way.


2.       Hard core argument-checking. We add the argument checks in Whidbey, and consider it to be a reasonable breaking change for the future of the platform. We then keep extending those enums as the need arises.


3.       Loosy-goosy. We could say that we don’t want to put the argument checks in, but we don’t want to freeze the enums either. If people use invalid bits, they do so at their own risk and they might get difference behavior across versions, but we won’t break them if we don’t strictly need to.


 


Kit said definitely #2. I originally said #3, but I think I have talked myself into #2 as well. However, with any potentially breaking change, I want to get some broader opinion on it.


 


Does anyone see any objection to doing #2?


 


Side Issues


The correct way to validate a flags enum is actually quite hard. You can’t use the tantalizing Enum.IsDefined. The correct way is to hack up an “inverse” of the defined bits and compare against that, which is very hard to discover and icky to code. Alternatively, you can do a range check based on the unused bits, that that is also very counter-intuitive.


 


One problem is that you can’t just create a magic API on the Enum to fix either flags or non-flags range checks, because you explicitly want to bake in the values at compile time rather than run-time. It would be good if there was a magic compile intrinsic like Enum.IsValidRange or Enum.IsValidFlags that got expanded to a range check at compile time instead of a function call. Possibly we should discuss this idea with the C# team for the future.


 


One cheat is that we could create an API, and it would be safe for us to use them because we would always bind to the same version of the Enum class. However, we would have the same problem with Enum.IsDefined and we would have to advise other library builders not to use it. The performance of these sorts of APIs is a problem also.


 

Comments (13)

  1. Chance Gillespie says:

    I think if you break it down into a more fundamental meaning then the absolute wrong choice is completely obvious.

    What you are essentially asking is “Do we put up a canopy to protect the small amount of ill written code that would be affected by this at the cost of fracturing the framework and causing future code to shift gears to compensate, or do we keep future code as a priority”.

    To me at least, the answer is clear… Sacrifice the few for the benefit of the many. Making a change like freezing the enum and introducing a new one just to protect that small amount of code (which honestly should be dealing with the known enum values now and providing error handling for unknown values to future proof against breaking changes) would be far more harmful than beneficial. I don’t think many people would argue that option #1 is out of the question (please, PLEASE don’t let that happen, I beg you!!). If we get caught up in situations like this where we don’t take to heart the changes of the platform as a usable development tool we are going to end up right back where we were with Win32/COM: a big, disgusting, irritable, incomprehensible mess (I used to show pages of DirectX 3 code to kids on Halloween to send them screaming and crying in fear =) ).

    The choice between #2 and #3 are not as clear cut, however. I, myself, am firmly on the side of #2. If you leave things open, and just not bother with doing the checks then you have succeeded in sheltering those few from the breaking changes but you have also denied future code of a protection that you would expect from .NET (and WinFX in the future.. decisions like this can have a far greater impact on future decisions than you might first imagine).

    So it all boils down to a quantifiable choice. Protect the few at the cost of the many, or sacrifice the few for the benefit of the many. You mention that we (yeah, I feel a part of a community with .NET so I say we =) ) are already at the second revision of the platform with the 3rd on it’s way. Personally I think that v1.0 and v1.1 of the framework are so closely related that v.1.1 really should have been called v1.0.1. From what I have seen of v1.2 the changes are so massive that it is unquestionably deserving of the title v2.0. Generics alone give it that right (who knows, maybe Microsoft will give it that moniker in the end). Since I see the next release of the framework as the second real version, it pains me to think that we may end up setting a precedent by introducing these sorts of splintering changes into a young platform that has so much potential to learn from past mistakes and not take road into API decay.

    Then again, maybe I just need to drink another beer and relax =)

    Chance

  2. J. Daniel Smith says:

    From reading the various MS blogs, there seems to be almost an undue emphasis on avoiding "breaking changes". I think that I pretty much agree is Chance Gillespie – do things the "right way", even if that means some pain for a handful of people.

    And that "pain" is really only going to hurt in fairly isolated cases such as trying to run an old(er) version of an application on a new(er) version of the .NET framework. Otherwise, what’s the big deal? We’re doing a new release of our app, moving to a new release of Visual Studio, and we’ve got to change some code. While that sucks, and it would be nice if we didn’t have to change anything; it’s not the end of the world.

    Dan

  3. +1 on #2. Breaking change is OK when it breaks broken code.

  4. Tom McKearney says:

    Absolutely #2. I can’t stand it when people use casting to get around things that were specifically designed to prevent them from entering the wrong value.

  5. Ben Maurer says:

    I see no real reason to test for the values. If someone wants to pass junk to a function, just let them do that (unless it is a security risk). I can’t think of any case where someone would be doing this other than just to probe the code.

    It would really sadden me to see enum checking become the de facto standard, which is basically what will happen if Microsoft goes this way. It is an issue that will never happen by mistake. You add no value to the API really by checking for the values.

    WRT versioning, I think that you should only mark the assembly as compat with a specific version of the fx if you can compile and run against it. This solves any issue with a function acting weirdly on an old version of the fx.

  6. Ian Ringrose says:

    Make the “breaking” change in 1.2, as is may find a bug in software that I am writing. At the same time, stop apps that are compiled against 1.1 from running with 1.2 ales the app.config file says they can. Ship version 1.1 (and 1.0) of the framework with all Microsoft products (including server packs that have version 1.2 of the framework. In other works make side by side running of the framework the normal case.

    What happens with the new SQL server, if there are two database in the same instance, and one database want versions 1.2 of the framework and other database wants version 1.3 of the framework?

  7. Tom McKearney says:

    It is an issue that will never happen by mistake

    I don’t agree. If someone persists a structure or class with an Enum value in it to disk as a text file or XML or a database record and someone else goes and modifies the record manually, the resulting object will have an invalid value for an enumerator. I want to know when that happens!

    The whole point to an enumeration is to constrain data to a particular set of values. If the language allows you to get around that, then every piece of code that reads and writes enums or accepts them as parameters has to manually bounds check them! That’s completely unacceptable.

    When you define an enumerated type as a parameter, that is a contract with the caller. You shouldn’t be able to pass anything you want. As far as I’m concerned, in the right circumstances, this COULD be a security issue.

    Tom

  8. Gavin Gregson says:

    One more for #2. Passing illegal values to a function is an error and should be treated as such. Definitely don’t do #1, that’s a nasty slipperly slope.