A new tool in the battle for consistency of WinFX

One of the folks on the FxCop team gave me an idea for a new tool in the battle for at least naming level consistency across WinFX. We always find ourselves asking questions about the “right” way to case certain terms. Recall the great ID debates, Ok vs. OK, a recent post on Callback, there is a new one every week. Given that we are getting close to being done with V2 of the .NET Framework, the most common answer to these questions is “what is the prior art”. That is what we have already begun teaching customers to expect. Unless we have a very, very good reason, we should stick with the prior art. But it can be hard to find the prior art especially if there is mixed history (we screwed up in V1, V1.1 and did things two or even three different ways).

So, I hacked (and I do mean hacked, so I can’t share the code or all the data just yet) a little tool that reflects over the framework and pulls out all the identifiers (type names, member names, etc) and breaks them up into unique terms. For example “CallbackEventReference” breaks into the terms “CallbackEventReference”, CallbackEvent”, “EventReference”, “Callback”, “Event”, “Reference”. It then tracks all usage of those terms in any other identifier and flags any differences.

For example, consider the term “Callback”. Should it be “Callback” or “CallBack”? Here is how it is used in current Whidbey builds. So consistency speaks 5 to 70 in favor of “Callback”.

Now for some trivia:

  • There are 50091 unique terms in the build of Whidbey I ran this over
  • 49473 or 98% are always used consistently in the .NET Framework. That is an ‘A’ in any book!
  • The top 10 most commonly used terms in publicly exposed identifiers:

.ctor 6387

Get 4358

To 1896

Type 1534

Is 1403

Name 1287

Data 1284

Invoke 1284

Item 1113

value__ 1049

Add 1030

.ctor and value__ are both interesting… shall we have a quiz on where they come from? .ctor is relatively easy, but value__ is a little more obscure I think.. Comment to this post with your guesses.

It is also interesting that “Is” showed up so high in light of the recent decision we had on IsXxx properties.

  • Most of the 2% where we do conflict there are only two options with most usage in just one option, but in some cases we have three different casings, as in this example:

· FileTime 4

o System.DateTime::public static DateTime FromFileTime(Int64 fileTime);

o System.DateTime::public static DateTime FromFileTimeUtc(Int64 fileTime);

o System.DateTime::public Int64 ToFileTime();

o System.DateTime::public Int64 ToFileTimeUtc();

· FILETIME 2

o System.Runtime.InteropServices.FILETIME

o System.Runtime.InteropServices.ComTypes.FILETIME

· Filetime 1

o System.Data.OleDb.OleDbType::Filetime,

Anyway, I hope you find this useful.. If there is enough demand I will post the full spreadsheet of all Whidbey usage when the beta comes out.