Hugarian notation - it's my turn now :)

Article
06/22/2004

Following on the heals of Eric Lippert’s posts on Hungarian and of course Rory Blyth’s classic “Die, Hungarian notation… Just *die*”, I figured I’d toss my hat into the fray (what the heck, I haven’t had a good controversial post in a while).

One thing to keep in mind about Hungarian is that there are two totally different Hungarian implementations out there.

The first one, which is the one that most people know about, is “Systems Hungarian”. System’s Hungarian is also “Hungarian-as-interpreted-by-Scott-Ludwig” (Edit: For Scott's side of this comment, see here - the truth is better than my original post). In many ways, it’s a bastardization of “real” (or Apps) Hungarian as proposed by Charles Simonyi.

Both variants of Hungarian have two things in common. The first is the concept of a type-related prefix, and the second is a suffix (although the Systems Hungarian doesn’t use the suffix much (if at all)). But that’s where the big difference lies.

In Systems Hungarian, the prefix for a type is almost always related to the underlying data type. So a parameter to a Systems Hungarian function might be “dwNumberOfBytes” – the “dw” prefix indicates that the type of the parameter is a DWORD, and the “name” of the parameter is “NumberOfBytes”. In Apps Hungarian, the prefix is related to the USE of the data. The same parameter in Apps Hungarian is “cb” – the “c” prefix indicates that the parameter is a type, the “b” suffix indicates that it’s a byte parameter.

Now consider what happens if the parameter is the number of characters in a string. In Systems Hungarian, the parameter might be “iMaxLength”. It might be “cchWideChar”. There’s no consistency between different APIs that use Systems Hungarian. But in Apps Hungarian, there is only one way of representing the parameter; the parameter would be “cch” – the “c” prefix again indicates a count, the “ch” type indicates that it’s a character.

Now please note that most developers won’t use “cch” or “cb” as parameters to their routines in Apps Hungarian. Let’s consider the Win32 lstrcpyn function:

  LPTSTR lstrcpyn(     
     LPTSTR lpString1,
    LPCTSTR lpString2,
    int iMaxLength
);

This is the version in Systems Hungarian. Now, the same function in Apps Hungarian:

  LPTSTR Szstrcpyn(     
     LPTSTR szDest,
    LPCTSTR szSrc,
    int cbLen
);

Let’s consider the differences. First off, the name of the function changed to reflect the type returned by the function – since it returns an LPTSTR, which is a variant of a string, the function name changed to “SzXxx”. Second, the first two parameters name changed. Instead of “lpString1” and “lpString2”, they changed to the more descriptive “szSrc” and “szDest”. The “sz” prefix indicates that the variable is a null terminated string. The “Src” and “Dest” are standard suffixes, which indicate the “source” and “destination” of the operation. The iMaxLength parameter which indicates the number of bytes to copy is changed to cbLen – the “cb” prefix indicates that it’s a count of bytes, the standard “Len” suffix indicates that it’s a length to be copied.

The interesting thing that happens when you convert from Systems Hungarian to Apps Hungarian is that now the usage of all the parameters of the function becomes immediately clear to the user. Instead of the parameter name indicating the type (which is almost always uninteresting), the parameter name now contains indications of the usage of the parameter.

The bottom line is that when you’re criticizing Hungarian, you need to understand which Hungarian you’re really complaining about. Hungarian as defined by Simonyi isn’t nearly as bad as some have made it out to be.

This is not to say that Apps Hungarian was without issue. The original Hungarian specification was written by Doug Klunder in 1988. One of the things that was missing from that document was a discussion about the difference between “type” and “intent” when defining prefixes. This can be a source of a great confusion when defining parameters in Hungarian. For example, if you have a routine that takes a pointer to a “foo” parameter to the routine, and internally the routine treats the parameter as single pointer to a foo, it’s clear that the parameter name should be “pfoo”. However, if the routine treats the parameter as an array of foo’s, the original document was not clear about what should happen – should the parameter be “pfoo” or “rgfoo”. Which wins, intent or type? To me, there’s no argument, it should be intent, but there have been some heated debates about this over the years. The current Apps Hungarian document is quite clear about this, intent wins.

One other issue with the original document was that it predated C++. So concepts like classes weren’t really covered and everyone had to come up with their own standard. At this point those issues have been resolved. Classes don’t have a “C” prefix, since a class is really just a type. Members have “m_” prefixes before their actual name. There are a bunch of other standard conventions but they’re relatively unimportant.

I used Hungarian exclusively when I was in the Exchange team; my boss was rather a Hungarian zealot and he insisted that we code in strict Apps Hungarian. Originally I chafed at it, having always assumed that Hungarian was stupid, but after using it for a couple of months, I started to see how it worked. It certainly made more sense than the Hungarian I saw in the Systems division. I even got to the point where I could understand what an irgch would without even flinching.

Now, having said all that, I don’t use Hungarian these days. I’m back in the systems division, and I’m using a home-brewed coding convention that’s based on the CLR standards, with some modifications I came up with myself (local variables are camel cased, parameters are Pascal cased (to allow easy differentiation between parameters and local variables), class members start with _ as a prefix, globals are g_Xxx). So far, it’s working for me.

I’ve drunk the kool-aid from both sides of the Hungarian debate though, and I’m perfectly happy working in either camp.

Hugarian notation - it's my turn now :)

Additional resources