Hungarian Notation

Article
03/07/2005

We've been having an internal discussion recently about coding guidelines and the rules that should be in place to create the "best" code possible. "Best" is, of course, up to interpretation. Readability, maintainability, perf, etc. all play into this. One of the elements that has come up is what sort of naming convention we should be using. Considering that we're all programmer geeks we want to come up with simple and clear rules that everyone can follow. Of course, when it comes to simple rules for naming one of the first things that springs to mind is Hungarian Notation (HN). There are wildly mixed feelings about HN here and i wanted to get some information from you if you use it or not and how you feel about it.

For those who don't know HN was created by Charles Simonyi @ MS. you can read more about HN at https://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnvs600/html/HungaNotat.asp

Condensed, the goals of HN are as follows.

The names will be mnemonic in a very specific sense: If someone remembers the type of a quantity or how it is constructed from other types, the name will be readily apparent.
The names will be suggestive as well: We will be able to map any name into the type of the quantity, hence obtaining information about the shape and the use of the quantity.
The names will be consistent because they will have been produced by the same rules.
The decision on the name will be mechanical, thus speedy.
Expressions in the program can be subjected to consistency checks that are very similar to the "dimension" checks in physics.

the specific rules are as follows:

Quantities are named by their type possibly followed by a qualifier. A convenient (and legal) punctuation is recommended to separate the type and qualifier part of a name. (In C, we use a capital initial for the qualifier as in rowFirst: row is the type; First is the qualifier.)
Qualifiers distinguish quantities that are of the same type and that exist within the same naming context. Note that contexts may include the whole system, a block, a procedure, or a data structure (for fields), depending on the programming environment. If one of the "standard qualifiers" is applicable, it should be used. Otherwise, the programmer can choose the qualifier. The choice should be simple to make, because the qualifier needs to be unique only within the type and within the scope—a set that is expected to be small in most cases. In rare instances more than one qualifier may appear in a name. Standard qualifiers and their associated semantics are listed below. An example is worthwhile: rowLast is a type row value; that is, the last element in an interval. The definition of Last states that the interval is "closed"; that is, a loop through the interval should include rowLast as its last value.
Simple types are named by short tags that are chosen by the programmer. The recommendation that the tags be small is startling to many programmers. The essential reason for short tags is to make the implementation of rule 4 realistic. Other reasons are listed below.
Names of constructed types should be constructed from the names of the constituent types. A number of standard schemes for constructing pointer, array, and different types exist. Other constructions may be defined as required. For example, the prefix p is used to construct pointers. prowLast is then the name of a particular pointer to a row type value that defines the end of a closed interval. The standard type constructions are also listed below.

It all seems well and good, but i end up finding the code written in this way completely unreadable. One of the reasons for this might be the following suggestion: "Conclusion: Do not use qualifiers when not needed, even if they seem valuable."

Wow... so you end up with code that looks like:

1 #include "sy.h"
2 extern int *rgwDic;
3 extern int bsyMac;
4 struct SY *PsySz(char sz[])
6 {
7 char *pch;
8 int cch;
9 struct SY *psy, *PsyCreate();
10 int *pbsy;
11 int cwSz;
12 unsigned wHash=0;
13 pch=sz;
14 while (*pch!=0)
15 wHash=(wHash<>11+*pch++;
16 cch=pch-sz;
17 pbsy=&rgbsyHash[(wHash&077777)%cwHash];
18 for (; *pbsy!=0; pbsy = &psy->bsyNext)
19 {
20 char *szSy;
21 szSy= (psy=(struct SY*)&rgwDic[*pbsy])->sz;
22 pch=sz;
23 while (*pch==*szSy++)
24 {
25 if (*pch++==0)
26 return (psy);
27 }
28 }
29 cwSz=0;
30 if (cch>=2)
31 cwSz=(cch-2/sizeof(int)+1;
32 *pbsy=(int *)(psy=PsyCreate(cwSY+cwSz))-rgwDic;
33 Zero((int *)psy,cwSY);
34 bltbyte(sz, psy->sz, cch+1);
35 return(psy);
36 }

I dunno, but i can't read that code at all. Let's say i did know hungarian, woudl that help? Im' not so sure. Starting at the top:

rgwDic. It's an array of words called "dic". Not sure what "dic" is but maybe it's a dictionary. Ok, so a dictionary maps keys to values somehow. But what are the keys, what are the values? Is it a dictionary that uses hashes? I have no idea. I really don't have a single clue what rgwDic is right now. Amazingly, Simonyi recommends that that name actually be grpsy. grpsy... i would be completely lost with that. Ok onto the next field.

bsyMac. No clue. We're doing something with a SY type... so it's like the last SY out there... Of course, i have no idea what an SY is... so i'm still clueless.

char* pch. Ok. It's a precompiled header. Just kidding :) We have some string. I would prefer std::string, but that's just me.

int cch. some count of characters. Is it related to pch? I have no idea

Ok, some local function def follows.

And at this point i'm completely lost. I'm not even going to go on to the rest of the code. The lack of clear names has me compeltely confounded. I can't tell how things are related and i'm scared out of my mind about touching even the slightest character in this code.

Have you had experience using hungarian in a project? Did it turn out to be a good thing, a bad thing, or soemthing you didn't even notice? Personal experiences would be very appreciated.

---

Note: we've been discussing coding conventions in the context of writing C# code (if that helps).

Hungarian Notation

Additional resources