# Why do atoms start at 0xC000?

There are two types of atoms, so-called integer atoms, which are just small integers, and, um, atoms that don't have a special adjective; these plain vanilla atoms come from functions like `AddAtom`. (For the purpose of this discussion, I'll call them string atoms.) The atom zero is invalid (`INVALID_ATOM`); atoms 1 through `MAXINTATOM-1`† are integer atoms, and atoms from `MAXINTATOM` through `0xFFFF` are string atoms. Why is the value of `MAXINTATOM` `0xC000`?

The reason has its roots in 16-bit Windows. Atoms are kept in a, well, atom table. The details of the atom table aren't important aside from the fact that the nodes in the atom table are allocated from the heap, and each node corresponds to one string in that atom table.

Since we're working in 16-bit Windows, the pointers in the atom table are 16-bit pointers, and all memory blocks in the 16-bit heap are 4-byte aligned. Alignment on 4-byte boundaries means that the bottom two bits of the address are always zero. Therefore, there are only 14 bits of value in the node pointer. Take that pointer, shift it right two places, and set the top two bits, and there you have your atom. Conversely, to convert an atom back to a node, strip off the top two bits and shift the result left two places.

Why encode the pointer this way? Well, you have 14 bits of information and you want to return a 16-bit value. You have two bits to play with, so your decisions are where to put those bits and what values they should have. It'd be convenient if all the integer atoms were contiguous and all the string were contiguous, to make range checking easier. Now you're down to two options. You have a 49152-value range of integer atoms and a 16384-value range of string atoms. Either you put the integer atoms at the low end (`0x0000-0xBFFF`) and the string atoms at the high end (`0xC000-0xFFFF`), or you put the string atoms at the low end (`0x0000-0x3FFF`) and the integer atoms at the high end (`0x4000-0xFFFF`). You probably don't want zero to be a valid string atom, since that's the most likely value for an uninitialized atom variable, so putting the string atoms at the top of the range wins out.

Now, with the conversion to Win32, the old implementation of atoms was thrown out. Atoms are no longer encoded pointers, but the new implementation still had to adhere to the breakdown of the 16-bit atom space into integer atoms and string atoms.

Over the next few entries, we'll take a look at other consequences of the way string atoms are assigned and surprising things you can do with atoms. (Not necessarily good things, but still surprising.)

Footnotes

†The `MAXINTATOM` symbol adheres to the classic Hungarian convention that "max" is one more than the highest legal value.

Tags

1. Mikey says:

"The MAXINTATOM symbol adheres to the classic Hungarian convention that "max" is one more than the highest legal value."

Whoa! I never heard of this.  Are all the max path etc. symmbols likewise defined?

2. Spike says:

@Mikey

Certainly that’s not the interpretation I get from here http://support.microsoft.com/kb/110264

I can only imagine that Raymond intends sarcasm here although that’s always hard to portray with the written word.

[Ahem, I wrote “classic Hungarian” not “Hungarian as reinterpreted later by people who misunderstood it.” -Raymond]
3. Triangle says:

I can only imagine that Raymond intends sarcasm here although that’s always hard to portray with the written word.

You will find portraying sarcasm with the written word is one of Mr. Chens’ greatest strengths.

4. Spike says:

Oops. I stand corrected (and thanks for deleting my erroneous entry too).

Now that explains why I keep on getting all these off-by-one bugs.

5. mikeb says:

I can’t help myself… from an old Simpson’s episode ("Radioactive Man"):

Coach: Up and atom!

Rainier: Up and at them!

Coach: Up and atom!

Rainier: Up and at them!

Coach: [annoyed] Up and atom!

Rainier: [louder] Up and at them!

Coach: [covers his eyes] Better.

— McBain misses the point, "Radioactive Man"

(quote snippet stolen from http://www.snpp.com/episodes/2F17.html)

6. Aaargh! says:

Uuhm.. what is an Atom in the first place ?

Some kind of atomic variable ?

7. Spike says:

@Aaaargh!

If only there were some web site you could go to to find the answer to questions like what are windows integer and string atoms?

8. KJK::Hyperion says:

Mikey: MAX_PATH = "X:" + 256 + 1 = 260

9. IP programmer says:

These days we use ‘max’ when talking about closed intervals and introduced a new hungarian prefix ‘mac’ for the upper limit of the open interval.

(Someone from Simonyi’s current team :))

great blog btw.

10. Yuhong Bao says:

“Now, with the conversion to Win32, the old implementation of atoms was thrown out.”

Win9x’s 32-bit atom functions in Kernel32 thunks to the 16-bit atom functions in USER.

[What, no mention of Win32s? If you’re going to nitpick, then at least do it right. -Raymond]
11. Yuhong Bao says:

I forgot to mention Win32s. That thunked almost every part of 32-bit User32 and Gdi32 to 16-bit USER and GDI. In some ways, Win9x is a enhanced version of Win32s. Luckily no one claimed Win32s is fully 32-bit and that fact was well-known, thus I don’t mention it. Win9x is often hyped to be a full 32-bit OS. It is not, as exposed by the Pentium Pro, made by Intel engineers who believed the hype.

[That was not an invitation to discuss Win32s. It was sarcasm. -Raymond]
12. ak says:

It might be (a little) offtopic, but I enjoy undocumented funfacts like that

13. Aaargh! says:

"If only there were some web site you could go to to find the answer to questions like what are windows integer and string atoms?"

Luckily enough ‘atom’ is not a generic word used for a gazillion other things. (the first hit is actually this article which isn’t very helpful). Wikipedia also has a lot of articles about ‘atom’s but not about this particular kind. Furthermore, I tried MSDN but that’s completely useless.

14. Spike says:

@Aaargh!

Atom is a common word.  So maybe you need to hone your search slightly.

Try going to a popular search engine, entering "What are windows integer and string atoms" and clicking "I feel lucky".

15. Aaargh! says:

Try going to a popular search engine, entering "What are windows integer and string atoms" and clicking "I feel lucky".

Did you even try that before posting ? Because you’ll end up on THIS page you’re looking at right now.

Also, the first MSDN hit isn’t too helpful, it explains what atom tables are used for, not what they are.

16. Sinan says:

re: Aaargh!

I searched for

atom windows api

using Google. The first hit is a Safari preview of Chapter 23 from the book Microsoft Windows 2000 API SuperBible. Now, I do not read SuperBibles etc, but there was enough information there to know what to look for next

http://safari.oreilly.com/0672319330/ch23lev1sec1

Sinan

17. Chris M says:

I have a classic Hungarian amplifier.

It goes to eleven.

18. paby says:

@Sinan

maybe it would have been more helpful to report here the answer instead of ping ponging around the ability of searching something over the cloud.

That’s what we’re all stand for I think. learning.

my 2 cents.

19. Spike says:

@paby

"maybe it would have been more helpful to report here the answer instead of ping ponging around the ability of searching something over the cloud. "

Well yes, but you know if you give a man a fish he’ll have a meal.  But if you teach a man to fish… well then he’ll have a hobby too.

20. Dean Harding says:

Aaargh: Do you mean this msdn page: http://msdn2.microsoft.com/en-us/library/ms649053(VS.85).aspx

? It seems pretty clear to me what atoms are just from reading that page. They allow you to share strings (or integers, obviously) via an opaque integer. Some atoms can be global (shared by all processes) and some a private (available to only one application).

So there’s your answer. I’m not sure what you’re looking for… also, until a few hours ago, that MSDN page was the first page you hit when you entered Spike’s query. It’s now the second hit (this page is the first).

21. me says:

"I can’t help myself.."

me neither…

BURNS’ GRANDFATHER: "Come on, come on! Crack those atoms! You, turn out your pockets. Atoms! One, two three, four… six of them! Take him away!"

WORKER: "You can’t treat the working man this way. One day, we’ll form a union and get the fair and equitable treatment we deserve! Then we’ll go too far, and get corrupt and shiftless, and the Japanese will eat us alive!"

BURNS’ GRANDFATHER: "The Japanese?! Those sandal-wearing goldfish-tenders? Bosh, flim-shaw!"

BURNS: "If only we’d listened to that boy, instead of walling him up in the abandoned coke oven. "

22. KenW says:

@Yuhong Bao: Who cares? Who was discussing Win32s/Win9x?

People who post nonsense that doesn’t apply to the current topic in order to impress people with their superior knowledge instead only prove that they haven’t got the intelligence to find something pertinent to say about the topic at hand.

23. Aaargh! says:

"They allow you to share strings (or integers, obviously) via an opaque integer. "

This is unclear to me, the MSDN page says "When applications pass null-terminated strings to the GlobalAddAtom, AddAtom, GlobalFindAtom, and FindAtom functions, they receive string atoms (16-bit integers) in return."

Sounds to me it’s just a 16 bit integer, but then why call them ‘atoms’ , just to make it less transparent and more annoying ? Doesn’t sound logical to me, it would seem to me that an atom is more than just a 16-bit int, why else give it a confusing name. Furthermore, why is it called an ‘atom’, because the name does not really make sense even when you know what it’s supposed to do. As far as I can see it’s just a key in a mapping table, so why not call it a key ?

So again: what is an atom ?

24. Aaargh! says:

"Well yes, but you know if you give a man a fish he’ll have a meal.  But if you teach a man to fish… well then he’ll have a hobby too."

You got the quote wrong, it’s : "Give a man fire, and he’s warm for a day. Set a man on fire and he’s warm for the rest of his life."

25. hholtmann says:

Where, When and Why would an windows application developer ever need to use an Windows ATOM in the first place?

Are Atoms the lowest overhead objects that can be shared by processes?

What gives?

Heston

26. mikeb says:

Cripes, people – if you want to know about the atoms that Raymond is talking about, Sinan did the heavy lifting and provided a *link* to a document that gives a basic description.  That’s right – a link.  All you have to do is click on it.

Surely, even if it’s too much to ask to actually use some form of Internet search, you can’t argue that it’s to much to ask that you  click on a link.  After all, how did you get to Raymond’s blog in the first place?

27. Jim Dodd says:

I was also confused about the use of "atom" in this context. It is not easy to find information about this because that term is used in many different ways. There are even two others (that I know about) in the field of software (1 – an "atomic" operation can’t be interrupted and its result corrupted and 2 – an API called Atom for RSS feeds). It’s interesting that a link that was proffered as the definition of "atom" for this context came from a third-party site.

http://msdn.microsoft.com/en-us/library/ms648708(VS.85).aspx

from the MSDN site itself is more helpful. It was not easy to find. Google and MSDN search are great tools but they don’t always find the answer you need as quickly as you need it. I am glad to help out with a search when I can. And I’m glad to know a bit more about the use of "atom" in the Windows 32 API.

Regards,

Jim Dodd

Onset Computer Corp.

28. Triangle says:

"So there’s your answer. I’m not sure what you’re looking for… also, until a few hours ago, that MSDN page was the first page you hit when you entered Spike’s query. It’s now the second hit (this page is the first)."

That there’s the problem with web two-point-oh.

Muddles up all ther search engines.

29. a wildcard says:

I tried splicing an ATOM but all I had left was 2 BYTEs and no bang! Aaaaaaaw :'(

Maybe Windows should get a new api, PlodeEx (to remain consistent), that works faster then the UNIX nuke(8) :)