What does the "l" in lstrcmp stand for?


If you ask Michael Kaplan, he'd probably say that it stands for lame.

In his article, Michael presents a nice chart of the various L-functions and their sort-of counterparts. There are other L-functions not on his list, not because he missed them, but because they don't have anything to do with characters or encodings. On the other hand, those other functions help shed light on the history of the L-functions. Those other functions are lopen, lcreat, lread, lwrite, lclose, and llseek. There are all L-version sort-of counterparts to open, creat, and read, write, close, and lseek. Note that we've already uncovered the answer to the unasked question "Why does llseek have two L's?" The first L is a prefix (whose meaning we will soon discover) and the second L comes from the function it's sort-of acting as the counterpart to.

But what does the L stand for? Once you find those other L-functions, you'll see next door the H-functions hread and hwrite. As we learned a while back, being lucky is simply observing things you weren't planning to observe. We weren't expecting to find the H-functions, but there they were, and they blow the lid off the story.

The H prefix in hread and hwrite stands for huge. Those two functions operated on so-called huge pointers, which is 16-bit jargon for pointers to memory blocks larger than 64KB. To increment your average 16:16 pointer by one byte, you increment the bottom 16 bits. But when the bottom 16 bits contain the value 0xFFFF, the increment rolls over, and where do you put the carry? If the pointer is a huge pointer, the convention is that the byte that comes after S:0xFFFF is (S+__AHINCR):0x0000, where __AHINCR is a special value exported by the Windows kernel. If you allocate memory larger than 64KB, the GlobalAlloc function breaks your allocation into 64KB chunks and arranges them so that incrementing the selector by __AHINCR takes you from one chunk to the next.

Working backwards, then, the L prefix therefore stands for long. These functions explicitly accept far pointers, which makes them useful for 16-bit Windows programs since they are independent of the program's memory model. Unlike the L-functions, the standard library functions like strcpy and read operate on pointers whose size match the data model. If you write your program in the so-called medium memory model, then all data pointers default to near (i.e., they are 16-bit offsets into the default data segment), and all the C runtime functions operate on near pointers. This is a problem if you need to, say, read some data off the disk into a block of memory you allocated with GlobalAlloc: That memory is expressible only as a far pointer, but the read function accepts a near pointer.

To the rescue comes the lread function, which you can use to read from the disk into your far pointer.

How did Windows decide which C runtime functions should have corresponding L-functions? They were the functions that Windows itself used internally, and which were exported as a courtesy.

Okay, now let's go back to the Lame part. Michael Kaplan notes that the lstrcmp and lstrcmpi functions actually are sort-of counterparts to strcoll and strcolli. So why weren't these functions called lstrcoll and lstrcolli instead?

Because back when lstrcmp and lstrcmpi were being named, the strcoll and strcolli functions hadn't been invented yet! It's like asking, "Why did the parents of General Sir Michael Jackson give him the same name as the pop singer?" or "Why didn't they use the Space Shuttle to rescue the Apollo 13 astronauts?"

Comments (15)
  1. Anonymous says:

    I don't think l is for long, it is for LARGE, as in LARGE pointer or model

    [If you look at the Hungarian Notation documents, you'll see that "lp" officially stands for "long pointer", not "large pointer". Not that it really matters. -Raymond]
  2. Anonymous says:

    @ToddLa: You mean LONG_PTR? There's no such thing as "LARGE" in the Win32 API AFAIK.

    [Todd is going back to the Intel memory models which every 16-bit programmer had memorized. -Raymond]
  3. Anonymous says:

    It is mildly amusing that I'm used to llseek taking a 64 bit offset and returning a 64 bit position. Accidental name collision.

  4. JonPotter says:

    So it's a courtesy now to export functions for applications to use? Funny, I thought that was kinda the whole point of an OS.

    [Strange, the linux kernel doesn't provide strcmp. They leave that to the C runtime. Courtesy functions have a tendency to become support burdens. -Raymond]
  5. Anonymous says:

    Not to be confused with the _l suffix which appears on some functions (such as the *printf_l/*scanf_l families of functions), where the l stands for locale.

  6. Anonymous says:

    … or "Why did they build Windsor Castle so close to the motorway?"

    (my favourite overheard quote from an American tourist)

  7. Anonymous says:

    Here I was thinking that the lstrcmp() was just a locale-sensitive version of strcmp() — which it appears to be, according to the documentation.  I never looked at strcoll().

    Of course, one of the positive things about the lstrcmp() function is that it is exported from KERNEL32.DLL.  If you're making a program that, for some reason (*sheepish grin*), doesn't make use of the C Runtime Library, you can still have a strcmp() function without rolling your own. :)

  8. Anonymous says:

    Unlikely, motorways are called freeways in US.

  9. Anonymous says:

    Those functions are nice, you can avoid C runtime and have SEH too.

  10. Anonymous says:

    Time to fire up your time machine again Raymond.  (Can I borrow it after your done with it?)

  11. Anonymous says:

    You could probably borrow it before he's done with it. You humans and your silly one-directional way of thinking about time …

  12. Anonymous says:

    [Strange, the linux kernel doesn't provide strcmp. They leave that to the C runtime. Courtesy functions have a tendency to become support burdens. -Raymond]

    Normally true. In this case, it's cheap to export stdlib library functions from kernel32.dll that kernel32.dll needs anyway. Linux knerel doesn't do the same as crossing a memory barrier to get to it makes it not worth it. kernel32 is userspace.

    Note that I'm saying that really only the l… and h… functions that already exist and wsprintf had any business being exported. Yes, lstrcmp and wsprintf are a little bit clunky, but wsprintf still has its uses even today. Had there been a few more that do exactly what the C standard says I'd have argued for them as well. The only ones I know of are RtlCopyMemory and RtlMoveMemory in ntdll that should have been where memcpy and memmove were redirected to but were not. (If you try to call them without doing something funky you end up in memcpy and memmove thanks to a #define in windows.h).

  13. Anonymous says:

    @Tihiy

    >>> Those functions are nice, you can avoid C runtime and have SEH too.

    What is the advantage of not using the CRT?

    I also noted that Raymond tends to use these l-functions (e.g. lstrcpyn) often in the code presented in this blog. What is the reason? Should we use these l-functions instead of ordinary CRT functions in our Windows apps written in native C/C++ code?

    Thanks.

    [My samples are written for brevity, and using the l-functions saves me a #include. -Raymond]
  14. Anonymous says:

    And there you go, the main reason other programming languages were invented outside Micro$oft backyard (like Pascal, Java, etc), to avoid the "lame" functions which only adds to the normal headache any serious application is generating.

  15. Anonymous says:

    I used to think they were "High" and "Low" as in most significant and least significant..

Comments are closed.