What does the letter "T" in LPTSTR stand for?


The "T" in LPTSTR comes from the "T" in TCHAR. I don't know for certain, but it seems pretty likely that it stands for "text". By comparison, the "W" in WCHAR probably comes from the C language standard, where it stands for "wide".

Comments (35)
  1. Rui Craveiro says:

    Ain’t it nice that in .Net types have names instead of acronyms? ;-)

  2. I always thought the "T" stood for "typed char" because just CHAR would have been confusing; though "text" makes more sense.

  3. Chris says:

    Hum.  It’ll still always be "Long Pointer To a STRing" for me.

  4. dave says:

    Why are the Win32 calls named

    ‘W’ for ‘Wide’ and

    ‘A’ for ‘ANSI’

    ?

    Why not ‘W’ and ‘N’ (for ‘Narrow’)

    or ‘U’ (for ‘Unicode’) and ‘A’ (for ‘ANSI’)?

    (It’s especially galling, since it’s the ‘A’ functions that support MBCS …)

    [Wow, I think that’s a new record. A two-sentence
    post, and somebody doesn’t read the second sentence. As for the A, that
    was discussed two years ago. -Raymond
    ]
  5. Mark Zuber says:

    I think it’s T for Typed Char as stated above since T can change type based on the compiler settings (with _UNICODE defined, it’s a WCHAR and with _UNICODE not defined, it’s a CHAR, ansi).  Similar to the _T("") macro that will do the same thing with a constant string as opposed to L"" which always makes a wide char.

  6. dave says:

    >Wow, I think that’s a new record. A two-sentence post, and somebody doesn’t read the second sentence.

    I think you misunderstood my point. I’m obviously aware that the
    programming langauge contains ‘wide chars’, and it doesn’t take much to
    figure out that ‘W’ comes from ‘wide char’.

    My point is that, if you’ve got two things to distinguish by a
    naming convention, you could name them systematically based on size
    (‘wide’ or ‘narrow’) or on content (‘unicode’ versus ‘ansi’).
     Mixing the two (‘wide’ versus ‘ansi’) seems a little, well,
    arbitrary.

    >As for the A, that was discussed two years ago.

    That’s as maybe, but I wasn’t here two years ago, and it didn’t occur me to see whether this was a subject already covered.

    [Given that the type for Unicode characters is
    WCHAR, decorating functions as GetWindowTextU would lead to people saying “That’s completely moronic. It should be decorated with a W. WCHAR, GetWindowTextW.
    Duh.” What I’ve learned from writing this blog is that no matter what
    you do, half of the people will tell you that you made the obviously
    wrong decision and that even a moron would have recognized that the
    other way was clearly superior. (As for old topics: Perhaps
    I should just put the blog on infinite repeat. Every day, I just repost
    something from two years ago. Would save me a lot of work.) -Raymond
    ]
  7. foxyshadis says:

    That’s as maybe, but I wasn’t here two years ago, and it didn’t occur me to see whether this was a subject already covered.

    Why comment, if you’re unwilling to do any research? To you honestly expect your complaints to be taken seriously when you respect someone that little?

    Now if it was Q or G, maybe you’d have a point.

  8. Robert says:

    Let us all be nice.

    First, Raymond types at a high level.  I am mid-level, in the usual WIN32/C realm he discusses, and out of date.  Slack is due. We are here at his sufferance.  Did you get what you paid for?!?

    On the other hand, the above poster reads the blog to learn, so, let us not try to discourace him.

    Nivenesque third hand, keep up the posts, without dupes.

    Thx

  9. Wilfried says:

    As far as I remember, ‘T’ means ‘transparent’ – by using TCHAR as character type you handle character size transparently.

  10. Mihai says:

    <<half of the people will tell you that you made the obviously wrong decision>>

    Here you are wrong. At least 70% will say you made the wrong decision :-) And if you change lines, the one you just left will go faster. Murphy is always right :-)

    <<"W" in WCHAR probably comes from the C language standard>>

    I think WCHAR was used in Windows before the C standard decided to adopt the half-baked wchar_t.

  11. dave says:

    Wow, things can certainly get heated when one asks what appears (to me) to be a straightforward question.

    I take it the answer is "it’s W versus A (rather than W versus N, or W versus C) for no particular reason".

    Lest you think I am somehow telling you that you made the wrong decision, I’ll point out that the only note of complaint was me using "galling" about using MBCS with functions tagegd "A".  Which isn’t really that irritating to me personanlly, since I tend to use Unicode.

  12. Miral says:

    The T in TCHAR definitely means "TEXT", as Raymond thought.  And I have proof :)

    In tchar.h there is a macro _T defined that is intended to be used around your TCHAR-based string constants [eg. _T("foo")], so that they will change to Unicode as appropriate as well.

    There is an alias for the same macro, which actually predates it (it was shortened because it was a hassle to type the whole thing in everywhere).  And guess what it’s called?  That’s right, _TEXT.

    So an LPCTSTR is a Long Pointer to a Constant Text STRing (or _TEXT String).

  13. Ben Bryant says:

    The T may come from _T and _TEXT historically, but it is a bit unhelpful because the difference between LPTSTR and LPSTR is not that one is ‘T’ext and the other is not, it is that one is ‘T’yped i.e. that it will switch between W and A. But oh well; that’s how these things come about.

  14. JamesW says:

    @Raymond

    ‘What I’ve learned from writing this blog is that no matter what you do, half of the people will tell you that you made the obviously wrong decision and that even a moron would have recognized that the other way was clearly superior.’

    It took years of blogging to work this out?! Did the religious wars regarding the best placement of { in code not give you an insight into typical programmer mentality. Obviously K&R bracing is best and emacs is the only text editor – only a moron would disagree. Oh, and pointers should be decalred thus: ‘int* ptr’. All alternatives such as ‘int *ptr’, and ‘int * ptr’ are unholy.

  15. JamesW says:

    @Mihai

    Ah, but multiple declarations on a single line are wrong too ;)

  16. Mihai says:

    <<pointers should be decalred thus: ‘int* ptr’>>

    Nope, ‘int *ptr’ is the right thing :-)

    Only half kidding. This is my motive:

      int* a, b, c; // a, b, c are int* ???

    versus

      int *a, b, c; // only a is pointer.

  17. Amazing the arguments people can have in the comments of blog posts, isn’t it? I was reminded of this

  18. Xavi says:

    >half of the people will tell you that you made the obviously wrong decision<

    Is that a problem or is this an answer to Daves point?

    I read such blogs to see why and where different minds come to
    different conclusions. Sometimes they match my thinking, sometimes not,
    sometimes they change my mind.

    I’ve read Dave’s first post and agreed with him. Then I’ve read
    Raymonds note on it, and got confused where (like Dave) I missed
    something.

    After re-reading the blog-statement and Daves post several times, I
    do not see a reason to award Dave with a Guinnes-Book entry about
    “fewest sence understood”.

    Daves reasoning is comprehensible and it obviously applies also to WCHAR, which could also be named UCHAR.

    Thus, to argue with “Given that the type for Unicode characters is
    WCHAR” is just not applicable as it does not take Daves point into mind.

    After reading the first two sentences from Daves post, this was clear to me, how about you?

    Regards Xavi

    (unrelated to any of the posters here)

    [And I gave the answer in the second sentence: It’s called WCHAR because that’s what the C language calls it (wchar_t). -Raymond]
  19. Adam says:

    "I think WCHAR was used in Windows before the C standard decided to adopt the half-baked wchar_t."

    Well, wchar_t was in the first (1990) ISO C standard, and AFAIK most of the things in the standard had been tried out in /at least/ one implementation for some time to make sure it was workable, so it’d almost certainly have been available in a couple of compilers since 1989.

    If anyone knows when WCHAR was first introduced into Windows, that’d be interesting.

  20. l33t c0d3r d00d says:

    Xavi wrote:

    > Daves reasoning is comprehensible and it obviously

    > applies also to WCHAR, which could also be named UCHAR.

    Not really.  UCHAR is already defined to mean "Unsigned char".

  21. KJK::Hyperion says:

    To clear up a common misunderstanding… the UNICODE define affects Win32, the _UNICODE define affects the Microsoft C runtime library. The difference is subtle. UNICODE controls the following definitions:

    TCHAR

    LPTSTR

    TEXT()

    Win32 A/W APIs and structures

    _UNICODE, on the other hand:

    _TCHAR

    _TINT

    _T()

    CRT string APIs and related structures

    Yes, you can define UNICODE and _UNICODE differently. I think ATL has specific support for that. IIRC it also supports the obscure corner case in which OLECHAR is defined as CHAR rather than WCHAR (16-bit Windows & MacOS)

    Closing note about LPSTR/LPWSTR/LPTSTR: they are not just typedefs for CHAR/WCHAR/TCHAR *, they mean "NUL-terminated string of CHAR/WCHAR/TCHAR". This is actually enforced in PREfast ("lint" mode of the latest Microsoft compiler)

  22. Mihai says:

    <<If anyone knows when WCHAR was first introduced into Windows, that’d be interesting.>>

    The need was definitely out there, but who was really-really first, is tought to say.

    Windows NT 3.1 was Unicode, and was started in 1988 (http://en.wikipedia.org/wiki/Windows_NT_3.1)

    And no, I don’t have a Windows NT 3.1 SDK around to check if it used WCHAR :-)

  23. ulric says:

    This is pointless, but I actually did check the other day about when ‘wchar’ became part of the C Standard – figuring that it must have been much later than NT.  easynews revealed discussion about gcc’s wchar early in the 90s.  So maybe not the actual standard yet, but it existed on unix.

    p.s. : yes, UCHAR is a common typedef for ‘unsigned char’.

  24. Norman Diamond says:

    wchar_t in Unix was not originally Unicode, though it can be (I’m not sure if anyone does it though).

    A few things I’ve read lend the impression that wchar_t in NT was not originally Unicode, though it is now.

    For comparison, char was not originally EBCDIC, but it can be.  For comparison, char was not originally either (depending on its value) 100% or 50% of a Shift-JIS character, but it can be.

  25. Ulric says:

    True norman, wchar_t does not mean unicode or anything in particuliar except ‘wide’.

    On unix, wchar_t is in fact 32-bit but the spec says nothing about how it should be implemented.

  26. Norman Diamond says:

    Wednesday, October 18, 2006 11:46 PM by Ulric

    On unix, wchar_t is in fact 32-bit

    I’m pretty sure it was 16-bit when I was reading related code.  Even though most processors had 32-bit words, 16-bit unsigned shorts existed.

    Of course that was for EUC Japanese only, so other locales could vary.  Locales existed by the time I was reading that code, but they might not have existed yet when that code was written.

  27. Norman Diamond says:

    Ouch, I lied, sorry.

    Even though most processors had 32-bit words,

    16-bit unsigned shorts existed.

    Should be:

    Even though most general-purpose processors (that could run compilers, database systems, etc.) had 32-bit words, 16-bit unsigned shorts existed.

  28. C Gomez says:

    I believe Raymond should take heart that there is likely a silent majority who reads what Raymond has to say… nods… and either agrees with Raymond or appreciates the insight regardless of personal opinion.

    They do not bother reading the comments.

  29. BryanK says:

    Well, you need 32 bits to store a UCS-4 character.  (Or is that UTF-32?  I can never remember which is which.  In this case I think they’re exactly the same bit pattern for every code point, but in the case of UTF-16/UCS-2, some non-BMP stuff is different.  One of UTF-16/UCS-2 can’t represent non-BMP characters, and I can never remember which.)

    But anyway, if you want your strings to be arrays of characters, without any special escaping required at all, you’ll need a 32-bit wchar_t.  And you’ll waste incredible amounts of memory on a long wchar_t string that contains only 7-bit-ASCII characters, but there’s not much you can do about that.

  30. Gabest says:

    @Mihai

    typedef int* intptr;

    intptr a, b, c;

    ‘int*’ IS the type, I think it’s just the c syntax that makes it look stupid without the typedef.

  31. Igor says:

    >Given that the type for Unicode

    >characters is WCHAR

    No, it is actually unsigned short so only GetWindowsTextUS would be proper. :)

    >  int* a, b, c; // a, b, c are int* ???

    >versus

    >  int *a, b, c; // only a is pointer.

    Tsk, tsk, tsk, only a is pointer in both cases.

    Last time I checked you were supposed to write:

     int *a, *b, *c;

    if you wanted them all to be the pointers.

  32. GregM says:

    Igor, that’s exactly the point.   The first form makes it *look* like "a", "b", and "c" are of type "int*", instead of "a" being "int*" and "b" and "c" being "int", because the * is next to the "int" instead of the "a".

  33. 640k says:

    An int in ansi c should be the largest native integer type of the hardware. Why isn’t int 64-bit on visual c++ in win64?

    [Please search the archives before asking questions. -Raymond]
  34. Igor says:

    >Igor, that’s exactly the point.

    >The first form makes it *look* like…

    I personally always use the second form but everyone familiar with C syntax knows that compiler doesn’t give a damn where the whitespace in the above-mentioned code is.

    >Why isn’t int 64-bit on visual c++ in win64?

    Apart from Raymond’s explanation there is one potential reason I perceive — code size.

    Due to x64 concept being just an expansion of GPRs (exactly like it happened when we had 16->32 bit transition) where underlying CPU architecture wasn’t completely widened (ALUs, multipliers, data paths, etc) both Intel and AMD suggested that 64-bit code still use 32-bit integers wherever possible because:

    a) it is more efficient in terms of performance

    b) instructions that use 32-bit operands are shorter leading to the more compact code which fits better in L1 instruction cache thus again improving performance.

    As the matter of fact, I believe that is the only justifiable reason.

    Raymond mentioned that changes in SDK would break the code. I think there are alternatives to what they did based on one simple fact:

    Fact is that you release new Platform SDK to enable developers to use features present in future operating systems.

    Based on that fact we may assume that whoever installs latest Platform SDK they are doing it because they intend to support those future operating systems.

    From that it easy to conclude that they will bite the bullet and fix their code.

    Those who don’t want to fix it can continue using old Platform SDK.

    That is at least what I would do because this will have to be fixed sooner or later.

    Sorry for the O/T.

Comments are closed.

Skip to main content