Printing the contents of the clipboard as text to stdout

The clip.exe takes its stdin and puts it on the clipboard. But how do you get it out? That's today's Little Program. (I guess we could call it clop.exe.)

#define UNICODE
#define _UNICODE
#define STRICT
#include <windows.h>
#include <stdio.h>
#include <tchar.h>
#include <strsafe.h>

void WriteToStdOut(const void *pvBuf, DWORD cbBuf)
 DWORD cbWritten;
 WriteFile(GetStdHandle(STD_OUTPUT_HANDLE), pvBuf, cbBuf,
           &cbWritten, nullptr);

int __cdecl _tmain(int argc, PTSTR *argv)
 if (OpenClipboard(nullptr)) {
  HANDLE h = GetClipboardData(CF_UNICODETEXT);
  if (h) {
   auto pwchText = static_cast<PCWSTR>(GlobalLock(h));
   if (pwchText) {
    SIZE_T cbMemory = GlobalSize(h);

    // arbitrary limit because I am lazy
    cbMemory = min(cbMemory, 0x10000000);

    size_t cbActual;
    if (SUCCEEDED(StringCbLengthW(pwchText, cbMemory,
                                  &cbActual))) {
     if (argc == 2 && _tcsicmp(argv[1], TEXT("/u")) == 0) {
      WriteToStdOut(pwchText, cbActual);
     } else {
      UINT cp = (argc == 2 &&
                _tcsicmp(argv[1], TEXT("/a")) == 0) ?
                     CP_ACP : CP_OEMCP;
      int cch = WideCharToMultiByte(cp, 0, pwchText,
               cbActual / 2, nullptr, 0, nullptr, nullptr);
      if (cch > 0) {
       auto psz = new(std::nothrow) char[cch];
       if (psz) {
        WideCharToMultiByte(cp, 0, pwchText, cbActual / 2,
                               psz, cch, nullptr, nullptr);
        WriteToStdOut(psz, cch);
        delete[] psz;
 return 0;

Okay, what do we have here?

We open the clipboard and try to get the Unicode text on it. We then look for the null terminator within the first 0x10000000 bytes. Why do I stop at 256MB? Because I'm lazy and this lets me avoid worrying about integer overflow. This is a Little Program, remember.

If you pass the /U command line switch, then the output is printed to stdout as the Unicode string itself.

If you pass the /A command line switch, then the output is converted to ANSI.

Otherwise the output is converted to the OEM code page.

Bonus chatter: You can get most of the same program above (no Unicode output) in much less code if you're willing to use C#:

class Program {
  public static void Main(string[] args)
    string text = System.Windows.Forms.Clipboard.GetText();
    if (args.Length == 1 && string.Compare(args[0], "/a", true) == 0) {
        System.Console.OutputEncoding = System.Text.Encoding.Default;
        System.Console.Write("changed encoding");

Or perl (ANSI output only):

use Win32::Clipboard;
print Win32::Clipboard()->GetText();
Comments (27)
  1. BOFH says:

    It has annoyed me a lot over the years that the clip.exe from the resource kits and later bundled in Windows, could only recieve a pipe of another program's stdout into the clipboard, but had no way of printing the contents of the clipboard into stdout.

    Why this half-measure? :(

    At least UnxUtils has had pclip.exe, so with a bit of foresight it has been possible to handle the omission of this (pretty obvious) feature:

    It's baby-steps, I guess…

    Maybe we'll see this feature in a future version of clip.exe? :)

    [First you need to work it out with the people who complain that Windows is bloated. (One person's half-measure is another person's tightly-focused feature.) -Raymond]
  2. John says:

    Don't call it clop:

       // arbitrary limit because I am lazy

       cbMemory = min(cbMemory, 0x10000000);

    Can you really copy 256 MB of text to the clipboard?  [Ron_Burgundy]I'm not even mad.  That's amazing.[/Ron_Burgundy]

  3. Roland says:

    I wrote some similar programs, ccopy, ccut and ppaste that will cut/copy/paste text and files to and from the clipboard. The code is available on GitHUb. You can do things like 'ccopy *.txt' or 'dir | ccopy'…/ccopyppaste

  4. Chris L says:

    Be a friendly citizen of the world and make your program output CP_UTF8 instead of anything else.

    [You're making the assumption that all programs accept CP_UTF8. -Raymond]
  5. Chris says:

    If clip.exe it for getting text to clipboard, then the tool getting text from clipboard should be board.exe.

  6. MItaly says:

    Since the last non-NT Windows dates back to almost 14 years ago, current CRTs don't support pre-NT Windows since 2008 and you hard-define `UNICODE` and `_UNICODE` on the top of the program, is there some particular reason why you keep using TCHARs & co.? I ask because the rare times I happen to write some Win32 code I always wonder if I should really go on using those unreadable `_tcsXXX` macros, that nowadays seem completely obsolete.

    [To make it easier for people to incorporate my code fragments into production code that does not target Unicode. -Raymond]
  7. WndSks says:

    It is pretty sad that Windows still does not have a proper console write helper. To do this correctly you have to:

    1) Check if stdout is the real console (not redirected), if it is you must call WriteConsoleW to get proper Unicode output (You must select a TT font in the console properties as well to really get the full benefit, you will still get some boxes but if you copy/paste to a text editor you will get the correct output)

    2) To support clip >> maybeUTF16BOM.txt you have to call GetFileType on stdout, if it is a file you need to seek to the start of the file and check for a BOM. If there is no BOM you output ACP/OEM or whatever your default encoding is…

  8. It crashes the compiler ;-)

       Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80×86


       a.cpp(40) : error C2039: 'nothrow' : is not a member of 'std'

       a.cpp(40) : error C2065: 'nothrow' : undeclared identifier

       a.cpp(40) : fatal error C1060: compiler is out of heap space

    #include <new> is missing. But the worse problem is, it displays correct characters only in OEM mode. And yes, I've set the console window font to Consolas.

  9. Adam Rosenfield says:

    [You're making the assumption that all programs accept CP_UTF8. -Raymond]

    It's true that not all programs accept UTF-8, but we'll never get there if we don't take any steps in that direction.  I'm a strong proponent of UTF-8 everywhere (

    My version of this little program only outputs UTF-8, it doesn't have any switches for choosing between ANSI/OEM/UTF-16.  The terminal program I use is Cygwin's MinTTY, and since that supports UTF-8, the output is always Unicode, and I never need to think about code pages.

    I realize that Windows uses UTF-16 for historical reasons (UTF-8 not being invented yet yada yada yada), but going forwards, I believe that the best long-term solution for new code is to always use UTF-8 whenever possible, and then convert to/from UTF-16 only when interfacing with Windows API functions.

  10. GregM says:

    "If clip.exe it for getting text to clipboard, then the tool getting text from clipboard should be board.exe."

    My first thought was "it's clip.exe run backwards, so it should be pilc.exe".  :)

  11. Danny says:

    "Or perl (ANSI output only):

    use Win32::Clipboard;

    print Win32::Clipboard()->GetText();"

    Or Delphi

    uses clipboard;


  12. loRg says:

    "Bonus chatter: You can get most of the same program above (no Unicode output) in much less code if you're willing to use C#:"

    You mean what will appear to be less code.

    "Or perl (ANSI output only):"

    Or in my new language (full unicode support with artificial intelligence just for the kicks)

    " "

    You shouldn't measure code quality or goodness by the code length. Lazy talk. You should measure good code by the least amount of bloat binary code it generates to do the task.

    Code length fetish exasperates me.

    [So you're saying we should all be writing in assembly language in order to get the best code? -Raymond]
  13. Cheong says:

    @Danny: Your code won't compile because it lacks all kinds of "Program <progname>", "Begin", "End." struct. With Perl, these are not necessary because it's scripting.

  14. Joker_vD says:

    [You're making the assumption that all programs accept CP_UTF8. -Raymond]

    The program that doesn't accept, say, CP-1251, doesn't accept UTF-8 just as fine, because it still can't fopen() that file path I copied to the clipboard. Really, the very fact that fopen() can't reliably open like, half of the files on my machine is apalling.

    And if one has to feed a CP-1258-accepting program a UTF-8 string, well, there is iconv, and if you don't have iconv, you can write a MultiByte->UTF8/UTF8-MultiByte converter in 10 minutes in C, and I bet every more or less experienced Windows programmer has done it (I personally had to write it at least thrice, on different machines in different places).

  15. Cd-MaN says:

    A quick look at the docs (…/ seems to indicate that Perl can output Unicode like this:

    use Win32::Clipboard;

    print Win32::Clipboard()->GetAs(CF_UNICODETEXT);

    Don't quote me on this though – I don't have access to a Windows machine right now to test the correctness.

  16. Joshua says:

    The problem of logic here is because of erroneous decisions by MS that were only visible in hindsight, portable+Unicode+Windows has no intersection.

  17. Random User 572095 says:


    Indeed. Michael Kaplan has written about this a few times. I suppose you could use a variation on "void WriteLineRight(string)" from his Apr 2010 post:…/9989346.aspx

    Actually, is you search his blog for "IsConsoleFontTrueType", you can find all kinds of interesting info about outputting text in consoles that may or may not be redirected.

  18. Random User 572095 says:

    Adam Rosenfield,

    I have heard it argued that favoring UTF-8 is biased in favor of languages that can be represented entirely (or mostly) in basic ASCII. Most of Latin-based scripts get one or two bytes per character. Basic Greek, Coptic, Hebrew, and some Arabic gets two bytes per character. Most scripts not already mentioned (Tagalog, more Greek, Chinese, Japanese, Korean, etc) end up with three or more bytes per character.

    Meanwhile, if you use UTF-16, just about everything in common use (theoretically) is two bytes.

  19. Joshua says:

    @Random User 572095: Just about the only ones penalized are extended Greek and Japanese. The others are sill reasonably small because they're one or two symbols per word rather than letter. UTF-16 unfairly penalizes western languages by allocating thousands of codepoints to each asian language but most codepoints are the same size as western languages that get about 60 codepoints.

  20. Adam Rosenfield says:

    @Random User 572095: While it's true that UTF-8 is biased towards Latin-based scripts in terms of memory used, that's only one (small) argument out of many in favor of UTF-8; did you read the link I posted?

    The theoretical advantage of UTF-16 is that it's a fixed-width encoding, which simplifies a lot of things, except that's not even true since you have to deal with surrogate pairs for characters outside the BMP.  So you still have to deal with a variable-width encoding, or your code doesn't really support Unicode, and that entire advantage is gone.

  21. Joker_vD says:

    My first language is one of those languages that have to use 2 bytes per characters, be it UTF8 or UTF16. And hey, if you used a legacy code page, you could use 1 byte per character and save half the space!

    But you know what? It actually doesn't matter. At all. If you're concerned about the disk storage and/or network bandwidth — then use a compressor, any compressor, it compresses UTF8-encoded text to the same size it compresses ACP- or UTF16-encoded text. And concerns about RAM usage? An empty DOC-file was 10 KB and an empty DOCX was 12 KB large the last time I checked; in the biggest Word documents I've seen had most of the space was spent on pictures and (I belive) markup — 200 MB document turned into ~200 KB text file when saved as plain text. I heavily doubt the character strings eat up most of the memory in application.

    So please, stop digging up that "reason". In the year 2013 we have enough memory, and that memory is fast enough for fiddling around with text.

  22. UTF-8 is a popular encoding nowadays. I believe Windows should add support for UTF-8 as a MBCS encoding for native applications. There's no technical reason not to do it.

  23. Joshua says:

    Many people say that. Michael Kaplan says it's hard. I say it's not hard. Code page 65001 is already assigned to it. It almost works as an OEMCP (everything appear to work except batch files don't run). From what I can find, only AnsiPrev and AnsiNext need to be fixed.

    [There is a lot of code that assumes that the maximum number of bytes in a multibyte encoding is 2. (See, for example, any code that finds the start of the current MBCS character. Or which assumes that anything that isn't a lead byte must be a trail byte.) Everything may appear to work for you, but there's a lot of stuff that doesn't work that you simply haven't run into (or haven't noticed). -Raymond]
  24. Random User 572095 says:

    Granted. And you are correct, I accidentally overlooked your link; I apologize.

    My purpose was only ensure due thought was given. Clearly you have, or you would not have provided your link. (I have encountered many who have put little or not thought into it. For example, some seem to think UTF-8 somehow magically maps all of Unicode into 8 bits per character.)

  25. loRg says:

    [So you're saying we should all be writing in assembly language in order to get the best code? -Raymond]

    Of course not and you know it or at least i hope you are.

    I'm saying we should not be fooled by gimmicks like "oh look at this new language you can do more with less code. See how stupidly easy it is to use too. Even an intellectually challenged person could use it. Yes, you can make a program." but they fail to perf test or even mention the mess that goes behind the scenes to make it work or the problems of allowing lazy or intellectually challenged people to code.

    Have you ever tried to use windows 8 desktop mode? Charms and other crap popups the moment the mouse is too close to the screen edges. Disrupts and aggravates the user.

    To shutdown windows 8 you have to first go into metro than move the mouse to the right edge, wait 5 minutes, hope it doesn't go away when you move the mouse to the right button, few clicks later windows shuts down.

    Why not have it in the same menu as the log out option?

    This is the sort of utterly bad design that happens when you let lazy and/or stupid people code.

    Other examples: Look at the javascript code sites and ads use on the web. It's horrible!

    Coding should be difficult as an intelligence test.

    Just as internet should be. Internet was difficult to access before, it was a peaceful time. Less to no scams, trolls were unheard of, no nonsense elitist views. Ad blockers wasn't even needed back then because the ads was not hostile.

    [Let's go back to your original statement. "You should measure good code by the least amount of bloat binary code it generates to do the task." Total binary code size is only one factor in determine whether code is "good". Others are maintenance cost, deployment cost, opportunity cost, and portability. -Raymond]
  26. Joshua says:

    You're right my testing is limited. I have only 3 classes of programs. Those that are compiled Unicode, those that have latent code for UTF-8, and those that will never support MBCS, and most of the third category will pass UTF-8 if no attempt to edit the sequences is made.

    [Programs in the third category will run into a lot of problems when they try to do things like word breaking or ellipsis munging. They may end up splitting a three-byte UTF-8 sequence after the second byte. Many programs in categories 1 and 3 also assume that strlen(MBCS) ≤ 2 * wcslen(Unicode). Violating this assumption will lead to buffer overflows. -Raymond]
  27. Joshua says:

    No category 3 program in my possession makes any Unicode conversion call, and if I could use AppLocale's method to set 65001 no Category 1 program would ever observe the strange codepage. I know how to fix clipboard to behave with AppLocale and don't need MS to lift a finger for that piece.

    [Category 3 programs assume that 1 byte = 1 character. This is clearly not true with UTF-8. -Raymond]

Comments are closed.