Copying HTML on the clipboard


Setting plain text on the clipboard is easy. Call Clipboad.SetText(“Hello!”), and it works great. But what if you want to set HTML?  Tempting to think you just call Clipboard.SetText(“<b>Hello!</b>”, TextDataFormat.Html). But that doesn’t work because HTML on the clipboard (in CF_HTML format) requires that the string contains a header in front of the actual HTML.  (I first hit this when I wanted to make my RTF 2 HTML converter copy the html to the clipboard instead of write it to a file.)


You’ll also notice this if you call Clipboard.GetText(TextDataFromat.Html) after copying HTML to the clipboard.  You don’t just get the HTML string back, there’s also a giant header in front of your string. This gives 2 problems:


1. How do you get the HTML from the clipboard (strip the header).
2. How do you copy raw HTML to the clipboard? (generate the header)


I have some sample code to do this below, but there’s a few points I want to hit first. 


Example of a bad example:
<rant> The example code for Clipboard.SetText is very “clever”. It manages to call the API, but in a way that’s completely meaningless, and completely avoids mentioning this crucial header. 


// Demonstrates SetText, ContainsText, and GetText.
public String SwapClipboardHtmlText(String replacementHtmlText)
{
String returnHtmlText = null;
if (Clipboard.ContainsText(TextDataFormat.Html))
{
returnHtmlText = Clipboard.GetText(TextDataFormat.Html);
Clipboard.SetText(replacementHtmlText, TextDataFormat.Html);
}
return returnHtmlText;
}

My guess is that the sample writer originally tried to do something straightforward and useful, it didn’t work (for the exact reasons I’m writing this blog post), and then came up with this more obscure meaningless excuse of an example. </rant>


So what’s this header?


ClipBoard.SetText(…, TextDataFromat.Html)  is just shorthand for Clipboard.SetData(“HTML Format”, …), which is just a wrapper around the raw win32 APIs and CF_HTML format, which require this text header. (In my experience, Winforms is usually great about not just being raw pinvokes to win32, but actually smoothing over the win32 APIs and exposing a layer that’s fundamentally easy to use. I think this is a case that just fell through the cracks.)


The header is a text string that prefixes the actual string you set to the clipboard. The format is described here.   You’ll first notice this if you call Clipboard.GetText(TextDataFromat.Html) after copying HTML to the clipboard.


So you don’t just say ClipBoard.SetText(“<b>Hello!</b>”, TextDataFromat.Html).


You end up with a text string like this that you have to pass in:


Version:1.0
StartHTML:000125
EndHTML:000260
StartFragment:000209
EndFragment:000222
SourceURL:file:///C:/temp/test.htm

<HTML>
<head>
<title>HTML clipboard</title>
</head>
<body>
<!–StartFragment–><b>Hello!</b><!–EndFragment–>
</body>
</html>

The header is in blue. The actual fragment is highlighted.


There’s a method to the madness. It provides benefits like:



  1. context to the fragment, such as any enclosing tags the fragment is in. For example, if the text you copied is inside a bold tag, the context can capture that.

  2. a source URL, so you can resolve relative links.

Sample code:


I wrote a class to handle the copying + pasting of HTML snippets to the clipboard. Here’s a little sample code demonstrating the class in use:



class Foo
{
[
STAThread]
static void Main()
{
string html = “<b>Hello!</b>”;
HtmlFragment.CopyToClipboard(html);

HtmlFragment html2 = HtmlFragment.FromClipboard();
Debug.Assert(html2.Fragment == html);
}
}


The sample code is at here. I tested it with IE7 and Frontpage. Since the header spec wasn’t very precise, no general gaurantees. Use at your own risk, etc ,etc.


It worked well enough to hook up to my Rtf/Html converter and used that to paste the code snippets here.


Before, I’d save the HTML to a file (out.html), and then load that in IE and copy from there:


            TextWriter tw = new StreamWriter(“out.html”);
            Format(tw, data);
            tw.Close();
 


Now I can copy the HTML to the clipboard:


StringWriter tw = new StringWriter();
Format(tw, data);
string s = tw.ToString();

HtmlFragment.CopyToClipboard(s);


 

Comments (6)

  1. Oren Novotny says:

    You might want to also take a look at the CopySourceAsHtml add-in for VS:

    http://www.jtleigh.com/people/colin/software/CopySourceAsHtml/

    Among other features, it can let you include line-numbers and use a different font/size than the one you use for editing.

  2. Craig says:

    Tried your code exactly and it doesn’t work 🙁 Using XP, C# 2.0. Bummer. I’ve been trying to copy to HTML to the clipboard all morning. I may have to throw in the towel and get rid of that functionality. Nice code though, wished it worked.

  3. Craig – where breaks? Does it just not paste? What application are you pasting into? Does it at least paste accurately into Word or Front-page?

  4. Jeff Winchell says:

    You are correct about "Since the header spec wasn’t very precise."

    I read a header I manually put into the clipboard (by typing Ctrl-C in a browser window). I was able to replicate this HTML using the SetClipboard WinAPI function with considerable work. Then I started removing things to see when it broke.

    Example:

    If I didn’t put a line break between the last </Body> and </HTML> end tags, it didn’t work. If I didn’t put a line break right before "StartFragment" and also right before "EndFragment" (I mean the instances of those strings in the first 50 or so characters). Also, I didn’t test how exact it wanted line break… I used ASCII 13 and 10 for my line breaks.

    This was true for my WinXP Pro SP2 machine (using MSFT Dynamix AX as the front end to the WinAPI code… I’ll try C# shortly)

    There may be other caveats I haven’t run into.

    In general though, this task was a huge PITA.

    I also can’t believe MSFT thinks HTML is a custom format.

  5. I just noticed that my blog had birthday #3 (Sep 30th) . In tradition, some various stats… 384 posts.

Skip to main content