XAML FlowDocument to HTML Conversion Prototype

XAML FlowDocuments and HTML have some things in common. But they also have some distinct differences that makes writing a conversion utility tricky. A well written XSLT could potentially process XHTML input and generate FlowDocument content... But this pre-supposes well-formed HTML in the first place. I've tried to go down this road on a few occasions with limited success.

Since most HTML isn't well formed, a more flexible solution was to build a conversion library. The attached application contains class libraries capable of converting from HTML to FlowDocument, or from FlowDocument to HTML. I can't emphasize enough that this is simply a prototype -- true fidelity of content is not promised nor is it expected. However, if you're interested in playing with (and potentially improving upon) a conversion prototype, you'll find the attached project very useful. The user interface is basic -- simply a TextBox into which you can paste content for conversion. The converted content appears in the same TextBox after the "Convert!" button is pressed.

These classes can also be used to process the entire contents of a folder and turn all of the HTML contained therein to XAML. We're using a similar technique for our updated version of the SDKViewer demo that will ship with the RC1 SDK (so content will be up to date in future versions of the application, unlike the present circumstance). You could do something similar, using a foreach loop, like the following:

C#

Directory.CreateDirectory("test");
string[] myString = Directory.GetFiles(filepath);
foreach (String s in myString)
    {
    FileStream htmlFile = new FileStream(s, FileMode.Open, FileAccess.Read);
    StreamReader myStreamReader = new StreamReader(htmlFile);
    File.WriteAllText(("test\\" + s + ".xaml"), (HtmlToXamlConverter.ConvertHtmlToXaml(myStreamReader.ReadToEnd(), true)),Encoding.UTF8);
    }

This conversion library isn't perfect -- but it gives you a big head start if your enterprise is considering conversion from HTML to FlowDocuments. Converting your HTML content would allow you to take advantage of the enhanced reading capabilities of WPF, including paginated content, magnfication, and annotations.

For more information on the WPF document platform, see this SDK topic:

https://windowssdk.msdn.microsoft.com/library/en-us/wpf_conceptual/html/6e8db7bc-050a-4070-aa72-bb8c46e87ff8.asp?frame=true

Good luck.

-Keith

About Us


We are the Windows Presentation Foundation SDK writers and editors.

html2flow.zip