An open-source full-fidelity XML parser


A while back I needed to understand XML at a low-level, including whitespace, line breaks and comments. While XLinq is a fantastic and powerful library, it does lack a few things, for instance I noticed it doesn’t preserve whitespace around attributes. Nor does it expose the position/line/column information about the nodes.

So I took a part of Roslyn VB parser that parses XML literals, ported it to C# and made it into a standalone library that I could use for my purposes. Realizing that it might fill some gaps for certain scenarios out there I’m putting it out as open-source:

https://github.com/KirillOsenkov/XmlParser

And here’s an example of what I needed it for: notice the colored XML with hyperlinks?

http://referencesource.microsoft.com/#MSBuildFiles/C/ProgramFiles(x86)/MSBuild/14.0/bin_/amd64/Microsoft.CSharp.Core.targets,67

The parser is also available on NuGet:

Microsoft.Language.Xml

Microsoft.Language.Xml.Editor

The editor one is a simple language service for Visual Studio, which is less useful since Visual Studio already has a far more powerful language services for XML than this one. The advantage here however is that you can tweak this one any way you want to add further extensibility features to XML files in Visual Studio on top of what’s already there.


Comments (2)

  1. Ian Yates says:

    Looks pretty neat – thanks for sharing.

    A quick note to others, if you click the referencesource link, you might get a 404 the first time for the right-hand pane (or at least I did).  Refreshing did the trick.

  2. Wesner Moise says:

    Did you check out the LoadOptions for XDocument.Load which includes whitespace and line information (using IXmlLineInfo).

    https://msdn.microsoft.com/en-us/library/system.xml.linq.loadoptions(v=vs.110).aspx

Skip to main content