Using SelectSingleNode (or SelectNodes) on XML where the default namespace has been set


I've been stumped by this one at least two times over the last couple of years, so I thought it was a good candidate to be written up here.

I was trying to select a node from some standard XHTML where the default namespace was set. In otherwords the XHTML was something like:

   <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[]>
   <html xmlns="
http://www.w3.org/1999/xhtml" lang="en" xml:lang="en">
   <head>
   <meta http-equiv="content-type" content="text/html; charset=utf-8" />
   <title>MSN Search News: Microsoft</title> ...

Note the xmlns attribute on the root <html> node.

Without thinking too hard, I first tried to find the title of the page by going ...

   XmlDocument resultsXhtml = new XmlDocument();
   resultsXhtml.Load("
http://search.msn.com/news/results.aspx?q=Microsoft");
   XmlNode metaNode = resultsXhtml.SelectSingleNode("//title");

... which left metaNode as null.

This took me a little while to figure out. Clearly I need to identify in the XPath query that the title tag is in the default namespace, but how can I do that if that namespace has no prefix in the actual XML.

The solution (reasonably obviously!) is to register a prefix of my own choosing in an XmlNamespaceManager object, and then use that namespace manager when doing the select. Here's some code that works:

   XmlDocument resultsXhtml = new XmlDocument();

   resultsXhtml.Load("http://search.msn.com/news/results.aspx?q=Microsoft");

   XmlNamespaceManager namespaceManager = new XmlNamespaceManager(resultsXhtml.NameTable);

   namespaceManager.AddNamespace("myprefix", "http://www.w3.org/1999/xhtml");

   XmlNode metaNode = resultsXhtml.SelectSingleNode("//myprefix:title", namespaceManager);

 

I think what's interesting about this problem, is the way you have to think about namespaces and XPath queries. The namespace is a logical entity denoted by the URI not the prefix in the actual XML. Therefore you can register that URI with any prefix you want in your XPath, which isn't a completely intuitive concept - to me at least!

Comments (13)
  1. Paul Hammond says:

    Actually, the whole idea that it is the URI that is the logical entity, and NOT the prefix is something that took a while for me to "get" also. It was only when I was working with a lot of files that had an un-prefixed namespace that I finally figured it out!

  2. Peter says:

    Thanks — I was struggling for AGES with this. The documentation is as clear as mud…

  3. mike says:

    Thanks, this helped me out.  Was wondering why my xpath was not working till i stumbled on this post.

  4. Dag says:

    Thanks A LOT!

    I was going crazy over my null results when calling SelectSingelNode to get inner nodes in my soap-document.

  5. jkita says:

    While I understand that the prefix for the xpath query is controlled by the XmlNamespaceManager and can be different than the prefixes used in the xml itself, it disturbs me that one can set the “default namespace” for the XmlNamespaceManager and that default is ignored in the xpath query.  This to me is a bug in the implementation, which should be corrected to avoid untold hours of frustration by developers attempting to discover this workaround.

    Thanks for your post, it did indeed shorten the amount of time that I was frustrated.

  6. Shane says:

    This was wrecking my head as well, thanks for posting John

  7. Gary says:

    Great post. This had me stumped.

  8. Mike says:

    Great explanation. Thanks!

  9. Elsa says:

    What about if the XML doesn't have any namespace to refer to?

  10. John Pollard says:

    Elsa, I'm not sure I understand your question.

    If there is no namespace set, doesn't the node query just work without any prefix?

    So in my example:

       XmlNode metaNode = resultsXhtml.SelectSingleNode("//title");

    then it would actually return the title node if the XML had no namespaces set

    John

  11. Manish says:

    Here the namespace is hardcoded. Can we process the multiple XML Docs having the same child node title and have different namespaces [Namespace will be decided at Runtime]?

    For Example: XML Document 1

    <Root xmlns=http://abc.com/example&gt;

    </Root>

    XML Document 2

    <Root xmlns=http://xyz.com/example2&gt;

    <title>  Title 2 </title>

    </Root>

  12. John Pollard says:

    Manish – been a long time since I wrote this article (and don't do much XML programming nowadays!), but I think the answer is that because the namespace of the two documents are different, then the documents aren't actually the same.

    Just because they look similar with the same structure, the different namespace means they are actually different.

Comments are closed.

Skip to main content