Searching for Custom ID Tags With Signed XML

Last week, I blogged about using references to sign only specific parts of an XML document. The biggest limitation with doing this is that you must refer to the nodes that are being signed by ID, which for v1.1 and 1.0 of the framework was given by an attribute named "Id". The problem there is that the Id attribute may already have another use in your schema, and you cannot reuse them for creating node names. Another problem that may come up is that the XML being signed may be generated by a tool or program, and it's not possible for you to add Id tags. Whidbey reduces this limitation somewhat by also allowing "id" and "ID", but the fundamental problem still exists.

Recently, this problem cropped for one of our customers who was having a problem getting signed XML to work in his environment. Their application was trying to sign XML generated by a Java program. The Java application had generated ids by using "_Id" attributes, making it impossible to sign with an id based reference.  If the only problem was creating a signature, then there would be an easy workaround.  However, the C# portion of the program was trying to verify the signature that the Java application had created, and could not resolve references to the _id elements.

So, how did we solve this problem? Actually with a very clever solution from one of the other members of the security team. Nodes that are being referred to by ID are resolved in the GetIdElement method of the SignedXml class. By subclassing SignedXml and overriding this method, its possible to create your own id node resolver. I'll show a sample here that allows ids prefixed with underscores to be resolved by XML signature engine.

Although this sample relies on a similar ID attribute mechanism for identifying nodes, there's nothing stopping you from creating a fancier system. Just as long as you always return exactly one XmlElement representing the specified node, or null if the node could not be found. The best use of this technique is to enable interop scenarios. If you're trying to do this simply to have more fine-grained control over which nodes your reference identifies, I'll show a better technique later this week.

Implementation

The only method that I'll need to override is the GetIdElement method. In addition, I'll provide a constructor that takes an XmlDocument, and passes it along to the SignedXml constructor. I've also defined an array of strings, which represent the attributes that I'll allow to identify nodes. Here's the code:

public class CustomIdSignedXml : SignedXml
{
    private static readonly string[] idAttrs = new string[]
    {
        "_id",
        "_Id",
        "_ID"
    };

    public CustomIdSignedXml(XmlDocument doc) : base(doc)
    {
        return;
    }
    
    public override XmlElement GetIdElement(XmlDocument doc, string id)
    {
        // check to see if it's a standard ID reference
        XmlElement idElem = base.GetIdElement(doc, id);
        if(idElem != null)
            return idElem;

        // if not, search for custom ids
        foreach(string idAttr in idAttrs)
        {
            idElem = doc.SelectSingleNode("//*[@" + idAttr + "=\"" + id + "\"]") as XmlElement;
            if(idElem != null)
                break;
        }

        return idElem;
    }
}

So what's going on here? At the beginning of the method, I call the default GetIdElement, and if that found a match, return that node. This allows my resolver to continue working with "Id" nodes. Since I do this at the beginning of the method, it also makes it so that nodes with the standard id attributes take precedence over nodes with my custom attributes. Next, I loop over my custom attributes, and perform an XPath query, looking for the first node that has a custom attribute with the correct value. Since I quit searching for nodes as soon as I find a match, the order that the attributes appear in the idAttrs array is also the order of precedence. For instance a reference to #idnode when run over the following XML,

<root>
  <node1 _id='idnode'/>
  <node2 Id='idnode'/>
  <node3 _ID='idnode'/>
  <node4 _ID='otherref'/>
  <node5 _id='otherref'/>
  <node6 _id='otherref'/>
</root>

will match node2, since IDs found by the SignedXml class have the highest precedence. A reference to #otheref will match node5, since _id occurs before _ID in my search and only the first match is returned.

Searching for the id nodes is done with the SelectSingleNode method, passing an XPath query customized for the name of the id attribute and the ID I'm looking for. For instance, when searching for nodes with an _ID attribute that would match a reference to #myref, the XPath would be:

//*[@_Id="myref"]

Signing With Custom IDs

Creating a signature that would use the custom id resolver is done in exactly the same way that normal signatures containing id references are done. The only difference is that instead of creating a SignedXml object, you create an instance of the custom ID resolver. Since GetIdElement is virtual, you can even assign the instance into a SignedXml object, and everything will still work as expected. This helps to enable scenarios where the code that actually computes signatures lives in a different method that you don't control, however you do have write access to the SignedXml object they use.

Enough talk, here's some sample code showing a signature being created with my custom _Id resolver. The XML that I'm going to sign is:

<xml>
  <tag Id='tag1'/>
  <tag _Id='tag2'/>
</xml>

And the code that computes the signature is:

XmlDocument doc = new XmlDocument();
doc.LoadXml(xml);

SignedXml signer = new CustomIdSignedXml(doc);
signer.AddReference(new Reference("#tag1"));
signer.AddReference(new Reference("#tag2"));
        
signer.SigningKey = new RSACryptoServiceProvider();
signer.ComputeSignature();
Console.WriteLine(signer.GetXml().OuterXml)

This produces a signature similar to the following:

<Signature xmlns="https://www.w3.org/2000/09/xmldsig#">
  <SignedInfo>
    <CanonicalizationMethod Algorithm="https://www.w3.org/TR/2001/REC-xml-c14n-20010315" />
    <SignatureMethod Algorithm="https://www.w3.org/2000/09/xmldsig#rsa-sha1" />
    <Reference URI="#tag1">
      <DigestMethod Algorithm="https://www.w3.org/2000/09/xmldsig#sha1" />
      <DigestValue>/dsJPkLT3QydsHQ1dpmMLPEIbRo=</DigestValue>
    </Reference>
    <Reference URI="#tag2">
      <DigestMethod Algorithm="https://www.w3.org/2000/09/xmldsig#sha1" />
      <DigestValue>AWY9mgt5Z+jRSS+CevluG77gFC8=</DigestValue>
    </Reference>
  </SignedInfo>
  <SignatureValue>JVXWblnT . . . 55rZ7zc=</SignatureValue>
</Signature>

Verifying the Signature

Remember that since the standard SignedXml class will have no idea how to resolve references to custom ID nodes, so verification will also need an instance of the custom signed xml class. However, aside from this difference verification works exactly as you would expect

SignedXml verifier = new CustomIdSignedXml(doc);
verifier.LoadXml(signer.GetXml());
if(verifier.CheckSignature(signer.SigningKey))
    Console.WriteLine("Signature verified");
else
    Console.WriteLine("Invalid signature");