Walking XmlSchema and XmlSchemaSet objects


I’ve seen a number of newsgroup posts asking how to find a particular element or how to get a list of all elements from either XmlSchema or XmlSchemaSet objects.  Since we don’t provide this functionality in the framework, you need to manually traverse these objects to get what you want.  Depending on what your goal is, you might need to get either pre-compile or post-compile information.  For example, named groups are not available post-compile while PSVI information is not available pre-compile.  In this post I’ll show you how you can get pre-compile information from these objects.  To make the reading easier, I don’t include any error or exception handling code which is not relevant.


 


I’m assuming that you have a SchemaSet with a few schemas added to it.  SchemaSet provides collections of global elements, types, and attributes.  However, these collections are empty until you compile the set.  Note that named group collection is not available in the SchemaSet.  To get the pre-compile info (including a list of named groups), you will need to go through each schema in the set.


 


foreach (XmlSchema schema in ss.Schemas())


{


}


 


Once you have an XmlSchema object, you can step through and parse each global


 


// stepping through global complex types


foreach (XmlSchemaType type in schema.SchemaTypes.Values)


{


if (type is XmlSchemaComplexType)


      {


}


}


 


// stepping through global elements


foreach (XmlSchemaElement el in schema.Elements.Values)


{


}


 


// stepping through named groups


foreach (XmlSchemaAnnotated xsa in schema.Items)


{


if (xsa is XmlSchemaGroup)


{


}


}


 


Now that we have a global, whether it’s a type, an element, or a group, how do we traverse it?  I’m going to use a recursive method that takes an XmlSchemaParticle to do it.


 


void walkTheParticle(XmlSchemaParticle particle)


{


    if (particle is XmlSchemaElement)


    {


        XmlSchemaElement elem = particle as XmlSchemaElement;


 


        // todo: insert your processing code here


 


        if (elem.RefName.IsEmpty)


        {


            XmlSchemaType type = (XmlSchemaType)elem.ElementSchemaType;


            if (type is XmlSchemaComplexType)


            {


                XmlSchemaComplexType ct = type as XmlSchemaComplexType;


                if (ct.QualifiedName.IsEmpty)


                {


                    walkTheParticle(ct.ContentTypeParticle);


                }


            }


        }


    }


    else if (particle is XmlSchemaGroupBase)


    //xs:all, xs:choice, xs:sequence


    {


        XmlSchemaGroupBase baseParticle = particle as XmlSchemaGroupBase;


        foreach (XmlSchemaParticle subParticle in baseParticle.Items)


        {


            walkTheParticle(subParticle);


        }


    }


}


 


If the particle passed to the walkTheParticle is a base group (all, choice, or sequence), we will loop through each element within this base group.  If the particle is an element, we will do our processing and then (if the element is of a complex type) walk through it.  Finally, below are the last touches to the calling method to make it all work.


 


void start(XmlSchemaSet ss)


{


    foreach (XmlSchema schema in ss.Schemas())


    {


        foreach (XmlSchemaType type in schema.SchemaTypes.Values)


        {


            if (type is XmlSchemaComplexType)


            {


                XmlSchemaComplexType ct = type as XmlSchemaComplexType;


                walkTheParticle(ct.ContentTypeParticle);


            }


        }


 


        foreach (XmlSchemaElement el in schema.Elements.Values)


        {


            walkTheParticle(el);


        }


 


        foreach (XmlSchemaAnnotated xsa in schema.Items)


        {


            if (xsa is XmlSchemaGroup)


            {


                XmlSchemaGroup xsg = xsa as XmlSchemaGroup;


                walkTheParticle(xsg.Particle);


            }


        }


    }


}


 


That’s about it.  Let me know if you have any questions.

Comments (20)

  1. Stan writes about Walking XmlSchema and XmlSchemaSet objects

  2. Looks cool. What about elements in xsd file that have minoccurs = 0 and maxoccurs = 0?

    I have a real situation when I need to iterate through the elements of my schema and recognize those that have maxOccurs = 0?

    My email is Moshe_Gershberg@Coutrywide.com

  3. Roy says:

    Stan,

    If I have an XSD defining

      Simple Type (like an integer)

      Complex Type

      Simple Type (like an integer)

      Complex Type

    and then I do a GetXml() on my DataSet, the resulting XML comes out with all the Simple Types first and then all the Complex Types.

    i.e.

      Simple Type (like an integer)

      Simple Type (like an integer)

      Complex Type

      Complex Type

    Is there a way I can have GetXml() keep things in the same order as the XSD?

    Thanks.

    Roy

  4. I’ve been working on an application that is essentially a data processing pipeline. Due to the nature

  5. robert says:

    ok, how do you determine which schema each element belongs to.

  6. dinhny says:

    Hi, I don’t understand why check for elem.RefName.IsEmpty and ct.QualifiedName.IsEmpty in your example code?

    if (elem.RefName.IsEmpty)

           {

               XmlSchemaType type = (XmlSchemaType)elem.ElementSchemaType;

               if (type is XmlSchemaComplexType)

               {

                   XmlSchemaComplexType ct = type as XmlSchemaComplexType;

                   if (ct.QualifiedName.IsEmpty)

                   {

                       walkTheParticle(ct.ContentTypeParticle);

                   }

               }

           }

  7. Steve Marshall says:

    I also don’t understand why the code checks for elem.RefName.IsEmpty.  What should it do if the RefName is NOT empty?

    I’ve been using some code modelled on yours to search a single schema quite successfully.  But my app now uses a very complex set of schemas, and the code no longer finds things.  My feeling is that it is because there are a lot of elements with RefNames, and the code is not handling it.  But my knowledge of XmlSchema internals is not good enough to work out what needs to be changed.  Any suggestions?  Basically I want to hand a function a string like an Xpath, and get back a schema element for the leaf node.

  8. Stan says:

    Steve,

    The reason to check for RefName is to distinguish between locally defined elements (refname is empty) and references to global elements (ref name is NOT empty).  In some cases you might want to skip references, in others not.  It sounds like in your case you don’t want to differentiate between the two.

  9. Stan says:

    Checking for ct.QualifiedName.IsEmpty is similar to checking for elem.RefName.IsEmpty – the main goal here is to distinguish between global types and locally defined types.

  10. Josh says:

    I’ve got a schema snippit that looks like this:

     <xs:element name="TimeOfDay">

       <xs:complexType>

         <xs:all>

           <xs:element minOccurs="1" maxOccurs="1" name="Time" type="ValidTime" />

           <xs:element minOccurs="0" maxOccurs="1" name="Tolerance" type="ValidTolerance" />

         </xs:all>

       </xs:complexType>

     </xs:element>

    When it goes through the code, elem.ElementSchemaType is null, so it doesn’t get any particles off of it. I’m not sure why it’s null though – it looks like a complexType to me. Is my schema wrong? It validates stuff properly…

  11. skits says:

    What are ValidTime and ValidTolerance types?  They have to be either simple types or complex types with simple content for the schema to be valid.  Also, what element are you on when elem.ElementSchemaType is null?

  12. Josh says:

    ValidTime is a simple type:

     <xs:simpleType name="ValidTime">

       <xs:restriction base="xs:int">

         <xs:minInclusive value="0" />

         <xs:maxInclusive value="2359" />

       </xs:restriction>

     </xs:simpleType>

    ValidTolerance is similar. I expected them to flip through quickly, which they do – it was on the TimeOfDay element when ElementSchemaType is null.

  13. Jennifer says:

    Hi Stan. Very helpful – thanks!

    What I’m trying to do is opening an XSD file and listing all its elements and their attributes. So far I’ve succeeded in listing all the elements but I can’t seem to find a way to list all the attributes related to a particular element. Any help please? Below is part of the XSD file I’m working with.

    <xs:element name="ARTICLE">

    <xs:complexType>

    <xs:sequence>

    <xs:element ref="HEADLINE"/>

    <xs:element ref="BYLINE"/>

    <xs:element ref="LEAD"/>

    <xs:element ref="BODY"/>

    <xs:element ref="NOTES"/>

    </xs:sequence>

    <xs:attribute name="AUTHOR" type="xs:anySimpleType" use="required"/>

    <xs:attribute name="EDITOR" type="xs:anySimpleType"/>

    <xs:attribute name="DATE" type="xs:anySimpleType"/>

    <xs:attribute name="EDITION" type="xs:anySimpleType"/>

    </xs:complexType>

    </xs:element>

  14. Jonas Scalar says:

    This code is great.

    Have a wai to get the parent name? and the Type def of the element?

  15. Shaan says:

    Good Article..

    I used this "walkTheParticle" function in my application..

    I have a Question… I need to save the Additional Elements(Complex Type and Simple Type) into Existing XSD

    I am storing the Elements and its sub-elements Hierarchy in form of Generic Dictionary.

    The walkTheParticle returns me the Generic Dictionary…

    Now, how would I save the Generic Dictionary as xsd with tags as previously present in Existing XSD…

    Sorry I am a beginner in .NET..

  16. Hassan Gulzar says:

    Hi! I'm trying to use your code here in conjucmtion with "class FlatWsdlGenerator : BehaviorExtensionElement, IWsdlExportExtension, IEndpointBehavior" for the sake of finding nillable="true" in the target WSDL to be replaced with nillable="false". Certain contracts in my code have the IsRequried attribute = True. However, the default generated WSDL always have nillable = true. This is causing pain for my customers as there are literally hundreds of contracts. Can you help in this regard? Thanks.

  17. yashzee says:

    Hello,

    I'm working on a project that takes xsd file as an input, compiles the schema in XmlSchemaSet and then iterates over the elements in compiled schema.

    I'm using property XmlSchemaSequence.Items and XmlSchemaChoice.Items to iterate over the XmlSchemaObject inside them.

    following is a sample complex type that I'm using

    <xs:complexType name="aaa">

       <xs:sequence maxOccurs="unbounded">

         <xs:element name="hi" type="xs:string" />

         <xs:choice maxOccurs="unbounded">

           <xs:element name="v" type="xs:byte"/>

           <xs:sequence maxOccurs="unbounded">

             <xs:element name="z" type="xs:byte"/>

           </xs:sequence>

         </xs:choice>

         <xs:sequence maxOccurs="unbounded">

           <xs:element name="y" type="xs:byte"/>

         </xs:sequence>

       </xs:sequence>

     </xs:complexType>

    I get inner sequence  from Items property when MaxOccur is set to greater than 1, also, I'm getting choice inside sequence without maxOccur attribute. but, for the other cases of nested choice and sequence, if I removed the maxOccur attribute or set to less than or equal to 1, Items property gives me elements inside them and not the sequence or choice as XmlSchemaObject.

    please tell me how do I get inner/nested sequence or choice even if maxOccur attribute is not present.