What's new in MSXML 6.0: Schema-related changes

In this post I will outline the major schema-related changes in MSXML 6.0 and what you might need to do if you rely on the old behavior.  Questions? Comments? Let me know.

XDR Support

MSXML6 has removed support for XDR schemas.  XML schema (XSD) 1.0 has now been a W3C recommendation for almost 4 years so we made the decision to discontinue support for proprietary XDR schemas in MSXML6.  XDR schemas will continue to be supported in earlier versions of MSXML. 

  • If you are using MSXML6, you will need to convert your XDR to XSD.

Schema Compilation Changes

1. Support for Partial Schemas

Summary: If two XSD files are loaded from different locations for the same namespace the Add method will union the declarations found in both locations.  In the past, adding a namespace from a secondary location would replace the definitions.

Scenarios: Any scenario where the user is calling Add more than once on the same namespace may be affected.   Also, see SchemaCache::getSchema changes below.

If you rely on old behavior: Create a new SchemaCache and add only the appropriate schema

2. Schema flattening

Summary: We now “flatten” schema imports (xs:import) so every namespace referenced is a first class citizen in the schema cache.  This ensures there is only one unique definition for every schema type in the schema cache.

Scenarios: Any scenario where a common namespace was imported from more than one location by two or more namespaces – in other words, any situation where there could be ambiguous definitions for a type in the SchemaCache depending on the namespace it is used from.  Also, SchemaCache.Length might have a different value.

This is a breaking change and does not lend itself to the old behavior

3. Improved Support for Runtime Schemas

Summary: Inline schemas and schemas referenced from an instance using xsi:SchemaLocation are now added to an XML instance-specific cache which wraps the user-supplied SchemaCache. 

Scenarios: This enables some more complex scenarios where cross-references between runtime schemas are handled appropriately.

This is a non-breaking change – it is enabling new scenarios

4. Lax Imports

Summary: The schema cache will compile a schema that has a reference to any other type already in the schema cache regardless of whether there is an explicit import or not (this is like an import with no location).  In order to make the add order insignificant the validateOnParse flag must be set to false when schema is loaded.

This is a non-breaking change

SchemaCache API Changes

  1. SchemaCache::get – notImpl – this scenario was targeted primarily at XDR so it has been removed.
  2. SchemaCache::remove – notImpl – Previously the remove method removed a namespace and all of its imported namespaces, however now that imported schemas are promoted they may have multiple dependencies.  So this method is not implemented

o The simple workaround is just to create a new schema cache and add the desired schemas.

  1. SchemaCache::getSchema – If the namespace is loaded from one location there is no change to this API.  If the namespace is loaded from multiple locations an “encapsulating” schema is created that union all the declarations under a single generated schema tag.

o There is no workaround to get to the old behavior

  1. SchemaCache::addColllection – When one schema cache is added to another it supports partial schemas just like a call to Add( ).  addCollection is atomic, either all schemas in the cache can be added or else none are added.

o If you rely on old behavior, create a new SchemaCache and add appropriate schemas

  1. get_schemaLocations: used to return all the schema locations included/imported/redefined, now only reports the schema locations for the schema's namespace only, because of the same reason that schema has only elements/types etc in its own namespace.

o To get old behavior, you need to step through included/imported/redefined schemas manually and call get_schemaLocations for each one of them.  Note that if you hit schemas loaded from multiple files, you will get multiple schema locations, which is different from the old behavior.