Apache XmlBeans and .NET XML Serialization

Article
02/11/2005

A while back, I used Apache's XmlBeans for an interop sample. The real focus there was the performance comparison for serialization and de-serialization (not transport), using XML and Java binary serialization. I found that XML was slower, surprise! But not that much slower. Sometimes the factor was 1.3, sometimes 2. Not 10. It's questionable whether that perf difference would be significant in a real app. Keep in mind that it did not measure transport time, which could be important.

Anyway the XMLBeans stuff was sort of hidden in that piece, so I thought I would re-visit it here. XMLBeans is an open-source project from Apache, for mapping between Java objects and XML, much like .NET's built-in XML Serialization. There are various options for doing this in Java-land, including Castor, JAXB, and webMethods GLUE. XMLBeans provides another way to go. It is open-source, which is attractive to some people. It uses the Apache license, which is commercial-friendly. And it can be added to virtually any Java platform: J2EE server, servlet, client-side, custom container, whatever.

Apache's XMLBeans and .NET's XML Serialization use approximately the same philosophy and general model. A developer takes an XML Schema, runs it through a compiler, and generates source code that maps between instances of the schema (XML documents that conform to the schema) to instances of Java or .NET objects. "Maps between" means, you can start with an XML document and create an object, or vice versa.

Since both XMLBeans and .NET XML Serialization do this mapping, apps can interchange data pretty simply, using XML and the mapping magic on either side. Save the state of a .NET object into XML stream, then load a Java object from that same stream. Or vice versa. The sample presented here shows how.

I defined the base schema in W3C XML Schema (WXS) format. I cooked it up in the XML Schema designer in Visual Studio 2003. You could build the schema with any tool, even a text editor, if you speak XSD. Or, maybe you already have a schema, lucky you!

But I designed a schema. Following the best guidelines for XML Schema design that I found, I stayed away from all the WXS esoterica and exotica, and used only simple primitives, and structures and arrays of same. I used namespaces. I used complexTypes. I used elementFormDefault="qualified". I did not use notations, or default values. I stayed away from restriction, abstract types, lists, and other XML Schema tricks. All good advice.

Despite limiting myself to a mere subset of XML Schema, it's not a trivially simple Schema. As a bonus, it's interoperable.

The root element in the schema, TestMessage, includes:

a structure - Person - which contains 4 primitives
an Array of strings
2 arrays of other structures (each WXS complexType maps to a structure), each of them containing various primitives.
a base64-encoded blob

Here it is in all its glory:

1 <?xml version="1.0" encoding="utf-8" ?>

2 <wxs:schema xmlns:tns="urn:Interop.DotNet.XmlBeans-2005feb10" elementFormDefault="qualified"

3 targetNamespace="urn:Interop.DotNet.XmlBeans-2005feb10" xmlns:wxs="www.w3.org/2001/XMLSchema">

5 <wxs:complexType name="Person">

6 <wxs:sequence>

7 <wxs:element minOccurs="0" maxOccurs="1" name="Surname" type="wxs:string" />

8 <wxs:element minOccurs="1" maxOccurs="1" name="Registered" type="wxs:boolean" />

9 <wxs:element minOccurs="1" maxOccurs="1" name="HeightInCm" type="wxs:int" />

10 <wxs:element minOccurs="1" maxOccurs="1" name="N" type="wxs:int" />

11 <wxs:element minOccurs="0" maxOccurs="1" name="Id" type="wxs:string" />

12 </wxs:sequence>

13 </wxs:complexType>

15 <wxs:complexType name="Status">

16 <wxs:sequence>

17 <wxs:element minOccurs="1" maxOccurs="1" name="Code" type="wxs:int" />

18 <wxs:element minOccurs="0" maxOccurs="1" name="Message" type="wxs:string" />

19 </wxs:sequence>

20 </wxs:complexType>

22 <wxs:complexType name="Report">

23 <wxs:sequence>

24 <wxs:element minOccurs="1" maxOccurs="1" name="Date" type="wxs:date" />

25 <wxs:element minOccurs="1" maxOccurs="1" name="Time" type="wxs:time" />

26 <wxs:element minOccurs="1" maxOccurs="1" name="Id" type="wxs:int" />

27 <wxs:element minOccurs="0" maxOccurs="1" name="Notes" type="wxs:string" />

28 </wxs:sequence>

29 </wxs:complexType>

31 <wxs:complexType name="ReportSet">

32 <wxs:sequence>

33 <wxs:element minOccurs="0" maxOccurs="unbounded" name="item" nillable="true" type="tns:Report" />

34 </wxs:sequence>

35 </wxs:complexType>

37 <wxs:complexType name="ArrayOfString">

38 <wxs:sequence>

39 <wxs:element minOccurs="0" maxOccurs="unbounded" name="item" nillable="true" type="wxs:string" />

40 </wxs:sequence>

41 </wxs:complexType>

43 <wxs:element name="TestMessage">

44 <wxs:complexType>

45 <wxs:sequence>

46 <wxs:element minOccurs="0" maxOccurs="1" name="Reporter" type="tns:Person" />

47 <wxs:element minOccurs="0" maxOccurs="1" name="Comments" type="tns:ArrayOfString" />

48 <wxs:element minOccurs="0" maxOccurs="1" name="Reports" type="tns:ReportSet" />

49 <wxs:element minOccurs="0" maxOccurs="1" name="Buf" type="wxs:base64Binary" />

50 <wxs:element minOccurs="0" maxOccurs="1" name="Status" type="tns:Status" />

51 </wxs:sequence>

52 </wxs:complexType>

53 </wxs:element>

54 </wxs:schema>

Run this schema through xsd.exe, and you get generated C# code, which, in part, looks like this:

1 public class TestMessage {

2 public Person Reporter;

4 [System.Xml.Serialization.XmlArrayItemAttribute("item")]

5 public string[] Comments;

7 [System.Xml.Serialization.XmlArrayItemAttribute("item")]

8 public Report[] Reports;

10 [System.Xml.Serialization.XmlElementAttribute(DataType="base64Binary")]

11 public System.Byte[] Buf;

13 public TestStatus TestStatus;

14 }

There are other generated classes too, not shown here. For the Java side, XMLBeans generates code, too, using the scomp (schema compiler) tool. By default the generated code is archived in a jar file, and you don't ever see it. There's an option to preserve the source. Here's one piece of it.

1 public interface TestMessage extends org.apache.xmlbeans.XmlObject

2 {

3 public static final org.apache.xmlbeans.SchemaType type = (org.apache.xmlbeans.SchemaType)schema.system.s0EA50EB547117A1CF2E3D0F5CC8F45D3.TypeSystemHolder.typeSystem.resolveHandle("testmessageb49belemtype");

4 xmlBeans2005Feb10.dotNet.interop.Person getReporter();

5 boolean isSetReporter();

6 void setReporter(xmlBeans2005Feb10.dotNet.interop.Person reporter);

7 xmlBeans2005Feb10.dotNet.interop.Person addNewReporter();

8 void unsetReporter();

10 xmlBeans2005Feb10.dotNet.interop.ArrayOfString getComments();

11 boolean isSetComments();

12 void setComments(xmlBeans2005Feb10.dotNet.interop.ArrayOfString comments);

13 xmlBeans2005Feb10.dotNet.interop.ArrayOfString addNewComments();

14 void unsetComments();

16 xmlBeans2005Feb10.dotNet.interop.ReportSet getReports();

17 boolean isSetReports();

18 void setReports(xmlBeans2005Feb10.dotNet.interop.ReportSet reports);

19 xmlBeans2005Feb10.dotNet.interop.ReportSet addNewReports();

20 void unsetReports();

22 byte[] getBuf();

23 org.apache.xmlbeans.XmlBase64Binary xgetBuf();

24 boolean isSetBuf();

25 void setBuf(byte[] buf);

26 void xsetBuf(org.apache.xmlbeans.XmlBase64Binary buf);

27 void unsetBuf();

29 xmlBeans2005Feb10.dotNet.interop.TestStatus getTestStatus();

30 boolean isSetTestStatus();

31 void setTestStatus(xmlBeans2005Feb10.dotNet.interop.TestStatus testStatus);

32 xmlBeans2005Feb10.dotNet.interop.TestStatus addNewTestStatus();

33 void unsetTestStatus();

35 public static final class Factory

36 {

37 public static xmlBeans2005Feb10.dotNet.interop.TestMessageDocument.TestMessage newInstance()

38 {

39 return (xmlBeans2005Feb10.dotNet.interop.TestMessageDocument.TestMessage) org.apache.xmlbeans.XmlBeans.getContextTypeLoader().newInstance( type, null ); }

41 public static xmlBeans2005Feb10.dotNet.interop.TestMessageDocument.TestMessage newInstance(org.apache.xmlbeans.XmlOptions options)

42 {

43 return (xmlBeans2005Feb10.dotNet.interop.TestMessageDocument.TestMessage) org.apache.xmlbeans.XmlBeans.getContextTypeLoader().newInstance( type, options ); }

45 private Factory() { } // No instance of this class allowed

46 }

47 }

From there, you just build the apps that use these classes. The .NET code that loads and stores object instances from and to XML files looks like this:

24 public void Save(TestMessage tm) {

25 File.Delete(_Filename);

26 using (FileStream fs= new System.IO.FileStream(_Filename, System.IO.FileMode.Create )) {

27 s1.Serialize(fs, tm, ns);

28 }

29 }

31 public TestMessage Load() {

32 TestMessage tm= null;

33 StreamReader r=null;

34 try {

35 r= File.OpenText(_Filename);

36 tm= (TestMessage) s1.Deserialize(r);

37 }

38 finally {

39 if (r!=null) r.Close();

40 }

41 return tm;

42 }

And in Java:

1 public void Save(TestMessage tm)

2 throws IOException

3 {

4 FileWriter fw= new FileWriter(new File(_Filename));

6 TestMessageDocument tmDoc= TestMessageDocument.Factory.newInstance();

7 tmDoc.setTestMessage(tm);

9 String xmlStr = tmDoc.xmlText(xmlo);

10 fw.write(xmlStr,0,xmlStr.length());

11 fw.flush();

12 fw.close();

13 }

15 public TestMessage Load()

16 throws IOException, org.apache.xmlbeans.XmlException

17 {

18 File file = new File(_Filename);

19 TestMessageDocument tmDoc= TestMessageDocument.Factory.parse(file);

20 TestMessage tm = tmDoc.getTestMessage();

21 return tm;

22 }

So you can see they are pretty similar. The beautiful thing is, this code actually works nicely to exchange data between two independent apps.

Where does SOAP fit?

This isn't using SOAP or Web services for interop, although it's similar. SOAP defines a data encapsulation layer, and is typically used along with WSDL to define request/response message pairs, and HTTP for transport. If you don't have HTTP for transport, or if you don't need the SOAP envelope for routing or other message quality-of-service things (like security, integrity), then maybe using simple XML serialization is the way to go for your interop challenge.

About the Sample

Get the source

The sample app includes source for a Winforms app and a Java console app. The Windows Forms app presents an easy way to enter and modify data, and save it into a file, and load it from a file. In memory, all of the data is stored in a single object graph, rooted at the TestMessage type.

The Java console app does the same thing, generally, except there is no GUI, and thus no interactive editing capability. The Java app adds a note and a timestamp to the comments list in the TestMessage, then re-saves the message, each time it runs.

The main XML Serialization logic in the .NET app is contained in the file Store.cs. The corresponding job is carried out by a private internal Store class in the Java code.

Comparing

Personally, I think the IO and serialization is simpler using .NET than it is using XMLBeans. The dynamic code generation capability of the XML Serializer in .NET makes it much simpler to use. I find the model to be is cleaner and more easily digested. I like being able to view the generated class in .NET; with XMLBeans, the code is not available by default. Also the generated code in .NET is simpler. .NET's ability to go "either way", starting from a class file or starting from an XSD, is nice. I can just attribute a class with XML serialization tags, and generate a schema from that class. Which is nice. You can do this from XMLBeans too, it's just harder.

The one thing that makes a big difference to me is that XMLBeans defines a new activation and instantiation model. You use a factory to create the instances, and you deal in interfaces, not classes. This is a bit more complex than .NET, which uses plain-old-objects, and just new... for object instantiation.

Having said all this, both things do the job. Also I haven't looked at XMLBeans v2.0, which apparently is out, or due out soon.

Summary

XMLBeans and .NET XML Serialization presents one more option for data interop between Java-based apps and .NET apps.

Apache XmlBeans and .NET XML Serialization

Additional resources