libOPC version 0.0.1 released

imageThe first release of libOPC, a new API for Open XML development, was published on Codeplex last week. This API is the first open-source cross-platform API for developers working with Open Packaging Convention (OPC) packages as used by Open XML, XPS, and other formats. Full source code is available, and it’s written in portable C99, so can be used on all popular variants of Linux/Unix, Mac OS, Windows, Android, and many other platforms. The API uses other common cross-platform open-source APIs for some of the low-level details, including ZLIB for opening ZIP-compressed packages and libXML for parsing the XML streams from the parts in the package.

Historically, there have been two popular .NET APIs for Open XML development: System.IO.Packaging (which first appeared in .NET 3.0) and the Open XML SDK, released in early 2007. There’s also a COM-based native packaging API available for non-.NET Windows developers.

The libOPC API is roughly analogous to System.IO.Packaging, in that it’s focused on the details of OPC and MCE (parts 2 and 3 of IS 29500), but doesn’t provide higher-level abstractions for WordprocessingML, SpreadsheetML or PresentationML (as covered in parts 1 and 4 of IS 29500).  I say “roughly” because libOPC doesn’t yet address some of the things that System.IO.Packaging handles (e.g., digital signatures) but does include some more advanced capabilities not available in System.IO.Packaging, such as the opc_generate functionality described below, which is essentially the same as the document reflector functionality of the Open XML SDK.

The key new feature in libOPC is its cross-platform capabilities. If you’re working on a non-Microsoft platform, or working with embedded systems that have limited OS support for XML and ZIP, you now have a very fast, simple API that you can use to implement Open XML read and write capabilities in your applications. And libOPC is designed from the ground up to be wrapper-friendly, for use from programming languages other than C.

The coordinator of the libOPC project is Florian Reuter, a well-known voice in the document formats community through his years of work with the OASIS ODF TC, Ecma TC45, OpenOffice.org and LibreOffice. Florian has worked with both ODF and Open XML, as a Sun employee, a Novell employee, and (now) an independent developer, and this gives him a well-rounded perspective on what sorts of tools can help developers to be more productive when working with XML-based standardized document formats. Microsoft is sponsoring Florian’s contributions to libOPC, and I can’t think of a better person to be leading this project.

The first release of libOPC is mostly about providing abstractions that make it easy for developers to work with OPC packages. There are several basic tasks that you need to do over and over when working with OPC packages, such as finding a relationship by type, determining the target of a relationship, retrieving the content of a part in a stream, and so on. The libOPC API provides simple methods for carrying out these tasks, and the developer doesn’t need to think about details such as how a package’s relationship tree is serialized into various XML parts.

There’s a video on the libOPC Codeplex site in which Florian goes over the basics of what’s available in version 0.0.1. One of the things he demonstrates is opc_generate – a tool that takes an Open XML document as input and generates a cross-platform C program that will re-create that document through use of libOPC. This is a very powerful tool for building customized document assembly systems, because the code it generates can provide a big head start for certain types of projects.

In addition to enhancing developer productivity, opc_generate can enable better collaboration between graphic designers and developers. A graphic designer can create a beautiful document without giving any thought to how to generate that document programmatically, and then a programmer can generate source code for that document instantly, then focus on customizing that code to attach it to data sources as appropriate. My colleague John Haug did an informal test of opc_generate by taking all of the documents in our internal SharePoint document library and generating libOPC code, then running the code to re-generate the original documents. They all worked great, including some quite complex documents.

On the documentation page, you’ll find information about how to use libopc on Linux, MacOSX, Windows, iOS, and Android, as well as demo videos showing libopc in action with WebKit and the iPhone. The WebKit demo uses the Open XML standard itself as an input, so it’s a great example of the raw performance that libOPC delivers.

If you’re familiar with the Open XML SDK, you may wonder whether libOPC will eventually include the sorts of higher-level functionality that the Open XML SDK provides. This is an interesting question, and I’m really looking forward to seeing how this project evolves going forward. Just as .NET developers started with the System.IO.Packaging API and then the Open XML SDK was built on top of it, we may see other higher-level APIs and tools built on top of libOPC. If you have thoughts in this area, or are interested in contributing, please get involved and join the libOPC project on Codeplex.