Differences in vocabulary

The workshop I attended last week at Harvard was a really great learning experience for me. I spend most of my time focused on technology, but not always as often on the policies and practices that need to be put in place to best leverage those technologies. Over the two day workshop last week I had the chance to listen to a number of the technology leaders in the public sector. They have a big focus on interoperability to help make communication more efficient, and there was a big focus on archival as well.

One interesting things I noticed right away was that we don't always use the same terminology. There were a couple that really stood out to me because of how often I hear folks talk about them in my current job:


For a number of folks in the public sector interoperability primarily meant radio communication. For example, with police, they want to make sure everyone can quickly and efficiently communicate with each other. This is why standards in communication are very important. When I would mention interoperability in relation to document formats and semantics, often times there would be a little confusion at first because of this. The principles are the same, but I usually mean that more generally interoperability applies to creating an efficient way to share information.

Custom defined schema is the key for interoperability in terms of documents. Some of the folks at the session referred to this as "Web 3.0". I'm not sure it needs that grand of a name, as it's a pretty basic concept. It just means that documents are no longer documents in the traditional sense, but instead collections of data. Another way people talk about this basic concept is "micro formats." It doesn't really matter what term you use though, what's important is to realize that in order to truly get quick efficient sharing of data, you need to have the ability to structure that data within your documents.

If the office document that your target users are creating goes beyond simply specifying the display information but also calls out directly all the semantic information, this takes you to a completely different level of interoperability. When I think of interoperability, I think of documents interacting with systems and processes in ways no one is really doing right now.

Open Source

This was a really big eye opener for me. Many people were talking about open source specifically as a content sharing model. In many folks' minds, Wikipedia and Open Source could be thought of an analogous. I think I've adopted a view of open source that is much too narrow. There were folks from the defense department for example who said they wanted to set up an open source model for sharing intelligence information within their organization. I had previously thought about open source more in terms of the licensing model chosen. Well obviously the folks from the defense department weren't thinking they wanted to put all the content under the GPL, but instead they wanted a system where people could easily share information within their targeted community. This is something I believe strongly in. Easy collaboration and sharing of content was one of the big scenarios we were going after with the XML formats in Office. If the server technology is able to interpret the document content, you can build some powerful solutions.

Other Observations

There are a bunch of other things I wanted to write down about the conference, and hopefully I'll get to them in the next couple days. There were some IBM folks there as well, including Bob Sutor. Bob led a discussion around a case study where folks realized they had to bring more emotion back to certain IT systems that had become too rigid and form like. This study focused on child welfare case workers weren't being encouraged to really think about each case, and were forced to use a system that basically just consisted of a series of checkboxes to go through. This moved folks away from focusing on the true need, which focusing on what's in the child's best interest. So they made the move change up the system and to standardize around a separate color scheme for all sites relating to children. At first there was pushback because folks would say: "We can't do that, we have standards, and everything has to be blue." Eventually though there were able to break away from the rigid system and build something smarter and more targeted. It was an interesting talk.

I was also really interested in the document archival case studies. I worked with the British Library and the Library of Congress on the OpenXML standardization last year, so I've already had a lot of exposure to this issue. It's a very important issue that governments are now dealing with. You have content coming in all sorts of formats (documents, video, audio, e-mail, pictures, etc.), and it's important to maintain them all for the public good. The case study that was discussed at the workshop last week was with the State of Washington digital archives. I'll write up a separate post on that as I believe it's a really important topic.


Comments (7)

  1. MSDNArchive says:

    Hey Brian,

    With regards to Open Source, I think your notions are (were) correct. The term is evolving. The first I heard this new definition of Open Source was on NPR’s Radio Open Source (http://www.radioopensource.org/). At first, I thought they were co-opting the term, "Open Source", but its clear that the term has taken on a broader meaning. The best synonym I have for it now is "transparent".

  2. orcmid says:

    Um, I’m uncomfortable with custom content as a key interoperability vector.  

    I agree that custom content has an interoperability function, but I think that is a narrow (though also important) case.  I think that custom-content interoperability is more like a private agreement between parties, and if that agreement doesn’t move from the tacit to the explicit (something that could be carried along with the custom content, perhaps) the interoperability will perish.  

    Maybe the difference is between interoperability broadly and an interchange agreement narrowly (and beautifully facilitated with custom content).

    I’m going to have to mull on this some more, because I think there is something yet to be surfaced around the *expectation* of interoperability in the context of open standards initiatives, especially in governmental operations.

  3. jones206@hotmail.com says:

    Dennis, I should be clear that when I say custom content, I speaking from the Office suite productivity point of view. That custom content may be a schema that is defined by an industry group and applies to millions of people. The reason we call it custom content though is that those schemas shouldn’t be influenced by the producers of the file formats. They are at a higher level.

    Does that change your view?


  4. Keith J says:

    Brian wrote "The case study that was discussed at the workshop last week was with the State of Washington digital archives."

    Ok, I am going to go google the heck out of that but I am very interested to learn more about the discussion of the Washington State case and how it applies to archival and interoperability.  I work for a county gov. office in Washington and we have been talking about data and document managment.  Thanks for the tip.  

  5. n4cer says:

    Keith, unless there’s something new, the case study Brian may be talking about is located at the following link:

    Case Study

    State of Washington Digital Archives Project


    and more info:

    Digital Archives Background and History


    Q&A: Washington State Introduces Digital Archive Solution, First of its Kind and Based on Microsoft Technology


  6. orcmid says:

    Hi Brian.  My reaction was to this: "Custom defined schema is the key for interoperability in terms of documents."

    I didn’t realize that I had sounded so critical until someone else sent me a note offering a clarification.  I see your comment now.  

    I am aligned with your "I usually mean that more generally interoperability applies to creating an efficient way to share information."

    I am especially aligned with a wonderful crisp statement of Jon Udell (http://blog.jonudell.net/2007/03/29/the-essence-of-openness/)

    Collaboration: "People working together in shared information spaces, using shared technical and social protocols, to achieve shared goals."

    I think custom content is a great way for applying office documents to interchange and cooperative activities in such settings.

    I agree with your remark about using custom content to carry agreements (e.g., an agreed schema) on the structure of the document itself, or the structure of the separated XML part.  It indeed is part of making the agreement on useage explicit.

    I do think this is very powerful and it is terrific that it is something ECMA-376 provides in an original and consistent way.

    I just wanted to temper the rather absolute part of your statement and put in that there is a spectrum of interoperability provisions.

    Custom content is an exciting provision for interoperability, but not necessarily the most fundamental in the short term, it seems to me.

  7. orcmid says:

    Keith and n4cer: I think it is better to start at the top to see how the Digital Archives are used and how they are shaping up:


    There’s also this, also under the Secretary of State:


    It is interesting how bit rot is characterized, and the common litany about how documents become orphaned by the disappearance of the software that supports their electronic formats.

    There is an older inter-governmental initiative that has been going on in WA and I would hope that all county folk are aware of it, since there is rule making around accessibility and deposit with the state, as I recall.  Oh, I’m thinking of GILS and WAGILS: http://orcmid.com/blog/2002_10_27_lair-chive.asp#83794140

Skip to main content