Document Format Legislation – Something To Be Avoided


Recently, a number of elected officials in U.S. state governments have been taking a closer look at document formats. Minnesota, Texas, California, and Oregon are the most active voices today while Massachusetts was the first to start examining the issue. Given the genuine intent that underlies the issues being discussed, and the effectiveness of lobbying efforts by various large commercial interests, I would expect to see more of the same from other governments.


 


It is important to note that Open XML meets the requirements of openness in these bills. While I am sorry to see the legislative process used as a mechanism for product competition, I am more concerned about the implications of the bills and the precedent they set.  


 


Real Issues Being Contemplated:


The elected officials who are putting forward the three current draft bills are seeking to address three very real issues. All three are laudable goals, and ones we share with the lawmakers.


·         Archival: As society has progressed from the exclusive use of physical paper (and related goods) for the exchange of information to the digital world, the issue of long-term archival of electronic data has become a real concern. It is incumbent upon policy makers to retain the long-view and think carefully about how future generations will be able to learn from our labors.


·         Interoperability: Governments have an interest in improving the ability of their IT infrastructure to efficiently exchange data in a government-to-government or government-to-citizen context. The data exchanged includes documents, email, images, financial transactions, medical records, and more. Interoperability is the connection of people, data, and diverse systems; be that at the technology (infrastructure, app, data…) or the organizational (org. structure, business process, or legal) levels.


·         Communications: All constituents must be able to communicate with government, and the various levels of government (local, state, federal, international) need to communicate effectively with each other as well.


Governments should be in control of their data. They should have the maximum choice of technologies available to them for solutions. Vendors will respond to market pressures, and bring answers to market in various ways. Innovation will continue to take place with the result of increased choice of solutions, competitive pricing pressures, increased functionality, etc.  Governments should also have the ability to take advantage of these factors as well.


 


 


Factors To Be Considered:


Governments should be in control of their data. They should have the maximum choice of technologies available to them for solutions. Vendors will respond to market pressures, and bring answers to market in various ways. Innovation will continue to take place with the result of increased choice of solutions, competitive pricing pressures, increased functionality, etc.  Governments should also have the ability to take advantage of these factors as well.


·         Most governments have existing laws in place governing the archival of electronic data. Some gets kept for the long-term while different types may be disposed of based upon set timelines. Archivists at the nationals, state, and agency levels pay close attention to hardware, software, and services that have become available to answer archival concerns.  There are solutions out there across a full range of data types.


o   Document standards reach back to ASCII Text, through the advent of HTML, and on into XML-based documents. This progression shows how technology continues to move forward with one design feature of many being archival.


o   Established commercial players like Microsoft, Adobe, Corel, Novell, IBM, and Sun are actively thinking about archival and interop in their product offerings. Just look at the different flavors of PDF that adobe has produced to understand the market effect on producers of software.


o   New comers, like Google, to the document creation market have hybrid approaches as they seek to get a foothold in the market


o   There are entire niche markets that have come into existence to accomplish better document management, and archival. There is healthy competition among those players and they are brining ongoing innovation to that space. 


·         Interoperability is more than just standards. Interop is built into products (documentation, interfaces, protocols, data formats, development kits, etc.), collaboration with other firms to build bridges between systems, providing access to core technologies (through commercial licensing, community licensing, and open specifications), AND through standards.


o   This is an IBM argument I hear all the time, interop=open standards. Not only is that statement incorrect, it doesn’t even reflect what they do with their own business. Simply consider the software industry of 1980 with that of today, and you will see that interoperability is the cover charge for bringing a software product to market rather than the completely integrated stacks sold by DEC, IBM, NCR, et al back then.


o   Governments should want to utilize the full spectrum of interoperability offerings and not be limited to just one, standards.


·         Translation is available today, enabling communications between disparate systems. This is the basis for building a bridge between disparate systems. Software is fundamentally different than anything in the physical world. It is subject to infinite manipulation, and XML has made that much easier.


o   Microsoft, Sun, Novell, Google, IBM and others are working with translators to build the capacity for document formats to be exchanged between competing products that provide valuable and needed choice in the marketplace.


o   Translation is used every day to enable newer software to reach back and work with software produced 10, 20, and even 30 years ago. IBM is a direct beneficiary of these technologies and what they have enable for keeping their mainframe monopoly alive.


o   There are billions of documents out there in formats from all kinds of sources – web pages, simple text docs, complex binary documents (not just from MS) – all of these rely on translation to be brought forward and used effectively today.  Mandating a single document format today means that there really only should be one-way translation from old formats into the new, mandated format. Not only is this needlessly limiting, it carries real concerns about cost, performance, and a host of other factors.


 


 


The Challenge of Complexity:


In conversations with folks involved with these pieces of legislation, it has been clear that the sheer depth of this issue presents a challenge. There are more points to be made than I’ll put in this blog entry, but here are a few additional thoughts.


·         By putting forward a preference for formats as described in the MN bill, lawmakers are being told that it will solve their archival problems. Not true.


o   As I pointed out above, most governments already have archival rules in place. The challenges around archive reach well beyond the formats of documents. For any number of reasons, agencies will have chosen specific types of software applications to meet business needs (such as processing fishing licenses), and the output of that data will not necessarily be done in an open format.


§  The language in MN, for example, specifically says “documents including text, spreadsheets, and presentations.”  Any lawyer will point out that including becomes a very important word here, and thus all government systems (financial, image libraries, mapping data, etc.) would be within scope of the legislation.  This is probably not what was intended, thus showing why legislating in this space is so risky.


§  The precedent set by any of these bills says that in the name of archival, all electronic data should be stored in open, royalty free formats, that anyone can build in the future.


·         Does this mean all financial data currently stored by governments in SAP systems? How about data stored in Oracle databases? How about IBM data formats for mainframe, middleware, and database solutions?


·         The big guys might be able to do this with no problem, but the harder road is for the thousands of small software providers who sell their solutions to governments who will be forced to either re-develop their products or forcibly open up their product in order to retain their government customers.


·         There is no standard use of the word “standard.” There is a definite difference between a procurement “standard” established by the office of the CIO and an industrial “standard” created in a body such as OASIS or Ecma.


o   The MN bill came about because of the desire of a state employee to use a different office suite than MS Office. To be clear, I have no problem with someone wanting to use a different product. The challenge for the employee was the fact that his IT organization had a procurement “standard” in place (to simplify purchasing, gain volume purchase discounts, streamline deployment, establish uniform training practices, control long-term management costs, etc.). All large IT shops utilize this type of “standard.” Moreover, vendors work very hard to have their technologies adopted as enterprise standards. This is where product competition is often most pronounced, and where IT professionals are able to drive the best value for their organizations.


o   Industrial standards are also part of government procurement, but in varying degrees depending on the agency and the systems they would like to put in place. While governments often have procurement guidance based on standards, it is the implementation of the standards that matter most – in other words, the software that is produced and sold in the market that may include a given standards as a subset of a larger collection of technologies.


·         The pace of change matters. Legislation that lays down specific technology guidance will end up out of synch with technology – every time. Laws are made slowly, they change slowly, and they are replaced even more slowly. Within that context, the changes in technology are exponentially faster. Thus, if the state CIO is locked into a decision of what technology to use by law – that makes it very hard for him to select the best technology to do the job from his perspective.


o   In the time that these laws are being considered, ODF will go through multiple revisions. PFD will go through multiple revisions. The same may be true for Open XML or any other standard.


o   IF the government locks its systems to a legislated format, then it potentially does the same for its citizens. Citizens will inevitably move on to new technologies more rapidly than the government’s mandated solution. This reintroduces communications problems the government is seeking to avoid.


o   Another example of specifying technology in the legislation leading to a disconnect is when “open XML-based” formats are the only allowed solution. This precludes the state from using PDF.


·         Legislation like the bills proposed in MN, TX, CA, and OR create precedents that would be challenging on many fronts.


o   The “royalty-free” text goes to the heart of more than 100 years of industrial standards law and policies. There have been clear examples of royalty-bearing standards being superior to royalty-free specs, and that end up cheaper for the end customers due to market adoption bringing economies of scale to bear. Or, having a better technology in the specification enables a better range of solutions to be implemented and benefit consumers more because of the quality of the technology.  


§  This is not and issue for Open XML as it has been released under an Open Specification Promise, and is covered by royalty-free terms in Ecma as well.


§  The risk is in governments not recognizing the value of innovation in any standard and that rights-holders may wish to make submissions to a standard and yet retain commercial value in that submission. 


o   The language in these bills is essentially saying there has been a failing of procurement practices to be neutral and value-based. I submit that this is not the case. In fact, States have in place strong procurement rules, and most have explicit archival rules in place as well.


§  Governments have long-standing procurement practices in place designed to ensure an open proposal and selection process for contracts/goods.


§  There are many, many factors such as price, suitability for use, minority owned-businesses, accessibility for disabled citizens, etc. that are mandatory factors in a purchase decision. If a law is put in place that reduces choice, and removes these other balancing factors – that concerns me.


 


Overall, this is an enormously complex discussion. Organizations with commercial interests to sell solutions dependent upon ODF are working hard to create problems that need to be solved…by them. I am not saying that all is perfect and there is no room for improvement. In fact, I would say governments face legitimate challenges on all three issues listed at the top of this posting – but if we look at what is already taking place in the marketplace it becomes clear that using legislation to address these concerns is not only unecessary, but dangerous. 

Comments (9)

  1. Michael P. Ridley says:

    Thank you for the great post.  I am hoping we get some posts on why Open XML should be standardized in statute.  I really look forward to a great discussion on the pros and cons of these types of bills.

    Thanks again and very helpful.

  2. Stefan Wenig says:

    jason – the real question here should be: does OpenXML still support bullet points, or have you consumed all of them already?

  3. Anthony Christopher says:

    More of the same:

    The solution must be A or B exclusively, when A and B might do very well.(ex. open formats and accessability).

    Your stubborn insistance that in the case of the government, the consumer is better off not determining his own needs and then seeking a product that meets those needs, but letting the market dictate what solutions are available and preferable. I claim, that to get good value, a government must act more like a customer selecting a software building contractor than like a consumer relying on whatever happens to be on sale at the local convenience store. Having a choice of solutions that meet the customer’s requirements is good, but not bothering to define your requirements because someone claims the available solutions will change faster than your ability to define your own requirements is asking for trouble, even if the definition of these requirements reduces the customer’s choices somewhat.

    That is precisely what happens when you decide what you need and then seek someone or some product to meet that need. The need for a government to be able to access old data has been around for centuries and that need is not going to go away just because the IT industry happens to change very qucikly.

    You claim governments already have archival rules in place and suggest that these ought to be good enough. Well, with these rules in place your employer has managed to provide governments with tools that are now seen more as a large part of the archival problem than as a solution.

    It is time for governments, however slowly they move, to formulate a new expression of their requirements to avoid the kind of problem in which your employer’s products have played a part. That is precisely the role fulfilled by the open data format bills and it shows at least a lack in common sense to suggest that it is either inappropriate, ineffectual, or counter productive for governments to pass such. Suggestions as to how these bills might be improved to meet their goals is a different matter.

    The red herring that some businesses, other than Microsoft, might profit from these requirements, does not bother me a bit. Let those that are willing to build a product to meet new legal requirements reap some benefit for their efforts.

  4. jasonmatusow says:

    Hi Anthony.

    i may not have expressed myself well given your first paragraph. I am completely ok with governments expressing their requriements and vendors needing to fill those requirements. That was the point of my opening comments about archival, interop, and comms. I completely agree the governments act like customers when choosing a technology, but that they should act like legislators when defining high-level goals (like long-term access to public records, or conversely, the destruction of certain types of records etc.). The danger is when the act of being the customer is codified in a piece of legislation thus reducing the choice of the IT team as customers. I think you and I may be in agreemend on this (shock for us to agree on something – :-) )

    On the archival point, I was not advocating a dogmatic status quo – I am concerned about the implications of using archival as the blunt tool to drive adoption of royalty-free standardization (for example). The two have a tenuous connection at best. For governments to clarify their archival requirements is fine by me. To do it in a way that is purposely detrimental to a specific vendor is not the role of government in my mind. Legislatively picking winners and losers in the market is a problem. Thus, you see the legislators scrambling to find ways to include PDF because they unwittingly wrote Adobe out of contention.

    To be clear, I would be just as against legislation that stated a preference for Microsoft software. Non-neutral policies are harmful period.

    As for your last point, I too have no problem if others than Microsoft are successful in winning governemnt contracts. If Microsoft can’t produce competitive solutions then it deserves to lose.

    Thanks for the comment.

    Jason

  5. Anthony Christopher says:

    "On the archival point, I was not advocating a dogmatic status quo – I am concerned about the implications of using archival as the blunt tool to drive adoption of royalty-free standardization (for example). The two have a tenuous connection at best. For governments to clarify their archival requirements is fine by me. To do it in a way that is purposely detrimental to a specific vendor is not the role of government in my mind. Legislatively picking winners and losers in the market is a problem. Thus, you see the legislators scrambling to find ways to include PDF because they unwittingly wrote Adobe out of contention."

    Perhaps you could point out to me how royalty-free standardization favors specifically one vendor over another. If the other guy can’t charge royalties either how can it favor him?  Perhaps it favors him in the same way I favor certain vendors when I decide that the vehicle I will drive needs to be something other than a cement truck. All those vendors of just cement trucks will not get my business because I know I do not need a cement truck. Why should the government be forced to look at "cement trucks" when it needs/requires something else? That seems like a waste of time and resources. Even more so if the government ends up buying a "cement truck" in the process.

    I think the government procurement process should be more stringent than my own, not less, because they are spending taxpayer money. With your mindset, I am certainly glad that you are not one of my governments elected officials.

  6. jasonmatusow says:

    You’ll be my first stop for a campaign donation when I decide to run Anthony.

    I think I was being too subtle in my point. Let me put them in bullet form to see if that helps:

    – One of the political justifications for the doc format preference bills is archival of documents.

    – One of the terms in many of the bills is mandatory royalty-free standardization.

    – There is no relationship between whether or not a specification carries royalty-terms (mind you, I am careful not to say that someone is, or is not collecting royalties) and whether or not it is a helpful spec for archival.

    – So, the goal that many anti-software patent activists feel is important (namely, the removal of patents from the standards specification process) gets put into the legislative context.

    – There is over 100 years of industrial precedent about patents in standards, nevermind the fact that the standards bodies themselves have this strange desire to be in control of their own patent policies.

    – My point is not about the royalty-free issue being against any one vendor. The royalty-free term is actually against all potential contributors to standards bodies.

    Jason  

  7. Michael P. Ridley says:

    I spent a lot of time in the State Archives do any of you know what the predominant format is?

    It’s TIFF and JPEG.  In fact NYS archives millions of documents in one of the largest Filenet systems in the country. Filenet is a document management system that uses advanced OCR to do document lookups ect..

    So the archives already uses royalty free formats and a licensed system to read them, where do you see the problem ?

    Also the government procurement with IT systems is flawed but not in this case.  support, reliability and ease of use and employee training all are factored in with TCO estimates on software.  This is a very simple thing as I said I in my earlier post I encourage people in my Agency to use Oppenoffice they choose not to for a variety of reasons.  

    The point I was making is this legislation does favor vendors with integration services like IBM.  In addition the main purpose of the bill is not fulfilled; this legislation has a policy impact that is 180 degrees from what it is being sold as.  That’s my problem with it, it’s misleading.

    Rewrite archival rules forcing better data management disaster recovery ect. Provide funding to create statewide meta data repository or data dictionaries.  These types of things will get us to where we want to go not what office suite and what format reports should be done in.

  8. Wesley Parish says:

    I’m trying to get my head around your two statements:

    "Software is fundamentally different than anything in the physical world. It is subject to infinite manipulation, …"

    and:

    "So, the goal that many anti-software patent activists feel is important (namely, the removal of patents from the standards specification process) gets put into the legislative context.

    ‘There is over 100 years of industrial precedent about patents in standards, …."

    How, if software is infinitely malleable, and is fundamentally different to anything in the physical world, does one get from software to software patents?  Considering that patents were originally designed for the limited monopoly-for-unlimited divulging of new techniques?  Considering that the purpose of this trade-off was the continual development of the state of the art?

    It would appear that software’s infinite malleability would conflict with the legal regime established to reward invention in some very non-malleable materials.

    I can’t see how you can have both things being right at one and the same time, without dipping into some very dubious metaphysics and doubtful logic.

    Of course, patents were also established as a means to satisfy investors that there was going to be some return on their investment, since there was a limited period of monopoly awarded to the inventor, enough time to recoup the investments in plant and machinery.  Again, how does that feature of the original patent concept or doctrine, relate to an "industry" where you can pick up second-hand computers that are still useable, and suitable software, for just a few dollars?  That’s plant and machinery for a software business startup, and it can all fit in a garage or bedroom.

    Someone’s got their wires crossed, and it ain’t me!

  9. jasonmatusow says:

    Wesley – it is not up to me to argue the ontological roots of software, and thus determine if it is patentable. The fact is, the courts have said that it is patentable material. Just look at the fact that my company was hit with at $1.5B judgement to see how real it is. So, if that is the business environment in which one exists – you need to take it seriously.

    As for the point about physical vs. virtual – a railway car that runs on a wide-guage track will have a hard time jumping to a narrow-guage track. In software, translating between two elements is fundamentally different than in the physical world. Just as language is not physical, it is possible for it to be translated effectively even when there is no direct analog for a given idiom or concept.

    Thus, ODF and Open XML can coexist very easily on the same machine, even in the same application and users can move between the two formats easily. The flexibility of software is a clear benefit in this case.

    Jason