Open Government starts with Open Data... (but...)

But it doesn't end there.

Aggregating and exposing information is the initial hurdle to compliance with the Open Government Directive (https://www.whitehouse.gov/open), and is the basic prerequisite for everything else. But the Open Government Directive isn't just an exercise in publishing data - the intent, and expectation, is that the data will be an enabler for a new dimension of Government service and citizen participation.

It's clear that the Administration understands this and has a vision for the long-term impact of the Directive, as highlighted in the blog post, "Why open gov matters to you" (https://www.whitehouse.gov/blog/2010/01/23/why-open-gov-matters-you). The examples of impact they mention demonstrate consumer and citizen benefit, as opposed to the potential impact on technologists.

It's a great vision – and I believe that it is necessary and achievable. However, the heavy focus in the technical community on getting the data has obscured some of the downstream requirements for getting the overall vision right. Sometimes it seems there's a perception that just exposing the data is enough - and there's an expectation that useful applications will start to "magically" appear. I've tried to avoid the cliché analogy to the "Field of Dreams" movie mantra: "If you build it, they will come..." – but it's just too tempting in this case.

Unfortunately for the Open Government Directive, it's likely that "they" (consumers and citizens) will NOT come if we limit the overall focus to data publication. The fact is, most people will not have the time or the interest to download datasets and scour through them for usable tidbits... nor do most people have the context to fully understand the data or the time or interest to attempt to gain the proper insight.

In order for the Directive to produce results that will truly affect political and cultural change, we need rich, usable, and USEFUL solutions that will add real value to citizens and agencies… and that doesn't happen by accident or merely through community enthusiasm. There are a set of moons that need to align "just right" in order to produce something that will be significantly valuable.

This set of elements that must coalesce includes the Data, the APIs, the Builders & Tools, the Domain Experts, and the Incentives. To elaborate:

The Data - As mentioned above, this is the cornerstone to it all… without good data, nothing else matters. The good news: as a result of the Open Government Directive, we should start to see rapidly increasing volumes of Federal information – and many States have also implemented their own data publishing initiatives. There are already some great data sources available – including (but not limited to, of course):

  • Data.gov – this is the primary repository for the Open Government Directive datasets, so will be the granddaddy source into the future. There are already over 1,000 datasets available.
  • Open Government Dashboard (https://www.whitehouse.gov/open/around) - this is the dashboard site for reporting on Agencies' status on the Directive. It provides a visual for the quality and completeness of the data for each agencies – as well as links to the OpenGov sites for each Agency. I've found it easier to start here – and you'll also be able to find datasets that are not yet up on Data.gov.
  • Business.gov – this is the SBA data site (some of the data is also up on Data.gov), there are some good business loans and grants datasets here.
  • GovTrack.us – using words from their site: "GovTrack.us is a tool by Civic Impulse, LLC to help the public research and track the activities in the U.S. Congress, promoting and innovating government transparency and civic education through novel uses of technology." Basically, they have a set of automated services that go out and collect a tremendous amount of information about Congressional/Legislative activity, and then make the information available for consumption. I've found it to be an amazing resource.
  • New York City Datamine (https://www.nyc.gov/html/datamine/html/home/home.shtml) - has some excellent datasets, and was the basis for their recent "BigApps" contest.
  • DC Data Catalog (https://data.dc.gov/) - DC was actually an early leader here, and has a very rich library of datasets.
  • Virginia Datapoint (https://datapoint.apa.virginia.gov/) - had to include a link to data from my home state. Some great information there, everything seems to be available in Excel format.
  • Edmonton, Canada (https://data.edmonton.ca) – one of the first cities to use the Open Government Data Initiative (OGDI) to publish their data – and there are several applications implemented against the Vancouver data calalogue that were likely a big help to consumers during the Olympics: https://bit.ly/dlavR6
  • Microsoft Azure "Dallas" (https://www.microsoft.com/windowsazure/dallas/) – "Dallas" is essentially a "Marketplace" for data that data publishers can use to expose their data/services very broadly, and data consumers can get access to datasets that would otherwise be inaccessible or prohibitively expensive. Several datasets are already available, including some from Data.gov, NASA, and Associated Press.

The APIs - The data needs to be hosted somewhere and published in a way that makes it easily consumable. The minimal expectation should be a structured format for download, such as XML (nothing burns my britches more than getting structured data in a PDF file). The problem with downloading datasets, of course, is that someone has to be responsible for ongoing synchronization… and if many organizations are using the data, that means there will be a tremendous amount of duplicate work. The best solution: a single source of the data, exposed via open standard APIs. Some of the items and sites that are worth noting include:

  • Open standard interfaces – there's GREAT news here: the industry has gotten very mature when it comes to standards and interoperability. Most developers are already familiar with using XML, WebSerices, REST, JSON, Open Data Protocol, KML , etc. – and there are some great tools available that make this very easy. The trick is making sure that those who publish data use these standards.
  • SunlightLabs (https://sunlightlabs.com/) – Sunlight Labs is part of the Sunlight Foundation, a non-profit, nonpartisan Washington, DC based organization focused on digitization of government data and making tools and websites to make it easily accessible. They provide a community to work on open source project for Government transparency – including providing some great APIs for data, such as the SunLightLabs API and VoteSmart API. You can find a list of some others at https://wiki.sunlightlabs.com/Main_Page
  • Microsoft Azure "Dallas" (https://www.microsoft.com/windowsazure/dallas/) – as mentioned above, a great deal data will be there – and it will all be exposed via APIs. My expectation is that this will be a huge opportunity for developers into the future.
  • Open Government Data Initiative (https://www.microsoft.com/industry/government/opengovdata/) - OGDI (pronounced og-dee) is Microsoft's initiative to help agencies (and others) publish data and expose it via APIs. The idea is simple: provide a toolkit that allows agencies to fairly easily publish data and host in the cloud, in this case Windows Azure, and then automatically expose that data via open standard API interfaces. OGDI takes most of the hard work of hosting and writing the API services off the table, so agencies can focus on getting the data right.

The Builders & Tools – Okay, you have the data and you have the APIs… the next thing needed are developers and designers who know how consume the data and how to create interfaces that meet the needs of Citizens and Agencies. You also need good tools and technologies as a platform for the solutions. The challenge here is getting the time and focus from developers (which I'll talk about in my last point). There is good news here, though – as there are some great tools available. Some of the Microsoft technologies definitely worth considering include:

  • Visual Studio 2008/2010 – in my opinion (biased, of course, but I don't think overly so), there is no better tool for writing applications – specifically those that work with data and Web Services.
  • REST Starter Kit – an add-on for Visual Studio that makes consuming REST services a piece of cake. You can get more information here: https://bit.ly/JYxf - and the download (including source code) is available on CodePlex here: https://bit.ly/164EHm
  • SharePoint – I'm probably not telling anyone anything new here – but SharePoint can be the perfect platform for open and collaborative solutions. Publishing, content management, collaboration, are social networking are all out-of-the-box capabilities. It also happens to be a superb platform for customization, enabling the richest experiences, such as in the https://Recovery.gov site (which was built on SharePoint).
  • Bing – Bing? Isn't that just a search engine? Actually – Bing is really a very powerful platform for developers, and exposes a slew of APIs for consuming Bing search data (such as news items, images, business information, and web searches) – you can find more here: (https://www.bing.com/developer). Oh, and don't forget Bing Maps, of course… I'll never go back after using the new Silverlight interface. More info here: (https://www.microsoft.com/maps/developers/)
  • Silverlight – the key to building USABLE applications is having a user experience that is compelling and functional, and to that end, Silverlight provides a superb platform for Rich Internet Applications. There are some really cool examples of Silverlight use for OpenGov solutions – such as IDV Solutions' showcase (https://visualfusion.cloudapp.net/), ISC's Miami 311 (https://miami311.cloudapp.net/), and the City of Lakewood ezMaps (https://maps.lakewood.org/)
  • Windows Azure –Microsoft's Cloud Platform (https://bit.ly/yWv8a). Agencies will need a place to host the data and run the code that provides a web front-end as well as API interfaces. Since this data is all for public consumption, anyway, concerns about security and privacy in the public cloud are not an issue. Plus, the load on this data will likely be highly variable, and funding for capital expenditures on new data centers is likely scarce. Net net: public clouds are almost the perfect option for Open Government Data.

The Domain Experts – My favorite dataset up on Data.gov is the "2005 Toxics Release Inventory data for Guam"… it just sounds very interesting and cool. Unfortunately, I have no idea what a Toxic Release Inventory is, I'd probably be challenged to find Guam on a globe, and I have no idea how to decipher the information in the dataset. However, there is likely someone out there who completely understands the information and knows how it can be used. It is critical to engage such Domain Experts as we embark on the development of solutions – otherwise, most of the data publishing work will be a useless exercise. These experts can provide:

  • Insight to the meaning of the data and how it can be used (Contextualization/Semantic guidance)
  • Insight into what citizens and government agencies actually need - and how that data can be assembled to meet that need (Basically: good ideas for solutions)
The Incentives – this one is the kicker… how do we actually get people to write applications?? Yes – there will be some that are built out of interest, curiosity, and even civic responsibility (e.g., SunlightLabs). But I think it would be unreasonable to expect those things, alone, to develop an entire ecosystem of solutions that truly meet the tenets of Open Government. There needs to be a carrot that helps incent people to spend time and money building such solutions, and that typically means that that folks need visibility into how they're going to monetize their investment – either by selling solutions, using an advertising platform, gaining visibility in the market, providing consulting opportunities, etc. I know the idea of profiting may seem a bit counter to the noble tenets of Open Government – but without such prospects, it will be challenging to develop a fruitful marketplace of solutions, especially in this economic environment.

 

Thus, success of the Open Government Directive isn't just about getting the data right, it will really depend heavily on execution across the above five elements (Data, APIs, Builders & Tools, Domain Experts, and Incentives) – as well as formal mechanisms and a tangible strategy to bring them all together.

To date, I'd say we (everyone involved and interested in OpenGov solutions) are on a pretty good track – data sources are growing rapidly and quality is improving, the shift to APIs and advancements of the standards has been strong, and the developer community is definitely getting engaged. I don't think we've nailed the engagement of domain experts or models for incentives– but as a real business opportunities start to emerge, I suspect that those areas will ramp up quickly.

Now – for my next trick, I plan to build a small "open gov" application using some of the resources mentioned above. I'd like to create a quick app that allows me to enter my zip code and find out who all of my Congressional representatives are, then find out the committees they're on, bills they've sponsored, how they vote based on some issues… and then provide me the tools to contact them, as well as enter a daily rating on how I think they're doing.

Depending on how much it snows – and as long as the power doesn’t go out, I hope to be done in the next day or so. I'll post the results and code once finished.

-Dan