The challenge for the 47 developers who entered was to come up with ways of using the vast collections of federal records the Obama administration is now putting online as part of its Transparency Initiative. Through websites like Recovery.gov, citizens can now track the dollars spent in the $787 billion stimulus package. Through a newly re-designed online version of the Federal Registry , they can access copies of pending legislation, department rules and regulations, meeting notices, executive orders and other documents. In Data.gov, they can download and manipulate thousands of federal databases.
In almost every department of the federal government, websites, podcasts, even twitter feeds are emerging to spread the government’s message. Have a simple question about swine flu? Try flu.gov where you can download a widget that shows the weekly outbreaks of flu by state in a Flash animation map you can put on your website, post to your Facebook account or load on your mobile phone.
Never has the minutia of government operations been so readily accessible; and never has it been more important to have trained researchers, academics and, yes, journalists schooled in the art of reading the data.
Apps for America
“Apps for America” was a contest conducted by the Sunlight Foundation for the Office of Science and Technology. Its focus was on Data.gov where, in addition to the databases, the website has a “Tools” section that hints at various widgets you can use to incorporate the databases into a wide variety of geo-mapping programs for graphic display. From the size of the prize money ($25,000) and the low volume of entries, it appears to be one of those press release contests designed more to build awareness than meaningful applications, but it is a useful demonstration of my point.
The winner was a program called Datamasher.org––an almost Fortran-like formula that allows you to use add, subtract, multiply or divide operatives to compare one government data set against another. If, for instance, you put the mortgage foreclosure rate in every state against the suicide rate, you will find Nevada has the highest suicides per mortgage foreclosure in the country, an odd but not particularly useful piece of information.
The first runner-up was ThisWeKnow.org, a website that let’s you type in your zip code to discover random facts kept by the government about your neighborhood. In 60647, for instance, there are 290 factories within 15 miles spewing 5,927,650 pounds of pollutants in a county with 310,000 unemployed and a city with 464,912 homeowners, 597,000 renters, and 6 legislative earmarks requested by 6 organizations. Aren’t you glad you asked?
The last finalist––and best to my mind–– is govpulse.us, a gateway to the Federal Registry that lets citizens sift through this massive database by agency, topic, or date. Beyond the simplicity of browsing, govpulse.us also provides handy “sparklines” on new or pending rules and regulations and links to Google maps with thumbtacks showing locations near you where a rule, hearing, or legislation might apply.
The app is not without its faults. I eagerly clicked on the thumbtack nearest my home and was surprised to find the Coast Guard was proposing a new regulation on when Chicago River drawbridges must open for recreational boats. Only at the bottom did I discover they had been promulgated in 1994.
Data.gov is clearly a work in progress. There are only a couple hundred federal programs that offer “tools” for looking at their information, and 100 of those are Defense Department podcasts featuring briefings, blogs and videos supporting our troops abroad. (None help figure out what’s in the Defense Department budget.)
Of the 26 widgets now available on Data.gov, most of the good ones have been developed by federal agencies to reflect their own special interests.
The Environmental Protection Agency, for instance, has half a dozen widgets that let you type in your zip code to discover the air, water and land pollutants near you, see reports on local factories emitting pollutants, and/or report violators. The presumption, and it is not without merit, is that these can proliferate virally on the Internet like Smokey The Bear logos so you can report polluters as easily as forest fires.
My favorite widget so far comes from the FBI. It is a 10 most wanted list you can incorporate into your website, post on Facebook or carry around on your iPhone. At the top of the list is Osama bin Laden with a picture you can click through to a close-up if you happen to see him sitting next to you on the CTA.
President Obama has a lot riding on the success of Recovery.gov so it is no surprise it is getting a lot of scrutiny from tech heads, contractors, opponents and journalists. His promise in his State of the Union speech was that the public could track every dollar spent in every federal contract to see whether it is being spent wisely. But that is really, really hard.
The proof, six months into the project, is that the General Services Administration has already awarded a $9.5 million contract to Smartronix to redesign the site. Throwing contracts up on the net in raw data form is one thing; creating avenues for understanding what’s in them and parsing the data to come to meaningful conclusions quite another.
Fortunately, there are journalistic enterprises that took Obama at his word and have devoted the resources necessary to unravel the complexities.
Propublica.org is a website created in 2007 with a grant of $30 million over three years from former Golden West CEO Herbert Sandler to conduct investigative journalism in the public interest. Working out of a newsroom in New York, 32 seasoned reporters under the direction of former Wall Street Journal managing editor Paul Steiger are delving into topics like the bailout package, stimulus, Guantanemo, and gas drilling with technical resources few newspapers can muster.
They publish their reports online and in conjunction with newspaper partners. (Their most well known project was a long article––the result of a two-year investigation––on the hospital deaths during Hurricane Katrina published last month in the New York Times Magazine.)
I heard an example of their work on NPR a few days ago when a local reporter used some of their findings to show contractors in Illinois receiving stimulus money were employing only 9 percent minorities (mostly women) versus the 22 percent norm on other government jobs.
The Next New Thing
And that is what transparency is all about: making all this data understandable. It’s great that the government is pushing all information onto the net. (What are the alternatives? Hire more people to process Freedom of Information requests and xerox copies of documents to mail back in return?)
Rest assured, the political operatives in Washington, the lobbyists on K Street, the private corporations who traffic in government funding (who doesn’t these days?) have the money to probe and manipulate this gold mine of data to their advantage.
Who can the public rely on to do the same? In this era when newspapers can scarcely scrape up enough money to send a reporter to cover a three-alarm fire, organizations like Propublica are the next new thing in journalism. Just as they once did in forming the Associated Press to disseminate their news, newspapers would do well to invest in clusters of reporters armed with sophisticated data-mining tools to produce it.
Perhaps newspapers alone can’t fund the whole thing. Foundation support or benefactors may have to play a prominent role. This was the case back in the 90’s when Charles Lewis first established the Center for Responsive Politics to study campaign finance reports from the Federal Election Commission (a model that Sandler says inspired him). That pro bono effort has led to a website OpenSecrets.org that every political reporter uses in every campaign to report on where the money in politics in going.
Data mining the government is a growth industry. The widgets are cute. And there’s an argument that if you have enough bloggers looking at enough data, something good is bound to happen. It’s called the 1000 monkeys at a typewriter theory.
If you can find a way to apply all this technical data-mining know-how to the single purpose of showing the public how their government works––and put it in the hands of someone who can think and write at the same time––then you have something wonderful. A story people will want to read.