Saturday, May 22, 2010

Happy Birthday Data.gov. You’ve grown so in the last year… (from 47 to 272,677 datasets)

WhiteHouse.gov - Data.gov: Pretty Advanced for a One-Year-Old

“One year ago, data.gov was born with 47 datasets of government information that was previously unavailable to the public. The thinking behind this was that this data belonged to the American people, and you should not only know this information, but also have the ability to use it. By tapping the collective knowledge of the American people, we could leverage this government asset to deliver more for millions of people.

Today, there are more than 250,000 datasets, hundreds of applications created by third parties, and a global movement to democratize data. To date, the site has received 97.6 million hits, and following the Obama Administration’s lead, governments and institutions of all sizes are unlocking the value of data for their constituents.  San Francisco, New York City, the State of California, the State of Utah, the State of Michigan, and the Commonwealth of Massachusetts have launched data.gov-type sites, as have countries such as Canada, Australia, and the UK as well as the World Bank.

…”

Data.gov

“…

Data.gov is leading the way in democratizing public sector data and driving innovation. The data is being surfaced from many locations making the Government data stores available to researchers to perform their own analysis. Developers are finding good uses for the datasets, providing interesting and useful applications that allow for new views and public analysis. This is a work in progress, but this movement is spreading to cities, states, and other countries. After just one year a community is born around open government data.

Just look at the numbers:

6 Other nations establishing open data
8 States now offering data sites
8 Cities in America with open data
236 New applications from Data.gov datasets
253 Data contacts in Federal Agencies
272,677 Datasets available on Data.gov

image…”

Data.govCatalog

“Data.gov Catalogs

Use the Data.gov catalog below to access U.S. Federal Executive Branch datasets. Click on the name of a dataset to view additional metadata for that dataset. By accessing the data catalogs, you agree to the Data Policy. Data.gov offers data in three ways: through the "raw" data catalog, using tools and through the geodata catalog. The "Raw" Data Catalog provides an instant download of machine readable, platform-independent datasets while the Tools Catalog provides hyperlinks which may lead to agency tools or agency web pages that allow you to mine datasets.

image …”

Data.govDevelopers Corner

“Are you interested in sharing your mashups, apps, and ideas? Do you want to learn how to create app and mashups with some of the data hosted here on Data.gov? Whether you are here to share, learn, collaborate, or innovate–you've come to the right place.

image …”

The Data-gov Wiki (Data.gov in RDF )

“The Data-gov Wiki is a project being pursued in the Tetherless World Constellation at Rensselaer Polytechnic Institute. We are investigating open government datasets using semantic web technologies. Currently, we are translating such datasets into RDF, getting them linked to the linked data cloud, and developing interesting applications and demos on linked government data. Most of the datasets shown on this page come from the US government's data.gov Web site, although some are from other countries or non-government sources.

image …”

If only there was a API for Data.gov (cough… odata/”Dallas” would be very cool here… cough)

Still, there’s a ton of “data” here. Now to only turn it into information and finally wisdom…

2 comments:

Michael Hausenblas said...

Greg,

Nice post, thanks! What I don't get is why you say "If only there was a API for Data.gov". There is a very powerful one and you even have it in your post: RDF/Linked Data and SPARQL. It allows you to access the data in a standardized way. For example, go to [1] and execute a query against the public SPARQL endpoint. Or, once we've indexed all the data, you can look it up in Sindice [2].

Once you have URIs+HTTP+RDF (== Linked Data) in place, you have a uniform API. The only thing a developer needs to learn on a per-case bases are the used vocabularies. Experience shows that there are a few, quite widely used vocs such as Dublin Core, FOAF, SIOC, etc. and additionally domain-specific ones, as we also find it in data.gov.

Cheers,
Michael

[1] http://semantic.data.gov/sparql
[2] http://sindice.com/

Greg said...

Michael,
Nice! I had missed that and that was exactly what I was looking for...

Thank you!