"Today 30 organizations from across the political spectrum joined together to ask Congress to improve public access to legislative information. Our joint letters to congressional appropriators and rulemakers urges Congress to direct that the THOMAS legislative database be published online and to establish an advisory committee on further improvements.
THOMAS, Congress' legislative information website that provides basic information about legislative and congressional actions, has fallen far behind the needs of its users. Many have turned to important websites like GovTrack, OpenCongress, and WashingtonWatch to monitor congressional activities.
These sites and others, which repackage and add important context to legislative activities, extract data from the THOMAS website through a painstaking and often brittle process. To make this process easier and more reliable, the Library of Congress should publish THOMAS information "in bulk," which makes the entire legislative database available for download at once, instead of publishing information in such a way that it can only be gathered by scraping data from hundreds or thousands of webpages.
Bulk access to legislative information is already common practice inside and outside the government. For example,
- The Government Printing Office publishes six major databases, including the Federal Register, in bulk;
- The House's Office of Law Revision Counsel publishes the U.S. Code in bulk;
- New Jersey and New Hampshire publish their legislative information in bulk; and
- Data.gov has nearly 400,000 datasets available in bulk, including 4,395 high-value datasets.
Why? It's US Gov Manual as XML.
I don't know why, but I just think that's cool. Just as cool as the rest of the bulk data mentioned in the above post. Now to noodle up some ideas for using this data... hum...