Open Data 101:

Web Development in the Era of Open Data

Matthew Schwartz, Todd Davis

EPA Internet Service Center - CSC

What is "Data"?

Raw Data Image

 

In this context, data refers to information that is meant to be handled by other computers rather than directly read by people.


Very roughly: think databases not PDFs, spreadsheets not white papers

What is "Open"?

Data.gov download image

 

Data becomes "open" when it is published in a way that makes it accessible to all consumers.

What is Open Data not?


As with any set of "buzz words" there are a lot of interpretations of what fits into a category.

When we use the term "open data" we aren't referring to:

These are all tools built with open data.

Open Data Glossary: XML, CSV, JSON, etc

XML Sample Image

Text files used to structure, store and transport data.

Open Data Glossary: KML / Geodata

KML Sample Image

Same idea as XML - text files transporting information.

Used specifically for Geodata.

Open Data Glossary: APIs

"Application Programming Interface" - a method for computer programs to talk to each other.

In Open Data, APIs are a way for programs to request certain chunks of information over the web rather than the whole data set.

Distributor keeps the data on their computers and provides a "drive through window" for other people to grab small chunks through.

Most effective when dealing with very large or rapidly changing datasets.

Open Data Glossary: APIs

Visit a specially formatted web address:

Image of Google Code

Receive a machine formatted response:

Image of JSON

Raw Data, Not Forms

USAJobs's search forms expose data, but they don't provide open data

Image of USAjobs

Raw Data, Not Forms

Data Diagram

If the contents of USAjobs was offered also as "open" data the public could potentialy create a variety of new and interesting uses for this data.

Not only can the public create new uses for the data it is also portable.

Open Data is:

Why Provide Open Data?

Government as Platform

IT "platforms" are products that allow others to build on their success. They create opportunities.

The biggest success stories in the IT world often involve products becoming "platforms."

Government as Platform

How can opening government data build a platform?

External Contribution

Stakeholders have the most to gain and the most to offer in return

Encourage Experimentation

It is hard for the government to be innovative on the web

These barriers can be lower for the private sector. However they lack easy access to the data that could be used to innovate.

Opening up the raw data that many government sites are based upon allows the public to build supplemental and competing uses for the government's data.

Encourage Experimentation

Enabling the public to utilize open data in its innovations is good for the government as well.

All of this means government web sites aren't saddled with finding "the one true way" right out of the gate.

Efficiency

Data tucked away in open formats doesn't "age" in the same way web sites do. No re-designs, re-templating, or need to be moved to up-to-date platforms.

Assist in the "blending" of data across agency boundaries. Federal to Federal or Federal to State/Local/Tribal

Increase in "visibility": Innovators can expose groups who might not have been the original intended target of the data.

Marketing tool: Attention is drawn to the source of the data as it is discovered and used by the public.

Examples

Data.gov

Data.gov

http://www.data.gov/


"The purpose of Data.gov is to increase public access to high value, machine readable datasets generated by the Executive Branch of the Federal Government. Although the initial launch of Data.gov provides a limited portion of the rich variety of Federal datasets presently available, we invite you to actively participate in shaping the future of Data.gov by suggesting additional datasets and site enhancements to provide seamless access and use of your Federal data."


Data.gov (cont.)

Data.gov

Examples: Data.gov and Apps for America

Apps for America is a project by the Sunlight Foundation. Sunlight is a non-profit group that advocates transparency in Congress.

Examples: Jobless Rate

The Jobless Rate for People Like You is a data visualization tool built by the New York Times.

Using raw data from the Bureau of Labor Statistics this tool breaks out government data to be more accessible to lay people. It is also a great example of the public (private sector) coming up with innovations that could be incorporated back into government web sites.

Every Block

Every Block

http://www.everyblock.com/


A "Mashup" of a variety of data sources

Contact

Internet Services Center

Matthew Schwartz
EPA Internet Service Center - CSC
202-741-4162
Schwartz.Matthew@epa.gov


Todd Davis
EPA Internet Service Center - CSC
202-741-4354
Davis.Todd@epa.gov