USAID revamping data repository

The U.S. Agency for International Development is revamping its repository for agency-funded data to make it easier for partners on international development projects to submit and access frontline information.

The agency’s first open data policy, released in October 2014, required all data to be sent back to USAID so it could be archived and madepublicly available, assuming it doesn’t violate any national security or personal privacy laws.

The policy was intended to address two major agency challenges. First, most of the agency’s work is done through partner grants and contracts. As those awards come to an end, the agency risked losing the data generated through funded projects, according to USAID Chief Data Officer Brandon Pustejovsky. Second, because data is often collected from remote areas and underdeveloped countries, data was lost because laptops filled with dust and crashed and servers overheated, he told the audience at the Open Data Innovation Summit in Washington, D.C., on Sept. 28.

To reduce data loss and to support the open data policy, USAID created theDevelopment Data Library, a public repository of machine-readable data on agency-funded projects.  Since early 2015, partners have submitted forms to the agency requesting to upload data to the library.

“The DDL’s primary focus was encouraging the public’s use of data to generate new research and insight,” Pustejovsky said. However, he described the DDL’s current website as a basic starter page. In order to really harness more data from the front lines and make it more accessible, USAID is upgrading its repository.

A new DDL, scheduled for release in 2017, will be powered by Socrata, a provider of public-sector cloud-based data solutions.  The front page will have a submission button directing development partners to submit data on USAID-funded projects directly to the DDL.

“We recognize that responsible data is just as important and arguably more important than open data,” Pustejovsky said, so the system also guides users through a detailed privacy risk analysis that delves into the content of the data.

The site does not accept classified data and has individual identifiers built into its risk assessment tab. This way, if a dataset contains information on individuals, the site probes the data to get a sense of what the actual risk could be if it were released. It also has enhanced tools for submitting multiple similar datasets and an option to request an embargoed public release.

For the public, the site will feature easy-to-use search on the homepage, the ability to create data visualizations on certain datasets and visibility into the raw data. “We view this as a living repository and an ongoing effort to close the gap between those producing the data and those who can ultimately benefit from the data,” Pustejovsky said.

USAID already relies on open data to make decisions within the agency. For example, during the Ebola outbreak in West Africa, USAID mapped cell phone coverage so the agency and its partners could send and receive information to community health workers regarding lab test results, patient diagnoses, equipment availability and training materials.

“We’re on track, not simply to create another open portal, but to create a best practice repository for using data for improving international development outcomes,” Pustejovsky said.