More standardization needed in open data

Government’s pivot toward open data can be seen with the passage of the Data Act in 2014 and the Obama administration’s focus on open government initiatives. But data standardization has not progressed at the rate necessary to take advantage of the growing amount of open data, a new report from the Data Foundation and Grant Thornton said.

Hudson Hollister, interim president of the Data Foundation and founder of the Data Coalition, is enthusiastic about the potential that standardization has for making data more usable and integral to decision-making processes. “The need to create consistent data standards and apply them -- that’s the main reason why all of the utopian benefits of open data haven’t happened yet," Hollister said,  "and the reason why, I think, across the entire government we’re just starting to reap the benefits.”

There are a number of issues that arise as a result of little to no standardization in the government's older datasets, according to open data advocate and 18F staffer Waldo Jaquith, who was one of more than 40 people interviewed for the report. There is no metadata, so the information is not machine searchable; there is no agreement on the core types of government data; there is no central repository of government datasets; data portability is rare, making transferring data difficult. There are other issues too, Jaquith told interviewers.

“There are clever projects working on addressing all of these gaps, and some of these have achieved modest success on the sharing of data within the private sector and academia,” he said. “But they have not made any impact on the governmental practice of open data, nor is there any sign that they will soon.”

The Digital Accountability and Transparency Act was an important start for standardization, Hollister said. Prior to its passage, there was no one in charge of creating a consistent data structure for federal spending. Agencies were reporting some spending to the Treasury Department, some to the White House and other spending to the General Services Administration, he said. That’s changed now, but the Data Act covers only expenditures.

“It’s a real start,” he said. “We hope to see similar progress outside spending.”

One place to begin, Hollister said, is with the National Information Exchange Model, a project to create  a core set of data fields that can be used for common data fields like names and addresses. New standards could be built from this core and create what Hollister calls “serendipitous interoperability” between datasets.

The report suggests the government’s job will be to make sure standards are put in place across the board. “Government can help to solve these issues by mandating standards and setting up processes for their development and maintenance.”

The use of open data and standardization is not just good for government transparency, but also for management, Hollister said. The government will be able to save money on a “magnificent scale” when all government spending data is in one place, he said.

The term open data will eventually go away, he said, “when this isn’t just some special tech project anymore, but just the way that government works.”

Read the full report here.