The Data Foundation and Grant Thornton have co-published the second edition of The State of the Union of Open Data, which combines the perspectives of more than 20 government and industry leaders collected through one-on-one interviews. Their consensus? Once government and the private sector transition to open data – information both electronically standardized and freely available – it becomes a valuable resource with both internal and external applications.



  • Adam Hughes, Grant Thornton Public Sector
  • Matt Rumsey, Independent Consultant



1. Introduction and Executive Summary

2. Survey and Results

3. Government Data

4. Compliance Data

5. Private-Sector Data

7. Appendix

Introduction and Executive Summary

Open data – the idea that information should be both electronically-standardized and freely-available – has taken hold in the United States, bringing sweeping changes for our government and society.

Government, nonprofit, and industry leaders in open data came together in September, 2017, for Data Transparency 2017, the nation’s largest annual open data gathering. This fifth annual policy conference, hosted by the Data Foundation, came at a vital moment for the open data movement in the United States. By combining the perspectives of Data Transparency 2017 participants, this report seeks to capture the moment, provide a vision of the future, and catalyze further efforts.

2017 was a time of change and reckoning, with a new administration in the White House and major deadlines for implementation of the nation’s first open data law, the Digital Accountability and Transparency Act of 2014.

2017 was also a time of renewed focus. At Data Transparency 2017, participants heard a strong recommitment to open data from the White House. We heard new ideas from Congress, federal agencies, the nonprofit community, and the business community that will help grow the open data movement in years to come.

In the United States, the open data movement benefits from decentralized leadership, with energy coming from government, nonprofit, and tech-industry sectors. Data Transparency 2017 brought all of these sectors together in one venue to express and reflect the current state of open data, predict opportunities for the future, and identify challenges standing in the way of those benefits.

The combined perspectives of these stakeholders, expressed in a survey and interviews conducted by the Data Foundation and Grant Thornton, represent the State of the Union of open data in the United States.

The Data Foundation defines open data, at its most simple, through two steps: first, standardization, and second, publication.[1] The Data Foundation believes the benefits of open data can be recognized as three pillars: external transparency, internal management, and automated reporting (see matrix).

The open data movement began over a decade ago with a focus on government information, but it is no longer limited to government.[2] The Data Foundation also explores how open data can transform the compliance sector, defined as the interface between the public and private sectors.[3] And, at Data Transparency 2017, the Foundation featured a program track exploring purely-private sector open data: circumstances where private-sector organizations voluntarily standardize and publish their operational data, without any government mandate.

This focus on the private-sector was brand-new for the Data Foundation in 2017, but its ramifications are similar. We (authors Adam Hughes and Matt Rumsey) believe the same three pillars – better transparency, better management, and automated reporting – can be recognized across all three sectors (see matrix).

Accordingly, this report, just like the Data Transparency 2017 conference, will be broken down into three distinct sections, focusing in turn on government data, compliance data, and open data in the private sector.

When we interviewed open data leaders and witnessed sessions of Data Transparency 2017, certain themes were stressed again and again. Open data transformation is a marathon, not a sprint. Alternately, it is a relay race that requires continuity between administrations, cooperation across branches of government, and shared vision in the private sector. Open data is a non-partisan imperative, with stewardship of the public’s vital information at its core and innovation for the private sector’s information as its logical conclusion.

Interviewees found the three emerging open data sectors – government, compliance, and private sector – to be interesting on their own, but argued that even more potential comes through the sectors’ combination and comingling. Furthermore, interviewees noted that our three categories of benefits – transparency, management, and automatic reporting – are similarly intertwined. 

Based on our interviews and research, we believe that the State of the Union of Open Data is strong. This report attempts to assess just how strong, and why.

I believe data standardization is the new DNA. Data standards will allow for future advancement….We could not put a man on the moon or on Mars in the future without [standardized] data.
— Marcel Jemio, Chief Data Architect, Office of Personnel Management [4]

[Y]ou [couldn’t have] imagined when the lightbulb was invented what would have happened in the age that followed … I think we’re just starting to see, once we open the data, what can happen as a result of that and imagine those possibilities.
— Victoria Collin, Senior Policy Analyst, Office of Federal Financial Management, White House Office of Management and Budget [5]

…[D]ata has infiltrated everything that we do, purchases that we make, topics we’re interested in, things we want to learn more about, social connections, all of these things that should enrich our lives … I can’t think of an industry that’s not using observed data to be predictive of future outcomes or likely positive experiences and wiring that into what you see on a shelf or what you interact with or what’s presented to you.
— Nate Haskins, Chief Data Officer, S&P Global Market Intelligence [6]

Matrix: Three Sectors and Three Pillars of Benefits: Selected Applications of Open Data


Survey and Results

To best understand the State of the Union of open data, the Data Foundation and Grant Thornton conducted interviews with, and requested written submissions from, dozens of participants in Data Transparency 2017.

We conducted a standardized survey of all our interviewees. Its results back up our positive assessment of the State of the Union of Open Data while also highlighting vital nuance and revealing areas for improvement.

First, we asked our respondents whether or not data standardization and publication – the two basic steps of open data – had improved over the past year. The answer to both questions was, overwhelmingly, yes. However, several respondents indicated that data publication had not advanced particularly well in the last year and many expressed a feeling that both were not moving forward as fast as they would like.



In your field, has the standardization of data improved in the last year?

data-publication 4.png

In your field, has the publication of data improved in the last year?

Second, we asked our interviewees to rank the significant benefits of open data, which the Data Foundation has categorized as transparency, management, and automated reporting. As we will discuss further, below, we saw a trend toward greater understanding of the value of open, standardized data for internal management.

Third, we asked our respondents about their views on the future of open data: are they optimistic that standards and publication will improve over the next year or two? Are they pessimistic? Do they think things will merely stay the same? The results, again, were encouraging, with most respondents expressing optimism, several arguing that things are likely to stay roughly the same, and no one expecting significant rollbacks.


Rank the significant benefits of open data.

The Data Foundation categorized the significant benefits of open data as transparency, management, and automated reporting.

Group 2.png

Will standardization and publication of data improve, stay

the same, or deteriorate in the next year?

After responding to the standardized survey, we informally interviewed our participants for further insights into their perspectives on the status, challenges, and future of open data in the United States. They provided the following observations on our three separate, but interwoven, open data sectors: government, compliance, and the private sector.

We recognize that modernizing gov is not a sprint, but it’s a marathon and it’s a relay race from one administration to the next. Only by working together can we truly achieve this progress for our country.
— Chris Liddell, Assistant to the President, White House Office of American Innovation [16]

... [W]e have to transform that relationship from purely being viewed as between the public and the government to being more about whoever needs it, including other government agencies, [and that] is something that we’d like to pursue further. Partly that’s the concept of eating your own dog food or drinking your own champagne …
— Phil Ashlock, Chief Architect for, Acting Director of the Data Analytics Portfolio, Technology Transformation Service, General Services Administration [17]

Open data is not just a transparency exercise. It’s really integral to the management of the federal government itself.
— Margie Graves, Acting Federal Chief Technology Officer, White House Office of Management and Budget [18]

Government Data

Thanks to energy and effort by public servants in Congress and across the federal government, as well as believers in the nonprofit and private sectors, the state of open government data in the United States is stronger than it has ever been.

The DATA Act of 2014[19] has resulted in more and better open data on how taxpayer dollars are spent., Project Open Data, and other governmentwide efforts launched during the Obama administration have resulted unprecedented public access to all sorts of data from all across the government. They have also facilitated internal access: government executives now have easier access to managerial data to inform their decisions.

That does not mean, however, that the job is done.

A common theme in our interviews was that, despite the great work done so far, the path toward standardized and freely-available government data is long and there is plenty of work left to do. Overall, the experts we spoke with were optimistic that standardization and publication would continue to improve in the coming years, but only if those who believe in the power of open data inside and outside the government continue to work hard. Many of our interviews touched on the need to improve data quality across diverse types of data, the importance of data standards to improve internal data sharing and management, and the need for committed leadership to move open data forward in the coming years.


From External Transparency to Internal Management

Overall, the biggest trend in government open data appears to be a growing recognition that open data reforms initially enacted to improve external transparency are now playing a major role in increasing internal management efficiencies.  

Our interviews unearthed some examples of this trend in being made real at federal agencies. For example, White House OMB senior policy analyst Victoria Collin explained how federal chief financial officers, who have already carefully managed their financial data under the Chief Financial Officers Act, are now starting to see that data “linked very explicitly to other data sets, and are starting to see more broad management tools and implications and ideas for where we can go now that we’re able to connect these pieces of information more deeply and fundamentally.”

Justin Marsico, a senior policy analyst at the U.S. Treasury, Bureau of the Fiscal Service, explained how open data can help break down silos that all too often exist, even within agencies. “There’s this idea that people who work in Federal agencies already have access to their own agency data, [but] in my experience that just isn’t true,” he said during our interview. “You might have access to it in theory, but in practice putting it on a website means that you can do your job more efficiently, because you don’t have to be a data expert, you don’t have to design a query to run against the database, you don’t have to get some kind of report and figure out how to interpret it. You can just go to the database yourself.”[20]

The responses to our survey questions further clarified this shift in perspective. When asked to rank the significant benefits of open data (which the Data Foundation has categorized as transparency, management, and automated reporting) our government respondents prioritized the benefits related to internal management nearly equally with external transparency.

On the stage at Data Transparency 2017, acting federal chief information officer Margie Graves stressed the importance of standardized data for government management: “[O]pen data is not just a transparency exercise. It’s really integral to the management of the federal government itself.”[21] When we talked Ms. Graves after her address, she stressed the need to focus on achievable standardization goals that can be built upon. Using data to drive decision making, she noted, will also lead to improvements in “the quality of the data itself, because once people start using it for data-driven decision, [they’ll] pay attention to what they put in the box.”[22]

With a new administration in charge of the federal government, there was initially some concern that open data would be de-prioritized and rolled back across the board. While there have been a few instances in which the Trump Administration has removed previously-available open data sets as part of shifts in policy direction or reorganization, there has been no wholesale rollback of the open data progress of the Obama Administration.[23]

In fact, the Trump Administration – particularly OMB director Mick Mulvaney and the new Office of American Innovation – has made modernizing government, in part through open data, one of its top priorities. Chris Liddell, an Assistant to the President in the Office of American Innovation, took the stage at Data Transparency 2017 to affirm the Trump administration’s commitment to data-driven governance and its focus on leveraging data to build more efficient and effective government services for American taxpayers. In particular, the administration has shown a deep understanding of, and support for, the DATA Act, reinforcing the importance of open data for internal efficiency as well as external transparency.[24]


Management Across Agencies

Open data is a powerful tool to make government more transparent to the people it is supposed to serve.[25] But, perhaps as importantly, it is increasingly a powerful tool to improve the way that government is managed on the inside.

A common theme throughout Data Transparency 2017 and our interviews was the ability for standardized, freely-accessible data to break down longstanding siloes between government agencies. Data standards, at their core, are about getting diverse groups of stakeholders to agree on what they’re talking about and how to talk about it.

Implementation of the DATA Act, which produced the only governmentwide data standard of any kind currently in use, has helped show the importance of open data for internal management across multiple agencies. For example, the Treasury Department’s new Data Lab, unveiled at Data Transparency 2017, offers analytical applications for spending data, newly-transformed under the DATA Act, to create comparisons across agencies.[26]

Dave Mader, who was heavily involved in DATA Act implementation in his former role as Controller of the Office of Management and Budget, explained how standardizing data under DATA Act has helped federal managers. "We’re able now to look at that whole spend [across government] and start [asking:] what is that telling us as managers? Are we issuing grants to similar kinds of organizations in a particular state, city, or zip-code? … Are we giving grants to similar kinds of organizations from different government agencies? [We can] step back and say: is there a better way to affect the outcomes that we want from those disparate programs and bring them all together?”[27]


Data Use Leads to Data Quality

The need for improved data quality was another common theme among the experts we spoke to about the current state of government data. Although strides have been made with respect to standardization and publication – especially for spending information, via the DATA Act – the need to ensure consistent quality across a wide variety of government data sets is ongoing and requires plenty of work.

Luckily, open data, by its very use, can create incentives to improve its own quality. Phil Ashlock, chief architect at, explained that a central portal like his own can serve as a funnel for feedback to agencies, helping them improve the data sets they have made public. Ashlock has seen, “incrementally, data set by data set ... agencies are responding to that need or interest from the public and improving it.”[28]

Several interviewees talked about how a threshold of better data for management, for transparency, and for reporting and compliance will encourage more use, ultimately leading to more improvements in quality. As Inmar vice president Stephen Ibach concluded, the “benefits [that] are going to emerge as a result of the combination of greater transparency and clearer management while embracing complexity… [include that] you'll start doing discovery you haven’t done before.”[29]

The relationship between better data quality and expanded data use is self-reinforcing. What may have initially been seen as separate aspects of the drive to transform government information from disconnected documents into open data are increasingly being seen as integral and connected pieces of the larger whole.


Still Needed: Leadership

The importance of strong leadership for the continued progress of open data was stressed by almost everyone we spoke to. Several of our interviewees explained that the technical challenges involved with standardizing and opening data are not particularly complicated, but the cultural challenges require strong leaders capable of using a soft touch when necessary.

A few days after Data Transparency 2017, Deputy Assistant Secretary of the Treasury Christina Ho retired from government service after serving as the Treasury Department’s lead implementer of the DATA Act.[30] Ho’s work to evangelize the benefits of open data for government spending information were widely credited as a key reason for the governmentwide mandate’s successful and timely implementation.[31]

When we spoke to Ms. Ho, she explained that getting stakeholders to focus on common goods is a key prerequisite to establishing effective data standards. “In order to do data standards you have to be willing to think common good, because a data standard is inherently about giving up something for the greater good, which means giving up the way you’re doing it right now” for future benefits.[32]

Our interviewees cited the success of the DATA Act as an example of the need for, and benefit of, leadership. It took tremendous effort, leadership, and persuasion to get a wide, diverse range of stakeholders to agree on data standards and processes to implement the DATA Act. As Office of Personnel Management chief data architect Marcel Jemio explained to us while discussing Ms. Ho’s DATA Act leadership, you ““need someone…who can break out of the pack and provide the leadership needed to advance standards. Her leadership allows me to take advantage of opportunities available …”[33]


Data Standards Remain Crucial

Why are data standards so important? We dug deep into that question last year in the first edition of State of the Union of Open Data.[34] Ultimately, data standards provide several indispensable benefits: efficiency of production and consumption, increased comparability, increased consistency, and greater attractiveness for investment.

Thanks in large part to the DATA Act, under which the Treasury Department created the DATA Act Information Model Schema (DAIMS), our first-ever governmentwide data standard – many of the experts we spoke to expressed cautious optimism about the future development of data standards across the federal government.

However, those conversations made two important challenges clear. First, data standards are not developing as fast as many would like. Second, a number of vital programs that drive standardization are at risk due to lack of funding or unclear planning. Even the DATA Act, which was broadly seen as successful, was held up as an example of how difficult it is to develop data standards without additional funding.

At Data Transparency 2017, we heard from several public servants involved in the National Information Exchange Model (NIEM), a “common vocabulary that enables efficient information exchange across diverse public and private organizations.”[35] NIEM is used by organizations across levels of government and in a wide variety of domains. It eases data communication across diverse domains including emergency management, justice, surface transportation, agriculture, health, and many more.

Interviews with NIEM stakeholders often came back to the importance of the program for working across different levels of government and across communities as well as across domains. As Treasury solution architect Justin Stekervetz told us, NIEM can be viewed as community work, not merely government work, with the “most valuable user base [being] the state and local community.”[36] NIEM was cited as a program that eases sharing the burden of work, as well as data, at a time of tight budgets and reduced resources.

Unfortunately, our interviews with several individuals involved with NIEM raised concerns about its future, citing its lack of specific funding or a clear owner among the federal agencies. On a more positive note, those interested in NIEM had plenty of ideas about how, specifically given its cross-domain and cross-level nature, it could be funded and structured in the future.


Spotlight: The OPEN Government Data Act Moves Toward Passage


Interviewees expressed optimism that Congress would continue to show leadership on open data issues. That optimism came, in part, from the OPEN Government Data Act,[37] which both chambers of Congress passed (though in different forms) in 2017.[38] Further action on the OPEN Government Data Act is expected in 2018.

If enacted, the OPEN Government Data Act will enshrine the 2013 governmentwide Open Data Policy[39] into law. It will require all federal agencies to publish their information online, using non-proprietary, machine-readable data formats, encourage agencies address data quality and access, write open data definitions into law, and create a statutory Chief Data Officer role.

Our interviews revealed that open data leaders in the executive branch where heartened by the passage of the OPEN Government Data Act. First, its substantive mandates create a structure and time frames for agencies to continue their transformations. Second, the bill expresses Congress’ approval of the long-term transformation from a government powered by paper documents into one that is designed to operate seamlessly on open data. If enacted, the OPEN Government Data Act provides a touchpoint for executive-branch leaders’ evangelism within their agencies.

As one regulatory agency’s data policy manager explained, “it really builds on some of the things that [open data leaders are] trying to accomplish” in her agency, and across the government, giving them “more leverage to continue in this direction. It gives us a good indication that there is bipartisan support and that this administration as well is supporting some of those same goals.”[40]


Spotlight: The DATA Act Takes Effect

Last year, at Data Transparency 2016, the open data community was eagerly anticipating the first significant deadlines for implementation of the Digital Accountability and Transparency (DATA) Act of 2014. In May 2017, every agency began reporting standardized spending data to the Treasury Department, and Treasury combined their submissions to create the first unified data set covering the entire executive branch’s spending.[41]

As a result, at Data Transparency 2017, we were able to look back at the implementation process to celebrate its success and highlight some of the vital lessons learned as the federal government embraced the nation’s first open data law.

Data Transparency 2017 showcased some of the first uses of standardized spending data for both external transparency and internal management. The Treasury Department announced a new suite of analytical tools to help agencies and the public understand the newly-published federal data set. Treasury's new Data Lab, hosted on GitHub, offers powerful new visualizations of federal accounts, contractors, spending classifications, and the government's workforce, and also offers free access to the source code.[42]

Moreover, several interviewees spoke highly of the development process used to implement the DATA Act and build associated technical infrastructure. Agile development practices were utilized to ensure technical issues were caught early, that the product was continually improved, and that it met the needs of data users.

Treasury’s Justin Marsico explained how this played out in practice: “[When it was time for agencies to submit [standardized spending] data, [they] had [already] done it. They had been practicing for months.  The team had come up with different, early versions of the broker which accepts all the data from federal agencies and they’d been iterating on that for a long time, so there weren’t really any surprises when it came time for go time.”[43] DATA Act implementation could serve as a model for future governmentwide efforts to transform information domains beyond spending.

While the mood around DATA Act implementation is positive, many of those we interviewed were very clear it is only the beginning. They stressed the need for continued expansion of the DATA Act Information Schema (DAIMS), a commitment to oversight by Congress, and ongoing leadership to ensure the spending data is fully utilized within the executive branch.

Many of those we talked to were optimistic about the path forward for the DATA Act, noting the speed with which the Trump administration affirmed its commitment to the DATA Act’s principles and highlighting the diverse group of legislators in Congress who continue to track the Act’s progress and conduct oversight.[44]

If the government is going to dole out $600 billion [in grants], then we must utilize the incredible technology we have at our disposal to streamline the federal grant reporting process and learn something from all [those] hardworking taxpayer[‘s] money.
— Congresswoman Virginia Foxx [45]

Standardized, structured data provides an enhanced payload for [blockchain, artificial intelligence, and other emerging technologies]. Unless you have a highly structured payload, those emerging technologies are simply carrying around the digital equivalent of stone tablets, which would be just as cumbersome to access and use as they are today without the emerging technologies.
— Mike Willis, Assistant Director, Office of Structured Disclosure, Division of Economic Risk Analysis, Securities and Exchange Commission [46]

[I]f there [are] so many different [regulatory] actions, not just at the federal level, but at state and local levels, that companies need to keep their eye on, it’s a very costly thing to track them and comply with them. A data approach could simplify that, lowering the costs for all parties and even allowing the private sector to become more involved in informing regulators about the good or bad components of proposals before they become actual regulations.
— Patrick McLaughlin, Director, Program for Economic Research on Regulation, Mercatus Center, GMU [47]

Compliance Data

At Data Transparency 2017, the Data Foundation hosted a program track focusing on the transformation of compliance – defined as the information exchanged between government and other organizations – from document-centric to data-centric. By replacing compliance documents with submissions of open data, regulatory agencies can deliver transparency for the constituencies (such as investors) they serve, create new ways to manage and enforce regulatory mandates, and open up opportunities to automate formerly-manual reporting tasks.

SEC Division of Economic and Risk Analysis assistant director Mike Willis described “one big benefit” of transparency during our interview: “[W]ith more data comes more insight. That’s useful not only to filers [regulated by the SEC] but most importantly to investors.”[48]

Another of our interviewees expressed a belief that, once modernized through open data, the nature of compliance will completely change, because the sharing and submission of compliance information become seamless, immediate, and automatic. The use cases for compliance information, once expressed as open data instead of documents, will explode, to the benefit of all who compile, collect, analyze, and use it.

As Inmar’s Stephen Ibach explained it, “[G]etting that greater transparency, that greater operational efficiency in reporting ... leads you to the conclusion that the end of reporting is happening, it’s really about constant discovery on your data for the benefit of your organization.”[49]

In the short term, our interviewees continue to see the need adopt consistent data standards and to transform regulatory reporting, and other forms of compliance, from unsearchable documents and PDFs to searchable and standardized data.[50] They are also closely tracking the movement to embrace a single, open identifier for legal entities across formerly-siloed reporting regimes and efforts to bring open data federal grant reporting.[51]

The benefits of transforming compliance into open data are equally attractive on the state level as on the federal. Conference of State Banking Supervisors analytics director Paul Ferree described the results of his organization’s successful standardization of payment data reported by money transmission agents: the CSBS is now able to capture data on nearly every money transmission performed in the United States.[52]

Efforts to modernize compliance data were among the first open data efforts in the United States.[53] When the SEC embraced the eXtensible Business Reporting Language (XBRL) open data format for corporate financial reporting nearly a decade ago, it kicked off the movement for better, standardized compliance data and processes. Unfortunately, the path from there has not always been smooth. Because the SEC’s data structure was poorly-conceived, it proved difficult for public companies to learn and for investors and the SEC’s own staff to use.[54] The Data Foundation recently published a research report exploring this tumultuous history while setting out a positive vision for the future.[55]

The SEC has incrementally improved the quality of its existing open corporate data, and recently proposed the first expansion of open corporate data since its initial adoption of XBRL. Under this proposal, public companies would begin reporting the information contained on the cover pages of the main corporate financial reports as open data, instead of unstructured text.[56]

While a number of our interviewees expressed support for the SEC’s ongoing effort to transform its corporate compliance documents into data, one expressed skepticism in the SEC’s recent announcement. He expressed a belief that most users would only find limited use from the standardization of these the cover pages of such documents, beyond what can already be gleaned using text mining and language processing techniques.[57]


Spotlight: An Open Data Future for Federal Grants


The federal government awards roughly $650 billion in grants every year. Grantees must report to federal agencies on their receipt and use of those funds. These reports are submitted as disconnected documents and have not yet been transformed into open data. As a result, grantees suffer heavy compliance costs to compile and transmit their grant reports, while grantor agencies and the constituencies they serve cannot comprehensively track or understand grant spending.

The potential for open data to help spur savings and performance improvement in federal grant reporting was mentioned by several of our interviewees. As the White House OMB’s Victoria Collin put it, “[T[he average state or local government depends on the federal government for approximately a third of its revenue and that comes most often in the form of federal grants. So any time you can automate processes that create administrative burden you’re having cost avoidances which dramatically increase the impact of every dollar we spent and allow us to achieve our mission all the more effectively across the federal government.”[58] Interviewees expressed optimism that if performance data is included in standardization efforts around grant reporting, significant savings and performance improvements will be possible through better analytics and evaluation.

During her closing keynote address at Data Transparency 2017, Congresswoman Virginia Foxx announced she would introduce legislation to adopt a governmentwide open data structure for all federal grant reporting.[59] The Grant Reporting Efficiency and Agreements Transparency (GREAT) Act will build on work done to standardize spending data under the DATA Act by including grant reporting within the existing federal spending data structure.

Dr. Foxx explained what she intended to accomplish: “This bill will ensure the modernization of reporting by grant recipients, by mandating a standardized data structure for the information that recipients must report to federal agencies...It assigns to the executive branch the task of establishing these data standards. The bill also creates goals for these new standards, including searchability, consistency with accounting principles, and a non-proprietary product. The bill will ensure the executive branch consults with the grant recipient community and software providers to accomplish this. We named it the GREAT Act for a reason. The results of the passage will be great for stakeholders, government agencies, job creators, grantees and grantors.”[60]


Spotlight: The Potential of Open Entity Identification

37185690680_b6f345c48f_k (1).jpg

Consistently identifying the non-governmental entities that do business with, or are regulated by, the federal government is a major hurdle standing in the way of better compliance data, more transparent grant reporting, truly useful spending data, and oversight of all kinds.

Across government, there is no consistent system for entity identification. This makes it harder to quickly cross reference entities across their interactions with the federal government. Without significant manual research, it is difficult to tell if a given organization is regulated by multiple agencies or how it reports to those agencies. A common entity identifier could be used as a bridge across various systems. However, the DUNS Number, which is the federal government’s most-widely used entity identification code outside the tax system, is proprietary and thus seriously limited as a potential bridge. To download or analyze DUNS data users must pay for a license, and, even then, further dissemination is limited.[61]

Luckily, there is a global, open alternative gaining traction. The Legal Entity Identifier (LEI) was initially created in the wake of the financial crisis last decade to help firms and governments across the world do a better job managing their risk. More recently, it is being adopted across different sectors and areas of government. In the short term LEI can help agencies map between various existing entity identification schemes and, eventually, replace them.[62]


Spotlight: Standard Business Reporting Provides a Model

37442270231_2c7bdf2fca_k (1).jpg

Angus Taylor, a Member of Parliament and Assistant Minister for Cities and Digital Transformation in Australia, was one of the first speakers to take the stage at Data Transparency 2017. He was there to tell the story of Australia’s successful effort to embrace a governmentwide Standard Business Reporting (SBR) structure.[63]

Minister Taylor explained how his country adopted a single data taxonomy across everything that companies report to government. This means Australian companies can fulfill multiple regulatory reporting requirements in one software environment. The effort saves Australian companies $1 billion dollars a year in compliance costs.

Some people have argued the United States is too big and complex to implement such a system, but we (authors Rumsey and Hughes), along with a number of our interviewees, disagree. Specifically, former OMB controller Dave Mader stressed how important it is to learn from best practices elsewhere. He specifically cited the Australian example: “[T]he fact of the matter is that the government can benefit so much by learning from best practices whether they’re in the US [or elsewhere]…I made a comment that Australia is not quite the size of the US, but there are lessons learned…”[64]

SBR may have started there, but it’s coming here. The only question is how.

I think what we have to recognize in open data in the private sector is that it’s an extremely complex dynamic system. As a result of this complexity and the dynamic interaction between entities in the ecosystem you’ve got these opportunities for emergence from and between data that can really enhance the overall relationships across the ecosystem multiplying the value.
— Stephen Ibach, Vice President - Product Development and Head of Products, Inmar [65]

Private-sector data just keeps growing; it grows in different ways. It doesn’t seem like it’s at the point where it surpasses government [open] data on value, but it’s really close. That’s where the data is, so thinking about data sharing across the board makes a ton of sense.
— Adam Neufeld, Former Deputy Administrator, GSA [66]

Private-Sector Data

This year, for the first time, the Data Foundation began thinking beyond government and compliance. Data Transparency 2017 showcased companies choosing, voluntarily, to standardize and publish their operational information as open data, motivated by a combination of public interest and self-interest.

To be sure, the private sector had long taken a lead role in the push for open government data. Private sector firms incorporate government data into their own products and services, they build off government data to create improved products and services, they analyze government data to provide better intelligence and help their clients make better decisions, and much more. Private-sector business models built on open government data represent one of the main drivers of reforms like the DATA Act and the OPEN Government Data Act.

These efforts are not likely to stop, but the private sector is now beginning to embrace the open data transformation for its own information, as well as for the government’s. By voluntarily standardizing and sharing operational information as open data, private-sector companies can realize benefits similar to those realized in other sectors: transparency for stakeholders, decision-making resources for internal managers, and automation for formerly-manual reporting processes.

One interviewee expressed pessimism about the future of private-sector open data, instead arguing that companies were likely to identify their internal data as a valuable asset that should only be shared in exchange for some sort of payment.[67] But our other interviewees suggested that open data in the private sector may be nascent and fragile, and yet is promising.

Inmar vice president Steve Ibach, who spoke to this point during his presentation at Data Transparency 2017 as well as throughout our interview with him, expressed optimism about the future of private-sector open data. He discerned a “shared value” inherent in the private-sector data ecosystem and argued that data sharing will become vital across a range of emerging sectors including smart cities, autonomous vehicles, the Internet of Things, and many more.[68]

A number of interviews explored the idea that, for the private-sector, open data may be less about the wide, free, and open sharing of comprehensive information and more about finding the right nodes, connections, and relationships to extract untapped value from shared data for businesses and broader society. It won’t always make sense for businesses to keep their operational data close to the vest or demand payment in exchange for its use.


Sharing Private-Sector Operational Data

More and more, private-sector organizations are treating their own operational information, expressed as standardized data, as an asset. Like other assets, such data might traditionally be considered proprietary and only shared in exchange for payment, or under restricted conditions. However, some companies are beginning to see the value in standardizing and sharing such data more broadly, for a variety of reasons.

First, in some cases, private-sector firms can build customer relationships by sharing operational information as open data. For example, OpenCorporates makes its database of information about businesses available for public browsing, while charging for high-volume downloads.[69] Customers might first browse the free option, then later choose to pay for a high-volume data feed. Yelp provides access to their API under similar conditions.[70]

A second reason is to improve products and services. OpenCorporates solicits the public’s feedback on its freely-available database and makes corrections based on that feedback – improving the quality of the company’s main product.[71] Waze, through a partnership with OpenDataSoft, shares data from its app on accidents and road blockages with municipal governments; the governments, in turn share data about planned roadwork projects with Waze. Waze’s product is improved by the addition of roadwork data.[72]

AHIMA Senior Director of Federal Relations Lauren Riplinger expressed optimism that open data can be useful in the healthcare arena, explaining: “I think that open data as a whole, particularly when we see the cost of healthcare still continuing to rise…offers a lot of opportunity in going about and thinking about how do we reduce those costs? What are the pain points? And from a consumer perspective, thinking about how can I potentially shop for different aspects of my healthcare?”[73]

Third, some private-sector firms share their data for philanthropic purposes – and for the positive public attention that results. For example, Google has worked to make election information, including data on the location of polling places, more readily available to citizens.[74]

In that vein, private sector data can prove immensely valuable for governments, non-profits, and other civically oriented organizations.

As Jon Sotsky, Strategic Investment Officer at the Knight Foundation, explained in our interview, “[C]orporate data could utilized by government, funders, and nonprofits seeking to identify trends in communities where they operate and assess the effectiveness of their strategies and programs. Essentially, company data help the social sector evaluate impact as well as set strategy.” He cited applications related to urban planning, social service delivery, and more. [75]


Other Open Data Activities

Beyond sharing their operational information as open data, private-sector firms engage in related activities, such as developing open data standards and building open data communities of practice. Yelp’s development and promotion of the LIVES data standard for public health inspections (see Spotlight) is one example. A second example is the creation of blockchain-based ecosystems in which many participants, connected via a distributed ledger, share the challenge of maintaining a complex data set, as the startup TruSet plans to do.[76]


Spotlight: Yelp’s Open Data Work


Yelp, best known for its user reviews of local businesses, has worked closely with partners in local governments, healthcare providers, public interest researchers, and more to ease information sharing and build out the range of data they can offer to users.

In his presentation at Data Transparency 2017 Laurent Crenshaw, director of public policy at Yelp, highlighted three ways that the company is working with governments and nonprofits to organize, collect, and publish important information as open data.

Perhaps most impressively, Yelp partnered with cities on a data standard – the Local Inspector Value-Entry Specification (LIVES) that makes it easier for cities to share restaurant inspection data with Yelp. Yelp can now help you avoid a big lunch time mistake thanks to this data feed.

Expanding beyond the restaurant reviews they were first known for, Yelp worked with the California Health Care Foundation (CHCF) and Cal Hospital Compare to display maternity care measures for the roughly 250 hospitals that deliver babies in California.

Yelp also partners with ProPublica to incorporate health care statistics and consumer opinion survey data onto the Yelp business pages of more than 25,000 medical treatment facilities, including hospitals and nursing homes. Thanks to these partnerships, consumers can easily combine government data with feedback from other consumers to make more informed healthcare decisions.

Yelp’s direct users are not the only beneficiaries of these activities. Publishing health inspection results, maternity care statistics, and consumer feedback can also create pressure on restaurants, hospitals, and nursing homes to improve performance, and help health departments evaluate the impacts of their policies and enforcement.

Once we make open data a standard practice we’ll have a more responsive government. I think that’s why you’ve seen buy in from both Democrats and Republicans. Because we’re really focused on empowering positive change for our government and for citizens and for innovators.
— Representative Derek Kilmer (D-WA)


The State of the Union of Open Data is strong, but the road to fulfilling its promise is long. The task of creating a government and private sector that run on open, standardized data instead of paper documents will never really be over. We must be prepared to harvest the future benefits of open data that we cannot even dream of today.

To ensure data is as useful and available as possible, new efforts at standards-building will need to be undertaken and data publication will need to proceed with its potential users in mind. Public servants in all areas of government will need to learn that open data is not only useful for the sake of transparency, but can also help them do their jobs more efficiently and effectively.

Standardized data that is more-widely available will drive benefits across government and the private sector. The three pillars of transparency, management, and compliance build on each other, helping all to thrive and accruing benefits that would not exist if just one were prioritized. These benefits pay dividends for government agencies, private-sector firms, and most importantly, American citizens.

Setbacks are inevitable, but they cannot be seen as failure. This is especially clear in the realm of compliance data, where years of effort will eventually lead to stronger data around grants; financial data that is more easily automated; a simple, open way to identify entities that interact with the federal government; and, ultimately, a system of Standard Business Reporting that will streamline how businesses comply with regulatory requirements and save billions of dollars.

Finally, new areas and applications must be cultivated. The private sector is starting to appreciate the benefits of open data, not simply from the government, but also within their own information assets as well as exploring ways to combine both types of data together for expanded insight and use cases. The most valuable examples of this should be identified and championed.

We are on the road to a society – government, compliance, and private sectors – that operates more efficiently and seamlessly, thanks to open data.


Participant List                                                    

We thank all the government officials and policymakers, industry leaders, experts, and advocates who agreed to interview with us or provide their perspectives for this project.

●      Phil Ashlock, Chief Architect,

●      Victoria Collin, Senior Policy Analyst, Office of Federal Financial Management, White House Office of Management and Budget

●      Rob Cook, Director, Technology Transformation Service, General Services Administration

●      Laurent Crenshaw, Director of Public Policy, Yelp

●      Ren Essene, Manager, Policy Section of the Chief Data Office, Consumer Financial Protection Bureau

●      Paul Ferree, Data Analytics Product Director, Conference of State Banking Supervisors

●      Anthony Fung, Deputy Secretary of Technology, Commonwealth of Virginia

●      Margie Graves, acting Chief Information Officer of the United States

●      Rachel Han, Director of Business Development, OpenDataSoft

●      Nate Haskins, Chief Data Officer, S&P Global Market Intelligence

●      Christina Ho, Founder, Policy Insights

●      Steve Ibach, Vice President – Product Development, Inmar

●      Marcel Jemio, Chief Data Architect, Office of Personnel Management

●      Representative Derek Kilmer

●      Dave Mader, Chief Strategy Officer, Civilian Sector, Deloitte

●      Justin Marsico, Senior Policy Analyst, Department of the Treasury

●      Patrick McLaughlin, Director, Program for Economic Research, Mercatus Center, George Mason University

●      Adam Neufeld, former Deputy Administrator, General Services Administration

●      Lauren Riplinger, Senior Director, Federal Relations, AHIMA

●      Dominic Sale, Deputy Associate Administrator, Information Integrity and Access, General Services Administration

●      Jon Sotsky, Director, Strategy & Assessment, Knight Foundation

●      Justin Stekervetz, Solution Architect, Office of Accounting Policy and Financial Transparency, Department of the Treasury

●      Chris Traver, Senior Advisor for Information Sharing, Administration for Children and Families, Department of Health and Human Services

●      Mike Willis, Assistant Director, Office of Structured Disclosure, Division of Economic and Risk Analysis, Securities and Exchange Commission



[1] Alison Gil, Adam Hughes, and Hudson Hollister, Data Foundation and Grant Thornton, The State of the Union of Open Data 2016, October 2016 available at (“State of the Union of Open Data 2016”).

[2] See id., see also The 8 Principles of Open Government Data, Sebastopol, December 8, 2007,

[3] See Hudson Hollister, Joseph Kull, Michael Middleton, and Michal Piechocki, Data Foundation and PwC, Standard Business Reporting: Open Data to Cut Compliance Costs, March 2017, available at (“SBR Report”); Scott Straub and Matt Rumsey, Data Foundation and LexisNexis, Who is Who and What is What? The Need for Universal Entity Identification in the United States, September 2017, available at (“LEI Report”); Marc Joffe, Data Foundation, Open Data for Financial Reporting: Costs, Benefits, and Future, September 2017, available at (“XBRL Report”).

[4] Interview with Marcel Jemio, September 26, 2017.

[5] Interview with Victoria Collin, November 2, 2017

[6] Interview with Nate Haskins, October 17, 2017

[7] See Frank Landefeld, Jamie Yachera, and Hudson Hollister, Data Foundation, The DATA Act: Vision and Value, July 2016, available at (“DATA Act Vision & Value”); Dave Mader, Tasha Austin, Christina Canavan, Dean Ritz, and Matt Rumsey, Data Foundation and Deloitte, DATA Act 2022: Changing Technology, Changing Culture, May 2017, available at (“DATA Act 2022”).

[8] See XBRL Report, supra note 3.

[9] See Carl Bialik, “Yelp’s Local Economic Outlook: Using Data to Track Small Business Opportunity & Economic Health in America,” Yelp Blog, October 17, 2017,

[10] See DATA Act Vision & Value, DATA Act 2022, supra note 4.

[11] See SBR Report, supra note 3.

[12] See Private-Sector Data discussion, infra.

[13] See Government Accountability Office, DATA Act: OMB, Treasury, and Agencies Need to Improve Completeness and Accuracy of Spending Data and Disclose Limitations, Report No. GAO-18-138, Novemer 2017, available at Government Accountability Office, available at, Figure 1 (“Operation of the DATA Act Broker”); see also DATA Act Vision & Value, DATA Act 2022, supra note 4.

[14] See SBR Report, supra note 3.

[15] See Caroline Dumortier, “The OpenDataSoft platform allows Schneider Electric data scientists to focus on value-add analysis,” OpenDataSoft, July 25, 2017,

[16] Chris Liddell, White House Keynote: Public Sector Reforms for Private Sector Growth (speech), September 26, 2017, available at (video).

[17] Interview with Phil Ashlock, September 26, 2017.

[18] Margie Graves, Plenary: Modernizing Government Management with Data (speech), September 26, 2017, available at (video) (“Graves Plenary Address”).

[19] Digital Accountability and Transparency Act of 2014, Public Law No. 113-101 (May 9, 2014),; see also Data Coalition, “DATA Act” (website), (accessed December 2, 2017); DATA Act Vision & Value, DATA Act 2022, supra note 4.

[20] Interview with Justin Marsico, October 18, 2017.

[21] Graves Plenary Address, supra note 18.

[22] Interview with Margie Graves, September 26, 2017

[23] Danny Vinik, “What happened to Trump’s war on data?”, July, 25, 2017,

[24] The Trump administration has, however, overhauled a number of websites, removed certain documents, and pulled a few previously-available open data sets offline as part of shifts in policy direction or reorganization. See Andrew Bergman and Toly Rinberg, “In its first year, the Trump administration has reduced public information online,” Sunlight Foundation blog, January 4, 2018,

[25], About, (accessed January 13, 2018).

[26] See Amy Edwards and Justin Marsico, Government Data Demo: the DAIMS In Action (presentation), September 26, 2017, available at (video and slide deck); Department of the Treasury, “Data Lab at,” (accessed November 30, 2017) (providing management and transparency tools using standardized federal spending data) (“Data Lab”).

[27] Interview with Dave Mader, October 17, 2017.

[28] Interview with Phil Ashlock, September 26, 2017.

[29] Interview with Stephen Ibach, November 7, 2017.

[30] Meredith Somers, “‘Hero’ of federal management, DATA Act implementation leaving Treasury,” Federal News Radio, September 29, 2017,

[31] Id.

[32] Interview with Christina Ho, October 13, 2017.

[33] Interview with Marcel Jemio, September 26, 2017.

[34] State of the Union 2016, supra note 1 (data standardization section).

[35] National Information Exchange Model, (accessed January 13, 2018).

[36] Interview with Justin Stekervetz, October 4, 2017.

[37] OPEN Government Data Act (115th Congress), S. 760 (introduced March 29, 2017), H.R. 1770 (introduced March 29, 2017); see also Data Coalition, “OPEN Government Data Act,” (accessed January 13, 2018).

[38] See Data Coalition, “Senate Passes OPEN Government Data Act” (media release), September 18, 2017,; Data Coalition, “OPEN Government Data Act Passes the House for the First Time” (media release), November 15, 2017,

[39] Office of Management and Budget, Memorandum No. M-13-13, Open Data Policy: Managing Information as an Asset, May 9, 2013, available at; Executive Order No. 13642, 78 C.F.R. 28111 (2013), available at

[40] Interview with Ren Essene, November 7, 2017.

[41] See Hudson Hollister, Data Coalition Blog, “This data set took six years to create. Worth every moment.,” May 9, 2017, (“Worth Every Moment”).

[42] Data Lab, supra note 26.

[43] Interview with Justin Marsico, October 18, 2017.

[44] Dave Mader identified a number of key policymakers who are invested in the future success of the DATA Act and similar programs, including Rep. Mark Meadows, Rep. Gerry Connolly, Sen. James Lankford, Sen. Mark Warner, and Comptroller General Gene Dodaro. Interview with Dave Mader, October 17, 2017.

[45] Representative Virginia Foxx, Closing Keynote: An Open Data Transformation for Federal Grants (speech), September 26, 2017, available at (video) (“Foxx Keynote”).

[46] Interview with Mike Willis, October 2, 2017.

[47] Interview with Patrick McLaughlin, October 17, 2017.

[48] Interview with Mike Willis, October 2, 2017.

[49] Interview with Stephen Ibach, November 7, 2017.

[50] See XBRL Report, supra note 3.

[51] See LEI Report, supra note 3.

[52] Interview with Paul Ferree, November 3, 2017 (“[The only way we’re not collecting transaction data from a money transmitter is if they’re not licensed in at least one of those 18 states, and if you think about the nationwide money transmitters, they’re going to be licensed in all the states, so we’ve got all the big guys. We have pretty close to a nationwide number of money transmissions every quarter, and that’s the first time such a number has existed").

[53] See XBRL Report, supra note 3 (“The SEC's decision was the U.S. government's most ambitious open data project until the DATA Act was enacted, five years later”).

[54] Id.

[55] Id.

[56] Securities and Exchange Commission, FAST Act Modernization and Simplification of Regulation S-K (Proposed Rule), Release Nos. 33-10425, 34-81851, IA-4791, IC-32858; File No. S7-08-17 (October 11, 2017), available at; see also Hudson Hollister, “SEC Proposes to Transform Corporate Cover Pages from Documents into Data,” Data Coalition Blog, October 16, 2017,

[57] Interview with Nate Haskins, October 17, 2017.

[58] Interview with Victoria Collin, November 2, 2017.

[59] Foxx Keynote, supra note 45.

[60] Id.

[61] See Hudson Hollister, “To fix federal procurement, dump the DUNS number,” The Hill, December 6, 2016,

[62] See LEI Report, supra note 3.

[63] Angus Taylor MP, International Keynote (speech), September 26, 2017, available at (video) (“Taylor Keynote”).

[64] Interview with Dave Mader, October 17, 2017.

[65] Interview with Stephen Ibach, November 7, 2017. Inmar manages transaction and point-of-sale systems for retailers and manufacturers.

[66] Interview with Adam Nuefeld, October 17, 2017.

[67] Interview with Laurent Crenshaw, November 3, 2017.

[68] Stephen Ibach, Spotlight: Private-Sector Data for Public Good (speech), September 26, 2017, available at (presentation) (“Ibach Presentation”).

[69] See OpenCorporates, “What Makes OpenCorporates Data Special,” (accessed January 13, 2018); see also Chris Taggart, Panel on Corporate Data Philanthropy (panel discussion), September 26, 2017, available at (presentation) (“Taggart Presentation”).

[70] Interview with Laurent Crenshaw, November 3, 2017.

[71] Id.

[72] Rachel Han, Panel on Corporate Data Philanthropy (panel discussion), September 26, 2017, available at (presentation) (“Han Presentation”).

[73] Interview with Lauren Riplinger, October 20, 2017.

[74] Interview with Laurent Crenshaw, November 3, 2017.

[75] Interview with Jon Sotsky, October 23, 2017.

[76] TruSet, (accessed January 13, 2018).