Advertisement

Open data movement needs standardization — experts

The open data movement has grown in leaps and bounds in recent years, but efforts to standardize that data are only in their infancy, experts say.

The open data movement has grown in leaps and bounds during the Obama administration, but efforts to standardize that data are only in their infancy, experts say.

Waldo Jaquith, former director of the nonprofit U.S. Open Data and now a member of 18F, notes the problem succinctly in a new report called the “The State of the Union of Open Data.”

“Currently, open data practices are appallingly crude,” he writes. “A list of 100 core types of government data would find that for more than 90, there exists no standard schema.”

The Data Foundation and Grant Thornton Public Sector, co-publishers of the report released Wednesday, interviewed more than 40 people who presented at the Data Transparency 2016 conference, including members of Congress, agency leaders, and open data experts and advocates. One of the report’s main conclusions is that progress on data standardization “has been limited and incremental.”

Advertisement

[Read more on that conference and the first White House Open Data Innovation Summit here: Open data’s journey through an administration]

“For many valuable information resources beyond spending and corporate finance, there is no clear responsibility to standardize,” the report notes. “Government can and should fix that by appointing more standard-setters.”

Jaquith’s assessment of the open data movement so far presents a pretty grim reality and calls for “standards, schemas and infrastructure.” In the report Jaquith notes government still lacks a central data set of government data repositories and inventories, metadata on many of its data sets, and implemented data portability practices.

“The open data ecosystem is less an ecosystem and more a collection of hacks and workarounds, dependent as they are on data otherwise trapped within ‘enterprise-grade’ software built by companies with a financial interest in locking agencies into their multi-million dollar software by keeping the data within from getting out,” Jaquith writes.

Jaquith notes there are projects working to address these issues, and they have had some success with private sector and academia data sharing, but “they have not made any impact on the governmental practice of open data, nor is there any sign that they will soon.”

Advertisement

Data Foundation interim President Hudson Hollister told FedScoop that while many interviewed for the report were enthusiastic about open data, there were some who had become cynical even while embedded in the movement themselves.

“There’s some cynicism about open data for sure; it was a trendy topic five years ago, and it is not new anymore,” he said. “Some of our interviewees, even though we were interviewing people who are in the open data movement, some of them reflected that cynicism and said that ‘open data is just a buzzword.’”

“And I think we can explain that cynicism by pointing to the lack of standardization,” he added.

Several different experts quoted in the report, from industry and government alike, note that importance of standards.

“Standards are crucial because, otherwise, data remain in silos,” said Robin Carnahan, director of state and local practice at 18F. “Good standards reduce the workload on people and save a lot of time through automation.”

Advertisement

Bryce Pippert, principal at Booz Allen Hamilton, said that “publishing data is not sufficient for value creation. We need data that can freely move and be understood – and doing that well requires standardization.”

But Brandon Pustejovsky, chief data officer at the United States Agency for International Development, said government is still lagging a step behind standardization.

“Standardization is extremely important, but we are still at an intermediary step — namely, ensuring the broad availability and machine-readability of our data on a routine basis,” Pustejovsky said. “Standardization will become more of a priority as we get these initial steps in place. But when moving forward, we can’t allow the perfect to be the enemy of the good.”

The Digital Accountability and Transparency Act is one standardization effort that shows promise, but it is regulated to spending data. Hollister notes that effort in itself isn’t perfect.

He also mentioned the recently announced U.S. Data Federation as another step in the right direction as it encourages agencies to look at good standardization examples. But he calls it a “baby step,” because “it doesn’t actually set up a new standardization effort.”

Advertisement

“The report says that we need to have more standards setters, like what the DATA Act did,” he told FedScoop. “The DATA Act says ‘there’s going to be a consistent data standard for spending. And here’s who’s in charge of it — everyone’s got to do what they say.’ That kind of policy change is needed in these other areas.”

The U.S. Data Federation launched in late September as a place data publishers can look for examples of successful standardized multi-agency data initiatives.

Future plans for the federation include tools to help publishers coordinate their efforts and use a preferred data standard, and a maturity model to monitor the progress of some of these initiatives, Philip Ashlock, chief architect of Data.gov, told FedScoop in a recent interview.

“The concept … of data federation is basically how do you coordinate among multiple data publishers so that you can pull all the data together in one place so that it’s sort of one cohesive whole?” Ashlock said then. “So this gets around sort of the concept of data standardization, or just the basic coordination of how information is published.”

[Read more: Meet the U.S. Data Federation: A new hub for standardized, coordinated open data]

Advertisement

Hollister did note that developing schema for open data efforts takes work.

“A schema, it’s a mature standard,” he said. “It’s when everyone gets together and they figure out here are the data fields and here are how all the data fields relate to each other, and here are the organization methods maintaining it. The schema — that’s maturity.”

On Jaquith’s comments, Hollister noted that “outside of spending, what he is saying is that for the most important government information we don’t have standards.”

Samantha Ehlinger

Written by Samantha Ehlinger

Samantha Ehlinger is a technology reporter for FedScoop. Her work has appeared in the Houston Chronicle, Fort Worth Star-Telegram, and several McClatchy papers, including Miami Herald and The State. She was a part of a McClatchy investigative team for the “Irradiated” project on nuclear worker conditions, which won a McClatchy President’s Award. She is a graduate of Texas Christian University. Contact Samantha via email at samantha.ehlinger@fedscoop.com, or follow her on Twitter at @samehlinger. Subscribe to the Daily Scoop for stories like this in your inbox every morning by signing up here: fdscp.com/sign-me-on.

Latest Podcasts