Controlled sandboxes and open data: A look inside GSA’s AI-themed hackathon

Agency leaders seek AI solutions for protecting websites as “hackers” pick apart Data.gov.

August 1, 2024

GSA Administrator Robin Carnahan delivers a speech during AITalks on April 18, 2024, in Washington, D.C. (Scoop News Group photo)

On a normal day at the General Services Administration, agency staffers interested in using artificial intelligence-based platforms must submit a request and wait for approval to proceed. On Wednesday, those platforms were freed during the agency’s inaugural AI hackathon, a three-city competition that saw public and private sector participants attempt to “hack” federal agency websites and build out solutions for prizes.

The event, held in Washington, D.C., Atlanta and New York City and co-sponsored by Microsoft and OpenAI, was part of the agency’s efforts to get the federal government to open up authoritative data and set it up for usability, GSA Administrator Robin Carnahan told reporters in D.C.

That data focus was evident in one participant’s hack of GSA’s Data.gov, which houses the government’s open, machine-readable datasets, and in conversations FedScoop had with GSA officials throughout the day. For an agency that has embraced generative AI tools to a significant degree, AI and data transparency go hand in hand — starting with Wednesday’s unblocked setting.

Building tall walls around a sandbox

For the hackathon, GSA Chief AI Officer Zach Whitman said the agency had unblocked certain AI tools from its network — including OpenAI — to ensure access for participants. The GSA offered hackers the ability to use open-source products and encouraged them to “run stuff locally, wherever they can.” If participants were blocked from a tool, the agency found a workaround.

Dave Shive, GSA’s chief information officer, told reporters during the event that while the hackathon felt “like it’s using open tooling,” the work being done was in a “tightly controlled sandbox,” which Whitman characterized as a “segmented infrastructure” so there would be no privacy concerns or agency risks.

Protecting data and privacy were top priorities, Shive said.

“We build those walls around what we’re doing,” Shive said. “We want to be thoughtful about risk; we’re risk managers first here. But then we also know that our agency partners are watching this very closely, and they have an expectation that we’re going to be creating the playbooks to be able to drive innovative and emerging technologies across the larger federal enterprise.”

In light of AI’s emergence not only in public spaces but also government uses, Whitman told FedScoop that the agency’s mindset is that “people will use these tools via desk swiveling or they’ll bring their own. … It’s just something that is now becoming the norm.”

The GSA’s policy, which was implemented around a year ago, allows agency staff to apply to use AI tools, according to Whitman. After an individual requests a tool, a security team will unblock platforms if it is cleared to be safe to use — with guardrails.

Part of these guardrails include using public information with a tool and using AI for non-sensitive use cases, but excluded use for internal work products. Whitman noted that “we don’t have clearance there yet.”

Following the release of President Joe Biden’s AI executive order and Office of Management and Budget guidance on AI, Whitman said the agency now has a permanent directive where “anyone at GSA can apply to use generative AI tools for these non-sensitive use cases.”

Whitman reported that the agency expanded the directive to include applications for procurement use cases, research and development, and production workflow.

Production workflow “is where we have the highest level of scrutiny because we need to make sure that we are very clear about what can go in and how this thing can go in,” Whitman said. “Ultimately, all of that needs to be funneled up to our AI inventory and published to the public, and we need to be very transparent with how that works.”

Transparency in data

While the OMB end memo for the Foundations for Evidence-Based Policymaking Act of 2018 has still not been issued to agencies five years after the law’s implementation, Whitman told FedScoop that Data.gov is subject to the data submitted by agencies for statute compliance.

“You think about [the missing memo] and then on top of that, has the landscape shifted so much that they need to start thinking about other stuff?” Whitman said. For that reason, Data.gov was a selected website for hackathon participants to target. The GSA wanted to see “what kind of ideas could we bring to bear that would make this stuff programmatically accessible but also understandable,” Whitman said.

Whitman called for DCAT-US, the standard metadata specification for describing datasets and APIs in agencies’ data inventories (formerly known as the Project Open Data MetaData Schema) to be updated, as Data.gov information sets also adhere to the standard.

“The schema is old — DCAT-US is not new, needs updating,” Whitman said. “How much can they feasibly do with that metadata to make for a better search experience is really difficult; they have a really hard job because they can’t control the input of data.”

Data.gov’s issues, Whitman said, represent “a macro version, blown out of all the complexity of dealing with interagency cooperation.” Other agency sites also cannot control the input of data, but Whitman said issues are typically “an internal organizational problem.”

With those challenges in mind, Whitman said he was “really excited” to see what hackathon participants could do with AI to exploit and then improve Data.gov. With another hackathon “already in the works,” according to Shive, the GSA and its chief AI officer are committed to leveraging the technology going forward in more data-centric ways.

“Hopefully this hackathon … [will bring] this conversation to people’s mind in terms of, there might be a new future that we need to prepare for,” Whitman said. “It might not look like five years ago and might not look like today. It might be totally different.”