Agencies trying to find their ‘dark data’ face policy, leadership hurdles
Most IT managers agree finding and capturing dark and grey data should be a top priority, but antiquated policies and lack of senior-level support remain major hurdles.
Dark data describes all the unknown and therefore unused data across an agency, while grey data is known but unused.
San Francisco-based software company Splunk released a survey of 1,357 IT managers April 30 that found 56 percent of public sector data is assumed dark or grey. While 77 percent of public sector respondents said locating and using that data was paramount, 76 percent said lack of support from senior agency leadership was a challenge.
The Federal Deposit Insurance Corp. is wrestling with how to more efficiently and securely share all the personally identifiable information it collects from banks — short of reengineering its systems.
“It’s not magic,” Howard Whyte, CIO and chief privacy officer at FDIC, said Tuesday at the Splunk GovSummit. “You have to set a policy, and you have to go out and market the capabilities you’re trying to deliver and show value to the corporation.”
First FDIC has to model its data and then look at automation, Whyte said.
That’s easier said than done when 82 percent of public sector IT managers identified a mistrust of artificial intelligence and lack of knowledge around what can be automated within their agencies, according to the Splunk report.
Whereas machine learning helps identify patterns to get value out of grey data, AI is the key to finding and analyzing dark data, Frank Dimina, vice president of public sector at Splunk, told FedScoop.
“But AI won’t work if we’re not supplying it with massive data sets to make the technology smart,” Dimina said. “And I think when the dust settles, that’s when we’ll see some really interesting use cases and success stories.”
FDIC has the luxury of owning its own data center, so it’s now able to map where its data sets are and make decisions about consolidating data for better usage and security. The agency has also started running analytics on the data where it resides to do work faster for banks, identify data misuse cases and flag them for immediate action, Whyte said.
Ensuring leadership understands the importance of investing in such capabilities is critical, he added.
The Joint Special Operations Command is grappling with who should have access to its data, said Col. Carl “Jeff” Worthington, director of C4 system within JSOC.
“Inside JSOC we have the Dothraki — those are the Rangers, we send them out ahead — we have the Tullys, and we have the Karstarks, and we have the Knights of the Vale,” Worthington said, alluding to groups of characters on the popular TV show “Game of Thrones.” “And no one really wants to tell everyone everything because there’s danger in that, when you open yourself up and you show them your cards, so it’s been a struggle at times.”
Worthington controls data sharing within JSOC IT operations but said even within his organization it’s hard getting people to understand the value.
When it comes to making dark and grey data actionable, IT and cybersecurity are “low-hanging fruit,” Dimina said.
“But what they really should be using it for is to impact the mission of government — to make smarter investments at the public level, to have more openness with citizens, to deliver better services to their citizens, to improve the security of the nation,” he said.
The U.S. Postal Service is using its data to monitor the health of its applications and proactively go after cyberthreats by monitoring for excessive failed password attempts.
And the Department of Homeland Security’s DevOps team is tracking code check-ins when developers upload code for administrative review to gauge not only efficiency but count the number of bugs.
Another sign agencies recognize dark and grey data are a problem is the OPEN Government Data Act’s requirement that they appoint a chief data officer, Dimina said. While many agencies have done so, not all have them yet.
FDIC is considering appointing a CDO.
“And that person should own that framework — the data architecture,” Whyte said. “I’m not saying that everyone can’t use data, but someone has to be responsible for sharing it through the organization and making sure it’s providing the value that’s needed.”