Agencies face continued challenges making security data ready for machine learning

While agencies have advanced capabilities to monitor and analyze user behavioral data, they need added analytics and ML support to improve cyber resilience, according to a new report.

By FedScoop Staff

October 25, 2021

(Getty Images)

The ability for federal agencies to harness artificial intelligence and machine learning to identify anomalous behavior on their networks depends increasingly on having robust data gathering, preparation and analytics capabilities in place.

Though a large majority of federal IT and agency leaders polled in a new FedScoop study say their agencies have above-industry-average capabilities to monitor, collect and analyze behavioral data across their networks, trying to use that data for machine learning (ML) remains a significant challenge, especially for identifying and responding to anomalous behaviors on their networks.

Read the full report

Chief among their challenges, according to the survey, are a lack of lack of experience and requisite skills in training and testing machine learning algorithms; a lack of adequate tools to perform all the work in processing data, as well as a lack of clarity about what tools and services in the marketplace meet their ML needs; and a lack of reliable ML-ready data to work with.

This FedScoop study, released this week, surveyed 160 prequalified IT and program executives at large, medium and small federal agencies to explore the state of their data analytics and machine learning capabilities. The study also identified the obstacles agencies continue to face across the life cycle of gathering, processing and analyzing data. And the study looked at the types of services agencies are turning to for greater support. The survey was conducted online in August and September 2021 and underwritten by Cloudera.

Among other findings:

ML challenges vary by agency size — 4 in 10 of respondents at large agencies (10,000-plus employees) — which tend to deal with larger scale data challenges — cited a lack of adequate ML-related skills as a top challenge, compared to 2 in 10 respondents at small agencies (fewer than 1,000 employees) — which are often still ramping up ML efforts or which rely more on third parties.

Conversely, 1 in 3 respondents at small agencies cited a lack of adequate tools among their biggest challenges, compared to less than 1 in 4 at large agencies. And more than twice as many respondents at small and mid-size agencies struggle with a lack of reliable ML-ready data, compared to their counterparts at large agencies.

Skills gaps across ML process — Respondents at agencies of all sizes say they face significant deficiencies in skills across the data processing and machine learning life cycle — from data ingestion, to extraction, to transforming and loading, to analysis, to ML-training to operationalizing ML. The study suggests those deficiencies are hampering the ability for agencies to implement zero-trust models and establish greater cyber resiliency.

Agencies appear to have the data they need — There was positive news in the study, which found that agencies have the capabilities required to monitor, process, store and analyze behavioral data about users, devices and applications operating on their networks — with more than 2 in 3 respondents saying those capabilities meet or exceed industry and NIST accepted standards. What was less clear, the report said, was how fully or effectively agencies are harnessing those capabilities.

Reliance on external support — While federal IT leaders indicate they have the capabilities to handle anomalous behavior data, a sizeable portion also report they’re opting to tap the expertise of external service providers at every stage of the data-gathering-to-ML process. The areas where agencies are most often seeking help are for data analytics and data integration and production; but there’s also high demand for help with ML governance, and ML training.

The study also touched on other dimensions of data readiness, including:

Agencies’ capability to securely gather data at the edge of their networks as well as across their network environments.
Where agencies are storing their ML production data.
The extent to which agencies are relying on open-source solutions versus in-house and commercial solutions to prepare their ML data.

“While federal IT leaders maintain their agencies have the capabilities to ingest, prepare and analyze data, they still need help harnessing those capabilities to leverage machine learning in order to better detect and respond to anomalous behavior on their networks,” the study concluded.

Additionally, the deficiency in skills across most ML-related data processing stages — and the rapid evolution of data management and ML tools — suggest agencies “would benefit from moving to more modern, integrated platforms for ingesting and analyzing behavioral data to improve cyber resilience. They would also likely achieve zero trust frameworks faster by engaging with service providers specializing in modernized data and ML solutions that can spot anomalous behaviors,” the report said.

Download the full report, “Data analytics readiness for cyber resilience” for detailed findings and guidance on improving data gathering, preparation and analytics for improved threat detection.

This article was produced by FedScoop and sponsored by Cloudera.

Agencies face continued challenges making security data ready for machine learning

More Like This

Lack of IRS transparency on AI jeopardizes public trust, advisory panel says

VA secretary pledges progress on EHR rollout amid major workforce cuts

Implementing zero trust: a blueprint for cyber resilience in the age of AI

Top Stories

Nuclear Regulatory Commission has IT recs to address amid Trump ‘reform’ of the agency

IRS data error likely affected millions of Education Department forms

Trump nominates former Transportation CIO Ryan Cote to lead IT at VA

Trump White House issues internal federal guidance on AI reporting

VA systems lack accessibility standards needed by disabled veterans, IG says

GSA’s newly expanded acquisition data reporting program is riddled with ‘shortcomings,’ watchdog says

Scammers have a new tactic: impersonating DOGE

More Scoops

How the State Department used AI and machine learning to revolutionize records management

Too many products, not enough integration hampers agency security resilience

Air Force Research Lab seeks new algorithms to enhance space situational awareness

Agency leaders face a steep and costly road ahead in implementing zero-trust security

Government clouds now handle more IT work than federal data centers

Agencies underscore software vulnerabilities in supply chain assessments

HHS unit wants insurance fraud-detection platform largely unsupervised

Latest Podcasts

Trump nominates former Transportation CIO to lead IT at VA; DOD creating joint counter-drone task force

Trump admin issues internal federal guidance on AI reporting; GSA’s newly expanded acquisition data reporting program is riddled with ‘shortcomings’

The State Department’s innovation-driven approach to security at the edge

SSA makes another DOGE switch at CIO; Federal workers at at least one agency have tried to use Deepseek

Tech

Defense

Cyber

FedScoop TV