Agencies face continued challenges making security data ready for machine learning

While agencies have advanced capabilities to monitor and analyze user behavioral data, they need added analytics and ML support to improve cyber resilience, according to a new report.

By FedScoop Staff

October 25, 2021

(Getty Images)

The ability for federal agencies to harness artificial intelligence and machine learning to identify anomalous behavior on their networks depends increasingly on having robust data gathering, preparation and analytics capabilities in place.

Though a large majority of federal IT and agency leaders polled in a new FedScoop study say their agencies have above-industry-average capabilities to monitor, collect and analyze behavioral data across their networks, trying to use that data for machine learning (ML) remains a significant challenge, especially for identifying and responding to anomalous behaviors on their networks.

Read the full report

Chief among their challenges, according to the survey, are a lack of lack of experience and requisite skills in training and testing machine learning algorithms; a lack of adequate tools to perform all the work in processing data, as well as a lack of clarity about what tools and services in the marketplace meet their ML needs; and a lack of reliable ML-ready data to work with.

This FedScoop study, released this week, surveyed 160 prequalified IT and program executives at large, medium and small federal agencies to explore the state of their data analytics and machine learning capabilities. The study also identified the obstacles agencies continue to face across the life cycle of gathering, processing and analyzing data. And the study looked at the types of services agencies are turning to for greater support. The survey was conducted online in August and September 2021 and underwritten by Cloudera.

Among other findings:

ML challenges vary by agency size — 4 in 10 of respondents at large agencies (10,000-plus employees) — which tend to deal with larger scale data challenges — cited a lack of adequate ML-related skills as a top challenge, compared to 2 in 10 respondents at small agencies (fewer than 1,000 employees) — which are often still ramping up ML efforts or which rely more on third parties.

Conversely, 1 in 3 respondents at small agencies cited a lack of adequate tools among their biggest challenges, compared to less than 1 in 4 at large agencies. And more than twice as many respondents at small and mid-size agencies struggle with a lack of reliable ML-ready data, compared to their counterparts at large agencies.

Skills gaps across ML process — Respondents at agencies of all sizes say they face significant deficiencies in skills across the data processing and machine learning life cycle — from data ingestion, to extraction, to transforming and loading, to analysis, to ML-training to operationalizing ML. The study suggests those deficiencies are hampering the ability for agencies to implement zero-trust models and establish greater cyber resiliency.

Agencies appear to have the data they need — There was positive news in the study, which found that agencies have the capabilities required to monitor, process, store and analyze behavioral data about users, devices and applications operating on their networks — with more than 2 in 3 respondents saying those capabilities meet or exceed industry and NIST accepted standards. What was less clear, the report said, was how fully or effectively agencies are harnessing those capabilities.

Reliance on external support — While federal IT leaders indicate they have the capabilities to handle anomalous behavior data, a sizeable portion also report they’re opting to tap the expertise of external service providers at every stage of the data-gathering-to-ML process. The areas where agencies are most often seeking help are for data analytics and data integration and production; but there’s also high demand for help with ML governance, and ML training.

The study also touched on other dimensions of data readiness, including:

Agencies’ capability to securely gather data at the edge of their networks as well as across their network environments.
Where agencies are storing their ML production data.
The extent to which agencies are relying on open-source solutions versus in-house and commercial solutions to prepare their ML data.

“While federal IT leaders maintain their agencies have the capabilities to ingest, prepare and analyze data, they still need help harnessing those capabilities to leverage machine learning in order to better detect and respond to anomalous behavior on their networks,” the study concluded.

Additionally, the deficiency in skills across most ML-related data processing stages — and the rapid evolution of data management and ML tools — suggest agencies “would benefit from moving to more modern, integrated platforms for ingesting and analyzing behavioral data to improve cyber resilience. They would also likely achieve zero trust frameworks faster by engaging with service providers specializing in modernized data and ML solutions that can spot anomalous behaviors,” the report said.

Download the full report, “Data analytics readiness for cyber resilience” for detailed findings and guidance on improving data gathering, preparation and analytics for improved threat detection.

This article was produced by FedScoop and sponsored by Cloudera.

Agencies face continued challenges making security data ready for machine learning

More Like This

VA health secretary quits less than a year into tenure

FedRAMP 20x widely available to cloud services with release of 2026 consolidated rules

Logging has entered the AI era. Here’s what federal cyber leaders should know

Top Stories

OMB eyes AI tool to flag grants that don’t align with Trump’s agenda

Republicans want data scientists at the IRS. Trump axed those roles last year, Dems say

No plans for a DOGE after-action report, Russell Vought says

NNSA aims to balance modernization, asset protection amid Genesis Mission

CIA restructures tech, acquisition offices for the age of AI

Agencies have four months to finalize quantum-ready migration plans

More Scoops

How the State Department used AI and machine learning to revolutionize records management

Too many products, not enough integration hampers agency security resilience

Air Force Research Lab seeks new algorithms to enhance space situational awareness

Agency leaders face a steep and costly road ahead in implementing zero-trust security

Government clouds now handle more IT work than federal data centers

Agencies underscore software vulnerabilities in supply chain assessments

HHS unit wants insurance fraud-detection platform largely unsupervised

Latest Podcasts

Russell Vought says there are no plans for a DOGE after-action report

Secret Service experiences mobile device security blunders

The Department of Education’s IT shop lost more than half of its employees to Trump’s reduction in force last year

From experimentation to enterprise integration

Tech

Defense

Cyber

FedScoop TV