OPM drops Claude, adds Grok and Codex to AI use disclosure
The Office of Personnel Management removed Claude and added Grok and Codex in an update to its public disclosure of AI use cases dated Wednesday.
Removal of Claude comes after a disagreement between its maker, Anthropic, and the Department of Defense over the technology’s guardrails culminated in President Donald Trump issuing a governmentwide ban on the company late last week. In the following days, numerous federal agencies have made moves to stop using Anthropic’s services, including OPM.
While the changes to the disclosure were made at the same time, Grok and Codex were not added as the result of Claude’s removal, OPM spokeswoman McLaurine Pinover said in an emailed response to FedScoop. The human capital agency is “constantly working to provide the best tools to the OPM workforce. These initiatives were already underway,” Pinover said.
According to the new inventory, the “first production use” for both tools is listed as the first quarter of 2026. Pinover confirmed that date references the calendar year rather than fiscal year. Grok, a product of Elon Musk’s xAI, is listed as in production, and Codex, a coding specific AI tool from OpenAI, is being deployed in a sandbox phase — which generally describes a kind of controlled environment.
OPM also added several other systems that deploy AI to its public disclosure, including Wiz, Zendesk, Waze, Google Maps, and the Apple iPhone. All of those use cases were backdated to quarters prior to the current year, and all appear to be uses that are more “commercial-off-the-shelf” systems.
Removal of Claude from the inventory comes after the agency, like others, said it stopped using Anthropic services earlier this week. According to a statement from Pinover at the time, the agency was “still in the initial steps of implementing the tool” and didn’t expect it to affect OPM functions.
Claude was listed on the agency’s previous disclosure as a tool “used across OPM for summarization, drafting, and decision support” and was in a sandbox phase.
Agency use case inventories are an annual accounting of planned, deployed and retired applications of AI in federal agencies. They were initially created by the first Trump administration, enshrined into statute by Congress, and generally improved under the Biden administration. The public inventories exclude most research-and-development uses, those related to national security, and any within the Department of Defense.
While OPM has a public inventory, it doesn’t follow the standard format many other disclosures follow, including the way it reports risk classifications. For example, Grok and Codex are both described as “Low-impact” and others are listed as “Medium-impact,” but neither category is defined in the Trump administration’s governance memo for AI.
In response to FedScoop questions, Pinover said the agency plans to modify both the format as well as the risk management classifications.
Grok in government
The addition of Grok is notable given the chatbot’s public missteps and pushback from some organizations against its use in government. While there are examples of xAI’s presence in federal agencies, it’s still not a common sight in agency use case disclosures.
Grok infamously faced backlash for producing racist and antisemetic responses, including calling itself “MechaHitler.” While Musk, at the time, said xAI had made improvements to the tool, Grok came under fire again in January for producing sexualized images of women and children on X, formerly known as Twitter.
OPM’s disclosure follows a recent statement from the Department of the Treasury that it’s exploring Grok’s use.
In response to questions about the Anthropic ban, a Department of Treasury spokesperson told FedScoop earlier this week that software engineers at the department were using Codex, Google Gemini, and testing Grok in lieu of Claude Code.
At least two other agencies have known use cases for xAI’s products. The Department of Energy and the Department of Health and Human Services both disclosed applications for the company’s services or Grok specifically in their inventories.
DOE’s Lawrence Livermore National Laboratory was piloting Grok. It described the outputs as: “General answers to questions, summarized documentation, document creation, general research.”
HHS, meanwhile, listed “xAI gov” for a couple of use cases on its inventory of services that are commercial-off-the-shelf. Those uses were “scheduling and managing social media posts using AI” and “generating first drafts of documents, briefing, or communication materials using AI.”