The Federal Reserve may need to move quickly to ensure AI doesn’t get further ahead of regulation. (Photo by Brooks Kraft/ Getty Images)
Getty Images
On April 17, the Federal Reserve issued SR 26-2, its revised guidance on model risk management. The document updates SR 11-7, the framework that has governed how banks validate and oversee their models since 2011. It is the first major revision in fifteen years. And buried in the scope section is a sentence that should alarm anyone paying attention to how banks are deploying AI: “Generative AI and agentic AI models are novel and rapidly evolving. As such, they are not within the scope of this guidance.”
Read that again. The Federal Reserve just updated the rulebook for model risk in banking and excluded the single fastest-growing category of model deployment. Goldman Sachs is building autonomous agents powered by Anthropic’s Claude to handle trade accounting. Lloyds Banking Group expects agentic AI to generate £100 million in value this year. JPMorgan Chase has more than 400 AI use cases in production. The governance framework that applies to all of them, as of two days ago, says it does not cover the technology they are deploying.
A framework built for a different kind of model
SR 11-7 was designed for statistical models with defined inputs, fixed logic, and stable outputs. Credit scoring models. Value-at-risk calculations. Loan loss forecasting. These models operate within known parameters. They can be validated by testing whether their outputs match expectations. They do not change their own behavior between review cycles. As GARP’s analysis of the framework notes, the validation approaches SR 11-7 emphasizes (conceptual soundness assessments, outcomes analysis, benchmarking) assume the model’s structure and behavior remain stable. For a model that recalibrates autonomously or adapts based on ongoing interaction, these tools lose effectiveness.
Agentic AI operates on a fundamentally different basis. These systems receive goals, not instructions. They make decisions about how to achieve those goals, interact with external systems, and modify their approach based on results. Goldman’s autonomous agents for trade accounting and client onboarding interpret data, make judgment calls, and execute transactions with no fixed decision tree. Goldman also deployed Devin, an autonomous software engineering agent, across its developer workforce. The entire architecture of model risk management (validation, testing, documentation, change control) assumes you can examine what a model does before it does it. With agentic AI, the behavior emerges at runtime.
What banks are deploying right now
The scale of adoption is hard to overstate. JPMorgan Chase has increased its technology budget to roughly $18 billion annually, with a significant portion going to its OmniAI platform. The bank moved from pilot projects to 400 production use cases by early 2026. Lloyds expects agentic AI to add £100 million in value by automating fraud investigations and complex complaints.
The industry is moving from what the World Economic Forum describes as AI “assistance” to “transactional authority”. These systems are no longer summarizing reports or flagging anomalies for human review. They are settling trades, processing compliance checks, and making credit decisions. Citigroup, Morgan Stanley, Bank of America, and HSBC are embedding AI across middle and back-office operations, from trade reconciliation to fraud monitoring. According to NVIDIA’s 2026 State of AI in Financial Services survey, 61% of financial firms are already using or assessing generative AI, and 42% are using or assessing agentic AI specifically. Agentic AI is already embedded in core banking operations.
The risk nobody is modeling
Individual bank deployments carry their own risks: hallucination, data leakage, unauthorized actions, bias. Deloitte’s analysis of the MIT AI Risk Database identifies more than 350 distinct risks that can arise from autonomous or agentic behavior, many of which pose direct threats to banking systems. But the larger concern is what happens when multiple banks’ agents interact in the same market at the same time.
This is not a theoretical concern. We have a precedent. On May 6, 2010, the flash crash erased roughly $1 trillion in market value in under five minutes. The cause was not a single rogue algorithm. It was the interaction of thousands of algorithmic trading systems digesting and responding to the same market data simultaneously. Research after the crash found that supposedly independent liquidity-providing algorithms had become statistically correlated, creating systemic interdependence that nobody had mapped. Pre-rupture indicators, including a rising frequency of “mini flash crashes” in individual securities throughout 2009 and 2010, were visible in hindsight but went unmonitored in real time.
Agentic AI amplifies every dimension of that risk. The 2010 algorithms followed fixed rules. Agentic systems adapt. They learn. They respond to the same market signals with behavior that can change from one interaction to the next. Recent research proposes an Agentic Financial Market Model that maps how agent design parameters (autonomy depth, execution coupling, infrastructure concentration) translate into market-level outcomes. The conclusion: rapid, potentially coordinated autonomous financial movements across multiple institutions could trigger market volatility or liquidity crises. The risk compounds because these agents share underlying model architectures. If five major banks all deploy agents built on the same large language model, responding to the same data, the diversity of market participants that normally dampens shocks is replaced by correlated behavior at machine speed.
The regulators know. They are not ready.
The Bank of England’s Financial Policy Committee said in its April 2026 record that advanced AI is not yet being used in ways that present systemic risk in UK finance, but acknowledged that risks “could increase rapidly” as firms push into agentic AI. The Bank is running scenario analysis and simulations, looking at herding in financial markets, concentrated dependence on a small number of AI service providers, and the possibility that autonomous systems could start affecting financial decisions at scale. The FCA has warned banks that rapid adoption of agentic AI could expose consumers and the financial system to new risks as major UK lenders prepare customer-facing pilots.
In the US, the picture is less coordinated. The Fed’s decision to explicitly exclude agentic AI from SR 26-2 is defensible on narrow grounds: the technology is evolving too fast for prescriptive guidance to keep pace. But the practical effect is that the most consequential AI deployments in banking sit outside the model risk framework that examiners use to evaluate banks. When a Fed examiner walks into Goldman or JPMorgan to assess model risk, the guidance they carry does not apply to the agents actually making decisions.
What a governance framework would need to look like
The current gap is not about banning agentic AI or slowing adoption. It is about the absence of supervisory tools designed for systems that learn, adapt, and interact. A fit-for-purpose framework would need to address at least three things traditional model risk management does not. First, continuous validation rather than periodic review, because an agent that recalibrates autonomously can change its behavior between examination cycles. Second, interaction testing, because the systemic risk is not in any single agent but in the emergent behavior when multiple agents from different institutions operate in the same market. Third, concentration monitoring, because if the same underlying model architecture powers agents at five or six major banks, the illusion of diversified decision-making masks correlated exposure.
None of these capabilities exist in current regulatory practice. The Bank of England’s approach of scenario analysis and simulation is the closest anyone has come, but it remains exploratory. The US has no equivalent program. As Hogan Lovells observed in their analysis of the regulatory landscape, generative and agentic AI is entering financial markets faster than existing governance can adapt. Current frameworks assume static algorithms and one-time validations. Large language models and multi-agent trading systems learn continuously, exchange latent signals, and exhibit emergent behavior.
The banks themselves recognize the problem, at least privately. Risk officers at major institutions have acknowledged that their internal model validation teams were built to assess credit models and pricing engines, not autonomous systems that interact with live markets in real time. The skillsets are different. The tooling is different. The cadence of review that works for a quarterly recalibrated credit model is meaningless for an agent that adapts with every transaction it processes. Some banks are building internal AI governance functions from scratch, but these efforts are fragmented, inconsistent across institutions, and invisible to regulators who lack the examination framework to evaluate them.
The 2010 flash crash cost $1 trillion in five minutes and led to a decade of regulatory reform for algorithmic trading. The agents being deployed today are more autonomous, more interconnected, and less predictable than the algorithms that caused that crash. The difference is that in 2010, the algorithms were at least operating within a governance framework that, however inadequate, applied to them. The agents being deployed in 2026 operate in a space the regulator has formally acknowledged it does not yet cover. That is the gap. And it will stay open until something forces it closed.
