menu
close

Author(s):

Matteo Aquilina | Bank for International Settlements (BIS)
Douglas Araujo | Central Bank of Brazil
Gaston Gelos | Bank for International Settlements (BIS)
Taejin Park | Bank for International Settlements (BIS)
Fernando Pérez-Cruz | Bank for International Settlements (BIS)

Keywords:

market dysfunction , liquidity , arbitrage , artificial intelligence , large language models , financial stability

JEL Codes:

G14 , G15 , G17

This policy brief is based on “Harnessing artificial intelligence for monitoring financial markets,” BIS Working Paper, No 1291/2025. This paper does not necessarily represent the views of the Bank for International Settlements or the Central Bank of Brazil.

Abstract
Predicting financial market stress has long proven to be a largely elusive goal. Advances in artificial intelligence and machine learning offer new possibilities to tackle this problem, given their ability to handle large datasets and unearth hidden nonlinear patterns. However, financial supervisors not only need good forecasts of market stress, but also tools that explain what drives emerging risks. This policy brief introduces a framework combining recurrent neural networks with large language models to address both needs. The neural network forecasts an indicator of stress periods while assigning transparent, time-varying weights to each input variable, revealing which market indicators matter most at any moment and addressing the “black box” problem in machine learning. These weights then direct automated searches of financial news to identify relevant economic narratives behind the statistical signals. The model identified, out-of-sample and with three months of anticipation, elevated risks before the March 2023 banking turmoil and October 2023 Treasury stress while pinpointing underlying drivers. Our approach demonstrates how transparency can be built into predictive systems, making them both powerful and interpretable for real-time supervisory monitoring.

Policy motivation

Predicting financial market stress has long been a highly desirable yet elusive goal for researchers, policymakers and market participants. Despite advances in econometric modelling, anticipating episodes of market dysfunction remains extraordinarily challenging. The rarity of severe stress events limits the training data available for statistical models, while the unique and often nonlinear transmission mechanisms across markets through which risks materialise make it difficult to identify common patterns across different crisis episodes (Brunnermeier and Oehmke (2013)). Traditional early warning systems have shown mixed success, often suffering from high false positive rates or failing to capture novel sources of systemic risk. These fundamental challenges have motivated researchers to explore alternative approaches, including machine learning techniques that can process high-dimensional data and capture complex nonlinear relationships.

Recent advances in artificial intelligence (AI) and machine learning (ML) have opened new possibilities for financial stability monitoring. Machine learning models can process high-dimensional datasets and uncover complex nonlinear patterns that traditional econometric methods might miss. However, these models often operate as “black boxes,” making it difficult for supervisors to understand why a model flags certain periods as risky or which specific market developments deserve attention, an issue that is the focus of ongoing research and discussion (Aldasoro et al (2025)).

This policy brief presents a two-stage framework that addresses both challenges: first, a neural network model that forecasts market stress while transparently revealing which variables drive each forecast; second, a large language model (LLM) that searches and digests contemporaneous textual information to explain the economic narrative behind the statistical signals.

Monitoring market frictions: the TAP deviation

We demonstrate this approach using deviations from triangular arbitrage parity (TAP) in the Euro-Yen currency pair. In well-functioning markets, the direct exchange rate between two currencies should equal the indirect rate obtained by trading through a vehicle currency (the US dollar). Any difference can be quickly and easily arbitraged away by market participants. Therefore, occasions when the direct and indirect rates for the same pair of currencies diverge noticeably signal frictions in one of the world’s most liquid markets, a “canary in the coal mine” for broader dysfunction.

Our model forecasts the 20-day average of TAP deviations beginning 60 business days ahead, essentially predicting three months in advance the market conditions around a one-month window. The inputs comprise over 100 daily financial variables including currency spreads, equity volatility indices, government bond yields and forward exchange rates across major economies. The goal is to “cast a wide net” to pick up relevant signal from different corners of the economy.

How the model works: transparent predictions

Unlike conventional neural networks, our model incorporates a dynamic variable weighting mechanism. At each point in time, the model assigns time-varying weights to input variables, automatically identifying which indicators matter most for predicting future market conditions (Araujo (2025)). Crucially, these weights are directly observable to supervisors, providing transparency about the model’s decision-making process.

This design serves two purposes. First, the weights themselves provide early signals of shifting market dynamics. When the model suddenly increases the weight on a particular variable, supervisors can investigate why that indicator has become more relevant. Second, when the model forecasts heightened stress, the weights identify precisely which variables to examine for potential explanations.

Out-of-sample performance

Out-of-sample testing on 3.5 years of data (2021–2024) demonstrates the model’s practical value. The model successfully identified periods of market stress, with dynamic weights highlighting variables subsequently confirmed as relevant to actual events. Notably, the model signalled elevated risks before the March 2023 banking turmoil, even though it had not been trained on any data after end-2020 (Graph 1). By contrast, periods such as the market anomalies around the onset of Covid-19 were essentially not forecasted by the model, consistent with the fact that the origins of the event were completely outside the financial system.

The model exhibits a particularly useful characteristic: its predictions are less volatile than both the actual market data and simpler autoregressive benchmarks. This means that supervisors can have greater confidence that when the model forecasts dysfunction, it is likely to materialise, reducing false alarms that can lead to “warning fatigue.”

Statistical tests confirm that the model’s forecasts meaningfully predict future market conditions. Whereas a benchmark autoregressive model essentially carries forward current values, our neural network forecasts market conditions approximately three months in advance.

From predictions to explanations: Using LLMs

Next, we demonstrate how LLMs can transform the neural network’s variable weights into narrative explanations.

When the model forecasts elevated stress and assigns high weights to specific variables, we deploy an LLM to search contemporaneous financial news focusing on precisely those indicators. This targeted approach dramatically improves the signal-to-noise ratio compared to asking an LLM to summarise all financial news.

We illustrate this with the October 2023 “Treasury Tantrum” (Diercks and Asnani (2024)). Using only financial news from July 2023, when our model forecast heightened stress, the LLM shortlisted potential sources of risk, which included the key risk factor that subsequently proved relevant: “diverging views on Federal Reserve policy creating volatility”. This successful identification of a looming vulnerability was generated automatically, three months before the stress materialised.

Practical implications for financial supervisors

This framework offers several advantages for real-time financial stability monitoring:

Early intervention capability. Three-month forecasting horizons provide sufficient time for supervisory authorities to investigate emerging risks, engage with market participants, and prepare contingency plans before stress peaks.

Transparent decision-making. Unlike opaque “black box” models, the variable weights make the model’s reasoning explicit. Supervisors can see which market indicators drive each forecast, facilitating internal discussions and external communication.

Automated narrative generation. By combining quantitative forecasts with LLM-powered news analysis, the system delivers both statistical predictions and economic explanations, addressing the perennial challenge of translating technical findings into actionable policy insights.

Resource efficiency. The automated approach enables supervisors to monitor numerous markets simultaneously without manually reviewing thousands of news articles or market reports daily.

Adaptability to novel risks. The model’s dynamic weights adjust to changing market conditions, potentially detecting new transmission channels or emerging vulnerabilities that were not present during the training period.

Conclusion

For financial supervisors, this framework represents a concrete tool for enhancing real-time risk monitoring while maintaining interpretability, a critical consideration when analytical results must inform high-stakes policy decisions. As AI continues advancing, such hybrid approaches that combine quantitative forecasting with qualitative explanation may become central to financial stability surveillance.

References

Aldasoro, I, P Hördahl, A Schrimpf and S Zhu (2025): “Predicting financial market stress with machine learning”, BIS Working Papers, no 1250.

Araujo, D K G (2025): “Dynamic nonlinear variable selection in time series”, mimeo.

Brunnermeier, M and M Oehmke (2013): “Chapter 18 – Bubbles, Financial Crises, and Systemic Risk”, Handbook of the Economics of Finance, vol 2, part B, pp 1221–1288.

Diercks, A M and D Asnani (2024): “The Treasury tantrum of 2023”, FEDS Notes, 3 September

About the authors

Matteo Aquilina

Matteo Aquilina is the Adviser for Financial Stability at the Bank for International Settlements (BIS). He leads the BIS’s contributions and deliverables to the G20 and the Financial Stability Board. His research has been published in several academic journals, including the Quarterly Journal of Economics, the Journal of Economic Dynamics and Control and the Journal of Financial Markets.

Douglas Araujo

Douglas Araujo is a research economist at the Central Bank of Brazil. His current work focuses on the relationship between financial stability and monetary policy, and the use of machine learning in econometrics. His experience includes the Bank for International Settlements, consultancy work for the International Monetary Fund, and the private sector.

Gaston Gelos

Gaston Gelos is Deputy Head of the Monetary and Economic Department and Head of Financial Stability Policy, and a member of the senior management team at the BIS. Previously, he spent 25 years in a range of roles at the International Monetary Fund (IMF). His research has focused mainly on financial stability, monetary policy and capital flows, and has been published widely in leading academic journals. He holds a PhD from Yale and is a CEPR Research Fellow.

Taejin Park

Taejin Park is Head of Financial Markets Research Support at the BIS. His main interest is the application of AI technologies in economic research.

Fernando Pérez-Cruz

Fernando Pérez‑Cruz is a Senior Advisor on Innovation at the BIS and an adjunct Professor at the Computer Science department at ETH Zurich. He was the Chief Data Scientist at the Swiss Data Science Centre. His current research interest lies in machine learning and its application to economics, sciences, and engineering. He has an h-index of 42.

More on these topics

Tags:
Privacy Overview

This website uses cookies so that we can provide you with the best user experience possible. Cookie information is stored in your browser and performs functions such as recognising you when you return to our website and helping our team to understand which sections of the website you find most interesting and useful.