This policy brief is based on “Harnessing artificial intelligence for monitoring financial markets,” BIS Working Paper, No 1291/2025. This paper does not necessarily represent the views of the Bank for International Settlements or the Central Bank of Brazil.
Abstract
Predicting financial market stress has long proven to be a largely elusive goal. Advances in artificial intelligence and machine learning offer new possibilities to tackle this problem, given their ability to handle large datasets and unearth hidden nonlinear patterns. However, financial supervisors not only need good forecasts of market stress, but also tools that explain what drives emerging risks. This policy brief introduces a framework combining recurrent neural networks with large language models to address both needs. The neural network forecasts an indicator of stress periods while assigning transparent, time-varying weights to each input variable, revealing which market indicators matter most at any moment and addressing the “black box” problem in machine learning. These weights then direct automated searches of financial news to identify relevant economic narratives behind the statistical signals. The model identified, out-of-sample and with three months of anticipation, elevated risks before the March 2023 banking turmoil and October 2023 Treasury stress while pinpointing underlying drivers. Our approach demonstrates how transparency can be built into predictive systems, making them both powerful and interpretable for real-time supervisory monitoring.
Predicting financial market stress has long been a highly desirable yet elusive goal for researchers, policymakers and market participants. Despite advances in econometric modelling, anticipating episodes of market dysfunction remains extraordinarily challenging. The rarity of severe stress events limits the training data available for statistical models, while the unique and often nonlinear transmission mechanisms across markets through which risks materialise make it difficult to identify common patterns across different crisis episodes (Brunnermeier and Oehmke (2013)). Traditional early warning systems have shown mixed success, often suffering from high false positive rates or failing to capture novel sources of systemic risk. These fundamental challenges have motivated researchers to explore alternative approaches, including machine learning techniques that can process high-dimensional data and capture complex nonlinear relationships.
Recent advances in artificial intelligence (AI) and machine learning (ML) have opened new possibilities for financial stability monitoring. Machine learning models can process high-dimensional datasets and uncover complex nonlinear patterns that traditional econometric methods might miss. However, these models often operate as “black boxes,” making it difficult for supervisors to understand why a model flags certain periods as risky or which specific market developments deserve attention, an issue that is the focus of ongoing research and discussion (Aldasoro et al (2025)).
This policy brief presents a two-stage framework that addresses both challenges: first, a neural network model that forecasts market stress while transparently revealing which variables drive each forecast; second, a large language model (LLM) that searches and digests contemporaneous textual information to explain the economic narrative behind the statistical signals.
We demonstrate this approach using deviations from triangular arbitrage parity (TAP) in the Euro-Yen currency pair. In well-functioning markets, the direct exchange rate between two currencies should equal the indirect rate obtained by trading through a vehicle currency (the US dollar). Any difference can be quickly and easily arbitraged away by market participants. Therefore, occasions when the direct and indirect rates for the same pair of currencies diverge noticeably signal frictions in one of the world’s most liquid markets, a “canary in the coal mine” for broader dysfunction.
Our model forecasts the 20-day average of TAP deviations beginning 60 business days ahead, essentially predicting three months in advance the market conditions around a one-month window. The inputs comprise over 100 daily financial variables including currency spreads, equity volatility indices, government bond yields and forward exchange rates across major economies. The goal is to “cast a wide net” to pick up relevant signal from different corners of the economy.
Unlike conventional neural networks, our model incorporates a dynamic variable weighting mechanism. At each point in time, the model assigns time-varying weights to input variables, automatically identifying which indicators matter most for predicting future market conditions (Araujo (2025)). Crucially, these weights are directly observable to supervisors, providing transparency about the model’s decision-making process.
This design serves two purposes. First, the weights themselves provide early signals of shifting market dynamics. When the model suddenly increases the weight on a particular variable, supervisors can investigate why that indicator has become more relevant. Second, when the model forecasts heightened stress, the weights identify precisely which variables to examine for potential explanations.
Out-of-sample testing on 3.5 years of data (2021–2024) demonstrates the model’s practical value. The model successfully identified periods of market stress, with dynamic weights highlighting variables subsequently confirmed as relevant to actual events. Notably, the model signalled elevated risks before the March 2023 banking turmoil, even though it had not been trained on any data after end-2020 (Graph 1). By contrast, periods such as the market anomalies around the onset of Covid-19 were essentially not forecasted by the model, consistent with the fact that the origins of the event were completely outside the financial system.
The model exhibits a particularly useful characteristic: its predictions are less volatile than both the actual market data and simpler autoregressive benchmarks. This means that supervisors can have greater confidence that when the model forecasts dysfunction, it is likely to materialise, reducing false alarms that can lead to “warning fatigue.”
Statistical tests confirm that the model’s forecasts meaningfully predict future market conditions. Whereas a benchmark autoregressive model essentially carries forward current values, our neural network forecasts market conditions approximately three months in advance.

Next, we demonstrate how LLMs can transform the neural network’s variable weights into narrative explanations.
When the model forecasts elevated stress and assigns high weights to specific variables, we deploy an LLM to search contemporaneous financial news focusing on precisely those indicators. This targeted approach dramatically improves the signal-to-noise ratio compared to asking an LLM to summarise all financial news.
We illustrate this with the October 2023 “Treasury Tantrum” (Diercks and Asnani (2024)). Using only financial news from July 2023, when our model forecast heightened stress, the LLM shortlisted potential sources of risk, which included the key risk factor that subsequently proved relevant: “diverging views on Federal Reserve policy creating volatility”. This successful identification of a looming vulnerability was generated automatically, three months before the stress materialised.
This framework offers several advantages for real-time financial stability monitoring:
Early intervention capability. Three-month forecasting horizons provide sufficient time for supervisory authorities to investigate emerging risks, engage with market participants, and prepare contingency plans before stress peaks.
Transparent decision-making. Unlike opaque “black box” models, the variable weights make the model’s reasoning explicit. Supervisors can see which market indicators drive each forecast, facilitating internal discussions and external communication.
Automated narrative generation. By combining quantitative forecasts with LLM-powered news analysis, the system delivers both statistical predictions and economic explanations, addressing the perennial challenge of translating technical findings into actionable policy insights.
Resource efficiency. The automated approach enables supervisors to monitor numerous markets simultaneously without manually reviewing thousands of news articles or market reports daily.
Adaptability to novel risks. The model’s dynamic weights adjust to changing market conditions, potentially detecting new transmission channels or emerging vulnerabilities that were not present during the training period.
For financial supervisors, this framework represents a concrete tool for enhancing real-time risk monitoring while maintaining interpretability, a critical consideration when analytical results must inform high-stakes policy decisions. As AI continues advancing, such hybrid approaches that combine quantitative forecasting with qualitative explanation may become central to financial stability surveillance.
Aldasoro, I, P Hördahl, A Schrimpf and S Zhu (2025): “Predicting financial market stress with machine learning”, BIS Working Papers, no 1250.
Araujo, D K G (2025): “Dynamic nonlinear variable selection in time series”, mimeo.
Brunnermeier, M and M Oehmke (2013): “Chapter 18 – Bubbles, Financial Crises, and Systemic Risk”, Handbook of the Economics of Finance, vol 2, part B, pp 1221–1288.
Diercks, A M and D Asnani (2024): “The Treasury tantrum of 2023”, FEDS Notes, 3 September