Luigi Federico Signorini | Bank of Italy


economic surveys , data integration , unconventional data

JEL Codes:

C81 , C82 , C83

Scientifically designed sample surveys have long been and continue to be a key tool for measuring the policy-relevant dimensions of heterogeneity across populations of economic agents. Despite their important role in supporting policy-makers, surveys are facing various challenges such as declining response rates and the proliferation of more timely and competitive alternative sources of data. This policy brief argues that many of the challenges can be tackled, at least to some extent, by making extensive use of the newly available (massive) data, for instance by integrating administrative data into surveys, and of the technological advances in survey data collection and handling. Turning existing challenges into opportunities while preserving the methodological rigour of well-designed surveys is not an easy task. It may take time and require great caution. More importantly, the involvement of economic analysts in designing surveys and collecting data along with a strong investment in statistical human capital may enhance the value of surveys and prepare them for the challenges of the 21st century.

Survey data have played and continue to play a key role in shaping our understanding of the economy and our ability to design effective policies.

The Bank of Italy has a long tradition in this area. Following some even earlier experiments, we launched our Survey of Household Income and Wealth in the early 1960s. While there have been many changes since those early days, our SHIW is still very much alive; it has one of the longest continuous histories in this field. Soon afterwards, we started our annual Survey of Industrial and Service Firms, also still running. Over time, we have further expanded our toolbox, with higher-frequency surveys of firms and households, banks and real-estate agents.

We have always prized our ability to collect pieces of data through surveys that are not provided by general statistics, but are meaningful for economic analysis in our fields of action. When I say ‘the ability to collect’, I do not only refer to legal freedom (or authority), or to practical means, which are just prerequisites. Even before Zvi Griliches’s famous 1985 critique1 – that while economists were increasingly using and even designing surveys, they were broadly uninterested in, and unaware of, the crucial steps of data collection and quality assessment – our surveys had always entailed, first, the close involvement of economic analysts at all stages and second, strong investment in statistical human capital, that is, both theoretical and practical expertise in statistics.

We are proud that our experience spans decades. However, or perhaps I should say because of this, we are also keenly aware of the evolving challenges and opportunities.

Among today’s challenges, one that I believe is common to many surveys around the world is that response rates have been declining for some time.2 Poor response rates drain budgets, as surveyors need to make more effort to enrol interviewees, and threaten the quality of the data, especially when the propensity to respond is correlated with the phenomena under investigation, like income and wealth. The usual statistical corrections are not always enough to do away with this kind of bias.

This, moreover, is happening just as researchers and policymakers are looking for new, better and more comprehensive micro data. As societies become increasingly complex, and unprecedented shocks pose new questions for economic analysis, the need for timely and granular data grows ever more pressing.

One is tempted to look for a response to both challenges in the proliferation of competitive alternative sources, such as administrative and unstructured (‘big’) data. The latter often have clear advantages in terms of timeliness, granularity and number of observations. Efficiency is also a consideration, as the marginal social cost of putting existing data to new uses is usually negligible. (The market price for accessing such data, of course, is not necessarily negligible in the case of proprietary data, especially if they possess monopoly value).

On the face of it, there are limits to what statisticians can do with data that are not collected primarily for statistical purposes. The sample design, when the concept is at all applicable to such data, is often obscure. Coverage, accuracy, policy relevance and integrity are also typically unsatisfactory from the statistician’s point of view.

However, given the phenomenal, ever-growing amount of data that is nowadays available somewhere in the global IT ecosystem, finding ways to overcome these difficulties is a challenge that statisticians should be eager to take on.

One promising avenue (among many) is integrating administrative and/or unstructured data into surveys. Ideally, this would combine the richness and cheapness of the information contained in large databases with the methodological rigour of well-designed surveys. There are various possibilities. Administrative data, when they more or less cover the universe, can be (and are) used to improve the sampling design and correct any nonresponse bias. Unit-record linkage of surveys with administrative data (or sometimes even big unstructured data) can be used to populate certain survey variables, reducing the need for lengthy questionnaires and thus lessening the response burden. Data integration may also facilitate matching between surveys.

Admittedly, exploiting such opportunities is not always an easy task; it may take time, and it should always be done with care. Integration of surveys with non-traditional data sources to produce multivariate indicators, such as the ratio of debt to income, may be challenging, since the two sources are likely to suffer from different quality issues.

The linkage may be subject to respondents’ consent and thus introduce consent bias in lieu of nonresponse bias. Difficult negotiations among institutions, and/or payment to for-profit data owners, will often be needed to access such data. Privacy law or other confidentiality considerations will require adequate methods to protect the anonymity of respondents. And so on. There is little, however, that cannot be done with enough goodwill and ingenuity on all sides (assuming, that is, that finance is no constraint, given the savings that the use of external data may entail).

Other opportunities for the development of surveys stem from advances in data collection techniques that enable statisticians to run mixed-mode surveys in a better way. In fact, the form of the interview (whether face-to-face, through the web, by telephone and so on) influences participants’ willingness to respond and the frankness or completeness of responses, and mixed approaches can improve participation rates if the mode is tailored to different groups’ preferences. While more research is needed to evaluate its effects more accurately, mixed-mode data collection may indeed be useful.

Good distributional information remains essential for studying the transmission of shocks and underpins the decision-making of central banks and other institutions. Scientifically designed sample surveys have long been a key tool for measuring the policy-relevant dimensions of heterogeneity across populations of households, individuals, firms and so on. They are still invaluable as a way to answer meaningful analytical questions, and to tailor the data to the constantly changing needs of research and policy debate. While surveys face increasing costs and other challenges, these can be tackled at least in part by, among other things, making clever use of newly available massive data. Such usage can help enhance the value of surveys and make them fit for the 21st century, although it may also put a question mark over the survival of surveys as self-contained sets of information.

As a data producer, the Bank of Italy is trying to exploit some of the new opportunities. The sampling design of the last wave of the SHIW has changed thanks to newly available administrative information on households’ income and indebtedness. This change has materially improved our ability to observe segments of the population that, though small, account for a high share of the target variables, such as wealth or debt. This helps to reconcile survey-based data with aggregate variables and to compute more accurate distributional statistics. Our Income, Consumption and Wealth joint project with the National Institute of Statistics (Istat) aims at creating a synthetic database for three different surveys,3 using statistical matching techniques that rely on a common set of administrative records. Together with the ECB, we have also started a Distributional Wealth Accounts project that is meant to bridge the gap between two survey waves by linking macro and micro data. Many similar experiments are taking place across the world, and this meeting is a good opportunity to exchange experiences, including beyond or between formal sessions.

  • 1.

    Zvi Griliches 1985. ‘Data and Econometricians-The Uneasy Alliance’, American Economic Review, vol. 75(2), pages 196-200.

  • 2.

    For instance, Britain’s Labour Force Survey (LFS) response rate halved from 2011 to 2021. The share of households responding to the US Current Population Survey (CPS) has decreased by about 20 percentage points over the last decade. The Bank of Italy’s Survey of Household Income and Wealth suffered a similar reduction between 2010 and 2020. The rest of Europe and Canada have seen similar trends.

  • 3.

    (1) the Survey on Income and Living Conditions (EU-SILC) for income; (2) the Household Budget Survey (HBS) for consumption; and (3) the Survey on Household Income and Wealth for wealth.

About the authors

Luigi Federico Signorini

Luigi Federico Signorini is Senior Deputy Governor of the Bank of Italy and President of the Insurance Supervisory Authority (IVASS) (Presidential Decree of 12 March 2021, published in the Italian Official Gazette No 97, dated 23 April 2021).

More on these topics