A Brief History of Survey Data Harmonization

by Joshua Kjerulf Dubrow, Polish Academy of Sciences and CONSIRT

This article gives a brief overview of ex post cross-national survey data harmonization (SDH) projects in the social sciences from the 1980s to the 2010s (see also Burkhauser and Lillard 2005; Granda, Wolf, and Hadorn 2010; Dubrow and Tomescu-Dubrow 2014).

There are two major types of SDH projects. One are large scale projects designed to produce data on a range of research topics with open research questions. They involve multiple institutions – including governments, and especially their financing – and large numbers of researchers and assistants. These projects produce harmonized data and corresponding user manuals, as well as publications on the use of these data for addressing substantive issues. The second type are projects designed by small research teams to answer specific pre-determined research questions. Here, harmonization is limited to the variables needed to answer the research questions. This article focuses on large-scale projects.

One of the earliest attempts to integrate data from different extant surveys, and perhaps the most successful, is the Luxembourg Income Study, now simply called LIS. The idea of LIS was generated by a conference on the topic of poverty in cross-national perspective, held in Luxembourg in 1982 (for a detailed history, see Smeeding, Schmaus, and Allegrezza 1985: 2-4).

While LIS was getting off the ground, scholars interested in the concept of “time use” also started to consider how to compare all of the Time Use Studies (TUS) conducted in various countries, past and present. The resulting project, named the Multinational Time Use Study (MTUS), has its roots in the 1970s, but only took shape as a harmonized time use study in the 1980s (for a detailed history, see MTUS User’s Guide 2013: Chapter 2). MTUS is based on time use diaries. The European Foundation for the Improvement of Living and Working Conditions (EFILWC), an agency of the European Union, paid for the initial release of MTUS; the collaboration between MTUS researchers and the EU led to the Harmonized European Time Use Study, or HETUS.

One of the most significant SDH projects initiated in the 1990s is the Cross-national Equivalent File (CNEF) (see Lillard’s article in this Newsletter). CNEF is simultaneously based on the successful LIS model and designed to be different from LIS. Unlike LIS, CNEF harmonizes household panel studies and was designed to be developed and enhanced by its user community. CNEF can be called a bottom-up approach, with users having strong say in the direction of CNEF’s target variables, in contrast to LIS’ top-down approach. When it comes to top-down or bottom-up in SDH, there are no ideal solutions, as LIS uses its working papers to understand how users employ the data.

The early 2000s saw the maturation of LIS, CNEF and HETUS, and the creation of new SDH projects. An early project was the Consortium of Household Panels for European Socio-economic Research (CHER). CHER was initially funded by the European Commission for over one million Euros between 2000 and 2003, and coordinated by Centre de Recherche en Sciences Sociales (CEPS), a research bureau in Luxembourg. CHER is substantively similar in its harmonization aims to CNEF, namely the harmonization of already collected panel data. By 2003, CHER had data going back to the 1980s. CHER ended in 2003, and was not updated since.

The European Union Statistics on Income and Living Conditions (EU-SILC) was formally created in 2004 and is run by Eurostat. Like CNEF and CHER, EU-SILC deals with ex-post harmonized data of coordinated larger-scale surveys; it includes cross-sectional and longitudinal surveys on income, poverty, social exclusion and living conditions in the European Union.

The 2010s have seen the continuation of CNEF, EU-SILC, and ISMF, as well as MTUS and HETUS. In 2013, the Harmonization Project joined the group of large-scale SDH projects. It is led by sociologists Kazimierz M. Slomczynski of the Polish Academy of Sciences and The Ohio State University, and J. Craig Jenkins, who represents the OSU Mershon Center for International Security Studies. The Harmonization Project focuses on political protest and its micro and macro-determinants, while also keeping the possibility of harmonizing variables relevant to other topics open. This newsletter features a description of the study.

Lessons from History

Regarding the history of social science SDH projects since the 1980s, there is evidence that these projects learn from each other: a new methodological field emerges. Yet, this field emerges without a coordinated effort to build a comprehensive theoretical and methodological base. One reason is that SDH has no institutionalized apparatus: no journal, no professional association, no academic department, and no research center; SDH does not even have a separate handbook. It is only in the last fifteen years that, in the social sciences, there is some attempt at a theory of SDH and the development of an appropriate methodology. Exemplary work in this regard are that of Hoffmeyer-Zlotnik and Wolf (2003), Minkel (2004), Granda and Blasczyk (2010), Granda, Wolf and Hadorn (2010). The Harmonization Project has recognized these achievements, and is addressing the problems already raised by pushing for a theory of data harmonization and by focusing on methodological issues.

Joshua Kjerulf Dubrow is Associate Professor at the Institute of Philosophy and Sociology, Polish Academy of Sciences and Projects and Labs Coordinator at CONSIRT. His edited book, Political Inequality in an Age of Democracy: Cross-national Perspectives was published by Routledge in 2014.

References

Bratton, Michael. 2009. “Democratic Attitudes and Political Participation: An Exploratory Comparison across World Regions.” Paper presented at the Congress of the International Political Science Association, Santiago, Chile, July.

Burkhauser, Richard V. and Dean R. Lillard. 2005. “The Contribution and Potential of Data Harmonization for Cross-National Comparative Research.” DIW Diskussionspapiere, No. 486.

Granda, Peter and Emily Blasczyk. 2010. Data Harmonization. Guidelines for Best Practices in Cross-Cultural Surveys. Ann Arbor, MI: Survey Research Center at the Institute for Social Research, University of Michigan.

Granda, Peter, Christof Wolf and Reto Hadorn. 2010. “Harmonizing Survey Data.” Pp. 315-334  in Janet A. Harkness, Michael Braun, Brad Edwards, Timothy P. Johnson, Lars Lyberg, Peter Ph. Mohler, Beth-Ellen Pennell and Tom W. Smith (eds.) Survey Methods in Multinational, Multiregional, and Multicultural Contexts. New York: Wiley.

Hoffmeyer-Zlotnik, Jurgen H. P. and Christof Wolf. 2003. “Comparing Demographic and Socio-Economic Variables across Nations: Synthesis and Recommendations.” Pp. 389-406 in Jurgen H. P. Hoffmeyer-Zlotnik and Christof Wolf (ds.) Advances in Cross-national Comparison: A European Working Book for Demographic and Socio-Economic Variables. New York: Springer Science+Business Media.

Minkel, Hartmut. 2004. Report on Data Conversion Methodology. CHINTEX Working Paper no. 20.

Multinational Time Use Study. 2013. User’s Guide and Documentation. timeuse.org

Smeeding, Timothy, Günther Schmaus, and Serge Allegrezza. 1985. “An Introduction to LIS.” Luxembourg Income Study Working Paper Series. Working Paper No. 1 (June).