Harmonizing Corruption Items in Cross-national Surveys

by Ilona Wysmułek, Graduate School for Social Research, Polish Academy of Sciences

Corruption, given its secretive nature, is a phenomenon that is hard to capture in the interview situation.

In corruption research, surveys are among the major sources of our knowledge about the subject (Heath, Richards and de Graaf 2016; Karalashvili, Kraay and Murrell 2015).  However, there are several methodological challenges to studying cross-national trends in corruption with public opinion data. Corruption, given its secretive nature, is a phenomenon that is hard to capture in the interview situation. Some respondents are reluctant to answer sensitive questions and some may understand the concept differently than intended by researchers (Azfar and Murrell 2009; Bertrand and Mullainathan 2001). Moreover, international survey projects dealing with corruption continue to face challenges of unequal country representation. Estimation of rare event determinants also remains problematic, given that reported corruption instances are, for most modern democracies, highly infrequent.

To overcome some of these methodological problems, I apply ex-post harmonization of cross-national survey data in corruption research. In my dissertation project, I study corruption perception and individual corruption experience of giving informal payments (as a bribe or a gift) in public schools in Europe. I use cross-national survey data on corruption in public schools in Europe combined with country-level indicators, for example from the World Bank Education Statistics and OECD’s Education at a Glance. I follow the Survey Data Recycling (SDR) framework developed by the research team of Kazimierz M. Slomczynski, which provides a blueprint for ex-post survey data harmonization and for integrating surveys and other data sources (please see corruption project for more detailed information).

Read more ›

Harmonizing Ethnic Minority Status in International Survey Projects

by Olena Oleksiyenko, Graduate School for Social Research, Polish Academy of Sciences

This article focuses on issues of harmonizing information on ethnic minority status as part of a larger project on patterns of electoral and non-electoral political participation in post-soviet states. Specifically, I am interested in differences in political participation between a given country’s Russian-speaking minority and the majority population in Armenia, Azerbaijan, Belarus, Estonia, Georgia, Kazakhstan, Kyrgyzstan, Latvia, Lithuania, Moldova, Tajikistan, Uzbekistan and Ukraine.

There is no single international survey project that adequately covers all the former Soviet republics since the Soviet Union’s collapse, to current times. Even projects with the broadest country coverage, such as Life in Transition, do not allow for meaningful over-time comparisons. Hence, I selected, for purpose of ex-post harmonization, international projects that measure peoples’ electoral and non-electoral participation and ethnic identification in any of the post-soviet countries. Table 1 presents the list of the international survey projects I included, which taken together, span the period 1993- 2015.

Table 1. International Survey Projects with Relevant Data


Cross-national comparisons of ethnic groups are not as straightforward as it may seem, since in many cases the underlying concept of “minority group” is different in each state. The literature proposes different approaches to increase comparability of the concept. The “absolutist” approach suggests that only one marker of minority status should be taken into account, e.g. citizenship or language. The advantage of such a solution is conceptual clarity, but one can argue that the complexity of the minority status cannot be precisely studied with only one indicator. An alternative is the “relativist” approach to harmonization of items on minority status. This involves cross-classification of different ethnic referents to obtain a single, cross-nationally equivalent score on “ethnic minority status” (Lambert 2005). The problem with the “relativist” approach is the low availability of the same markers across all surveys.

Read more ›

Latest Issue of Harmonization Newsletter Out Now

The Harmonization Project team, in coordination with Cross-national Studies: Interdisciplinary Research and Training program (CONSIRT.osu.edu), has published the latest issue of Harmonization: Newsletter on Survey Data Harmonization in the Social Sciences.

harmonization-newsletter-v2-n2-2016-2017You can download and view the newsletter here.

This issue features articles on a variety of methodological topics. Tom Smith, of NORC at the University of Chicago, discusses recent projects in survey data harmonization. Claire Durand and colleagues at the University of Montreal present their projects on analyzing trust in institutions using surveys pooled across time and countries. Zbigniew Sawinski, long-time methodologist of the Polish Panel Survey POLPAN, presents a schema of inter-wave harmonization of panel data. Two graduate students at the Graduate School for Social Research of the Polish Academy of Sciences discuss their dissertation projects on harmonizing ethnic minority status in surveys of post-Soviet nations (Olena Oleksiyenko) and on harmonizing corruption items in international survey projects (Ilona Wysmulek). We also include news from IPUMS (Catherine A. Fitch) and GESIS (Kristi Winters).

The harmonization community continues to present their research at conferences and workshops around the world. In this issue, we have reports from the International Political Science Association meeting in Poland, the QDET2 in Miami, Florida, the 3MC conference in Chicago, Illinois, and the International Social Survey Programme meeting in Lithuania.
As always, we invite all scholars interested in survey data harmonization to read our newsletter and contribute their articles and news to future editions.


New Book! Democratic Values and Protest Behavior: Harmonization of Data

The Harmonization Project has released their first book:

Democratic Values and Protest Behavior Harmonization of Data from International Survey Projects IFiS Publishers 2016

Democratic Values and Protest Behavior: Harmonization of Data from International Survey Projects by Kazimierz M. Słomczyński, Irina Tomescu-Dubrow, and J. Craig Jenkins with Marta Kołczyńska, Przemek Powałko, Ilona Wysmułek, Olena Oleksiyanko, Marcin W. Zieliński and Joshua K. Dubrow. 2016. Warsaw: IFiS Publishers.

This book is available on our website free to download and read.

Across the world, mass political protest has shaped the course of modern history. Building on decades of theory, we hypothesize that the extent and intensity of political protest is a function of micro-level democratic values and socio-demographics, country-level economic development and democratic practices, and the discrepancy (i.e. cross-level interaction) between a country’s democratic practices and peoples’ trust in key democratic institutions – that is, political parties, the justice system, and parliament.

This book is a Technical Report on the logic of, and methodology for, creating a multi-year multi-country database needed for comparative research on political protest. It concerns both the selection and ex-post harmonization of survey information and the manner in which the multilevel structured data can be used in substantive analyses.

The database we created contains information on more than two million people from 142 countries or territories, interviewed between the 1960s and 2013. It stores individual-level variables from 1,721 national surveys stemming from 22 well-known international survey projects, including the European Social Survey, the International Social Survey Programme, and the World Values Survey. We constructed comparable measures of peoples’ participation in demonstrations and signing petitions, their democratic values and socio-demographic characteristics. We complemented the harmonized individual-level data with macro-level measures of democracy, economic performance, and income inequality gathered from external sources. In the process, we pulled together three strands of survey methodology – on data quality, ex-post harmonization, and multilevel modeling.

This book is funded by the (Polish) National Science Center under a three-year international cooperation grant for the Institute of Philosophy and Sociology of the Polish Academy of Sciences (IFiS PAN), and The Ohio State University (OSU) Mershon Center for International Security Studies (grant number: Harmonia-2012/06/M/HS6/00322).

Survey Weights as Indicators of Data Quality

“Survey Weights as Indicators of Data Quality” by Marta Kolczynska, Marcin W. Zielinski, and Przemek Powalko appears in Harmonization newsletter (Summer 2016, v2 n1) 

In the last decades, more and more scholars are using weights as a procedure for correction of distortions in surveys. The improvement in the quality of the data using weights is conditional upon the quality of the weights themselves, as well as their ability to correct the discrepancies between the realized sample and the population. In cross-national research, especially when combining survey data from different survey projects, the additional challenge is making sure across national samples, the quality of the weights and the quality of weighted data are comparable and allow for meaningful analyses of the combined data.

Over time, weighting data has gained popularity as a way of dealing with sampling and non-responses errors.

We propose four properties of weights that can be considered as both indicators of their quality, and also as indicators of the quality of the data in terms of the degree of distortion between the targeted sample and the achieved sample. First, the mean value of weights in a sample should be equal to 1; otherwise weighting the data would change the sample size and thus artificially alter standard errors and confidence intervals and lead to unfounded conclusions of hypothesis testing. Second, while weights usually lead to an increase in variance in the data, weights with a smaller variance are generally preferred over weights with greater variance. Weight variance depends on the discrepancy between the achieved sample and the population, or the extent to which the raw data need to be corrected to represent the population. Thus, in some sense, the weight variance can be assumed as a rough indicator of the quality of the sample. Finally, to avoid case exclusion and the loss of information, weights should have values greater than 0. If a weight would take the value 0, that case would be excluded from analyses. Extreme values should be avoided because they lead to potential bias if the individuals who have been assigned very high weights are specific, unusual, and deviating from the average.

marta kolczynska survey weights table 1

Read more ›

Quality of Survey Data: How to Estimate It and Why It Matters

A new article in Harmonization: Newsletter on Survey Data Harmonization in the Social Sciences by Melanie Revilla, Willem Saris and the Survey Quality Predictor (SQP) team

There is no measurement without error. However, the size of the error can vary depending on the measure used. In particular in social sciences survey data, the size of the error can be very large: on average, 50 percent of the observed variance in answers to survey questions is error (Alwin 2007). The size of the error can vary a lot depending on the exact formulation of the survey questions used to measure the concepts of interest (Saris and Gallhofer 2014) and also across languages or across time. Thus, one of the main challenges for cross-sectional and longitudinal surveys, in order to make meaningful comparisons across groups or time, is to be able to estimate the size of the measurement error, and to correct for it.

SQP is based on 3,700 quality estimates of questions obtained in more than 30 European countries and languages…

Read more ›

Harmonization Newsletter for Summer 2016 Out Now

The new issue of Harmonization: Newsletter on Survey Data Harmonization in the Social Sciences is now available. Harmonization is a product of the Harmonization team, and organized by Cross-national Studies: Interdisciplinary Research and Training program (CONSIRT.osu.edu). Working together, we share news and communicate with the growing community of scholars, institutions and government agencies who work on harmonizing social survey data and other projects with similar focus.

Articles in this issue:

Quality of Survey Data: How to Estimate It and Why It Matters by Melanie Revilla, Willem Saris and the Survey Quality Predictor (SQP) team

Estimation Bias due to Duplicated Observations: A Monte Carlo Simulation by Francesco Sarracino and Małgorzata Mikucka

Survey Weights as Indicators of Data Quality by Marta Kołczyńska, Marcin W. Zieliński, and Przemek Powałko

Read more ›

The Importance of Data Documentation for Survey Data Harmonization

by Marta Kołczyńska, The Ohio State University and Polish Academy of Sciences

Data, according to the United Nations Statistical Commission, are “the physical representation of information in a manner suitable for communication, interpretation, or processing by human beings or by automatic means” (UNSC 2000: 6). In other words, for information to qualify as data, it needs to be usable. Usable survey data depends on the availability and the high-quality of documentation.

Survey documentation refers to information on when, where, how and by whom the study was conducted, including information on the type of the sampling, size of the sample, response rate, preparation of the questionnaire and other instruments, as well as pretesting, and fieldwork control. In the Internet age, this information should accompany the survey data set in the form of one or more documents electronically available for viewing and downloading.

The main goal of any statistical analysis using survey data is to draw inferences about the target population. The precondition is that the survey sample is representative for the population. Representativeness can be approached in different ways and met to different degrees.

The researcher ultimately has to decide whether a given survey sample is sufficiently representative to solve their research problem. This decision requires knowledge about sampling, including the sampling scheme, the sampling frame and, if such is the case, details of stratified samples or other methods. For researchers, additional aspects of the survey process, such as response rates and control of fieldwork, are also important to review in order to assess survey data quality.

Read more ›

Tagged with:

Big Data and Political Behavior

by J. Craig Jenkins, Kazimierz M. Slomczynski, and Joshua Kjerulf Dubrow

An upsurge in popularity is not necessarily a revolution.

The wealth of quantitative data—including data from cross-national survey projects, official governmental and nongovernmental organization (NGO) statistics, newspapers and electronic newswires, and a variety of Internet-based websites, blogs, and social media sites—has generated a large and growing empirically based literature on political behavior.

Yet, social scientists have only begun to use this wealth to its fullest capacity, as advances in computing infrastructures, methods, and Internet communication technologies create new opportunities for developing and integrating diverse types of information into social science data. Social science faces the challenge of “big data,” a new era of the quantification and analysis of political behavior on an unprecedented scope and scale.

Will it rise to this challenge?

We guest edited a special issue of the International Journal of Sociology addresses recent uses of “big data,” its multiple meanings, and the potential that this may have in building a stronger understanding of political behavior. In our introduction, “Political Behavior and Big Data,” we address recent uses of “big data,” its multiple meanings, and the potential that this may have in building a stronger understanding of political behavior.

Read more ›

Tagged with:

The Harmonization Project

by Irina Tomescu-Dubrow and Kazimierz M. Slomczynski, Polish Academy of Sciences and CONSIRT

The Democratic Values and Protest Behavior: Data Harmonization, Measurement Comparability, and Multi-Level Modeling study is financed by the (Polish) National Centre of Science and supported by The Ohio State University. CONSIRT hosts the project in Poland. While there are a number of survey data harmonization projects that have informed our own, each with their own acronyms (Dubrow and Tomescu-Dubrow 2014), we have come to call this large-scale research, simply, the Harmonization Project.

To test this model we need data at both the individual- and the country-level that vary over time and across space. The Harmonization Project sets out to create comparable measurements of political protest, social values, and demographics via ex-post harmonization of variables from international survey projects and append them with macro-level variables from external sources such as the World Bank, OSCE, UN agencies, Transparency International, and others.

DSC00800 (2)

Read more ›

Tagged with: