Data – Page 2 – Use It for Good

Tools for Combining Data Across IPUMS Global Health Surveys

May 14, 2025March 3, 2025 by mpcblog

By Miriam King, Devon Kristiansen, and Anna Bolgrien

IPUMS Global Health includes integrated data from three international health surveys: Demographic and Health Surveys (IPUMS DHS), Multiple Indicator Cluster Surveys (IPUMS MICS), and Performance Monitoring for Action (IPUMS PMA). All three surveys are nationally representative, primarily focus on low- and middle-income countries, and address issues related to the health and well-being of women and young children. These commonalities make combining integrated data across these data collections appealing. As Figure 1 shows, IPUMS DHS and IPUMS MICS cover different countries; combining them extends the geographic coverage of harmonized versions of data covering similar topics. Researchers can also combine data for those countries included in both IPUMS DHS and IPUMS MICS to provide additional observation points for time-series analyses.

Figure 1: Countries covered by IPUMS DHS and IPUMS MICS

World map with the countries included in IPUMS MICS and IPUMS DHS shaded in

Researchers who want to carry out cross-survey analyses face practical challenges. IPUMS imposes consistent variable names and codes within one kind of survey (DHS, MICS, or PMA); harmonized variable names and codes differ between these surveys. On each project’s website, the documentation for each variable highlights comparability issues to keep in mind when combining multiple samples, either within one type of survey or across survey types. IPUMS users must make separate customized data files from each database and merge those files. And subtle differences in question wording, skip patterns, geographic boundaries, and sampling procedures—such as MICS’ taking reports on child health from caretakers other than the biological mother—can introduce inconsistencies and inadvertent errors.

Even More IPUMS Data Available in the SDA Online Data Analysis Tool

June 15, 2025December 19, 2024 by mpcblog

By Daniel Backman

Beyond offering the ability to create and download customized datasets from the IPUMS microdata collections, we also support web-based analysis of the data through the SDA (Survey Documentation and Analysis) online data analysis tool. SDA empowers users to analyze IPUMS data directly from their web browsers without the need for additional software or advanced programming skills. Whether you’re a seasoned researcher or a student exploring data for the first time, the SDA tool makes it easier than ever to unlock insights from our datasets. If you’re a current SDA user and ready to get started, check out the new datasets from IPUMS CPS and IPUMS MEPS. Otherwise, read on to learn more about SDA and how to use this tool to analyze IPUMS data.

About IPUMS & SDA

What is SDA?

The SDA tool is a web-based interface that allows you to generate frequency tables, cross-tabulations, and summary statistics; create customized data visualizations, including bar charts, line graphs, and scatter plots; perform regression analysis; and export results as a CSV file for presentations or further analysis.

SDA increases the accessibility of data by allowing users to analyze data through a web-interface without needing to use (or purchase!) statistical software. There is detailed guidance on how to use the tool for analyses and how to manipulate variables. Additionally, it provides exceptionally fast real-time processing of data, making it ideal for use in the classroom or other interactive settings. See our data training exercises page for exercises that will guide you through using SDA to analyze IPUMS data.

Updated Land Cover Summaries for Census Tracts, County Subdivisions, Counties, and Places

June 17, 2025December 2, 2024 by mpcblog

By David Van Riper, ISRDI Director of Spatial Analysis

What’s new?

We just released updated land cover summaries for census tracts, county subdivisions, counties and places. Our land cover summaries describe the proportion of a particular geographic unit (e.g., a county or a census tract) that is covered by a particular land cover class (e.g., deciduous forest, evergreen forest, or cultivated crops). This release provides users with land cover summaries from nine vintages of the National Land Cover Database (NLCD) – 2001, 2004, 2006, 2008, 2011, 2013, 2016, 2019, and 2021. Summaries are available for 2010, 2020, and 2022 census tract, county subdivisions, counties, and places. We include 2022 versions of these geographic units because that was the year the Census Bureau began identifying planning regions in Connecticut. These regions replaced Connecticut’s historical counties, which have long had no official administrative function. These new planning regions changed the unique identifiers for census tracts, county subdivisions, and counties.

Why did IPUMS NHGIS create these land cover summaries?

Land cover data is commonly used to study the impacts of natural events such as hurricanes or human modifications such as converting forest to agriculture or agricultural land to developed land. Land cover data is typically released as a high spatial resolution, gridded spatial dataset where each grid cell (or pixel) is assigned to a land cover category (Figure 1, Panel A). The gridded data almost never align with the geographic units, and the high spatial resolution yields massive files that can be slow to process. A single NLCD file is 25 gigabytes in size.

We summarized nine versions of the NLCD to multiple sets of geographic units so that users can easily integrate the data into analyses already structured around geographic units. This reduces the burden on individual users to create such summaries themselves.

Census Data for Good: Analysis to Action

June 15, 2025October 9, 2024 by mpcblog

By Lara Cleveland

IPUMS International regularly asks representatives of National Statistical Offices (NSOs) around the world to share their data with the research community. While IPUMS offers a license payment to countries for the right to redistribute microdata, NSO representatives are most interested in how sharing data with IPUMS will benefit the people of their countries. After 30 years of harmonizing data that NSOs have shared with us, IPUMS can indeed point to innovative research from data users all over the world, many at major universities in these partner countries. Directors of statistical offices, especially those with close ties to academia, are thrilled that the data are used for scholarly scientific production and for the purpose of educating the next generation. However, most of these leaders are much more interested in how data sharing leads to effective policy. And they want examples. They are essentially asking how the data have been “used for good,” as the original IPUMS tagline, “Use it for good!” implores.

IPUMS supports the Sustainable Development Goals

In response, IPUMS has been following data-to-policy trails where we can find them. The United Nations’ efforts to establish and measure the Sustainable Development Goals (SDGs) have provided wins in this area. Early in the life of the SDGs, colleagues from the World Health Organization visited IPUMS to leverage detailed information in the occupational variables for locating the health workforce. Microdata from censuses helped them measure the density of a range of health worker classifications at subnational levels. The International Organization for Migration (IOM) did similar work to disaggregate census-based SDGs by migratory status. At the start of the pandemic, The United Nations Population Fund (UNFPA) used IPUMS census microdata to spin up a dashboard showing the living arrangements of older adults, again at subnational levels. Each of these applications of IPUMS International data resulted in policy recommendations, informed by additional data, additional policy research, and pilot projects.

Constructing comparable intimate partner violence indicators across the DHS, MICS, and PMA health surveys

June 4, 2025September 16, 2024 by mpcblog

By Miriam King, Anna Bolgrien, Mehr Munir, and Devon Kristiansen

The three data series comprising IPUMS Global Health—IPUMS DHS, IPUMS PMA, and IPUMS MICS—contain intersecting subjects related to women’s and children’s health, while retaining distinct patterns of temporal and geographic coverage. This content overlap opens the door to combining harmonized data across the three surveys, to extend time series and/or increase the number of countries in comparative analyses. However, there are important yet subtle differences between these survey types, in sample frames, questionnaire wording, and variable responses and universes, which require cautious consideration. As the example below demonstrates, researchers must use extra care to avoid errors when combining data across IPUMS DHS, MICS, and PMA.

A July 2024 article in the Journal of Public Health Policy, “Constructing Comparable Intimate Partner Violence Indicators across DHS, MICS, and PMA Health Surveys,” describes some challenges and solutions to combining data across these IPUMS databases, using measures of intimate partner violence as an example. The piece, authored by Devon Kristiansen and colleagues at IPUMS, notes two necessary steps in combining data across survey types:

Identify and combine only variables with similar question wording
Adjust the samples to include only comparable subpopulations

Harmonized Malaria Indicator Survey (MIS) Data Now in IPUMS DHS

September 5, 2024September 3, 2024 by mpcblog

By Miriam King, Senior Research Scientist

Malaria is a pressing global health problem, with nearly 250 million malaria cases in 2022, according to the World Health Organization. Approximately 95 percent of malaria deaths were in Africa, with three-quarters of those deaths to children under 5. Climate change is increasing the transmission of mosquito-borne diseases, such as malaria. When IPUMS DHS recently received supplemental funding to support research on Climate Change Effects on Health, adding data on malaria was a top priority. Specifically, IPUMS DHS chose to integrate data from the DHS Malaria Indicator Surveys (MIS).

MIS have been fielded in nearly 30 African countries during the twenty-first century. Developed under an international partnership coordinating efforts to fight malaria, MIS surveys include some standard DHS variables on topics such as demographics, fertility, and household characteristics. MIS questionnaires also include hundreds of questions related to malaria. People’s knowledge about malaria causes, symptoms, and prevention; use of bednets; diagnosis and treatment of malaria, especially for pregnant women and children; exposure to public health messaging; and diagnostic blood testing for malaria in children under 5 are among the topics covered.

Map of Africa with the countries with MIS data in IPUMS DHS filled in with purple — Figure 1: Countries with MIS Data in IPUMS DHS

IPUMS DHS users now have access to harmonized data from 38 MIS samples, with geographic coverage shown in Figure 1. We prioritized harmonizing responses to MIS questions that matched variables already in the IPUMS DHS database, for approximately 700 widely available variables.

Digitizing and Exploring Qatar’s Population Censuses

June 15, 2025June 24, 2024 by mpcblog

By Shine Min Thant

Qatar, a small yet influential state in the Middle East, is a very interesting case study for demographic research because of its rapid development over the past thirty years. Qatar occupies a peninsula only slightly larger than the U.S. state of Rhode Island that juts out into the Persian Gulf from its border with Saudi Arabia. The country has experienced relatively rapid economic growth since the late 20th century, mainly due to its vast reserves of natural gas and oil. This newfound wealth allowed Qatar to invest heavily in its healthcare, infrastructure, and education – therefore making the country an ideal case study for social change and development. Additionally, a recent surge in Qatar’s immigrant population (which constitutes over 78 percent of the population) also makes it an ideal country to study social mobility and social change.

As part of the ISRDI Diversity Fellowship Program, I worked with Dr. Tracy Kugler, Professor Steven Manson, Professor Evan Roberts, and undergraduate student Rawan AlGahtani on a project to examine Qatar’s change using census data from 1984, 1997, and 2004. Summary tables from all three censuses were previously only available as printed documents. As a first step, we needed to transform the data from a hard-to-get printed format to widely accessible IPUMS IHGIS format. This process included multiple steps from conducting optical character recognition (OCR) to conducting data quality checks using R scripts (Figure 1).

Figure 1: IPUMS IHGIS Workflow