The IPUMS Archive has moved!

By Diana Magnuson

Looking in the window of the Institute for Social Research and Data Innovation 50 Willey Hall at the IPUMS archives shelves filled with accordion folders and boxesFor a quarter century archival staff and the physical collection of IPUMS archival materials have sojourned in spaces on the West Bank at the University of Minnesota. The People’s Center on Riverside Avenue, the fifth floor of Heller Hall, 50 Willey Hall, 1200 Washington Avenue and the West Bank Office Building on South 2nd Street have all been home to the IPUMS Archive. These moves were embedded in the organizational growth, development, and change experienced by the IPUMS projects from 1999 to the present. Since 2004, IPUMS headquarters have been in 18,500 square feet of renovated space in Willey Hall on the West Bank. The space was a $1.8 million College of Liberal Arts funded remodel of an art gallery and restaurant. Now home to the Institute for Social Research and Data Innovation, which houses the Minnesota Population Center (MPC), IPUMS, the Life Course Center (LCC), and the Minnesota Research Data Center (MnRDC), the space currently contains six private offices, ninety-one cubicles, lounge spaces, and twelve conference room spaces, including a large ninety seat capacity seminar room.

The IPUMS Archive began in earnest with the launch of IPUMS International (affectionately known as IPUMS-I in some circles), which, as of this writing, includes 104 countries; 656 censuses and surveys, and over 1-billion person records. Beginning in 1999, with a social science infrastructure grant from the National Science Foundation (NSF), IPUMS International (IPUMS-I) had the ambitious goal to preserve the world’s microdata resources to democratize global access to these rich resources. In 2025, the project goals continue to be: collecting and preserving census and survey data and documentation; harmonizing these data; and disseminating the harmonized data free of charge.

Continue reading…

Working with Subnational Geographies in IPUMS Global Health

By Divya Pandey and Anna Bolgrien

In a research project combining data from IPUMS MICS and IPUMS DHS, IPUMS Global Health staff examined trends in the relationship between open defecation and high infant mortality rates (IMR) in the Eastern Indo-Gangetic Plains. The project focused on selected bordering regions in Nepal, Bangladesh, and India. By analyzing these environmentally and agro-climatically comparable regions, the study aimed to isolate the impact of national and local policies on open defecation and infant mortality rates.

Figure 1: Regions included in the study

A map of the border and surrounding regions of Nepal, Bangladesh, and India that highlights the sub-national regions included in the study.

The study pooled data from IPUMS MICS and IPUMS DHS to look at trends over almost two decades. IPUMS DHS includes data for all three countries, and IPUMS MICS provides additional years of data for Nepal and Bangladesh. Since the study focused on selected bordering geographies, the authors worked with data from lower administrative levels—divisions in Bangladesh, states in India, and regions in Nepal. Leveraging the geography resources provided by IPUMS, the team used both spatially harmonized and sample-specific geography variables (learn more about IPUMS DHS geography variables and IPUMS MICS geography variables). Spatially harmonized geography variables identify geographic regions using a consistent spatial footprint to allow for the comparison of the same physical space over time. Sample-specific geography variables are not harmonized across time; as their name suggests, they use the geographic boundaries that are sample-specific or contemporaneous to the survey-year in a country.

Continue reading…

What is going on with the weighted counts in the January 2025 CPS?

By Kari Williams & Sarah Flood

The signature activity of IPUMS is data harmonization, or making variables interoperable across time, to facilitate pooling of multiple months or years of data, as well as comparative and trend analyses. It’s easy to get carried away in the magic of not needing to perform routine data cleaning and having documentation organized at the variable level, and perhaps miss some bigger picture considerations. The Current Population Survey (CPS) annual population controls adjustment is an excellent example.

Each January, the Census Bureau revises the CPS weights to incorporate new population controls, based on the Census Bureau’s updated population estimates. However, the Census Bureau doesn’t re-release previous weights for the CPS based on the new population controls. If you look at trendlines of weighted count estimates using CPS monthly data, you might notice a discontinuity between each December and January – these are the annual population control adjustments at work. In January 2025, the shift is particularly abrupt; this is because the 2024 vintage population estimates (i.e., the population controls for the 2025 CPS) reflect an improvement in the Census Bureau’s methodology for measuring net international migration.

Line chart showing a general upward trend from 2020-2025 with disruptions each January

Figure from Jed Kolko’s Population adjustments will cause the next jobs report to be misinterpreted and misconstrued.

Continue reading…

IPUMS Announces 2024 Research Award Recipients

IPUMS research awardsIPUMS is excited to announce the winners of its annual IPUMS Research Awards. These awards honor both published research and nominated graduate student papers from 2024 that use IPUMS data to advance or deepen our understanding of social and demographic processes.

The 2024 competition awarded prizes for the best published and best graduate student research in eight categories, each associated with specific IPUMS data collections:

  • IPUMS USA, providing data from the U.S. decennial censuses, the American Community Survey, and includes full count data, from 1850 to the present.
  • IPUMS CPS, providing data from the monthly U.S. labor force survey, the Current Population Survey (CPS), from 1962 to the present.
  • IPUMS International, providing harmonized data contributed by more than 100 international statistical office partners; it currently includes information on over 1 billion people in more than 547 censuses and surveys from around the world, from 1960 forward.
  • IPUMS Health Surveys, which makes available the U.S. National Health Interview Survey (NHIS) and the Medical Expenditure Panel Survey (MEPS).
  • IPUMS Spatial, covering IPUMS NHGIS, IPUMS IHGIS, and IPUMS CDOH. NHGIS includes GIS boundary files from 1790 to the present; IHGIS provides data tables from population and housing censuses as well as agricultural censuses from around the world; CDOH provides access to measures of disparities, policies, and counts, by state and county, for historically marginalized populations in the US.
  • IPUMS Global Health, providing harmonized data from the Demographic and Health Surveys (DHS), Multiple Indicator Cluster Surveys (MICS), and the Performance Monitoring for Action (PMA) data series, for low and middle-income countries.
  • IPUMS Time Use, providing time diary data from the U.S. and around the world from 1930 to the present.
  • IPUMS Excellence in Research, The IPUMS mission of democratizing data is strengthened by broad representation among our data users and the research that we highlight. This award was created to recognize outstanding work using any of the IPUMS data collections by authors who belong to groups that are underrepresented in social science and population health research*.

Over 1,300 publications based on IPUMS data appeared in journals, magazines, and newspapers worldwide last year. From these publications and from nominated graduate student papers, the award committees selected the 2024 honorees.

Continue reading…

IPUMS CPS Checks on Basic Monthly Data

By Sarah Flood, Renae Rodgers, and Kari Williams

Federal data are critical for understanding much about the US population from its size and composition to its health and employment. The Current Population Survey (CPS) is our nation’s official source of information about the labor force. At the beginning of each month, we eagerly await the first Friday when the Employment Situation Summary (aka the monthly jobs report) will be released (it isn’t just us, right??). The monthly snapshot of the US labor force serves as a bellwether for how our economy is faring.

The Wednesday after the jobs report is released, we at IPUMS clear the decks in preparation for the release of the CPS Basic Monthly Survey (BMS) by the Census Bureau. The CPS BMS is the individual-level data from which the jobs report is generated. Our goal is always to process these data as soon as they’re released by the Census Bureau so that we can deliver them to IPUMS CPS users as quickly as possible. Those who rely on CPS BMS data each month might be familiar with coping strategies while waiting for the data–obsessive page refreshing, some nervous pacing, maybe wondering why they haven’t yet been released (iykyk).

While quickly processing CPS Basic Monthly data is a priority, so, too, is ensuring data quality. Each month, we carefully inspect CPS BMS data at several points in our process. First, we review all of the variables for codes that are undocumented or have suspicious frequencies. Second, we rely on a suite of tools during our integration process that alert us to any codes in the data that we haven’t accounted for in our variable-level harmonizations. After harmonization, we compare univariate statistics from the newest month data to the previous month of data. Generally we expect very little change across months and we have built tools that are designed to flag variable-level differences above a certain threshold as well as new codes on either end of the distribution.

Continue reading…

Unlocking Spatial and Social Data with R: Introducing the R Spatial Notebook Series

By Kate Vavra-Musser

Introduction: What is the R Spatial Notebooks Project?

The R Spatial Notebooks Project is a series of R code notebooks, structured like a textbook, designed to guide users through the intricacies of data extraction, integration, cleaning, analysis, and visualization using R. The notebooks are specifically tailored for social science research and applications using spatial data. The modular textbook-style structure is designed for comprehensive skill development by working through sequences of notebooks. The project was developed through a partnership between the Institute for Social Research and Data Innovation (ISDRI), which houses IPUMS, and the Institute for Geospatial Understanding through an Integrated Discovery Environment (I-GUIDE). IPUMS provides census and survey data from around the world integrated across time and space. I-GUIDE is cyberinfrastructure that combines distributed geospatial data with computing for researchers, students, and policymakers.

The initial R Spatial Notebooks release includes roughly 20 freely-available notebooks on topics including IPUMS data extraction via API, accessing open-source data, data cleaning, foundational spatial data principles, exploratory data analysis, and mapping.

Continue reading…

New Data! IPUMS International Spring 2025 Data Release

By Derek Burk, Lara Cleveland, Jane Lee, Rodrigo Lovaton, and Sula Sarkar

Megaphone with Exciting news speech bubble banner.

Great news for IPUMS International (IPUMSI) users! Our ever-expanding census and survey data collection has just released new harmonized census samples from Honduras (2013), Kenya (2019), Malawi (2018), Mongolia (2010, 2020), and Mozambique (2017). We now have an average of 4.5 censuses per country. The Kenya census collection now spans 50 years!

This release also includes a large series of quarterly Labor Force Surveys from the Philippines (1997-2019). The 91 waves of the Philippines Labor Force Survey contain a total of 18 million person records.

Many thanks to the National Statistical Office partners in these countries for their ongoing contributions.

Continue reading…