Making IPUMS extracts from Stata

By Renae Rodgers

IPUMS has released a beta version of the Extract API that supports the IPUMS USA and IPUMS CPS microdata collections! Check out An Introduction to the IPUMS Extract API for Microdata for a brief introduction to the IPUMS Extract API for microdata. This blog post will demonstrate how to leverage the IPUMS Extract API and the ipumspy Python library to make IPUMS extracts directly from your Stata .do files! No prior knowledge of or interest in learning Python is required, but you will need Stata 16 or higher, an IPUMS user account, and an API key. All the code in the examples below is available in a template .do file if you would like to follow along.

Setting up Python for use with Stata

While you don’t need to be a Python user to make an IPUMS extract via Stata, there is a little bit of Python set up required. Chuck Huber at Stata has put together a great series of blog posts about how to use Python in Stata and this section is heavily inspired by the first post in that series.

Step 1: Download and install Miniconda

Even if you already have Python installed on your computer, I highly recommend setting up a separate python installation in a conda environment for your ipumspy-in-Stata work. Miniconda is a light-weight version of the package manager Anaconda that will allow you to install Python, ipumspy, and all of the necessary dependencies in a separate environment that you can access from Stata without disturbing anything else on your system that might be using Python. Download Miniconda for your operating system and install.

Continue reading…

An Introduction to the IPUMS Extract API for Microdata

By Renae Rodgers

Have you heard the news?! The IPUMS Extract API now supports microdata! For users who have been clamoring for this feature for some time, feel free to skip to the final section for resources to get started. For our users who haven’t been awaiting this announcement with bated breath, and who may be saying to themselves, “ok…great…but…”


via GIPHY

This blog post will give a brief introduction to APIs, give some examples of ways to use the IPUMS Extract API in your workflow, and share some more in-depth resources.

What is an API?

API stands for Application Programming Interface. An API is an intermediate layer between a user and a server that allows the user to interact programmatically with another program or a service. First, the user’s program talks to the API – this is known as making an API call or a request. The API, in turn, talks to the server, translating the user’s request into something the server can understand. The server returns the requested information to the API, and the API then returns that information to the user. For example, Google Maps has an API that allows developers to request and retrieve information from Google Maps from within their applications, without needing to go through a web interface.

At this point you may be thinking, “great, now I have a general idea of what an API is, but I am not a software developer so… thanks anyway.”


via GIPHY

The IPUMS Extract API opens up many possibilities for easing collaboration, and creating efficient workflows with only a few simple lines of code. Please read on!

Continue reading…

2022 Data-Intensive Research Conference

By Kari Williams

Registration is now open for the 2022 Data-Intensive Research Conference, being held July 20-21 in person in Minneapolis, MN and online. This event is sponsored by the Network for Data-Intensive Research on Aging (NDIRA), part of the University of Minnesota’s Life Course Center (which is co-located with IPUMS).

The conference will feature research presentations and discussions focused on the 2022 theme: Contextualizing Work and Health across the Life Course. The program includes research that leverages large-scale population data (including some #poweredbyIPUMS work) to explore these concepts. Here’s a sneak peak at the conference program. We are excited for research that digs into many different aspects of work (paid work, caregiving, labor market transitions) and health (health behaviors, health conditions, disability, wellbeing) with a life course perspective.

Continue reading…

IPUMS International 2022 Data Release

By Jane Lyon Lee, IPUMS International

IPUMS International has added 7 new census samples and new labor force surveys including the first-time data release from the Slovak Republic and historical samples from Egypt 1848 and 1868. The other newly added samples extend pre-existing series. The growing IPUMSI labor force survey collection has expanded with the addition of quarterly surveys from Mexico (ENOE 2005-2020) and more data from Spain & Italy. See a summary of the full IPUMS collection on the IPUMSI samples page.

Continue reading…

IPUMS Announces 2021 Research Award Recipients

IPUMS research awardsIPUMS is excited to announce the winners of its annual IPUMS Research Awards. These awards honor the best published research and nominated graduate student papers from 2021 that use IPUMS data to advance or deepen our understanding of social and demographic processes.

This year we are pleased to announce the IPUMS Excellence in Research Award. The IPUMS mission of democratizing data demands that we increase representation of scholars from groups that are systemically excluded in research spaces. This award is an opportunity to highlight and reward outstanding work using any of the IPUMS data collections by authors who are underrepresented in social science research*. In addition to the Excellence in Research Award, the 2021 competition awarded prizes for the best published and best graduate student research in seven categories, each associated with specific IPUMS data collections:

  1. IPUMS USA, providing data from the U.S. decennial censuses, the American Community Survey, and includes full count data, from 1850 to the present.
  2. IPUMS CPS, providing data from the monthly U.S. labor force survey, the Current Population Survey (CPS), from 1962 to the present.
  3. IPUMS International, providing harmonized data contributed by more than 100 international statistical office partners for over 500 censuses and surveys from around the world for 1960 forward as well as full count historical (NAPP) data.
  4. IPUMS Health Surveys, which makes available the U.S. National Health Interview Survey (NHIS) and the Medical Expenditure Panel Survey (MEPS).
  5. IPUMS Spatial, covering IPUMS NHGIS, IPUMS IHGIS, and IPUMS Terra. NHGIS includes GIS boundary files from 1790 to the present; IHGIS provides data tables from population and housing censuses as well as agricultural censuses from around the world; Terra provides data on population and the environment from 1960 to the present.
  6. IPUMS Global Health, providing harmonized data from the Demographic and Health Surveys and the Performance Monitoring and Accountability surveys, for low and middle-income countries from the 1980s to the present.
  7. IPUMS Time Use, providing time diary data from the U.S. and around the world from 1965 to the present.

Over 2,000 publications based on IPUMS data appeared in journals, magazines, and newspapers worldwide last year. From these publications and from nominated graduate student papers, the award committees selected the 2021 honorees.

Continue reading…

How the COVID-19 Pandemic Impacted the 2020 ACS 1-year PUMS Data

By Danika Brockman and Megan Schouweiler

One of the highlights of the past IPUMS USA release was the 2020 ACS 1-year Public Use Microdata Sample (PUMS) file. Due to the effects of the COVID-19 pandemic on 2020 ACS data collection and data quality, the Census Bureau did not release the standard PUMS data. Instead, they released the 2020 ACS 1-year data with experimental weights designed to account for the impact of the COVID-19 pandemic on data quality. In this blog post, we discuss the impact of the COVID-19 pandemic on the 2020 ACS and the development of the experimental weights, and we provide some recommendations for using the 2020 ACS 1-year PUMS file.

Impact of the Covid-19 Pandemic on Data Collection and Data Quality and the Development of the Experimental Weights

The COVID-19 pandemic severely disrupted data collection for the 2020 ACS. All methods of data collection were either shut down or significantly reduced from March 2020 through the end of the year. Data collection for group quarters was particularly disrupted by the COVID-19 pandemic; in-person visits to group quarter facilities were suspended or greatly reduced from March 2020 through the end of the year, and telephone interviews were not conducted due to logistical constraints. Beyond the impacts to data collection methods, the 2020 ACS had significant variability across the 2020 data collection year in response rates for both housing units and group quarters, and had the lowest overall response rate in the history of the ACS1.

Continue reading…