Automating monthly workflows using IPUMS CPS and the IPUMS Microdata Extract API

By Renae Rodgers

As many readers will know, the Current Population Survey (CPS) is a monthly labor force survey that is, among other things, the data source for the monthly jobs report (or more formally the Employment Situation reports) from the Bureau of Labor Statistics.

In this blog post, I will show you how to create a reproducible, sustainable monthly workflow to update previous analyses using new data with IPUMS CPS data, IPUMS Microdata Extract API, and the ipumspy Python library.

If this is not your first CPS rodeo, you may already have a monthly workflow for working with IPUMS CPS data that suits your needs just fine – perhaps written in Stata. Did you know you can use ipumspy to make IPUMS CPS extracts from Stata?! Check out the set up instructions and template .do file in this blog post and optimize your monthly analysis even more with the IPUMS Microdata Extract API!

But I digress. In this blog post, I will first walk through a simple analysis using the IPUMS Microdata Extract API and ipumspy. I will then show you how to package that workflow so that it can be simply executed monthly when the most recent data becomes available from IPUMS CPS for refreshed analysis including the newest data.

An example IPUMS CPS, IPUMS Microdata Extract API workflow: teleworking due to COVID-19

Let’s suppose that we’re interested in looking at trends in telework due to COVID-19 over the course of the pandemic. The IPUMS CPS variable COVIDTELEW indicates whether the respondent worked from home at any time during the past 4 weeks due to COVID-19. This example will show us the overall trend in remote work due to COVID-19 as well as how teleworking breaks down by educational attainment. First we’ll define an IPUMS CPS extract that contains COVIDTELEW and EDUC variables and all months from May 2020 to June 2022.

Continue reading…

Helpful Functions for Constructing Reproducible Workflows in the Absence of an IPUMS Microdata Metadata API

by Renae Rodgers

As you may have heard, the IPUMS Microdata Extract API is now in beta for our IPUMS USA and IPUMS CPS collections! Yay! You may have also heard that an IPUMS Metadata API is not yet available. Bummer. This means that users still need to do data discovery via the IPUMS web sites, which means a lot of clicking. It also makes it difficult to set up a reproducible workflow that can be updated as new data become available (and who doesn’t want that?). This blog post will show you some easy functions you can use to retrieve sample metadata to create a monthly or annual workflow that doesn’t require visiting an IPUMS website, and walk through some ipumspy functionality to access metadata for variables already included in your IPUMS extracts. The workflow in this blog post is using ipumspy v0.2.1.

If Stata is your stats package of choice, you can leverage ipumspy  to make IPUMS extracts from a Stata do file! You can spruce up your workflow with any of the functions in the blog post below by incorporating them into the template .do file offered in our blog post on making IPUMS extracts from Stata.

Sample Metadata

To get started, import the utilities  module from ipumspy. All of the helper functions in this blog post will be using the  CollectionInformation  class from this module. The sample_ids attribute of  CollectionInformation  returns a dictionary with sample descriptions as keys and sample ids as values. This information is pulled from the sample ID page for the specified IPUMS data collection. This page is the source of the sample_ids dictionary for IPUMS CPS, and this page is the source of the sample_ids  dictionary for IPUMS USA.

from ipumspy import utilities

Functions for retrieving IPUMS CPS sample IDs

First, let’s take a look at the sample ID dictionary for IPUMS CPS.

Continue reading…

An Introduction to the IPUMS Extract API for Microdata

By Renae Rodgers

Have you heard the news?! The IPUMS Extract API now supports microdata! For users who have been clamoring for this feature for some time, feel free to skip to the final section for resources to get started. For our users who haven’t been awaiting this announcement with bated breath, and who may be saying to themselves, “ok…great…but…”


This blog post will give a brief introduction to APIs, give some examples of ways to use the IPUMS Extract API in your workflow, and share some more in-depth resources.

What is an API?

API stands for Application Programming Interface. An API is an intermediate layer between a user and a server that allows the user to interact programmatically with another program or a service. First, the user’s program talks to the API – this is known as making an API call or a request. The API, in turn, talks to the server, translating the user’s request into something the server can understand. The server returns the requested information to the API, and the API then returns that information to the user. For example, Google Maps has an API that allows developers to request and retrieve information from Google Maps from within their applications, without needing to go through a web interface.

At this point you may be thinking, “great, now I have a general idea of what an API is, but I am not a software developer so… thanks anyway.”


The IPUMS Extract API opens up many possibilities for easing collaboration, and creating efficient workflows with only a few simple lines of code. Please read on!

Continue reading…