Helpful Functions for Constructing Reproducible Workflows in the Absence of an IPUMS Microdata Metadata API

by Renae Rodgers

As you may have heard, the IPUMS Microdata Extract API is now in beta for our IPUMS USA and IPUMS CPS collections! Yay! You may have also heard that an IPUMS Metadata API is not yet available. Bummer. This means that users still need to do data discovery via the IPUMS web sites, which means a lot of clicking. It also makes it difficult to set up a reproducible workflow that can be updated as new data become available (and who doesn’t want that?). This blog post will show you some easy functions you can use to retrieve sample metadata to create a monthly or annual workflow that doesn’t require visiting an IPUMS website, and walk through some ipumspy functionality to access metadata for variables already included in your IPUMS extracts. The workflow in this blog post is using ipumspy v0.2.1.

If Stata is your stats package of choice, you can leverage ipumspy  to make IPUMS extracts from a Stata do file! You can spruce up your workflow with any of the functions in the blog post below by incorporating them into the template .do file offered in our blog post on making IPUMS extracts from Stata.

Sample Metadata

To get started, import the utilities  module from ipumspy. All of the helper functions in this blog post will be using the  CollectionInformation  class from this module. The sample_ids attribute of  CollectionInformation  returns a dictionary with sample descriptions as keys and sample ids as values. This information is pulled from the sample ID page for the specified IPUMS data collection. This page is the source of the sample_ids dictionary for IPUMS CPS, and this page is the source of the sample_ids  dictionary for IPUMS USA.

from ipumspy import utilities

Functions for retrieving IPUMS CPS sample IDs

First, let’s take a look at the sample ID dictionary for IPUMS CPS.

Continue reading…

IPUMS International has brand new low level geographic variables and shapefiles

By Quinn Heimann

Map showing percentage of households with internet access in the 2014 Myanmar census by township
Map of Myanmar Internet Access

An ongoing goal and challenge for IPUMS-International (IPUMSI) is providing users with the most detailed geography possible. A unique obstacle to this is the confidentiality requirements agreed upon in order to distribute these census and survey samples. Nevertheless, IPUMSI has started launching lower-level geographic variables in samples where data is sufficient and confidentiality thresholds are still met. As of spring 2022, twenty samples have been released with third administrative level geographic data, covering ten countries across Africa and Asia. In addition, accompanying shapefiles are also being distributed to supplement these variables. Shapefiles can be used in conjunction with these more granular geographic variables to map out population trends in greater detail.

Screenshot to IPUMS International third level download page
IPUMSI third level download page

Many of these countries have multiple samples with lower level geography variables available. It is always a goal of IPUMSI to provide users with as much detail as possible for each sample, but this is sometimes hindered by a lack of sufficient data or detail. Some countries, such as Bangladesh and Mali, contain sufficient detail to provide lower level geography for all available samples in IPUMSI. More recent samples often contain more detail and more thorough documentation, whereas oftentimes this level of information is not present for samples produced longer ago.

Continue reading…

An Introduction to the IPUMS Extract API for Microdata

By Renae Rodgers

Have you heard the news?! The IPUMS Extract API now supports microdata! For users who have been clamoring for this feature for some time, feel free to skip to the final section for resources to get started. For our users who haven’t been awaiting this announcement with bated breath, and who may be saying to themselves, “ok…great…but…”


via GIPHY

This blog post will give a brief introduction to APIs, give some examples of ways to use the IPUMS Extract API in your workflow, and share some more in-depth resources.

What is an API?

API stands for Application Programming Interface. An API is an intermediate layer between a user and a server that allows the user to interact programmatically with another program or a service. First, the user’s program talks to the API – this is known as making an API call or a request. The API, in turn, talks to the server, translating the user’s request into something the server can understand. The server returns the requested information to the API, and the API then returns that information to the user. For example, Google Maps has an API that allows developers to request and retrieve information from Google Maps from within their applications, without needing to go through a web interface.

At this point you may be thinking, “great, now I have a general idea of what an API is, but I am not a software developer so… thanks anyway.”


via GIPHY

The IPUMS Extract API opens up many possibilities for easing collaboration, and creating efficient workflows with only a few simple lines of code. Please read on!

Continue reading…