Reproducible Research with R Markdown, ipumsr, and the IPUMS API

By Dan Ehrlich

Have you ever wanted to share a project using IPUMS data with a colleague, but then thought, “Oh wait! It is against the terms of use to redistribute my IPUMS data file!”

Maybe you’d like a colleague to explore your findings. Or maybe you’re a teacher with an exercise you’d like your students to review and replicate. In the past, if you wanted someone to use the same IPUMS data that you did, you would need to provide a list of samples and variables and instructions for your collaborator on how to navigate the online data extract system.

If you’re thinking that sounds like a pain, don’t worry, the brand new IPUMS microdata API makes it easier than ever to share your extract definitions with fellow IPUMS users!!! Using the microdata API, you and your collaborators can:

  • Save an extract definition as a .json file that can be shared freely
  • Submit a new extract request based on a .json definition
  • Download data and metadata directly into your project directory (this feature is a personal favorite)

Continue reading…

Making IPUMS extracts from Stata

By Renae Rodgers

IPUMS has released a beta version of the Extract API that supports the IPUMS USA and IPUMS CPS microdata collections! Check out An Introduction to the IPUMS Extract API for Microdata for a brief introduction to the IPUMS Extract API for microdata. This blog post will demonstrate how to leverage the IPUMS Extract API and the ipumspy Python library to make IPUMS extracts directly from your Stata .do files! No prior knowledge of or interest in learning Python is required, but you will need Stata 16 or higher, an IPUMS user account, and an API key. All the code in the examples below is available in a template .do file if you would like to follow along.

Setting up Python for use with Stata

While you don’t need to be a Python user to make an IPUMS extract via Stata, there is a little bit of Python set up required. Chuck Huber at Stata has put together a great series of blog posts about how to use Python in Stata and this section is heavily inspired by the first post in that series.

Step 1: Download and install Miniconda

Even if you already have Python installed on your computer, I highly recommend setting up a separate python installation in a conda environment for your ipumspy-in-Stata work. Miniconda is a light-weight version of the package manager Anaconda that will allow you to install Python, ipumspy, and all of the necessary dependencies in a separate environment that you can access from Stata without disturbing anything else on your system that might be using Python. Download Miniconda for your operating system and install.

Continue reading…

An Introduction to the IPUMS Extract API for Microdata

By Renae Rodgers

Have you heard the news?! The IPUMS Extract API now supports microdata! For users who have been clamoring for this feature for some time, feel free to skip to the final section for resources to get started. For our users who haven’t been awaiting this announcement with bated breath, and who may be saying to themselves, “ok…great…but…”


via GIPHY

This blog post will give a brief introduction to APIs, give some examples of ways to use the IPUMS Extract API in your workflow, and share some more in-depth resources.

What is an API?

API stands for Application Programming Interface. An API is an intermediate layer between a user and a server that allows the user to interact programmatically with another program or a service. First, the user’s program talks to the API – this is known as making an API call or a request. The API, in turn, talks to the server, translating the user’s request into something the server can understand. The server returns the requested information to the API, and the API then returns that information to the user. For example, Google Maps has an API that allows developers to request and retrieve information from Google Maps from within their applications, without needing to go through a web interface.

At this point you may be thinking, “great, now I have a general idea of what an API is, but I am not a software developer so… thanks anyway.”


via GIPHY

The IPUMS Extract API opens up many possibilities for easing collaboration, and creating efficient workflows with only a few simple lines of code. Please read on!

Continue reading…

IPUMS International 2022 Data Release

By Jane Lyon Lee, IPUMS International

IPUMS International has added 7 new census samples and new labor force surveys including the first-time data release from the Slovak Republic and historical samples from Egypt 1848 and 1868. The other newly added samples extend pre-existing series. The growing IPUMSI labor force survey collection has expanded with the addition of quarterly surveys from Mexico (ENOE 2005-2020) and more data from Spain & Italy. See a summary of the full IPUMS collection on the IPUMSI samples page.

Continue reading…

How the COVID-19 Pandemic Impacted the 2020 ACS 1-year PUMS Data

By Danika Brockman and Megan Schouweiler

One of the highlights of the past IPUMS USA release was the 2020 ACS 1-year Public Use Microdata Sample (PUMS) file. Due to the effects of the COVID-19 pandemic on 2020 ACS data collection and data quality, the Census Bureau did not release the standard PUMS data. Instead, they released the 2020 ACS 1-year data with experimental weights designed to account for the impact of the COVID-19 pandemic on data quality. In this blog post, we discuss the impact of the COVID-19 pandemic on the 2020 ACS and the development of the experimental weights, and we provide some recommendations for using the 2020 ACS 1-year PUMS file.

Impact of the Covid-19 Pandemic on Data Collection and Data Quality and the Development of the Experimental Weights

The COVID-19 pandemic severely disrupted data collection for the 2020 ACS. All methods of data collection were either shut down or significantly reduced from March 2020 through the end of the year. Data collection for group quarters was particularly disrupted by the COVID-19 pandemic; in-person visits to group quarter facilities were suspended or greatly reduced from March 2020 through the end of the year, and telephone interviews were not conducted due to logistical constraints. Beyond the impacts to data collection methods, the 2020 ACS had significant variability across the 2020 data collection year in response rates for both housing units and group quarters, and had the lowest overall response rate in the history of the ACS1.

Continue reading…

IPUMS FAQs: How do the original occupation and industry codes map onto harmonized versions created by IPUMS?

By Kari Williams

As part of the IPUMS mission to democratize data, our user support team strives to answer your questions about the data. Over time, some questions are repeated. This blog post is an extension of an earlier series addressing frequently asked questions. Maybe you’ll learn something. Perhaps you’ll just find the information interesting. Regardless, we hope you enjoy it!

Here’s one of those questions:

How do the original occupation and industry codes map onto harmonized versions created by IPUMS?

Continue reading…