Reproducible Research with R Markdown, ipumsr, and the IPUMS API

By Dan Ehrlich

Have you ever wanted to share a project using IPUMS data with a colleague, but then thought, “Oh wait! It is against the terms of use to redistribute my IPUMS data file!”

Maybe you’d like a colleague to explore your findings. Or maybe you’re a teacher with an exercise you’d like your students to review and replicate. In the past, if you wanted someone to use the same IPUMS data that you did, you would need to provide a list of samples and variables and instructions for your collaborator on how to navigate the online data extract system.

If you’re thinking that sounds like a pain, don’t worry, the brand new IPUMS microdata API makes it easier than ever to share your extract definitions with fellow IPUMS users!!! Using the microdata API, you and your collaborators can:

  • Save an extract definition as a .json file that can be shared freely
  • Submit a new extract request based on a .json definition
  • Download data and metadata directly into your project directory (this feature is a personal favorite)

Continue reading…

Sharing IPUMS Extract Definitions Using ipumspy

By Renae Rodgers

What is an Extract?

IPUMS users will already be familiar with the concept of an extract, but for those who may just be joining us, we’ll do a brief recap. Public Use data files are often large, unwieldy blocks of data, many variables wide and many many records long. Most analyses will only require a small subset of the available variables in any given dataset, but downloading public data from government agencies is an all-or-nothing endeavor. In addition to offering public use data that is harmonized across time and place, IPUMS allows users to choose only their variables of interest for download. These individualized datasets and accompanying metadata are IPUMS extracts.

What is an Extract Definition?

In short, an IPUMS extract definition is all the information needed to create a user’s personalized extract data file and accompanying metadata – everything short of those files themselves.

An IPUMS extract is defined by:

  1. The name of the IPUMS collection (e.g. “usa”, “cps”)
  2. A list of sample names or IDs (to be) included in the extract file
  3. A list of variable names (to be) included in the extract file
  4. An extract description (e.g. “2022 ACS demographic variables”)

IPUMS users build these extract definitions piece by piece when they create an extract through the IPUMS website, selecting samples, variables, and formats.

Continue reading…

Making IPUMS extracts from Stata

By Renae Rodgers

IPUMS has released a beta version of the Extract API that supports the IPUMS USA and IPUMS CPS microdata collections! Check out An Introduction to the IPUMS Extract API for Microdata for a brief introduction to the IPUMS Extract API for microdata. This blog post will demonstrate how to leverage the IPUMS Extract API and the ipumspy Python library to make IPUMS extracts directly from your Stata .do files! No prior knowledge of or interest in learning Python is required, but you will need Stata 16 or higher, an IPUMS user account, and an API key. All the code in the examples below is available in a template .do file if you would like to follow along.

Setting up Python for use with Stata

While you don’t need to be a Python user to make an IPUMS extract via Stata, there is a little bit of Python set up required. Chuck Huber at Stata has put together a great series of blog posts about how to use Python in Stata and this section is heavily inspired by the first post in that series.

Step 1: Download and install Miniconda

Even if you already have Python installed on your computer, I highly recommend setting up a separate python installation in a conda environment for your ipumspy-in-Stata work. Miniconda is a light-weight version of the package manager Anaconda that will allow you to install Python, ipumspy, and all of the necessary dependencies in a separate environment that you can access from Stata without disturbing anything else on your system that might be using Python. Download Miniconda for your operating system and install.

Continue reading…

An Introduction to the IPUMS Extract API for Microdata

By Renae Rodgers

Have you heard the news?! The IPUMS Extract API now supports microdata! For users who have been clamoring for this feature for some time, feel free to skip to the final section for resources to get started. For our users who haven’t been awaiting this announcement with bated breath, and who may be saying to themselves, “ok…great…but…”

This blog post will give a brief introduction to APIs, give some examples of ways to use the IPUMS Extract API in your workflow, and share some more in-depth resources.

What is an API?

API stands for Application Programming Interface. An API is an intermediate layer between a user and a server that allows the user to interact programmatically with another program or a service. First, the user’s program talks to the API – this is known as making an API call or a request. The API, in turn, talks to the server, translating the user’s request into something the server can understand. The server returns the requested information to the API, and the API then returns that information to the user. For example, Google Maps has an API that allows developers to request and retrieve information from Google Maps from within their applications, without needing to go through a web interface.

At this point you may be thinking, “great, now I have a general idea of what an API is, but I am not a software developer so… thanks anyway.”

The IPUMS Extract API opens up many possibilities for easing collaboration, and creating efficient workflows with only a few simple lines of code. Please read on!

Continue reading…

2022 Data-Intensive Research Conference

By Kari Williams

Registration is now open for the 2022 Data-Intensive Research Conference, being held July 20-21 in person in Minneapolis, MN and online. This event is sponsored by the Network for Data-Intensive Research on Aging (NDIRA), part of the University of Minnesota’s Life Course Center (which is co-located with IPUMS).

The conference will feature research presentations and discussions focused on the 2022 theme: Contextualizing Work and Health across the Life Course. The program includes research that leverages large-scale population data (including some #poweredbyIPUMS work) to explore these concepts. Here’s a sneak peak at the conference program. We are excited for research that digs into many different aspects of work (paid work, caregiving, labor market transitions) and health (health behaviors, health conditions, disability, wellbeing) with a life course perspective.

Continue reading…

IPUMS International 2022 Data Release

By Jane Lyon Lee, IPUMS International

IPUMS International has added 7 new census samples and new labor force surveys including the first-time data release from the Slovak Republic and historical samples from Egypt 1848 and 1868. The other newly added samples extend pre-existing series. The growing IPUMSI labor force survey collection has expanded with the addition of quarterly surveys from Mexico (ENOE 2005-2020) and more data from Spain & Italy. See a summary of the full IPUMS collection on the IPUMSI samples page.

Continue reading…