Accessing IPUMS NHGIS in R: A Primer

By Finn Roberts & Jonathan Schroeder

R users have a powerful new way to access IPUMS NHGIS!

The July 2023 release of ipumsr 0.6.0 includes a fully-featured set of client tools enabling R users to get NHGIS data and metadata via the IPUMS API. Without leaving their R environment, users can find, request, download and read in U.S. census summary tables, geographic time series, and GIS mapping files for years from 1790 through the present. This blog post gives an overview of the possibilities and describes how to get started.

What you can do with ipumsr

Request and download NHGIS data

You can use ipumsr to specify the parameters of an NHGIS data extract request and submit that request for processing by the IPUMS servers. You can request any of the data products that are available through the NHGIS Data Finder: summary tables, time series tables, and shapefiles. You can also specify general formatting parameters (e.g., file format or time series table layout) to customize the structure of your data extract.

Once you have specified a data extract, you can use a series of ipumsr functions to:

  • submit the extract request to the IPUMS servers for processing
  • check on the extract status
  • wait for the extract to complete
  • download the extract as soon as it’s ready
  • load the data into R with detailed data field descriptions.

This workflow allows you to go from a set of abstract NHGIS data specifications to analyzable data, all without having to leave your R session!

Continue reading…

IPUMS IHGIS: Unlocking International Population and Agricultural Census Data

By Tracy Kugler

Nearly all countries throughout the world conduct population and housing censuses at least every ten years, and most also conduct agricultural censuses or surveys regularly. These censuses collect information on demographics, education, employment, housing characteristics, migration, agricultural land ownership, agricultural workforce, livestock, crops, and more. The resulting data can be used to study a wide range of questions, from the character of demographic transitions within and across countries, to utilization of irrigation, to educational trends among women. 

Unfortunately, this wealth of data has remained largely inaccessible to researchers. The data are typically published in reports as tables summarizing population characteristics. In recent decades, many of these reports have been published as PDF documents and made available on national statistical office websites. While the reports are available, data from a PDF document cannot be easily imported into a statistical or GIS package. Furthermore, the table structures are highly heterogeneous, both across countries and even within the same report.

The International Historical Geographic Information System (IPUMS IHGIS) is designed to provide easy access to these data in a way that researchers can easily use for analysis. In the early phases, IHGIS was known internally as “Project Mako,” named after the Mako shark, which has a global range, voracious appetite, and a reputation for a broad-ranging diet. Like the shark, IHGIS (née Project Mako) will encompass the world and ingest all kinds of data tables.

Continue reading…

In the Archive: “25 Years of IPUMS Data”

“25 Years of IPUMS Data,” the current IPUMS/MPC archive exhibit, highlights a dynamic quarter center history of data innovation at the University of Minnesota. In the late 1980s, the Social History Research Laboratory at the University of Minnesota’s History Department proposed “the creation of a single integrated microdata series composed of public use samples for every year … with the exception of the 1890 census, which was destroyed by fire.”  The primary aim was to make the U.S. census microdata “as compatible over time as possible while losing little, if any, of the detail in the original datasets” (Integrated Public Use Microdata Series: A Prospectus).

Continue reading…