Accessing IPUMS NHGIS in R: A Primer

By Finn Roberts & Jonathan Schroeder

R users have a powerful new way to access IPUMS NHGIS!

The July 2023 release of ipumsr 0.6.0 includes a fully-featured set of client tools enabling R users to get NHGIS data and metadata via the IPUMS API. Without leaving their R environment, users can find, request, download and read in U.S. census summary tables, geographic time series, and GIS mapping files for years from 1790 through the present. This blog post gives an overview of the possibilities and describes how to get started.

What you can do with ipumsr

Request and download NHGIS data

You can use ipumsr to specify the parameters of an NHGIS data extract request and submit that request for processing by the IPUMS servers. You can request any of the data products that are available through the NHGIS Data Finder: summary tables, time series tables, and shapefiles. You can also specify general formatting parameters (e.g., file format or time series table layout) to customize the structure of your data extract.

Once you have specified a data extract, you can use a series of ipumsr functions to:

  • submit the extract request to the IPUMS servers for processing
  • check on the extract status
  • wait for the extract to complete
  • download the extract as soon as it’s ready
  • load the data into R with detailed data field descriptions.

This workflow allows you to go from a set of abstract NHGIS data specifications to analyzable data, all without having to leave your R session!

Continue reading…

IPUMS IHGIS: Unlocking International Population and Agricultural Census Data

By Tracy Kugler

Nearly all countries throughout the world conduct population and housing censuses at least every ten years, and most also conduct agricultural censuses or surveys regularly. These censuses collect information on demographics, education, employment, housing characteristics, migration, agricultural land ownership, agricultural workforce, livestock, crops, and more. The resulting data can be used to study a wide range of questions, from the character of demographic transitions within and across countries, to utilization of irrigation, to educational trends among women. 

Unfortunately, this wealth of data has remained largely inaccessible to researchers. The data are typically published in reports as tables summarizing population characteristics. In recent decades, many of these reports have been published as PDF documents and made available on national statistical office websites. While the reports are available, data from a PDF document cannot be easily imported into a statistical or GIS package. Furthermore, the table structures are highly heterogeneous, both across countries and even within the same report.

The International Historical Geographic Information System (IPUMS IHGIS) is designed to provide easy access to these data in a way that researchers can easily use for analysis. In the early phases, IHGIS was known internally as “Project Mako,” named after the Mako shark, which has a global range, voracious appetite, and a reputation for a broad-ranging diet. Like the shark, IHGIS (née Project Mako) will encompass the world and ingest all kinds of data tables.

Continue reading…

In the Archive: “25 Years of IPUMS Data”

“25 Years of IPUMS Data,” the current IPUMS/MPC archive exhibit, highlights a dynamic quarter center history of data innovation at the University of Minnesota. In the late 1980s, the Social History Research Laboratory at the University of Minnesota’s History Department proposed “the creation of a single integrated microdata series composed of public use samples for every year … with the exception of the 1890 census, which was destroyed by fire.”  The primary aim was to make the U.S. census microdata “as compatible over time as possible while losing little, if any, of the detail in the original datasets” (Integrated Public Use Microdata Series: A Prospectus).

Continue reading…

New IPUMS CPS Variable Links Records Across Data Files

 

Screenshot 2016-08-25 15.43.37

IPUMS CPS has added a new variable to unlock longitudinal information in CPS data called CPSIDP. The identifier, the result of a major initiative from the Minnesota Population Center, uniquely identifies individuals who are in the CPS and is assigned to the individual in each of the (up to) eight times over sixteen months that the individual is in the survey.

Continue reading…

Learning from Data: From Systemization to Investigation

Slide1

Benjamin Hartman is a 2016 Summer Diversity Fellow  at the MPC. As part of his fellowship, he learned how to take unprocessed data to produce harmonized IPUMS-I data and documentation, make GIS maps, and conduct his own case study investigating the spatial dimensions of internal migration using the Cambodian census. Hartman worked with his colleagues in IPUMS International to create this blog post.

Continue reading…

A History of Data: Information Technology and the MPC

MPCSuperman (1)

Bethel University Professor of History Diana L. Magnuson is documenting the growth of the Minnesota Population Center. Believing that preserving institutional memory is vital, the Center is supporting Magnuson’s work to capture oral histories of past and present MPC faculty and staff.

This is the second in a three-part series, with oral histories from the information technology (IT) side of the MPC. For over 16 years, the IT staff has collaborated with the MPC research staff to recode and disseminate data, develop specialized software, and make research more efficient. The “secret sauce of the MPC” is the longstanding synergistic collaboration between IT and research staff.

Continue reading…