Unlocking Spatial and Social Data with R: Introducing the R Spatial Notebook Series

By Kate Vavra-Musser

Introduction: What is the R Spatial Notebooks Project?

The R Spatial Notebooks Project is a series of R code notebooks, structured like a textbook, designed to guide users through the intricacies of data extraction, integration, cleaning, analysis, and visualization using R. The notebooks are specifically tailored for social science research and applications using spatial data. The modular textbook-style structure is designed for comprehensive skill development by working through sequences of notebooks. The project was developed through a partnership between the Institute for Social Research and Data Innovation (ISDRI), which houses IPUMS, and the Institute for Geospatial Understanding through an Integrated Discovery Environment (I-GUIDE). IPUMS provides census and survey data from around the world integrated across time and space. I-GUIDE is cyberinfrastructure that combines distributed geospatial data with computing for researchers, students, and policymakers.

The initial R Spatial Notebooks release includes roughly 20 freely-available notebooks on topics including IPUMS data extraction via API, accessing open-source data, data cleaning, foundational spatial data principles, exploratory data analysis, and mapping.

Why Was this Series Created?

Throughout my career in public health, environmental health, and climate impacts, I continue to run across the same challenge: the silos separating social science domain experts from technical specialists. I have met many social scientists who wish they had more experience with big data and coding – skills that could push their research into novel directions. On the other hand, many coders have the skills to manipulate large datasets but lack the contextual knowledge to apply these skills meaningfully within the social sciences.

The R Spatial Notebooks Project came from a desire to help social scientists expand their research toolkit by building R coding skills specifically geared toward social science research and data workflows. By equipping social scientists with the necessary technical skills in R, I hope these notebooks can help users expand their approach to social science research.

The project aims to help social scientists improve their technical fluency and coders and analysts understand the unique challenges within social science data contexts, which can help them become more effective contributors to social science research. I hope that all members of a research team, from domain experts to analysts, can help de-silo their skillset, improve cross-discipline communication, and push into research spaces previously unavailable to them.

Who is This For? Why Should You Use It?

Although I originally designed these notebooks with public health, demography, and environmental health researchers who are interested in enhancing their R coding skills in mind, these notebooks are a resource for anyone interested in expanding their R skills (including R newbies!) or applying their R skills to high-quality data sources that are relevant to studying a broad range of topics. The project emphasizes a few key user goals:

  • Get into IPUMS: Learn to leverage the extensive IPUMS data repositories for your research.
  • Turbocharge Your IPUMS Experience: Utilize ipumsr with the IPUMS API for direct data extraction to your workspace – great for easy extraction modification and batch processing.
  • Level Yourself Up: Develop some of the skills I’ve most often heard my colleagues say they knew more about – including automated workflow development, API data extraction (IPUMS and others), data cleaning, and more.
  • Think Outside the Desktop GIS: Learn the skills unique to using spatial data and applying spatial processes in R.

How do I Learn More?

Learn more about the project on the R Spatial Notebooks Project Page which includes links to all notebooks, an overview of the chapter structure, and additional background information on the project. Join the R Spatial Series mailing list to be the first to receive updates, new notebook release notifications, and opportunities to learn more.

Join me on April 9 at 11:00 am CT for a webinar exploring IPUMS USA workflows using the R Spatial Notebooks.

This hands-on session will guide you through the process of extracting and managing IPUMS USA data in R, providing practical skills you can immediately apply to your research. Notebook demonstrations will be presented on the I-GUIDE Platform – attendees who are interested in coding along are encouraged to set up an account and log into the I-GUIDE Platform prior to the session. Whether you’re new to R or looking to deepen your expertise, this webinar is designed to provide valuable insights for all skill levels.

Register for the webinar here.