The IPUMS spatial team is excited to introduce two new products that expand the ways you can access NHGIS data. IPUMS GeoMarker enables you to easily attach contextual characteristics from ACS data to address or point data, and the first public IPUMS API provides programmatic access to NHGIS data and metadata. Both products officially moved out of beta in December 2019.
Let’s say you have conducted a survey or other study that included collecting the addresses of participants, and you want to know about the characteristics of the neighborhood in which each participant lives. GeoMarker is the tool you need! Using GeoMarker, you can upload addresses or latitude/longitude point locations and select from a series of neighborhood characteristics, such as proportion unemployed or persons per square kilometer. GeoMarker determines which census tract each of your locations falls in, and attaches tract identifiers and the characteristics you selected.
The initial version of GeoMarker delivers 2017 American Community Survey 5-year data for census tracts for ten commonly used contextual variables:
- Proportion unemployed
- Proportion population in poverty
- Median household income
- Income inequality
- Proportion family households headed by a single woman
- Proportion occupied housing units that are owner-occupied
- Proportion African American
- Proportion of adults who completed high school
- Persons per square kilometer
- Housing units per square kilometer
Future versions may include additional variables, data years, and/or geographic levels. As we consider enhancements, we want to serve your needs. Please let us know which enhancements are most important to you by emailing firstname.lastname@example.org!
GeoMarker is housed in the University of Minnesota Academic Health Center’s Secure Data Enclave and is designed to meet University standards regarding HIPAA compliance. This means that you can use GeoMarker for health-related data, such as addresses of participants in a clinical trial. (As a good practice, you should remove any other identifying or health data and upload only locations with an ID you can use to link to the rest of your data.)
Do you have a favorite NHGIS extract that you update regularly, or want to run a series of similar extracts? Our new NHGIS Metadata and Data Extract APIs allow you to access NHGIS data programmatically, without having to click through the user interface. The APIs provide access to all the data and functionality offered on the regular website, including dataset and table metadata, source tables, time series tables, shapefiles, and more.
Our new IPUMS Developer Portal provides everything you need to get started, from how to get your API key to sample code in Python, R, and curl. There are three major steps to using the API: browsing metadata, constructing an extract request, and submitting and retrieving your extract:
- Browsing metadata – You can use API calls to retrieve information about the datasets available in NHGIS, the tables available in a specified dataset, and details about a specified table, including the variables it includes. Metadata are also available for time series tables and shapefiles. You will need pieces of information from the metadata to construct your extract request.
- Constructing an extract request – Extract requests to be submitted via API are structured as JSON-formatted text. The most basic extract request would specify a geographic level for a table from a dataset. Of course, you can include multiple geographic levels, tables, and datasets. You can also request shapefiles and time series tables and specify more specialized parameters such as breakdown values, data format and layout, years, and geographic extent.
- Submitting and retrieving your extract – You will submit your JSON-formatted extract request via an API POST. You can then monitor the status of your request via API. When your extract is completed, you can use an API GET to obtain URLs from which you can download your codebook, data table(s), and shapefile(s).
So, what might you want to do with the NHGIS API? Here are a few of the many possibilities:
- Manage a regularly updated extract – If you run a similar extract each time new 1-year or 5-year ACS data are released, you could construct the JSON definition for the current version of that extract. The next time new data are released, you could use the metadata API to update dataset and table codes in the JSON definition as necessary, then resubmit the extract.
- Share extract definitions with colleagues – The JSON definition provides a complete and precise description of your extract. You can share it with a colleague, and they can then submit and retrieve the identical extract.
- Run a series of similar extracts – If you need a large amount of data, it is often more efficient to submit several smaller extracts rather than one large extract. For example, if you need block-level data across several states, you could construct the JSON definition for one state, then modify it slightly to get data for each other state, submitting and retrieving the data one state at a time.
- Work seamlessly in your analysis environment – The IPUMS Developer Portal includes code examples for Python and R that walk you all the way through to retrieving your data and opening the files in your analysis environment. This means you can develop complete scripts that cover everything from defining your extract to conducting analyses. The code examples in R make use of the ipumsr package. Future versions of ipumsr will feature even tighter integration of the NHGIS APIs.
We hope you enjoy using the NHGIS APIs and are excited to see what you do with them. Let us know what you’re doing, and how we can make the APIs better by visiting the IPUMS’s user forum’s API category (also a great place to learn more) or emailing email@example.com!
Story by Tracy Kugler