Bivariate Proportional Symbol Maps, Part 2: Design Tips with Instructions for ArcGIS Pro

By Jonathan Schroeder, IPUMS Research Scientist, NHGIS Project Manager

How to make effective bivariate proportional symbol maps

A map of the share of population under age 18 in the Miami area in 2020. There is one colored circle for each census tract. There are five colors ranging from dark blue (representing less than 15% under age 18) to light green (representing 20 to 25% under age 18) to brown (representing 30% or more). The circle sizes correspond to tract populations. Most circles have similar sizes, representing around 1,000 to 10,000 people. The circles cluster together forming groups where there are more tracts and more people. The circles in central Miami and along the coast are bluer than elsewhere.
A bivariate proportional symbol map.
Click map for larger version.

In Part 1 of this blog series, I introduced bivariate proportional symbol maps and shared some examples to demonstrate their advantages. In short, when they’re well designed, they can make it easy to see multiple dimensions of a population all at once: size, composition, and spatial distribution.

A key part of that statement is, “when they’re well designed.” Standard mapping tools can make it easy to get started, but getting all the way to a good design still takes some extra effort.

In this Part 2 post, I discuss some key design considerations for bivariate proportional symbol maps, and I provide specific instructions to help you get to a good design.

Software considerations

I used Esri’s ArcGIS Pro to create the examples here and in Part 1. The design tips I share below should be relevant for any mapping tool, but my instructions are specifically for ArcGIS Pro (version 3.2). I expect there are ways to achieve similar designs with QGIS, R, Python, etc., quite possibly more easily than with ArcGIS Pro. I can only say that it’s easier to create effective bivariate proportional symbols now with ArcGIS Pro than it was with its predecessor ArcMap.

As I proceed, I’ll flag which instructions pertain specifically to ArcGIS Pro. All other tips are “tool neutral.”

General tip: Match size to “size” and color to “character”

When selecting which features to map, a framework that works consistently well is to use symbol color to represent an intensive property—e.g., the share of population under 18 years old, average household size, median household income, or the share of votes cast for a candidate—and use symbol size to represent the number of cases to which the intensive property pertains—e.g., the total population (when color corresponds to a population share) or the count of households (when color corresponds to average household size or median household income).

This framework enables the map to illustrate both the spatial distribution of the mapped characteristics and the frequency distribution of the intensive property—e.g., not only where a candidate received large or small vote shares but also how many votes were cast in each of those areas. Other frameworks can also work well (e.g., see the change maps in Part 1), but it’s generally very helpful if the two mapped characteristics relate to each other in a way that corresponds intuitively with “size” and “color.”

Continue reading…

Bivariate Proportional Symbol Maps, Part 1: An Introduction

By Jonathan Schroeder, IPUMS Research Scientist, NHGIS Project Manager

A powerful, underused mapping technique

The world could use a lot more bivariate proportional symbol maps. These maps pair two basic visual variables—size and (usually) color—to symbolize two characteristics of mapped features. When designed well, they convey multiple key dimensions of a population all at once: size and composition as well as spatial distribution and density.

A map of the share of population under age 18 in the Miami area in 2020. There is one colored circle for each census tract. There are five colors ranging from dark blue (representing less than 15% under age 18) to light green (representing 20 to 25% under age 18) to brown (representing 30% or more). The circle sizes correspond to tract populations. Most circles have similar sizes, representing around 1,000 to 10,000 people. The circles cluster together forming groups where there are more tracts and more people. The circles in central Miami and along the coast are bluer than elsewhere.
A bivariate proportional symbol map.
Click map for larger version.

Unfortunately, standard mapping software hasn’t made it easy to create good versions of these maps, and most introductions to statistical mapping stick to simpler strategies. As a result, bivariate proportional symbols aren’t used very often. With few examples and little guidance to go on, it’s understandable that mapmakers don’t realize how often they’re a viable, well-suited option.

This two-part blog series aims to spark more interest by providing a “few examples” (Part 1) and a “little guidance” (Part 2).

Picking up where I left off

In a previous blog post, I shared an example of a bivariate proportional symbol map and described some of the technique’s advantages. But that post focuses on a mapping resource (census centers of population) rather than on mapping techniques. Most of the examples in the post are also simply “proportional symbol maps,” without the more intriguing “bivariate” part.

To close that post, I suggested “a tantalizing next step” would be to use bivariate proportional symbols with small-area data (for census tracts or block groups), and I shared a few technical notes and design tips without much detail. I later expanded on those ideas in a conference talk, sharing some new examples with small-area data and going a little deeper with design tips.

In these new posts, I’m sharing and building on the examples and tips from the conference talk.

Continue reading…

Accessing IPUMS NHGIS in R: A Primer

By Finn Roberts & Jonathan Schroeder

R users have a powerful new way to access IPUMS NHGIS!

The July 2023 release of ipumsr 0.6.0 includes a fully-featured set of client tools enabling R users to get NHGIS data and metadata via the IPUMS API. Without leaving their R environment, users can find, request, download and read in U.S. census summary tables, geographic time series, and GIS mapping files for years from 1790 through the present. This blog post gives an overview of the possibilities and describes how to get started.

What you can do with ipumsr

Request and download NHGIS data

You can use ipumsr to specify the parameters of an NHGIS data extract request and submit that request for processing by the IPUMS servers. You can request any of the data products that are available through the NHGIS Data Finder: summary tables, time series tables, and shapefiles. You can also specify general formatting parameters (e.g., file format or time series table layout) to customize the structure of your data extract.

Once you have specified a data extract, you can use a series of ipumsr functions to:

  • submit the extract request to the IPUMS servers for processing
  • check on the extract status
  • wait for the extract to complete
  • download the extract as soon as it’s ready
  • load the data into R with detailed data field descriptions.

This workflow allows you to go from a set of abstract NHGIS data specifications to analyzable data, all without having to leave your R session!

Continue reading…

Better Maps with Census Centers of Population

Jonathan Schroeder, IPUMS Research Scientist, NHGIS Project Manager

The best mapping resource no one’s using?

In the domain of U.S. population mapping, the Census Bureau’s centers of population may be the nation’s most underused data resource. Before I explain why, let’s cover some basics…

What are they? A center of population represents the mean location of residence for an area’s population, roughly the average latitude and longitude, adjusting for the curvature of the earth. For the last three decennial censuses (2000, 2010, 2020), the Census Bureau has published centers of population separately for U.S. states, counties, census tracts, and block groups.

Where can you get them? Through the Census Bureau website, you can download files containing the latitude and longitude coordinates for centers of population. To facilitate mapping and analysis, IPUMS NHGIS has transformed the coordinates into point shapefiles, available for download through the NHGIS Data Finder.

What are they used for? At the moment, not much! But there are dozens of settings where they’d be helpful. I’m hoping this blog will help get the word out, and if it does, you might now be reading this in some future age, marveling how we ever went so long without using them!

OK, how should we use them? In the case of statistical maps—my focus here—centers of population are wonderfully effective for placing proportional symbols. I share lots of examples down below to demonstrate, but first, let’s consider the general advantages of proportional symbol maps compared to a more common alternative: choropleth maps…

Continue reading…

IPUMS Announces 2020 Research Award Recipients

IPUMS research awardsIPUMS is excited to announce the winners of its annual IPUMS Research Awards. These awards honor the best-published research and nominated graduate student papers from 2020 that used IPUMS data to advance or deepen our understanding of social and demographic processes.

IPUMS, developed by and housed at the University of Minnesota, is the world’s largest individual-level population database, providing harmonized data on people in the U.S. and around the world to researchers at no cost.

There are six award categories, and each is tied to the following IPUMS projects:

  • IPUMS USA, providing data from the U.S. decennial censuses, the American Community Survey, and IPUMS CPS from 1850 to the present.
  • IPUMS International, providing harmonized data contributed by more than 100 international statistical office partners; it currently includes information on 500 million people in more than 200 censuses from around the world, from 1960 forward.
  • IPUMS Health Surveys, which makes available the U.S. National Health Interview Survey (NHIS) and the Medical Expenditure Panel Survey (MEPS).
  • IPUMS Spatial, covering IPUMS NHGIS and IPUMS Terra. NHGIS includes GIS boundary files from 1790 to the present; Terra provides data on population and the environment from 1960 to the present.
  • IPUMS Global Health: providing harmonized data from the Demographic and Health Surveys and the Performance Monitoring and Accountability surveys, for low and middle-income countries from the 1980s to the present.
  • IPUMS Time Use, providing time diary data from the U.S. and around the world from 1965 to the present.

Over 2,500 publications based on IPUMS data appeared in journals, magazines, and newspapers worldwide last year. From these publications and from nominated graduate student papers, the award committees selected the 2020 honorees.

Continue reading…

Mapping Block-Level Segregation: The Twin Cities’ Black Population, 1980-2010

Research, data preparation, story and graphics by Amalea Jubara and Yaxuan Zhang (Minnesota Population Center, Summer Diversity Fellows), mentored by Jonathan Schroeder (IPUMS Research Scientist) and Ying Song (Assistant Professor, Department of Geography, Environment & Society)

Edited by Jonathan Schroeder (IPUMS Research Scientist)

IPUMS NHGIS Block Data: An Expanding Collection

The most spatially precise U.S. census data are block-level tables, summarizing population and housing characteristics for millions of blocks throughout the country. IPUMS NHGIS provides block-level tables for the 1970 to 2010 decennial censuses as well as block boundary files for 1990, 2000 and 2010. This collection is set to grow substantially in the next few years as NHGIS adds new 2020 census block data and as we continue with a major initiative to construct 1980 and 1970 block boundary files. This expansion will open up new possibilities for high-precision spatial analysis across a longer time span.

A Case Study of the Twin Cities’ Black Population

To demonstrate some of the potential value of this expanding collection, we use NHGIS block data, including some not-yet-released 1980 block boundaries, to explore the recent history of racial segregation and integration in the Black population of the Twin Cities of Minneapolis and St. Paul, Minnesota, from 1980 to 2010. We present the block data in an interactive map along with data on early-20th-century racial covenants and the “redlining” zones of the Home Owners’ Loan Corporation (HOLC), recently published by the Mapping Prejudice and Mapping Inequality projects.

The block-level changes since 1980 show a striking trend toward greater dispersion and integration of Black residents, but segregation persists; several neighborhoods still have uniformly low or high proportions of Black residents. By overlaying racial covenants and HOLC zones with the block data, we can also find cases where the historical discriminatory practices appear to have left a lasting imprint on the distribution of Black residents.

Continue reading…

New Products! IPUMS GeoMarker and NHGIS APIs

The IPUMS spatial team is excited to introduce two new products that expand the ways you can access NHGIS data. IPUMS GeoMarker enables you to easily attach contextual characteristics from ACS data to address or point data, and the first public IPUMS API provides programmatic access to NHGIS data and metadata. Both products officially moved out of beta in December 2019. 

Continue reading…