Bivariate Proportional Symbol Maps, Part 2: Design Tips with Instructions for ArcGIS Pro

By Jonathan Schroeder, IPUMS Research Scientist, NHGIS Project Manager

How to make effective bivariate proportional symbol maps

A map of the share of population under age 18 in the Miami area in 2020. There is one colored circle for each census tract. There are five colors ranging from dark blue (representing less than 15% under age 18) to light green (representing 20 to 25% under age 18) to brown (representing 30% or more). The circle sizes correspond to tract populations. Most circles have similar sizes, representing around 1,000 to 10,000 people. The circles cluster together forming groups where there are more tracts and more people. The circles in central Miami and along the coast are bluer than elsewhere.
A bivariate proportional symbol map.
Click map for larger version.

In Part 1 of this blog series, I introduced bivariate proportional symbol maps and shared some examples to demonstrate their advantages. In short, when they’re well designed, they can make it easy to see multiple dimensions of a population all at once: size, composition, and spatial distribution.

A key part of that statement is, “when they’re well designed.” Standard mapping tools can make it easy to get started, but getting all the way to a good design still takes some extra effort.

In this Part 2 post, I discuss some key design considerations for bivariate proportional symbol maps, and I provide specific instructions to help you get to a good design.

Software considerations

I used Esri’s ArcGIS Pro to create the examples here and in Part 1. The design tips I share below should be relevant for any mapping tool, but my instructions are specifically for ArcGIS Pro (version 3.2). I expect there are ways to achieve similar designs with QGIS, R, Python, etc., quite possibly more easily than with ArcGIS Pro. I can only say that it’s easier to create effective bivariate proportional symbols now with ArcGIS Pro than it was with its predecessor ArcMap.

As I proceed, I’ll flag which instructions pertain specifically to ArcGIS Pro. All other tips are “tool neutral.”

General tip: Match size to “size” and color to “character”

When selecting which features to map, a framework that works consistently well is to use symbol color to represent an intensive property—e.g., the share of population under 18 years old, average household size, median household income, or the share of votes cast for a candidate—and use symbol size to represent the number of cases to which the intensive property pertains—e.g., the total population (when color corresponds to a population share) or the count of households (when color corresponds to average household size or median household income).

This framework enables the map to illustrate both the spatial distribution of the mapped characteristics and the frequency distribution of the intensive property—e.g., not only where a candidate received large or small vote shares but also how many votes were cast in each of those areas. Other frameworks can also work well (e.g., see the change maps in Part 1), but it’s generally very helpful if the two mapped characteristics relate to each other in a way that corresponds intuitively with “size” and “color.”

Continue reading…

Bivariate Proportional Symbol Maps, Part 1: An Introduction

By Jonathan Schroeder, IPUMS Research Scientist, NHGIS Project Manager

A powerful, underused mapping technique

The world could use a lot more bivariate proportional symbol maps. These maps pair two basic visual variables—size and (usually) color—to symbolize two characteristics of mapped features. When designed well, they convey multiple key dimensions of a population all at once: size and composition as well as spatial distribution and density.

A map of the share of population under age 18 in the Miami area in 2020. There is one colored circle for each census tract. There are five colors ranging from dark blue (representing less than 15% under age 18) to light green (representing 20 to 25% under age 18) to brown (representing 30% or more). The circle sizes correspond to tract populations. Most circles have similar sizes, representing around 1,000 to 10,000 people. The circles cluster together forming groups where there are more tracts and more people. The circles in central Miami and along the coast are bluer than elsewhere.
A bivariate proportional symbol map.
Click map for larger version.

Unfortunately, standard mapping software hasn’t made it easy to create good versions of these maps, and most introductions to statistical mapping stick to simpler strategies. As a result, bivariate proportional symbols aren’t used very often. With few examples and little guidance to go on, it’s understandable that mapmakers don’t realize how often they’re a viable, well-suited option.

This two-part blog series aims to spark more interest by providing a “few examples” (Part 1) and a “little guidance” (Part 2).

Picking up where I left off

In a previous blog post, I shared an example of a bivariate proportional symbol map and described some of the technique’s advantages. But that post focuses on a mapping resource (census centers of population) rather than on mapping techniques. Most of the examples in the post are also simply “proportional symbol maps,” without the more intriguing “bivariate” part.

To close that post, I suggested “a tantalizing next step” would be to use bivariate proportional symbols with small-area data (for census tracts or block groups), and I shared a few technical notes and design tips without much detail. I later expanded on those ideas in a conference talk, sharing some new examples with small-area data and going a little deeper with design tips.

In these new posts, I’m sharing and building on the examples and tips from the conference talk.

Continue reading…

Making Your Customized IPUMS MICS Data File

By Anna Bolgrien

The newest IPUMS data collection, IPUMS MICS, has many similarities with other IPUMS microdata collections. However, there is one major difference: the IPUMS MICS Data Extract System only uses Stata.

Yes, you read that right. Users of IPUMS MICS must use Stata to open and create their customized data file.

Let’s start with how using IPUMS MICS is the same as using other IPUMS microdata collections.

If you are an IPUMS user, you will find the process of browsing the variables, looking at documentation, and adding samples to your data cart completely familiar. If you are not familiar with IPUMS, you can read more about browsing and selecting variables.

However, when you finish choosing variables and samples in IPUMS MICS and click “Create Extract,” things start to look different.

Normally, you could change the data format, but the only option currently available for IPUMS MICS is a .dat (fixed-width text) file format.

Continue reading…

Geospatial Contextuals from IPUMS International

By Ryan Gavin & Quinn Heimann

IPUMS International launched a new platform that will aid researchers using geospatial contextual data along with IPUMS International census microdata!

What is geospatial contextual data?

Geospatial contextual data describe features of the physical and social environment of a geographic area, and allow users to explore how contextual factors interrelate with individual characteristics and outcomes. For example, in their 2020 paper in Global Environmental Change, Mueller et al. estimated the effects that climate-related variables had on migration in Botswana, Kenya, and Zambia between 1989 and 2011. Often, however, these data are large, complex, and packaged in unfamiliar ways. With this new platform, IPUMS International simplifies the process of identifying and linking contextual data with our robust repository of census microdata.

Geospatial contextual data can vary across space, time, or both and often do not obey administrative boundaries. IPUMS International is unique in offering spatiotemporally harmonized administrative geography variables, which when linked to time-variant contextual data, allow researchers to explore the relationship between social phenomena and temporally-dynamic geospatial data using a consistent spatial footprint.

For example, researchers might be interested in studying how changing January precipitation in Bangladesh from 1991-2011 is associated with social or demographic variables. In this case, harmonized geographic variables are ideal because of administrative boundary changes in Bangladesh between 2001 and 2011.

Maps of Bangladesh in 1991, 2001, and 2011 showing the total January Precipitation using year-specific geography and harmonized geography.
Bangladesh map showing January precipitation totals for each census year, showing the difference between year-specific and harmonized geography for measuring effects.

Continue reading…

IPUMS International: 2023 Highlights & Heading Into 2024

By Jane Lee, IPUMS International

IPUMS International is entering 2024 with a strong head start on partner relations and great energy for continued data engagement with partners and with data users. Thanks to user feedback and productive engagement with existing and prospective national statistical office (NSO) partners, users can expect access to additional census and survey data and new, exciting enhancements in 2024.

2023 was packed fuller than usual with renewed interactions with National Statisticians and statistical offices worldwide. Our attendance at the UN Statistical Commission meetings in February garnered productive conversations with countries, and we were able to move those conversations closer to next steps at the ISI WSC in July, and at the International Conference of Labor Statisticians, in October, which was an opportunity for IPUMS to connect specifically about labor force survey data sharing with NSO representatives from more than 25 countries.

Group of people standing in front of backdrop at the IAOS Conference workshopIPUMS remains committed to regional and conference-based engagement. In May, we hosted a pre-conference workshop in conjunction with IAOS (International Association for Official Statistics) Conference in Livingstone, Zambia.

The 14+ NSO labor force and census experts who attended participated in robust cross-country discussions and shared expertise, tools, and technology related to census. In partnership with UNESCWA, IPUMS International joined NSOs and data users in October at the Regional Workshop on Population Projection and Use of Microdata in Rabat, Morocco. There, IPUMS piloted a new training for statistical offices on the preparation of public-use files for the 40+ attendees.

Continue reading…

Introducing the MEPS Prescribed Medicines Data

By Julia A. Rivera Drew

The Household Component of the Medical Expenditure Panel Survey (MEPS), administered by the Agency for Healthcare Research and Quality (AHRQ), is a short panel survey collecting information for a nationally representative sample of the civilian, noninstitutionalized population. Since 1996, the MEPS has collected information on demographic and socioeconomic characteristics; health status; medical conditions; and health care access, utilization, and expenditures.

Based on information provided by a family respondent about each family member at each interview, AHRQ produces a dataset of all reported fills of prescribed medicines purchased by family members during the calendar year (including refills). For example, if a prescription was filled monthly, there would be 12 records for that specific prescribed medicine (DRUGID) in the annual file. The prescribed medicines data includes information such as the medication name (RXNAME), national drug code (RXNDC), therapeutic classification (MULTC1), when the person began taking the medication (RXBEGMM and RXBEGYR), amounts paid (RXFEXPTOT), and source of payment (RXFEXPSRC).

IPUMS MEPS provides a harmonized and integrated version of the MEPS Household Component data, including data from the prescribed medicines files.

Continue reading…

2022 ATUS Eating and Health Module Data: New Variables and Updates

By Annie Chen & Sarah Flood

The American Time Use Survey Eating and Health Module, funded by the Economic Research Service, asks a series of questions related to grocery shopping, food preparation, and nutrition. The most recent module was fielded in 2022 during the COVID-19 pandemic and was previously fielded in 2006 to 2008 and 2014 to 2016. The 2022 Eating and Health Module, set to be fielded again in 2023, asks new questions, asks similar questions in different ways than previously fielded modules, and contains additional variables of high interest to researchers.

New Variables in 2022

The 2022 ATUS Eating and Health Module asks a series of new questions related to exercise/physical activity, grocery shopping, meal preparation, and food quality. The food quality questions are especially interesting because they provide researchers with the opportunity to assess relationships between food quality and time use, which hasn’t been possible previously with these data. This is the first time that the ATUS has asked any information about respondents’ food intake on the ATUS diary day. The module is also responsive to changes in shopping behavior during the pandemic, specifically online grocery shopping and grocery delivery/pickup options. The shopping and meal preparation enjoyment questions might allow for comparisons to the ATUS Well-Being Module (fielded in 2010, 2012, 2013, and 2021).

Continue reading…