By Sarah Flood, Renae Rodgers, and Kari Williams
Federal data are critical for understanding much about the US population from its size and composition to its health and employment. The Current Population Survey (CPS) is our nation’s official source of information about the labor force. At the beginning of each month, we eagerly await the first Friday when the Employment Situation Summary (aka the monthly jobs report) will be released (it isn’t just us, right??). The monthly snapshot of the US labor force serves as a bellwether for how our economy is faring.
The Wednesday after the jobs report is released, we at IPUMS clear the decks in preparation for the release of the CPS Basic Monthly Survey (BMS) by the Census Bureau. The CPS BMS is the individual-level data from which the jobs report is generated. Our goal is always to process these data as soon as they’re released by the Census Bureau so that we can deliver them to IPUMS CPS users as quickly as possible. Those who rely on CPS BMS data each month might be familiar with coping strategies while waiting for the data–obsessive page refreshing, some nervous pacing, maybe wondering why they haven’t yet been released (iykyk).
While quickly processing CPS Basic Monthly data is a priority, so, too, is ensuring data quality. Each month, we carefully inspect CPS BMS data at several points in our process. First, we review all of the variables for codes that are undocumented or have suspicious frequencies. Second, we rely on a suite of tools during our integration process that alert us to any codes in the data that we haven’t accounted for in our variable-level harmonizations. After harmonization, we compare univariate statistics from the newest month data to the previous month of data. Generally we expect very little change across months and we have built tools that are designed to flag variable-level differences above a certain threshold as well as new codes on either end of the distribution.
Of course, consistency across adjacent months is not always a reasonable assumption. For example, during COVID, unemployment rates skyrocketed. In this case, rather than using consistency as our measuring stick for data quality, we relied on context to evaluate whether the changes we were observing across months were as expected.
In the coming months, you can count on us to take extra care to evaluate the CPS BMS data while maintaining rapid data release timelines. We will continue to look for consistency across months in the demographic composition of couples as well as the racial and ethnic composition of the US population, and we will rely on broader context to evaluate differences in the data, such as specific sectors hit hard by job loss. IPUMS CPS users can expect to find documentation of any unexpected differences or notable changes in our revision history (e.g., changes to population controls with the January 2025 BMS) in addition to other types of documentation that we produce as needed (e.g., working papers on occupation harmonization and linking keys).