IPUMS Announces 2020 Research Award Recipients

IPUMS research awardsIPUMS is excited to announce the winners of its annual IPUMS Research Awards. These awards honor the best-published research and nominated graduate student papers from 2020 that used IPUMS data to advance or deepen our understanding of social and demographic processes.

IPUMS, developed by and housed at the University of Minnesota, is the world’s largest individual-level population database, providing harmonized data on people in the U.S. and around the world to researchers at no cost.

There are six award categories, and each is tied to the following IPUMS projects:

  • IPUMS USA, providing data from the U.S. decennial censuses, the American Community Survey, and IPUMS CPS from 1850 to the present.
  • IPUMS International, providing harmonized data contributed by more than 100 international statistical office partners; it currently includes information on 500 million people in more than 200 censuses from around the world, from 1960 forward.
  • IPUMS Health Surveys, which makes available the U.S. National Health Interview Survey (NHIS) and the Medical Expenditure Panel Survey (MEPS).
  • IPUMS Spatial, covering IPUMS NHGIS and IPUMS Terra. NHGIS includes GIS boundary files from 1790 to the present; Terra provides data on population and the environment from 1960 to the present.
  • IPUMS Global Health: providing harmonized data from the Demographic and Health Surveys and the Performance Monitoring and Accountability surveys, for low and middle-income countries from the 1980s to the present.
  • IPUMS Time Use, providing time diary data from the U.S. and around the world from 1965 to the present.

Over 2,500 publications based on IPUMS data appeared in journals, magazines, and newspapers worldwide last year. From these publications and from nominated graduate student papers, the award committees selected the 2020 honorees.

Continue reading…

How has COVID-19 affected 2020 data collection efforts?

By Julia A. Rivera Drew, Sarah M. Flood, Renae Rodgers

IPUMS integrates data from several major US surveys that collect data throughout the year. Below, we discuss how COVID-19 has affected how US statistical agencies have collected these survey data in 2020.

Current Population Survey (CPS)

The Bureau of Labor Statistics (BLS) and the Census Bureau have continued to collect data on a monthly basis during the COVID-19 pandemic, implementing some procedural modifications to protect the safety of respondents and Census Bureau employees and adding a short supplement to capture the effects of the pandemic on work in the United States.

Changes to Interviewing Procedures

Current Population Survey (CPS) data collection for March had already begun when the Census Bureau suspended in-person data collection on March 20th, 2020. Two call centers that assist with CPS data collection also closed down at this time. However, data collection continued exclusively by phone through June of 2020. In July, in-person interviews began in some areas of the country and the call centers that had been closed in March re-opened. In-person interviews were resumed in all areas of the country in September 2020 and data collection has returned to a normal routine. More information on how alternative data collection procedures affected response rates, attrition, and employment data is available on the IPUMS CPS website.

Additional COVID-related content

The COVID-19 outbreak prompted the BLS to add five questions to the monthly CPS survey about work in the time of COVID-19. These questions were first asked in May. Though the question about foregoing medical care due to the pandemic was dropped from the survey after October of 2020, all other questions will remain in the survey until further notice. Researchers may preview the questions or access the COVID-specific variables via IPUMS CPS.

IPUMS CPS will continue to update our documentation on the effects of the pandemic on CPS data collection and to make new data available as quickly as possible. Follow @ipums on Twitter for the latest updates.

Continue reading…

Overview of NHIS Data Collection, 1997-2018

By Julia A. Rivera Drew, Kari C.W. Williams, and Natalie Del Ponte

The IPUMS NHIS project offers integrated versions of the National Health Interview Survey (NHIS) data, the leading source of nationally representative information on the health of the U.S. population. The National Center for Health Statistics (NCHS) collects the NHIS data through face-to-face interviews covering information about health, health insurance coverage, health care utilization, socioeconomic characteristics, and demographics of all household members. It is representative of the civilian, non-institutionalized U.S. population with annual samples ranging between 30,000-50,000 households and 75,000-100,000 people. NCHS has collected the NHIS annually since 1957 (with digital copies of the data available going back to 1963), making it the longest running annual survey of health in the world.

Periodically, aspects of data collection – such as the sampling frame, oversampled populations, or questionnaire content – change to better capture changes in the most pressing health concerns of Americans or changes in the demographic makeup of ­­Americans and where they reside within the U.S. Most of these changes are modest, reflecting changes in U.S. population composition and distribution detected in the most recent decennial census. However, 2019 heralded the largest change in NHIS data collection since 1997. In fall 2020, the NCHS will release the 2019 public use data files, the first data collected under the newly redesigned NHIS. The upcoming release of the 2019 data warrants a look back at how NCHS collected the NHIS data over the 1997-2018 period.

1997-2018 at a Glance

The data collection design of the 1997-2018 NHIS was largely comparable over time. There were a few minor changes during this period, the largest taking place between 2005 and 2006 to update the sampling frame to reflect the 2000 Census and add an oversample of Asian persons. Most oversamples were discontinued in 2016 (see the IPUMS NHIS note on Sample Design for more information). Under the 1997-2018 design (illustrated in Figure 1), the NHIS was a sample of households, where each household could potentially contain multiple families. One representative from each family, the family respondent, provided demographic, health status, and health insurance coverage information about all family members. In addition to the data collected for all family members, interviewers randomly sampled one adult and one child per family to complete additional interviews (the “sample adult” and “sample child” questionnaires, respectively). Through this mechanism, the NHIS collected further information on topics such as Body Mass Index, mental health, access to health care, health behaviors, and (for adults) sexual orientation and details about paid employment. NCHS releases standalone data files for each of these content areas (households, families, family members, sample adults, and sample children) every year. IPUMS NHIS allows users to review variables from all content areas and include them in a single data extract.

Figure 1. 1997-2018 NHIS Data Collection

Illustration of sampling of data for NHIS

For IPUMS NHIS users interested in combining information collected on different parts of the survey, understanding the NHIS data collection process is important for two reasons. First, when users design analyses of the NHIS data, they must take into account the extent to which the overlap of topical supplements collected for sample adults and sample children varies by subject area and over time. Second, which variables analysts combine determines which sampling weight is most appropriate for analyses that utilize data from these different content areas.

Overlapping Sample Adult and Sample Child Content

Users interested in the rich topical content of NHIS may wish to design analyses that take advantage of the occasional and recurring supplements asked of sample adults and sample children. However, it is important to note that the items collected by the sample adult questionnaire are not necessarily also part of the sample child questionnaire, and vice versa. Even when similar topics are covered, the two questionnaires may not include identical measures. IPUMS NHIS combines sample adult and sample child measures into a single integrated variable wherever they overlap to make it easier for users interested in looking at both groups.

Additionally, because NCHS fields some supplements only in certain years, there are topical combinations that are not possible because NCHS never asks specific supplemental questions in the same year (e.g., the balance problems supplement never overlaps with the complementary and alternative medicine supplement). IPUMS NHIS users who add variables of topical interest to their data requests without confirming that they are available for all the relevant years may be confused to find missing values where they did not expect any.

Selection of Appropriate Sampling Weight

As described above, NHIS is a complex, multistage probability sample. Users must make use of sampling weights to produce population representative point estimates. For information on producing correct standard errors and statistical tests, see the IPUMS NHIS user note on variance estimation. Because NCHS releases standalone data files for each content area, they offer more weight variables (at least one for each file). Most person-level analyses using IPUMS NHIS will use PERWEIGHT or SAMPWEIGHT.

The IPUMS NHIS variable PERWEIGHT corresponds to WTFA in the original NCHS data files. PERWEIGHT is appropriate for analyses that use variables collected for all family members. The IPUMS NHIS variable SAMPWEIGHT combines two separate weights, one for sample adults and one for sample children, from the original NCHS data files. SAMPWEIGHT reports the sample adult weight only if the person is the selected sample adult and the sample child weight only if the person is the selected sample child. SAMPWEIGHT is 0 for all other persons. SAMPWEIGHT is appropriate for analyses that include variables collected as part of the sample adult or sample child content of the questionnaire. In cases where both types of variables are included, users should apply the more restrictive of the two weights (SAMPWEIGHT in this case).

Look for a future post describing the 2019 NHIS redesign after it is released in the fall of 2020. Until then, you may be interested in these IPUMS NHIS user notes on Sample Design, Sampling Weights, and Variance Estimation in NHIS data.