At IPUMS, we try to address every user’s questions and suggestions about our data. It is just one feature that adds value to IPUMS data. Over time, many questions are often repeated. In this blog series we will be sharing some of these frequently asked questions. Maybe you’ll learn something. Perhaps you’ll just find these interesting. Regardless, we hope you enjoy.
Here’s one of those questions:
Where are the data quality flags for income variables?
In the Current Population Survey income and earnings data is often subject to a considerable share of item nonresponse. That is, individuals who otherwise respond to the survey leave blank questions relating to specific items on the questionnaire. Some estimate that the rate of nonresponse for income and earnings questions can be as high as 30% in the CPS basic monthly samples and 20% in the ASEC samples.
The Census Bureau’s solution to this is to impute income and earnings values for these individuals, based on other information—such as occupation, industry, age, sex, geographic location, etc.—so that these individuals are not omitted from empirical labor force analysis. This is a good service provided by the Census Bureau since, as highlighted by Bollinger and Hirsch in their 2013 Review of Economic and Statistics article, those who respond to income and earnings questions in the CPS are not exactly similar to those who don’t respond.
Nevertheless, including imputed responses in empirical analysis is still a debated issue that is far from settled. This being the case, the Census Bureau provides data quality flags that identify whether or not an income or earnings value has been imputed. This way each individual user of CPS data can make their own choice about how to handle observations with imputed values.
Sometimes IPUMS CPS users write to us and ask: Where are the data quality flags for income variables such as HHINCOME, INCTOT, and FTOTVAL? The answer is not exceedingly obvious. Each IPUMS CPS variable has an associated “Flags” tab, but the flag tab for INCTOT is empty. So, what gives? Here is a brief explanation of the details.
Income variables such as HHINCOME, INCTOT, and FTOTVAL are all income variables that combine values of other more specific income variables. For example, the comparability tab for INCTOT describes the components of INCTOT in various sample years. In ASEC samples from 1988 and onward, INCTOT includes values from:
INCWAGE (Wage and salary income)
INCBUS (Non-farm business income)
INCFARM (Farm income)
INCSS (Social Security income)
INCWELFR (Welfare [public assistance] income)
INCRETIR (Retirement income)
INCSSI (Income for supplemental security)
INCINT (Income from interest)
INCUNEMP (Income from unemployment benefits)
INCVET (Income from veteran’s benefits)
INCSURV (Income from survivor’s benefits)
INCDISAB (Income from disability benefits)
INCDIVID (Income from dividends)
INCRENT (Income from rent)
INCEDUC (Income from educational assistance)
INCCHILD (Income from child support)
INCALIM (Income from alimony)
INCASIST (Income from assistance)
INCOTHER (Income from other source not specified)
Some of these variables have their own data quality flags. For example: INCASIST has QINCASSI and INCCHILD has QINCCHIL. Others, like INCWAGE and INCBUS, are themselves aggregated values of multiple variables. For example: INCWAGE is derived from OINCWAGE and INCLONGJ which both have data quality flags, QOINCWAGE and QINCLONG respectively.
So, the data quality flags for income variables are indeed available in IPUMS CPS, they just may not be in the first place you look if the income variable you are using is an aggregation of other income variables.
Story by Jeffrey R. Bloem
PhD Student, Department of Applied Economics
Graduate Research Assistant, Minnesota Population Center