IPUMS MEPS – Use It for Good

Family Interrelationships Variables in IPUMS MEPS

July 1, 2026 by mpcblog

By Etienne Breton

Health and family are inextricably tied. Their interplay is complex and dynamic, ranging from biological transmissions to the presence or absence of familial support over the life course. Elucidating these associations often requires vast datasets collected over multiple decades – to account for the ever-changing health and family circumstances of our lives. Researchers interested in investigating these questions at scale may now add a new tool to their toolkit: IPUMS family interrelationship variables are now available in IPUMS MEPS!

Also known as family pointers, these variables identify the location of a person’s probable co-resident spouse and/or parent(s) in the household. They increase reproducibility, flexibility and ease of use when analyzing family units and relationships within households. Whether interested in studying simple parent-child dyads or complex multigenerational arrangements, users may now seamlessly attach characteristics of in-household family members to a person’s records in MEPS.

IPUMS has pioneered the development of family pointers on nationally-representative samples of households and individuals, and these variables have since been added to most of our data collection projects. Their recent addition to IPUMS MEPS presents exciting opportunities owing to the unique richness of the MEPS data, which includes the possibility to eventually expand these pointers to a panel format.

How do the IPUMS MEPS family pointers compare to those in other IPUMS data collections?

The construction of these family interrelationship variables is comparable with other IPUMS microdata collections centered in the US: these are IPUMS USA¹, IPUMS CPS, IPUMS ATUS and IPUMS NHIS. The logic underpinning both common and project-specific codes is best described in the rule variables (as exemplified in the variables descriptions for MEPS: SPRULE and MOMRULE). These variables detail how pointers were attributed to certain individuals and not others, which further allows users to adjust the strictness of pointer attributions.

Let us provide a very brief overview of these procedures. In IPUMS MEPS, as in other IPUMS data collections, the assignment of family pointers and the corresponding rule variables rely primarily on information provided by the variable RELATE (denoting relationship to the householder or household reference person), and additionally on information from variables AGE, SEX and MARSTAT (marital status). The vast majority of family pointers are assigned using direct links established by RELATE (i.e., when a respondent is listed as the child or spouse of the householder). In IPUMS MEPS, these direct attributions represent between 94.7% and 98.9% of all assigned pointers depending on the year and the family pointer variable under consideration.

There remains, therefore, cases that RELATE does not directly solve. For instance, RELATE identifies persons who are grandchildren of the householder but does not specify who are the parents of those grandchildren among all children of the householder. In such clear but indirect cases, our codes algorithmically assign parent-child and spouse-spouse links based on information from RELATE as well as respondents’ age and marital status. These assignments are not probabilistic but instead follow a predefined logic which relies on a small number of well-defined assumptions². Crucially, the values of the rules variables listed above correspond to how direct (first digit) and unambiguous (second digit) each case is, with lower numbers indicating more direct and/or unambiguous cases. This means that users can rely on these rule variables to tailor the levels of directness and clarity they prefer for assigning family pointers.

Note that MEPS data are collected in a panel format: they encompass five interview rounds carried out over two calendar years. Currently, we provide family pointers for person records reported at the annual-level (or full-year consolidated files); variables reported at this level may differ from individual round-level observations, for which we do not yet offer family pointers. These variables should, therefore, be interpreted as reflecting household membership and family interrelationships within households as of December 31 of the survey year under consideration. The vast majority of family pointers are assigned using direct links established by RELATE (i.e., when a respondent is listed as the child or spouse of the householder)³.

How accurate are IPUMS MEPS family pointers?

While there is no omniscient vantage point allowing us to determine whether any given attribution of a family pointer is accurate or not, we possess at least two ways of assessing the plausibility (or plausible accuracy) of family pointers in IPUMS MEPS. The first is to compare the population-level prevalence of family pointers between IPUMS MEPS and other IPUMS data collections centered in the US. All of these data collections can be used to generate nationally representative statistics of the non-institutionalized population over a long time period. Once weighted, they should therefore provide reasonably convergent demographic estimates.

In brief, such a comparison reveals that IPUMS MEPS pointers describe a similar family demography within households to that obtained described by family pointers in other major US surveys. For instance, as shown on Figure 1, the proportion of all survey respondents who were assigned a mother in their household declined in all US-centered IPUMS data collections between the mid-1990s and the mid-2020s. This trend may well be explained by the ongoing fertility decline in the US, but nonetheless deserves further scrutiny as it could also be due to changes in patterns of living arrangements or even to changes in household rostering accuracy.

Figure 1 – Weighted Proportion of Respondents With Mother in the Household (MOMLOC!=0)

A second way to assess the plausible accuracy of our IPUMS-constructed family pointers is to compare them to family pointers provided in the original MEPS data from AHRQ (the Agency for Healthcare Research and Quality, which field MEPS). These AHRQ-pointers are provided at the round-level and not at the annual-level. They are initially reported by the respondents themselves and are then validated or imputed by AHRQ based on internal procedures (which include tests of age plausibility in parent-child relationships). These respondent-reported pointers have benefits, but they remain subject to reporting errors from respondents and enumerators. Furthermore, it is worth noting that many other U.S. federal data sources do not provide self-reported, much less agency-validated, family interrelationship variables⁴. They nonetheless provide a meaningful comparison for pointers constructed strictly from algorithmic rules based on a small number of variables⁵.

Figure 2 – Agreement between IPUMS and Respondent-Reported Pointers by Type of Pointer

As shown above on Figure 2, there is a very high level of agreement between IPUMS and respondent-reported pointers of mothers and father (MOMLOC and POPLOC compared to MOMPIDRD and POPPIDRD), both of which show more 98% of agreement from 1997 onward. However, Figure 2 also shows a declining level of agreement between IPUMS-constructed and respondent-reported location of spouse (SPLOC and SPOUSEPNUMRD) in the household. At first glance, this decline appears to be almost monotonic throughout the whole period. Yet this overall trend hides two distinct components, as shown below on Figure 3.

Figure 3 – Discrepant Cases Between IPUMS and Respondent-Reported Pointers by Selected SPRULE Values

The first component of this declining rate of agreement is due to the presence of unmarried partners of household heads (RELATE code 30). These individuals cannot be designated as spouse in respondent-reported pointers, while IPUMS-constructed pointers do designate them as spouse in SPLOC. Hence this simply represents a case of IPUMS-constructed pointers relying on a broader definition of union, one that includes some cohabiting couples, to define their spousal pointer. Fortunately, this discrepancy can be directly addressed by using the variable SPRULE. Indeed, SPRULE code 21 contains all and only cases of unmarried partners to household heads coded as spouses in SPLOC. Users can therefore remove this source of discrepancy in their own extracts by simply recoding SPLOC as 0 for all observations that have SPRULE code 21. Figure 3 shows that the use of this rule has become more prevalent since MEPS was initiated, reaching a peak prevalence in the mid-2010s and declining afterward.

The second component of the declining rate of agreement in spousal pointers is more puzzling. This component has been growing in importance since the mid-2010s and cannot be addressed directly. These are a subset of individuals with SPRULE code 00; more specifically, individuals for whom the IPUMS-constructed family pointers find no spouse but who have a respondent-reported spouse located at any round of interview. For the most part, these are respondents living in one-person households reporting that they are married with a spouse present with them in the household and who provide what appears to be a valid PID for that person. This spouse is therefore only identified in the variable SPOUSEPNUMRD. In other words, our IPUMS programming rules cannot find any possible spouse for those respondents living in one-person households. It is unclear whether these discrepant cases result from incomplete household rostering on AHRQ’s part or from inaccurate respondent reports. Additional research on this issue is under way, notably to investigate whether recent trends in one-person households converge between IPUMS MEPS and other major US surveys.

In conclusion

Researchers interested in using family pointers in IPUMS MEPS should keep three caveats in mind. The first is the deterministic nature of the pointer attribution rules. Our family pointers are highly accurate but remain imperfect, and users can manage these imperfections with a great degree of flexibility using the rule variables. The second is the inclusion of some unmarried but cohabiting spouse in SPLOC, which users can directly manage using SPRULE code 21. These two caveats apply to all IPUMS data collections centered in the US. The third issue is specific to IPUMS MEPS, where we are observing a growing proportion of one-person households where respondents provide a PID for their spouse’s location in the household. We’ll keep you posted on this one.

Taken together, our family pointers are reliable, comparable, and provide new flexible opportunities for combining person-level and family-level analyses. Use these newly added variables to expand your research in both familiar and unfamiliar directions (pun very much intended)!

¹IPUMS USA applies a comparable methodology for 1970-forward samples and uses a similar but unique methodology for pre-1970.

²This predefined logic states, for instance, that where there are multiple potential spouses in the household those individuals who are closer in age are more likely to be each other’s spouse than those individuals with a larger age gap; or that the older of two sets of dependent children in a household are more likely to have as parents the older of two sets of spouses in that same household (given a plausible age gap between parents and children). There are also rules assigning family pointers to dependent children with no clear parent in the household. For instance, IPUMS rules prioritize assigning those children to relatives over non-relatives; ever-married adults over never-married adults; older adults over younger adults; and so on. This serves as a reminder that IPUMS family pointers for parents represent social in addition to biological relationships within households.

³Users should note that, in IPUMS MEPS, the householder is not strictly the first person listed on the household roster.

⁴ The Current Population Survey provides such self-reported pointers for 2007-onward.

⁵ We define agreement as IPUMS-constructed pointers correctly predicting parental pointers on all non-missing rounds of a given survey year, and as correctly predicting spousal pointers on any non-missing round of a given survey year. This is because we expect marital instability to be more prevalent within a calendar year than changes in living arrangements with one’s own parents.

Measuring Food Security with U.S. Federal Data

October 30, 2025October 10, 2025 by mpcblog

By Kari Williams & Isabel Pastoor

The U.S. Department of Agriculture (USDA) defines a household as being food secure when all household members at all times have access to “enough food for an active, healthy life;” it sets a minimum threshold for food security of “ready availability of nutritionally adequate and safe foods” and the “assured ability to acquire acceptable foods in socially acceptable ways” (USDA Economic Research Service, 2025). The USDA provides survey modules for assessing food security in the U.S. (see Table 1), which are used in a number of federal surveys.

Following the recent announcement by the USDA that they plan to cease data collection for the Food Security supplement fielded as part of the December Current Population Survey, we are highlighting data sources for studying food security in the U.S. Table 2 provides an overview of a number of federal data sources that can be used to study aspects of food security in the U.S. This list of data sources is not exhaustive; we have prioritized data available through IPUMS and other long-running and large-scale population surveys. Additional sources covering shorter time periods or more specific focal populations can be found from the USDA’s Food Security in the United States Documentation page and the Food Access Research Atlas.

Even More IPUMS Data Available in the SDA Online Data Analysis Tool

June 15, 2025December 19, 2024 by mpcblog

By Daniel Backman

Beyond offering the ability to create and download customized datasets from the IPUMS microdata collections, we also support web-based analysis of the data through the SDA (Survey Documentation and Analysis) online data analysis tool. SDA empowers users to analyze IPUMS data directly from their web browsers without the need for additional software or advanced programming skills. Whether you’re a seasoned researcher or a student exploring data for the first time, the SDA tool makes it easier than ever to unlock insights from our datasets. If you’re a current SDA user and ready to get started, check out the new datasets from IPUMS CPS and IPUMS MEPS. Otherwise, read on to learn more about SDA and how to use this tool to analyze IPUMS data.

About IPUMS & SDA

What is SDA?

The SDA tool is a web-based interface that allows you to generate frequency tables, cross-tabulations, and summary statistics; create customized data visualizations, including bar charts, line graphs, and scatter plots; perform regression analysis; and export results as a CSV file for presentations or further analysis.

SDA increases the accessibility of data by allowing users to analyze data through a web-interface without needing to use (or purchase!) statistical software. There is detailed guidance on how to use the tool for analyses and how to manipulate variables. Additionally, it provides exceptionally fast real-time processing of data, making it ideal for use in the classroom or other interactive settings. See our data training exercises page for exercises that will guide you through using SDA to analyze IPUMS data.

Introducing the MEPS Variable Builder!

June 17, 2025November 4, 2024 by mpcblog

By Julia A. Rivera Drew

Earlier this year, IPUMS MEPS launched a new feature – the MEPS Variable Builder – to make it dramatically easier to create customized person-level variables that summarize information from the medical event and condition records and add them to your IPUMS extract. If you have ever thought about using the MEPS event and condition data but didn’t know where to begin because of the complexity of the data, the MEPS Variable Builder is for you!

The Medical Expenditure Panel Survey Household Component (MEPS-HC, referred to MEPS here) provides comprehensive information on characteristics of people residing in responding households, as well as information about their medical encounters during the calendar year – e.g., office-based provider visits, emergency room (ER) visits, and hospitalizations – and medical conditions associated with those medical encounters. This unique combination of information makes the MEPS data ideal for research questions that need detailed health care utilization and/or expenditure data alongside individual-level correlates of health. However, these rich data can be difficult to work with, creating barriers for researchers who wish to use the MEPS data.

IPUMS MEPS created the MEPS Variable Builder to enable users to easily build person-level variables summarizing information from the MEPS-HC event and medical condition records, also known as “event summary variables.” Using a point-and-click interface, researchers can create custom event summary variables that count the number of events or sum expenditures across event records, filtered on selected characteristics of events and/or medical conditions. Users can then include these custom event summary variables in their IPUMS extract. At this time, the variable builder does not include prescribed medicines data.

In this blog post, we run through an example where we create a variable that is the sum of all expenditures paid for by Workers’ Compensation for medical visits due to a workplace injury.

Introducing the MEPS Prescribed Medicines Data

May 28, 2025November 20, 2023 by mpcblog

By Julia A. Rivera Drew

The Household Component of the Medical Expenditure Panel Survey (MEPS), administered by the Agency for Healthcare Research and Quality (AHRQ), is a short panel survey collecting information for a nationally representative sample of the civilian, noninstitutionalized population. Since 1996, the MEPS has collected information on demographic and socioeconomic characteristics; health status; medical conditions; and health care access, utilization, and expenditures.

Based on information provided by a family respondent about each family member at each interview, AHRQ produces a dataset of all reported fills of prescribed medicines purchased by family members during the calendar year (including refills). For example, if a prescription was filled monthly, there would be 12 records for that specific prescribed medicine (DRUGID) in the annual file. The prescribed medicines data includes information such as the medication name (RXNAME), national drug code (RXNDC), therapeutic classification (MULTC1), when the person began taking the medication (RXBEGMM and RXBEGYR), amounts paid (RXFEXPTOT), and source of payment (RXFEXPSRC).

IPUMS MEPS provides a harmonized and integrated version of the MEPS Household Component data, including data from the prescribed medicines files.

IPUMS Announces 2020 Research Award Recipients

June 10, 2025May 17, 2021 by mpcblog

IPUMS is excited to announce the winners of its annual IPUMS Research Awards. These awards honor the best-published research and nominated graduate student papers from 2020 that used IPUMS data to advance or deepen our understanding of social and demographic processes.

IPUMS, developed by and housed at the University of Minnesota, is the world’s largest individual-level population database, providing harmonized data on people in the U.S. and around the world to researchers at no cost.

There are six award categories, and each is tied to the following IPUMS projects:

IPUMS USA, providing data from the U.S. decennial censuses, the American Community Survey, and IPUMS CPS from 1850 to the present.
IPUMS International, providing harmonized data contributed by more than 100 international statistical office partners; it currently includes information on 500 million people in more than 200 censuses from around the world, from 1960 forward.
IPUMS Health Surveys, which makes available the U.S. National Health Interview Survey (NHIS) and the Medical Expenditure Panel Survey (MEPS).
IPUMS Spatial, covering IPUMS NHGIS and IPUMS Terra. NHGIS includes GIS boundary files from 1790 to the present; Terra provides data on population and the environment from 1960 to the present.
IPUMS Global Health: providing harmonized data from the Demographic and Health Surveys and the Performance Monitoring and Accountability surveys, for low and middle-income countries from the 1980s to the present.
IPUMS Time Use, providing time diary data from the U.S. and around the world from 1965 to the present.

Over 2,500 publications based on IPUMS data appeared in journals, magazines, and newspapers worldwide last year. From these publications and from nominated graduate student papers, the award committees selected the 2020 honorees.

New to IPUMS Time Use and IPUMS MEPS: Rectangularizing Down

August 3, 2025May 16, 2019 by mpcblog

Don’t be a square—rectangularize!

Introducing…the option to rectangularize down for extracts in IPUMS Time Use and IPUMS MEPS.

What do we mean by “rectangularize”?