At IPUMS we try to address every user’s questions and suggestions about our data. It is just one feature that adds value to IPUMS data. Over time, many questions are often repeated. In this blog series, we will be sharing some of these frequently asked questions. Maybe you’ll learn something, or perhaps you’ll just find these interesting. Regardless, we hope you enjoy.
Here’s one of those questions:
How can I merge my IPUMS CPS file with an NBER CPS file?
At the present time, not all variables collected by the Bureau of Labor Statistics through the CPS are available through IPUMS CPS. This is because the IPUMS CPS team prioritizes working on variables that add value or improve on the functionalities of the already accessible CPS data available through NBER.
This being the case, users often find themselves in a situation that requires merging an IPUMS CPS data file with an NBER data file. This can be done in a couple of different ways. One method, which is applicable for ASEC samples, is by performing a sequential merge. This type of merge will match the first observation from one dataset to the first observation from another dataset. Since IPUMS and NBER datasets share the same sort order for ASEC samples, a sequential merge should successfully add any available variable in NBER to an existing IPUMS CPS data extract.
A second method, which is applicable for all CPS samples, is by using the variables HRHHID, HRHHID2, HUHHNUM, HRSAMPLE, HRSERSUF, and LINENO as linking keys while merging datasets. This method is possible because, with the exception of LINENO (which is called PULINENO in NBER files), these variables have the same names in IPUMS CPS as in the raw CPS files available via NBER.
After the merge is completed, using either method, we suggest to confirm everything worked properly by checking that the sex, age, and race of respondents match between the datasets.
Story by Jeff R. Bloem
PhD Student, Department of Applied Economics
Graduate Research Assistant, Minnesota Population Center