By Rodrigo Lovaton Davila
IPUMS International recently added twenty-one new harmonized variables that expand the thematic coverage of the data collection and enable new possibilities for research. Most notably, the data release introduces harmonized variables representing sample level information, including selected characteristics of the statistical operation and the sampling design (accessible in technical household). This information was previously available in the sample descriptions section, but is now also accessible through variables that can be included in data extracts. Read on for more details on these new sample-level variables and a few new work and household amenity variables!
New variables about the statistical operation describe whether the data correspond to a census or a survey; whether enumeration was de jure or de facto; the type of form received by respondents in the sample; and the month of data collection. The IPUMS International data collection currently includes 395 census samples, 233 labor force surveys, and 27 population surveys.
FORMTYPE allows users to identify whether the data for each sample consist of responses to a single, standard questionnaire applied to the entire population; responses to a short or long form, in a census that gathered more information from a sample of the population; or records derived from administrative registers (with no questionnaire used in data collection.) Most datasets in the collection correspond to one standard questionnaire (79% of 395 census samples). For censuses where a short and a long form were applied, the samples in IPUMS typically correspond to the long questionnaire (78% of 78 samples), which includes additional questions and is richer for research purposes.
ENUMTYPE indicates whether the enumeration was de jure or de facto, an important distinction for understanding how the population was counted in the census operation. Some censuses enumerate combining both de jure (usual residents) and de facto (those present on the census reference date whether resident or visitor), which is reflected in this new variable. Importantly, users can work with the existing variable RESIDENT to eliminate double-counting of persons who were enumerated both at their permanent residence and at the residence they were visiting on census night. ENUMMO complements the variable YEAR to provide a more accurate indicator of the timing of data collection.