IPUMS International has brand new low level geographic variables and shapefiles

By Quinn Heimann

Map showing percentage of households with internet access in the 2014 Myanmar census by township
Map of Myanmar Internet Access

An ongoing goal and challenge for IPUMS-International (IPUMSI) is providing users with the most detailed geography possible. A unique obstacle to this is the confidentiality requirements agreed upon in order to distribute these census and survey samples. Nevertheless, IPUMSI has started launching lower-level geographic variables in samples where data is sufficient and confidentiality thresholds are still met. As of spring 2022, twenty samples have been released with third administrative level geographic data, covering ten countries across Africa and Asia. In addition, accompanying shapefiles are also being distributed to supplement these variables. Shapefiles can be used in conjunction with these more granular geographic variables to map out population trends in greater detail.

Screenshot to IPUMS International third level download page
IPUMSI third level download page

Many of these countries have multiple samples with lower level geography variables available. It is always a goal of IPUMSI to provide users with as much detail as possible for each sample, but this is sometimes hindered by a lack of sufficient data or detail. Some countries, such as Bangladesh and Mali, contain sufficient detail to provide lower level geography for all available samples in IPUMSI. More recent samples often contain more detail and more thorough documentation, whereas oftentimes this level of information is not present for samples produced longer ago.

Map series showing third administrative boundaries in Bangladesh, called Upazilas, in the 1991, 2001, and 2011 censuses for the entire country and the Dhaka urban area
Map of Bangladesh showing complete level3 series


Another challenge associated with distributing more granular geographic data to users is the production of related shapefiles. IPUMSI aims to provide accompanying shapefiles to all lower level geographic variables produced, however, certain samples may be more difficult to produce these files for if adequate maps are not present, or the country is very large. One example is China, for which IPUMSI has just released lower level variables. As China is very large geographically and consists of more than 2,500 counties, processing is slower than for other countries. As a compromise, the IPUMSI team has released all currently available county-level variables for each sample for China, and a special GIS file that highlights select urban areas across the country for the 2000 sample. This combination hopes to provide users with as much data as possible, while also providing supplemental geographic files while the complete lower level file is being processed.

Map showing median age by counties in Chongqing and Shanghai cities as well as their surrounding prefectures
Map of select China cities, showing adjacent areas

As IPUMSI moves forward with further low level geographic variable creation, it is important to note the great amount of effort that is needed to create these variables. Many datasets provided to IPUMS are lacking sufficient detail to publish geographic detail beyond the first or second administrative level. The greatest amount of time spent with these variables is matching many codes and labels from datasets to real world boundaries. Oftentimes data can be present, but sufficient maps or shapefiles are not present, which is the case with Ethiopia and Senegal. In these cases, IPUMSI works hard to disseminate as many years of data as possible, but the earlier years are omitted. IPUMSI hopes to obtain further funding and resources to continue producing low level geographic variables and shapefiles.

IPUMS International 2022 Data Release

By Jane Lyon Lee, IPUMS International

IPUMS International has added 7 new census samples and new labor force surveys including the first-time data release from the Slovak Republic and historical samples from Egypt 1848 and 1868. The other newly added samples extend pre-existing series. The growing IPUMSI labor force survey collection has expanded with the addition of quarterly surveys from Mexico (ENOE 2005-2020) and more data from Spain & Italy. See a summary of the full IPUMS collection on the IPUMSI samples page.

IPUMS Announces 2020 Research Award Recipients

IPUMS research awardsIPUMS is excited to announce the winners of its annual IPUMS Research Awards. These awards honor the best-published research and nominated graduate student papers from 2020 that used IPUMS data to advance or deepen our understanding of social and demographic processes.

IPUMS, developed by and housed at the University of Minnesota, is the world’s largest individual-level population database, providing harmonized data on people in the U.S. and around the world to researchers at no cost.

There are six award categories, and each is tied to the following IPUMS projects:

  • IPUMS USA, providing data from the U.S. decennial censuses, the American Community Survey, and IPUMS CPS from 1850 to the present.
  • IPUMS International, providing harmonized data contributed by more than 100 international statistical office partners; it currently includes information on 500 million people in more than 200 censuses from around the world, from 1960 forward.
  • IPUMS Health Surveys, which makes available the U.S. National Health Interview Survey (NHIS) and the Medical Expenditure Panel Survey (MEPS).
  • IPUMS Spatial, covering IPUMS NHGIS and IPUMS Terra. NHGIS includes GIS boundary files from 1790 to the present; Terra provides data on population and the environment from 1960 to the present.
  • IPUMS Global Health: providing harmonized data from the Demographic and Health Surveys and the Performance Monitoring and Accountability surveys, for low and middle-income countries from the 1980s to the present.
  • IPUMS Time Use, providing time diary data from the U.S. and around the world from 1965 to the present.

Over 2,500 publications based on IPUMS data appeared in journals, magazines, and newspapers worldwide last year. From these publications and from nominated graduate student papers, the award committees selected the 2020 honorees.

