How Historical Data Become Public

1950 Census Bureau employees photographing records from the 1900 census for storage on microfilm. Photo: U.S. Census Bureau

An enormous amount of information about the characteristics and activities of ordinary people is just waiting to make its debut for researchers to analyze — two billion people and their households, spanning over 100 countries, from 1703 to the present day. All these data will be available for computer analysis by the general public, for free, by 2018.

Continue reading…

Counting—and Redefining—the Cost of War

Hacker_headshot 2016 copyAssociate Professor of History and MPC Faculty Member J. David Hacker made headlines in 2011 when he published a groundbreaking study of the total number of U.S. Civil War dead. Hacker argued that the widely-accepted figure of 620,000 was far too low. Using IPUMS, Hacker showed that the number of dead was at least 750,000—if not more. His article, “A Census-Based Count of the Civil War,” published in Civil War History, was introduced by the editors in the issue as “among the most consequential pieces ever to appear in this journal’s pages.”

Few demographic historians expect attention from mainstream press when they publish their research, but Hacker’s study attracted national interest, including interviews with the New York Times and National Public Radio.

Continue reading…

New Frontiers in Big Data

Figure1 Big Data Chart1

By 2020, MPC will make freely available to researchers worldwide 100% count U.S. Census microdata through 1940. This dataset will include over 650 million individual-level (1850-1940) and 7.5 million household-level records (1790-1840). The microdata represents the fruition of longstanding collaborations between MPC and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—to leverage genealogical data for scientific purposes.

“The importance of this massive donation of census data would be difficult to overstate,” says MPC Director Steve Ruggles. “This is one of the largest-scale data-entry efforts ever undertaken.”

Continue reading…