Associate Professor of History and MPC Faculty Member J. David Hacker made headlines in 2011 when he published a groundbreaking study of the total number of U.S. Civil War dead. Hacker argued that the widely-accepted figure of 620,000 was far too low. Using IPUMS, Hacker showed that the number of dead was at least 750,000—if not more. His article, “A Census-Based Count of the Civil War,” published in Civil War History, was introduced by the editors in the issue as “among the most consequential pieces ever to appear in this journal’s pages.”
Few demographic historians expect attention from mainstream press when they publish their research, but Hacker’s study attracted national interest, including interviews with the New York Times and National Public Radio.
Hacker’s work is not done, however. The 750,000 figure came from existing IPUMS 1% samples of the 1850, 1860, 1870, and 1880 Censuses. MPC’s forthcoming major data expansion promises full count historical census data, which will allow Hacker to make an even more accurate Civil War count and to examine the impact of the war on population and family structure. Coupled with the MPC’s charge to link these huge datasets, this data enables historical scholarship that was never before possible. The new information stands to redefine how we think about the demographic costs and consequences of war, and could have a major impact outside of American social history and demography.
Data Expansion and Linkage
In our last newsletter, we reported on the collaboration among the MPC, Ancestry.com, and FamilySearch that will bring 100% count U.S. census data through 1940 (over 650 million individual-level records and 7.5 million-level household records) to the public by 2020. The MPC will add significant value to this massive influx of data by digitizing variables necessary for demographic study, such as occupation, and then linking individuals across the 1850, 1860, 1870, and 1880 censuses. The mortality schedules will also be linked with the 1860, 1870, and 1880 population schedules and slaves in the 1860 slave schedules will be linked to owners in the 1860 free schedules.
When the data linkage process is finished, historians and other social science researchers will have unprecedented access to historical evidence from the 19th century. If this data represents the “holy grail of American social history,” as Matt Sobek, MPC Director of Data Integration, has suggested, then 19th-century American historians in particular stand to benefit from revolutionary access to a “unique laboratory for developing and testing models of demographic and economic change.”
The Holy Grail of Data
The individual, linked, full-count census data will include variables crucial for exciting advances in demographic research. Occupational and wealth variables are particularly illuminating for researchers. Individual-level mortality data, for example, allow a close look at relationships among socioeconomic status, illness, and disability to better illustrate patterns in mortality. For marriage scholars, this data enables longitudinal analysis of racial differences in marriage and the differential role of postwar economic opportunity on the marriage market.
“The war is by far the largest demographic shock in American history,” explains Hacker. Prior to the Civil War, the United States was an exceptional case when it came to demographic growth. Rapid population growth was a defining characteristic of the U.S. on the international stage, and was lauded as proof of the strength of the new nation by American leaders. Growth was predictable, as well; early demographic observers could calculate population growth well into the future with confidence.
The Civil War changed this demographic regime abruptly. There was the obvious effect of 750,000 dead young men. In addition, women and couples began consciously limiting births, greatly affecting the typical family size. At the same time, the beginnings of the public health movement and other factors meant a declining mortality rate. And all of these changes accompanied another demographic shock: the emancipation of 4.5 million enslaved African Americans.
What Does the Data Do?
Hacker’s 750,000 figure had its admitted limitations. With only 1% sample microdata at his disposal, he could only calculate for the white population, and had to rely on existing estimates for the African American population, which he—and others—have long suspected are too low. The new data are full-count, and will be of sufficient quality and density to estimate more accurately than ever the impact of the war for the African American population.
Prior to emancipation, the African American population was highly constricted by the white population, including close regulation of family formation. Emancipation means 4.5 million people, free for the first time, to decide where they want to live, how they want to earn a living, and what their households and families would look like. This was, to put it mildly, a transformative moment for the Black population.
To be sure, the effects of slavery and emancipation on the African American population, particularly long-term effects, are ones that historians and social scientists have grappled with understanding for over a century. There are many studies on post-emancipation family structure, for example, but most are based on small microdata samples. Outside of demographic circles, there are long-standing social and political debates on the long-term effects of slavery. “The full-count data will allow us to do a much better job on these questions than ever before,” says Hacker.
Fertility, mortality, family structure, migration, and other big topics of research enabled by this data expansion are all pieces of a bigger story: the demographic costs and consequences of war. The consequences of war, emancipation, and slavery are with us in 2016, in subtle and obvious ways, and this data allows us to know much more about them. But this is not just about expanding knowledge. What counts (literally) as the consequences of the Civil War? This major issue for scholars and the public is one that the data expansion stands to transform.
Story by Melissa Kelley