Delivering data: Technology at the MPC

“Good IT is invisible,” says MPC IT Core director Fran Fabrizio. “You want the users to have the idea that it’s a magic black box.” Though the intent is for the technology behind IPUMS and the other MPC data tools to seem effortless, Fabrizio understands the extent of the human work goes that goes into producing good technology. Getting 2.6 terabytes of data out to users each week requires no small amount of technology behind it.

“Simply put, we do two things, we get data ready for delivery and we build sites and tools that get the data to the user.” That simple task requires the work of 25 people to deliver data to 70,000 global users each week. The IT unit at the MPC is divided into three groups: data delivery, data production, and operations, with Fabrizio at the lead.

It’s a huge IT staff for a population center, but Fabrizio considers it a skeleton crew when compared to his private industry peers. “We try to be as efficient as possible so that we can get the most out of our grant money.”

The IT core recently made the investment in user experience. “It’s a new focus for us,” Fabrizio notes. The user experience work has been primarily focused on the Terra Populus website. “We hired our first dedicated UX designer, Alex McWhinnie, last year to help us have a user-centered design process when we build our sites. We want to think through issues with a usability perspective, and get feedback about our work from user testing so that we can be data-driven about design.”

Fabrizio and his group are looking into the future of data access, as well as the design of the sites themselves. “Right now we are giving a lot of consideration into how we get data to our users,” he says. The MPC data websites were first launched in the 1990s as a way to get data to users quickly. While this was a significant improvement from the previous method, filling out a form and waiting for data to arrive in the mail, as technology has changed we now have the potential for alternate delivery methods. The group is now researching new methods to get data to users, including giving IPUMS users open access to data and metadata via API. Fabrizio says, “Our current method, via the websites, will continue to be the primary way we deliver data, but it can be constraining to some of our users, particularly those who want to interact with our data via their own software. We need to think about how we can get that data out to them more easily, more efficiently. There are a lot of technically difficult issues, and it’s going to take us a while to figure this out.”

The sheer amount of data that the MPC is on track to release by 2020 — 2 billion records — is spurring the IT group’s progress. “If you consider how much the size of the data has grown over the years, it’s easy to understand how scalability plays a role in the future of the MPC,” Fabrizio explains, “Our current tools were created with assumptions of data sizes that we have long grown past.”

“It’s an interesting situation to be in,” he continues. “We are happy that there is a continuous need for our products, and we are focused on finding faster, easier and more friendly ways to continue our mission — getting data to users.”

