A Magazine for the George Mason University Community

Bringing Big Data to Social Science

March 12, 2013

By Andy Brown

Anne L. Washington started her career working for start-up technology firms, where employees ate catered meals, played with toys, and worked extraordinary hours. Then she found herself working in the technology department of a long-established corporation.

“Everybody had gone to Franklin-Covey training and was carrying the same day planner,” she says. “I thought, ‘Whoa, there are many, many ways to work in this world.’”

Anne Washington

Anne Washington

With many different ways of working come challenges to sharing knowledge. “My job in the technology department was helping to weave information together into something coherent, and that’s when I first got really interested in how we organize information into what is important and not important,” Washington says.

An assistant professor in Mason’s Organization Development and Knowledge Management program, Washington has since extended her interests to the federal government. “Congress is fascinating from a knowledge management perspective,” she says. “One of the key phrases in knowledge management is, ‘If we only knew what we knew.’ In other words, the knowledge is in here, everyone knows this, but we can’t find it.”

Another difficulty is that government collects so much data, it’s impossible for individual researchers to access it without the help of computational science. “There’s a whole part of how government functions that’s not in the light, but it could be if you used computational methods,” Washington says. Recently, she was awarded a National Science Foundation (NSF) grant to help make that happen.

“The idea of the grant is to bring big data to social science,” she says. “Ultimately, it’s about getting social scientists to ask new research questions and expand the research methods available to them.”

Social scientists researching Congress often work alone and with small data sets. For example, researchers might use the Congressional Record from a specific congressional session as a typical data set.

“A lot of research on Congress is about what happens on the floor of the House and Senate, because there’s the Congressional Record, because there are voting records,” Washington says. “The data exist to ask those questions, so these are the questions people ask.”

Jennifer Barrett

However, between the 2009 open government directive and recent Congressional committee initiatives, government is now systematically publishing digital information. Using computational science methodologies could help social scientists expand their data sets to include entire legislative histories, committee records, witness testimonies, agency reports, rule-making comments, and other documents.

Washington’s grant supports interdisciplinary research designed to pair computer scientists and social scientists. In the first year, she and her co-lead investigator will build a data repository of government data for the academic community. The data will come from existing open government information including text, video, and audio files. It will be of interest to both computational and social scientists. Researchers will be encouraged to develop papers and present at conferences in their fields, whether in computational science, linguistics, technology, political science, or others.

In year two, everyone who used the data to generate papers will be invited to a workshop funded by the NSF grant. “Once we’re in the same room and begin conversations, from there we’ll build research partnerships,” Washington says.

In year three, Washington will host a second workshop where collaborators will present the findings from their research. Another part of the grant calls for the creation of a learning module, so that the next generation of social scientists and policy analysts can be trained with the computational skills needed to conduct big data research.

Beyond the grant’s immediate goal to create interdisciplinary networks, Washington sees it having a broader impact tied to government transparency. The dissemination of more data and research leads to a fully informed public. And compared with other countries, the United States is far behind.

“In the European Union, they’ve already done this,” Washington says. “You can click on a bill, see schedules, see who supported it, how one party voted with another party, how coalitions formed, what members said. It’s not secret. It’s all online and very easily manipulated data. And that’s where we’re going.”

No Comments Yet »

Leave a comment