Scientists are just as confused as you are about the ethics of big data research
When a thug A researcher published 70,000 OkCupid profiles last week, with usernames and sexual preferences, people were pissed off. When Facebook researchers manipulated the way stories appear in news feeds for a 2014 Mood Contagion Study1, people were really pissed off. OkCupid filed a copyright infringement claim to remove the dataset; the newspaper that published the Facebook study published a “expression of concern. “Outrage has a way of shaping ethical boundaries. We learn from our mistakes.
Surprisingly, however, the researchers behind these two big data blasts never anticipated the public outrage. (The OkCupid research does not appear to have gone through any ethics review process, and a Cornell ethics review board that reviewed the Facebook study declined to review it due to the mere implication. limited to two Cornell researchers.2.) And it shows how untested the ethics of this new area of research are. Unlike medical research, which has been shaped by decades of clinical trials, the risks and benefits of crawling large semi-public databases are just starting to become clear.
And the patchwork of review boards tasked with overseeing these risks is only slowly entering the 21st century. Under the common rule in the United States, federally funded research must be ethically reviewed. Rather than a unified system, each university has its own Institutional Review Board, or IRB. Most of the IRB members are university researchers, mostly in biomedical sciences. Few are professional ethicists.
Even fewer have computer or security expertise, which may be necessary to protect participants in this new type of research. “The IRB can make very different decisions depending on who sits on the board, what university it is and how they feel that day,” says Kelsey Finch, policy adviser at the Future of Privacy Forum. There are hundreds of such IRBs in the United States, and they grapple with the ethics of research in the digital age largely on their own.
The Common Rule and the IRB system were also born out of outrage, but from a much more serious mistake. In the 1970s, the public finally heard about Tuskegee’s decades-long US government experiment in which African-American sharecroppers were not treated with syphilis to study disease progression. The controversy led to new regulations on research involving human subjects conducted for the United States Department of Health and Human Services, which then spread to all federal agencies. Now, any institution that gets federal funding must set up an IRB to oversee research involving humans, whether it’s a new flu vaccine or an ethnography of carpet sellers in Turkey.
“The structure was very largely developed from health agencies for experimental research,” says Zachary Schrag, historian at George Mason University and author of a book on BRIs in the social sciences. But not all human research is medical in nature, and many social scientists find the process ill-suited to their research, where the risks are usually more subtle than life or death.
Some IRB requirements may seem ludicrous when applied to the social sciences. Informed consent statements, for example, often include the phrase “the alternative to participation is…” to allay a patient’s possible fears that refusal to participate would mean being denied medical treatment. But if you are looking for volunteers to take a testing habits survey, then the only way to complete the phrase is the obvious “the alternative to participating is not participating”.
Sociologists have been crying out loud about IRBs for some time now. The American Association of University Teachers has advised increase the number of social science researchers in IRBs or establish separate boards that assess only social science research. In 2013, he went so far as to issue a report recommend that researchers themselves decide whether or not their minimal risk work requires IRB approval, which would also free up more time for IRBs to devote to biomedical research with life or death issues.
This is not to say that social science research in general, and social science research on big data in particular, is risk free. With new technology, a system that never quite worked works even less.
Elizabeth Buchanan, an ethicist at the University of Wisconsin-Stout, sees Internet research entering its third phase, raising new ethical questions. The first phase began in the 90s with internet surveys and the second with data from social media sites. Now, in the third phase, researchers can buy, say, Twitter data going back years and merge it with other publicly available data. “It’s in the intertwining that you can see the tension of ethics and intimacy,” she says.