A Vision for Open Science Beyond the Reproducibility Crisis
- Authored by John D. Boy
- Permalink
This Saturday I was invited to give a talk at the very first ReproHack in the Netherlands, held at the university library in Leiden. Because it was my first time talking on the topic of open science, I wrote up some of my thoughts that have been brewing over the past decade and spoke mostly on the basis of the notes that follow. I thought I’d share them here.
The slides for my talk included some wonderful photographs from my colleague Andrew Littlejohn’s fieldwork that I am not reproducing here, so you’ll have to use your imaginations—something that I encourage doing anyway!
I’m here to talk about open science. The title of my talk is “A vision for open science beyond the reproducibility crisis.” With this title I wanted to hint at two areas I’m hoping to touch on in the next twenty or so minutes.
First, I’ll talk a bit about what open science is or can be in fields not afflicted by the reproducibility crisis, such as my own.
I am speaking about this topic from my background as a mostly qualitative researcher. My background is in sociology, and I now work in the anthropology department here in Leiden. In a way, I’m joining you here today as a stranger from another land to give you a native’s point of view on our curious habits and customs.
My anthropology colleagues like to say that there’s much to be learned from studying the unfamiliar—not just because it makes the unfamiliar more familiar, but also because it makes the familiar unfamiliar. It helps us to clarify our own assumptions, such as our ways of classifying and knowing the world around us.
Second, I will advocate for imagining what we want open science to entail beyond concerns for reproduciblity, transparency and accountability. Our imaginations can be enriched by our encounters with strangers—I’m sure there’s robust psychological research to back me up on that!—so I’ll use my standpoint to sketch some parts of that vision. But I also want that part to be more of a conversation, because I don’t have all the answers, just a few suggestions.
So let me tell you a little bit about the world I come form. This is my colleague Andrew Littlejohn. He is an anthropologist of Japan, and for a major research project of his, he studied the aftermath of the tsunami in rural areas along the coast.
You may remember the 2011 tsunami in particular because of the catastrophic accident at the Fukushima nuclear power plant. But aside from that, the tsunami killed over 20,000 people in coastal regions. Its impact was devastating.
In the midst of this crisis, policymakers wanted to find ways to protect the coastal areas from being impacted like this again. One popular fix is the sea wall. Sea walls can be up to 15 meters high, and advocates claim that they can prevent the worst when another tsunami happens. Andrew became interested in these sea walls because they became targets of protests and resistance by locals. “Why are they resisting something that is supposed to keep them safe?” he asked.
Andrew writes, “The walls might protect people’s lives, but they come with a cost for identities and livelihoods: things ‘without which,’ one local ecologist told me, ‘it doesn’t matter if one lives.’” Identities were threatened by sea walls because they completely destroy the sense of place among residents of coastal villages. Livelihoods were also at stake, because sea walls have the consequence that fishers cannot survey what they call “the face of the sea” to decide on a suitable fishing tactic when their vision is cut off by a giant wall.
Why do I bring this up? For two reasons.
First, Andrew’s research project is a good illustration of how us qualitative researchers work. We often enter the field with a broad question or puzzle that we want to address, but without any specific hypotheses to put to the test. We go into a research setting, build relationships, ask questions, take lots of notes, gather documents, draw maps, take pictures and do all manner of things to build of a detailed understanding of the situation. As our understanding evolves, we’ll often have converstations with our informants about the interpretation we’re arriving at and ask for their input, so that the knowledge gained isn’t just an individual researcher’s achievement, but actually a collective one (even if ethnographies are often later published as books with just a single author).
Here you see Andrew with a group of citizens that were protesting the construction of a sea wall. As part of his research, he became friends with them and shared many a meal with them.
Second, there’s a conceptual point that Andrew’s research makes that I think it applicable to my topic today. Andrew argues that institutional or infrastructural fixes that are devised in the midst of a crisis often have unintended consequences further down the line, because those trying to fix the crisis are blind to other things going on. Sea walls may help solve one issue (though their effectiveness is very much in question), but they create a host of new ones, and as such they imperil what they were supposed to protect: lives and livelihoods.
Many champions of open science today are pushing open science as a fix for the reproducibility crisis. Be transparent, preregister your hypotheses, share your data, open up your code—these are all ways to address the crisis of confidence (leaving aside for the moment the question of whether these things are intrinsically right to do as a matter of sound science). The question is, in pushing for this form of fix—the institutionalization of open science practices—in the midst of a crisis, is there perhaps something we are failing to take into account, something that will have to bear the brunt of some unintended consequence down the road?
I’m afraid the answer is: very likely, yes.
boundary-work
Take this statement by an influential American sociologist: “To be a social science discipline, sociology needs to adopt standards for transparency and reproducibility. All science is moving this way. Some parts of the discipline can’t or won’t. This may solidify quant/qual as science/nonscience & I’m not sure the discipline can survive it.”
What he’s saying is that, due to the ineluctable institutionalization of open science pratices, research that can’t or won’t fall in line will be considered less scientific, and as such may become marginalized and delegitimated. The end result may be that the discipline of sociology breaks apart along a fault line.
In other words, the institutionalization of open science may become bound up with boundary-work—the delimination of more or less legitimate ways of knowing.
So what is it about “qual” work that means that it “can’t or won’t” adopt transparency and reproducibility practices? A lot of that should already be clear from what I said earlier about the way most qualitative research works
We can’t preregister hypotheses because we’re not doing null hypothesis significance testing. We follow a different logic of inquiry.
We can’t share our data because it consists of records of personal interactions and intimate settings that we cannot simply make available to others. We can be transparent about some aspects of our research process, but sometimes even revealing the exact location or groups studied can be too much. There’s a discussion among ethnographers to what extent we should use “masking” as a default—i.e., hiding people behind pseudonyms—but even so, most ethnography simply wouldn’t work if we couldn’t assure participants some level of anonymity.
Matthew Desmond, an ethnographer who wrote the Publitzer Prize winning book Evicted, “hired an independent, named fact-checker who was given access to otherwise confidential materials” (as reported here). That is probably the exception that proves the rule that one-size-fits-all standards for “transparency” or “accountability” are unattainable in most ethnographic work. (Fact-checking also only covers one thing—facts—and not conclusions or interpretations where things are arguably more likely to go wrong.)
Finally, sharing code. “Coding” is now a pretty standard part of qualitative research in the sense that source material is assigned labels (called “codes”) to aid the recognition of recurring patterns. But even with the most rigid coding scheme, qualitative work still fundamentally rests on interpretation, and interpretation cannot be translated into an algorithm. It is a manual, reflexive process, and it is also—as I said before—a collective process that may involve a dialogue with the “subjects” being studied.
interpretive labor
So, does that mean there are no open science practices in my world? Absolutely not. But the reaction to the term “open science” among my colleagues tends to be a reaction to the idea of imposing one-size-fits-all transparency policies. Not because they are anti-transparency or because they can’t argue that the principle of “as open as possible, as closed as necessary” exempts them from sharing their fieldnotes and other raw data from the field.
It’s because of the interpretive labor that’s demanded of them when yet another aspect of their work is bureaucratized. And you may be able to guess by now that their research work is really not compatible with bureaucracy, because it’s the work of forming relationships and forming understandings through an open-ended, iterative process of inquiry.
The autonomy of research has already been severely circumscribed because we now have to write research grants for everything. They see the dominant open science discussion as another way to take away their autonomy over their own work. That would be another unintended consequence.
epistemology
Finally, reproducibility is not as big a concern in my world because we follow a different epistemology. The reason there’s a crisis of confidence in psychology and other fields afflicted by the reproducibility crisis has a lot to do with the kind of knowledge that is produced. Scholars in these fields often seek to isolate effects that are then held up as general laws, like laws of nature. Some of this is a problem with how this research is reported on in popular outlets—“scientists find that we are hard-wired to X”—but psychologists often invite these kinds of impressions. So then when it turns out that the effect in question was an artefact of the research process, this has pretty strong repercussions. Whereas in my world, we emphasize how contextual and contingent our findings are, and we rarely formulate them in law-like terms. They are offered as concepts that help us make sense of the complex web of relations and meanings we live our lives in.
As an aside, I hope that one of the lessons learned from the reproduciblity crisis is to practice a bit more humility regarding how knowledge is presented. Even if findings are reproducible, it doesn’t necessarily mean they have a great degree of external or ecological validity—it doesn’t tell us whether we’d find an effect outside of the highly artificial conditions of a lab, survey or field experiment. In fact, since reproducibility is greatest when the process of inquiry takes on the characteristics of a pure function that yields the same outputs for the same set of inputs, it may even come at a cost to external validity.
But there are open science practices that we fully embrace—they’re just no the ones getting all the attention. First, anthropologists are already some of the best when it comes to open access publication. And in 2020, 13 major anthropology journals will turn fully open access as the result of an initiative called Libraria aiming to build a “more open, diverse, community-controlled scholarly communication system.”
Second, anthropological professional associations already long ago made it part of their professional ethics codes to “make your results accessible” which also means that anthropologists are mostly barred from doing wholly proprietary or classified research. Sociologists similarly have a principle of “social responsibility” which states that they should “apply and make public their knowledge in order to contribute to the public good.”
These concerns and principles point in three directions—and this is where we get to the imagination part of my talk.
First, projects like Libraria are attempting to increase the autonomy of researchers by putting publication channels under the cooperative control of those who do the work rather than under corporate control. In this context open science doesn’t become an administrative imposition and burden, but a project to seize control of the means of knowledge production.
Second, the ethical codes and professional norms aim at enriching what we could call the knowledge commons. What is the commons? Historically it has referred to the shared grasslands that were not under private ownership where, for instance, everybody’s sheep could graze. When the commons was enclosed and turned into privately owned land, that led to a stratification between those who could and those who couldn’t afford land to have their sheep graze on. This cemented class differences, and meant that those without land had to sell their labor power to those with land in order to make a living. Hello capitalism.
In the field of knowledge production there have also been enclosures that have cut some people off from the means of knowledge production. There’s the boundary between academia and the general public, and there are class differences within academia between successful Principal Investigators who get grants and those who do the grunt work.
Open science in my view should enrich the knowledge commons by bringing those outside academia into the process of knowledge production, not just “knowledge utilization,” as it’s known in the jargon here. That means empowering them with tools, literature and all the other means needed to create knowledge.
It should also help to level class differences within academia. Open access can be a bit part of this, because it makes it possible for universities without huge budgets to have access to the latest findings. Free and open source software is another important part of this, because software licensing fees, software patents and so forth are a way out keeping some people out of the fields of knowledge production. This is something I’d like to see open science advocates talking much more about: what does it mean for something to be “free as in academic freedom” (#faiaf)?
Third, the open science initiatives in my world are collective projects. They don’t try to make individual researchers more accountable, but they aim at the institutional conditions and community norms in which knowledge production is embedded. If we continue thinking along these lines of open science as a collective project, then we have to reach the conclusion that open science also has to be a political project. At some level, it has to take aim at the political economy of knowledge production. That can be at a structural level—governmental and other large funders. But at the level of our institutions we can also push for the changes that would improve scholarly work. One example is pushing for slower cycles so that researchers have the breathing room to do what they need to do.
By way of summary and to begin the discussion, let me end with this call to action: Let us imagine open science as a collective project that expands our autonomy and enriches the knowledge commons.