Peer-to-Peer Review and Its Aporias
Over the course of last week, a huge number of friends and colleagues of mine posted links and notes on Twitter and around the blogosphere about Mike O’Malley’s post on The Aporetic about crowdsourcing peer review.
It probably goes without saying that I’m in great sympathy with the post overall. I’ve invested a lot of time over the last couple of years in testing open peer review, including the experiment that we conducted at MediaCommons on behalf of Shakespeare Quarterly, which has been written about extensively in both the Chronicle of Higher Education and the New York Times. And of course there was my prior experiment with the open review of my own book manuscript, whose first chapter focuses in great detail on this new model of peer review, and which has been available online for just over a year now.
It’s gratifying to see other scholars getting interested in these wacky ideas about reinventing scholarly publishing that I’ve been pushing for over the last several years. In particular, the entry of scholars who are relatively new to the digital into these discussions confirms my sense that we’re at a tipping point of sorts, in which these new modes, while still experimental, are beginning to produce enough curiosity in mainstream academic circles that they’re no longer automatically dismissed out of hand.
All that said, I do feel the need to introduce a few words of caution into these discussions, because the business of open peer review isn’t quite as straightforward as simply throwing open the gates and letting Google do its thing. O’Malley argues that Google “is in effect a gigantic peer review. Google responds to your query by analyzing how many other people — your ‘peers’ — found the same page useful.” And this is so, up to a point; Google’s Page Rank system does use inbound links to help determine the relevance of particular pages with respect to your search terms. But what relationship Page Rank bears to the category of folks you might consider your “peers” — however democratically you construct that term — needs really careful consideration. On the one hand, Google’s algorithm remains a black box to most of us; we simply don’t know enough about how its machine intelligence self-adjusts to take it on faith as a reliable measure of scholarly relevance. And on the other, the human element of Page Rank — the employment of Search Quality Raters who evaluate the relevance of search results, and whose evaluations then affect the algorithm itself — and the fact that this human element has been kept so quiet, indicates that we haven’t yet turned the entire business of search on the web over to machine intelligence, that we’re still relying on the kinds of semi-secret human ratings that peer review currently employs. 
To put it plainly: I am absolutely committed to breaking scholarly publishing of its dependence on gatekeeping and transforming it into a Shirkyesque publish-then-filter model. No question. But our filters can only ever be as good as our algorithms, and it’s clear that we just don’t know enough about Google’s algorithms. O’Malley acknowledges that, but I’m not sure he goes quite far enough there; the point of opening up peer review is precisely to remove it from the black box, to foreground the review process as a discussion amongst peers rather than an act of abstracted anonymous judgment.
That’s problem number 1. The second problem is that peer review as we currently practice it isn’t simply a mechanism for bringing relevant, useful work into circulation; it’s also the hook upon which all of our employment practices hang, as we in the US academy have utterly conflated peer review and credentialing. As a result, we have a tremendous amount of work to do if we’re going to open peer review up to crowd-sourcing and/or make it an even partially computational process: we must simultaneously develop credible ways of determining the results of that review and, even more importantly, ways of analyzing and communicating those results to other faculty, to administrators, and to promotion and tenure committees, such that they will understand how these new processes construct authority online. It’s clear that the open peer review processes that I’ve been working with provide far more information than does the simple binary of traditional peer review’s up-or-down vote, but how to communicate that information in a way that conventional scholars can hear and make use of is no small matter.
And the third issue, one that often goes unremarked in the excitement of imagining these new digital processes, is labor. Most journal editors will acknowledge that the hardest part of their job is reviewer-wrangling; however large their list of potential peer reviewers may be, a tiny fraction of that list does an overwhelming percentage of that work. Crowdsourcing peer review presents the potential for redistributing that labor more evenly, but it’s only potential, unless we commit ourselves to real participation in the work that open peer review will require. It’s one thing, after all, for me to throw my book manuscript open for review — a process in which I received nearly 300 comments from 44 unique commenters — but what happens when everyone with such a manuscript uses a similar system? How much time and energy are we willing to expend on reviewing, and how will we ensure that this work doesn’t end up being just as unevenly distributed as is the labor in our current systems of review?
This difficulty is highlighted by the fact that many of the folks who have written excitedly about the post on The Aporetic are mostly people who know me, who know my work, and yet who were not commenters on my manuscript. Not that they needed to be, but had they engaged with the manuscript they might have noted the similarities, and drawn relevant comparisons in their comments on this later blog post. This is the kind of collaborative connection-drawing that will need to live at the forefront of any genuinely peer-to-peer review system, not simply so that the reviews can serve as a form of recommendations engine, but in order that scholars who are working on similar terrain can find their ways to one another’s work, creating more fruitful networks for collaboration.
There are several other real questions that need to be raised about how the peer-to-peer review system that I hope to continue building will operate. For instance, how do we interpret silence in such an open process? In traditional, closed review, the only form of silence is a reviewer who fails to respond; once a reviewer takes on the work of review, she generally comments on a text in its entirety. In open review, however, and especially one structured in a form like CommentPress, which allows for very fine-grained discussion of a text section by section and paragraph by paragraph, how can one distinguish between the silence produced by the absence of problems in a particular section of a text, the silence that indicates problems so fundamental that no one wants to point them out in public, and the silence that results from the text simply having gone overlooked?
And that latter raises the further question of how we can keep such a peer-to-peer review system from replicating the old boys’ club of publishing systems of yore. However much I want to tear it down, the currently existing system of double-blind peer review was in no small part responsible for the ability of women and people of color to enter scholarly conversations in full; forcing a focus on the ideas rather than on who their author was or knew had, at that time, a profoundly inclusive result.
That blind review is now at best a fiction is apparent; that it has produced numerous flaws and corruptions is evident. It’s also clear from my work that I am no apologist for our current peer review systems.
But nonetheless: I’d hate to find us in a situation in which a community of the like-minded — the cool kids, the in-crowd, the old boys — inadvertently excludes from its consideration those who don’t fall within their sphere of reference. If, as I noted above, our computational filters can only ever be as good as our algorithms, the same is doubly so in a human filtering system: peer-to-peer review can only be as open, or as open-minded, as those who participate in it, those whose opinions will determine the reputations of the texts on which they comment and the authors to whom they link.
 Most of this information came to me through a conversation with Julie Meloni, who also pointed out that for a glimpse of what a purely machine-intelligence driven search engine might produce, we can look at the metadata train wreck of Google Books. For whatever reason, Google has refused to allow the metadata associated with this project to be expert-reviewed, a situation that becomes all the more puzzling when you take the Search Quality Raters into account.