Recently in the New York Times editorial pages - the most prominent platform for short-form opinion writing in the United States and equal to any in the English-speaking world - Armand Marie Leroi, a professor of developmental biology, argued that the humanities, if they are to have a future, must make the transition to a mathematically-based science.
There is much in Leroi's argument that I agree or sympathize with. The digitization projects of the last decade or more truly have changed what is possible in the humanities. We have easy access to a breadth of sources that was entirely unknown just a few decades ago. It is also true that scholars in the humanities sometimes make overly broad statements based on slim evidence, and that we sometimes make assertions with statistical implications without bothering to gather data or test the likelihood of those assertions. As Leroi states, "Digitization breeds numbers; numbers demand statistics." I've beat on this drum a few times myself. At the conceptual if not the computational level, statistical and computational methods are not out of our reach. With less work than it takes to learn Latin, we in the humanities can make these methods our own.
And yet several times while reading the essay, I found myself staring at the text and wondering what Leroi could possibly be thinking. Now that we have a decade of experience with digitization, we can recognize both its promise and its limits. Digitization does not turn "caterpillars into butterflies"; we have seen media change before, and we know that there are both gains and losses. The easy access to facsimile images obscures the difficulty of determining what other pamphlets were bound together as a single volume, for example, an important fact that would have once been obvious to anyone visiting an archive in person. And it is not only scientists who "know that impressions lie"; humanists have been studying representation and memory for a long time.
A telling episode in Leroi's essay involves a hypothetical graduate student who reacts to the argument of a traditional scholar based on textual evidence by downloading texts, running algorithms, applying statistical analysis, and visualizing the results in order to disprove the traditional scholar's point. Now, digital texts are marvelous things, but you have to understand what it is they represent. Is it an autograph? The first edition? The last authorized edition? A critical edition? A transcription of an early, fragmentary manuscript, or a late, complete one? You can postpone some of these questions, but you can't avoid them forever. Textual editing, electronic or not, is difficult, painstaking, and often thankless work. The point is that these downloadable texts don't simply exist; they have been created by people with particular outlooks and specific places and histories, and serious work in the humanities has to be aware of those aspects. And what algorithms should the graduate student run? The coin of the realm in the textual humanities remains close reading, with careful attention to context and levels of meaning. At the moment, the algorithms at our disposal enable only distant reading. It's certainly true that the graduate student and the scholar may end up talking past each other, but that won't do anyone any good (especially the graduate student, who will be fishing for recommendation letters when he or she hits the job market in a few years).
If we do in fact reach the point where the digital humanities expresses its results "not in words, but equations," where the "analog scholar won't even know how to read the results," then the digital humanities will fail. The humanities as academic disciplines have a particular set of guiding questions, and if a would-be contribution to the field does not address any of those questions, or does so in a way that is incomprehensible to practitioners, then it will be ignored. Leroi, a biologist, thinks that the new humanities disciplines will resemble evolutionary biology, with contributions from "biologists, economists, and physicists." While all of these disciplines have useful insights and methods for the humanities, what they do not have is a grasp on the questions that are of primary importance to humanists, or the language humanists use to express their findings. It is furthermore not at all clear that the tools of evolutionary biology, where reproduction is the first imperative of the most basic building blocks of life, should apply to culture, where it is not.
It is not as if we have not been down this path before. There is a history of mathematical approaches to the humanities, and it is a history littered with dead ends. While lurking in the stacks as a graduate student at the University of Illinois, I would regularly come across books published in the 70s and 80s, precursors of a sort to the Mittelhochdeutsche Begriffsdatenbank, that attempted to semantically encode various medieval texts so that one could search for not just textual but rather significant semantic collocations. I know of no useful scholarship that ever came out of these efforts. On a happier note, corpus linguistics has been a well established discipline for several decades - but it has supplemented, not supplanted, other approaches to syntax and morphology. Leroi holds up the "unforgiving terms and journals that scientists read," and yet STEM peer review has not proved to be an effective arbiter of quality in the humanities. Instead, leading STEM journals have regularly published headline-grabbing articles that apply computational and statistical methods to historical linguistics, for example, and failed to recognize in the nonsensical results a mirror image of the Sokal affair. If the basic assumptions of one field (for example, that rates of gene mutation are predictable) simply don't apply to another (linguistics really and truly reject glottochronology), then the methods will not be transferable, not because analog scholars are hidebound, but because they have a grounding in their disciplines that their neighbors across the quad simply do not have.