Kari Kraus is an assistant professor at the University of Maryland with a joint appointment in the College of Information Studies and the Department of English. Her research and teaching interests focus on new media and the digital humanities, digital preservation, intellectual property, game studies, and textual scholarship and print culture.
Authored for DHQ; migrated from original DHQauthor format
Broadly conceived, this article re-imagines the role of conjecture in textual scholarship at a time when computers are increasingly pressed into service as tools of reconstruction and forecasting. Examples of conjecture include the recovery of lost readings in classical texts, and the computational modeling of the evolution of a literary work or the descent of a natural language. Conjectural criticism is thus concerned with issues of transmission, transformation, and prediction. It has ancient parallels in divination and modern parallels in the comparative methods of historical linguistics and evolutionary biology.
The article develops a computational model of textuality, one that better supports
conjectural reasoning, as a counterweight to the pictorial model of textuality that
now predominates in the field of textual scholarship. Computation
is here
broadly understood to mean the manipulation of discrete units of information, which,
in the case of language, entails the grammatical processing of strings rather than
the mathematical calculation of numbers to create puns, anagrams, word ladders, and
other word games. The article thus proposes that a textual scholar endeavoring to
recover a prior version of a text, a diviner attempting to decipher an oracle by
signs, and a poet exploiting the combinatorial play of language collectively draw on
the same library of semiotic operations, which are amenable to algorithmic
expression.
The intended audience for the article includes textual scholars, specialists in the digital humanities and new media, and others interested in the technology of the written word and the emerging field of biohumanities.
In an essay published in the
scientific,a word whose positivistic associations conflict with traditional humanistic values of ambiguity, open-endedness, and indeterminacy
Ramsay's rich and nuanced answer to that question has been formulated across two publications, of which
stochasticaesthetics of the
generative aestheticsof the Dadaists and Oulipo
Ironically,Ramsay writes,
it is not the methods of the scholar that reveal themselves ascomputational,but the methods of the gematrist and the soothsayer
With these observations in mind, the principal purpose of this essay is to establish an
ancient lineage for what we might call
computational processes within the rich tradition of
interpretive endeavors [are] usually aligned
(my emphasis) pre-interpretive,
Broadly conceived, then, this essay re-imagines the role of conjecture in textual scholarship at a time when computers are increasingly pressed into service as tools of reconstruction and forecasting. Examples of conjecture include the recovery of lost readings in classical texts and the computational modeling of the evolution of a literary work or the descent of a natural language. Conjectural criticism is thus concerned with issues of transmission, transformation, and prediction (as well as retrodiction).
In the first section of the essay I attempt to define conjecture, often viewed in the humanities as a misguided and anti-methodical pursuit, and rationalize it as a form of subjunctive knowledge, knowledge about what might have been or could be or almost was. The object of conjecture is notional rather than empirical; possible rather than demonstrable; counterfactual rather than real. This subjunctive mode, I contend, is not antithetical to the humanities, but central to it. Whether it is a student of the ancient Near East deciphering a fragmented cuneiform tablet or a musician speculatively completing Bach's unfinished final fugue or a literary scholar using advanced 3D computer modeling to virtually restore a badly damaged manuscript, the impulse in each instance — vital and paradoxical — is to go beyond purely documentary states of objects.
The essay develops a computational model of textuality, one that better supports
conjectural reasoning, as a counterweight to the material model of textuality that now
predominates. string,
I mean a sequence of
non-numeric symbols, such as letters. The term, which derives from computer science,
is explained below.
In an effort to overcome common objections to analogies between the humanities and
sciences,
In textual scholarship, conjecture is the proposal of a reading not found in any extant witness to the text. It is predicated on the idea that words are always signs of other words; the received text also harbors the once and future text. When A. E. Housman alters a line from Catullus's
Opisso that it is understood not as the Latin term for
power,but as the genitive of
Ops,the mother of Jupiter in Roman mythology, he substantially alters the meaning of the line:
Peleus, most dear to his son, is the protector of the power of Emathia,becomes
Peleus, protector of Emathia, most dear to the son of Ops [Jupiter].The emendation, as the fictional Housman explains in Tom Stoppard's
How can Peleus becarrisme nato , most dear to his son, when his son has not yet been born?
Expressed in grammatical terms, conjecture operates in the subjunctive and
conditional moods.
Conjecture has at various moments in history held a place of honor in our repertory
of editorial paradigms. Taken collectively, for example, Housman’s critical writings
offer a magnificent apologia of the fine flower
of textual criticism he calls it, choosing
metaphor over definition
The scholarly language can at times suggest the intervention of a supernatural agent.
The classicist Robin Nisbet speaks of a Muse of Textual Conjecture,
whom he
playfully christens, appropriately enough, a happy conjecture
; demands more than humanity
possesses
What does this general silence indicate about conjecture, its status in textual
criticism, and its practice? E. J. Kenney’s position (which he officially co-opts
from the nineteenth-century critic Boeckh, but which nevertheless seems to be
prevalent among Classical editors) that the solution to the more difficult
cases
of conjecture comes in a flash or not at all
At its inception, a [romantic] poem is
an involuntary and unanticipated donnée
Would it be possible for conjecture to assume again a position of prominence in editorial theory? It would take a concerted inquiry into history, method, and theory. We would need to seriously engage the psychology of intuition, the AI and cognitive literature of inference, and the history of conjecture as divinatory art and scientific inquiry. We would need to keep apace with advancements in historical linguistics and evolutionary biology (two fields whose relevance to textual criticism is vastly under-appreciated) and the burgeoning psycholinguistic literature of reading. Indeed, we would reap many rewards from outreach to the linguistic community. We would need to dispel the myth that conjecture is a riddle wrapped in a mystery inside an enigma. A curriculum for conjecture would give us better insight into the method underlying the prediction — or retrodiction — of ancient readings of texts. Not least, we would need to work decisively to bring conjectural criticism into the 21st century. Because it has traditionally been described as a balm to help heal a maimed or corrupted text, conjecture is in desperate need of a facelift; the washed-up pathological metaphors long ago ceased to strike a chord in editorial theory.
So what would an alternate language of conjecture sound like?
We might, for starters, imagine conjecture as a knowledge toolkit designed to perform
what if
analyses across a range of texts. In this view, the text is a
semiotic system whose discrete units of information can be artfully manipulated into
alternate configurations that may represent past or future states. Of course the
computing metaphors alone are not enough; they must be balanced by, among other
things, an appreciation of the imponderable and distinctly human qualities that
contribute to conjectural knowledge. But formalized and integrated into a curriculum,
the various suggestions outlined here have the potential to give conjecture a new
lease on life and incumbent editorial practices — much too conservative for a new
generation of textual critics — a run for their money.
What do I mean by conjecture, then? Giovanni Manetti's pithy definition — inference to the imperceptible
— is
good insofar as it captures the cognitive leap (inference
) from the known to
the unknown.cognitive
is a misleading word choice. The inferential
leap may be either biological or artificial (with artificial
understood as
computational,
mechanical
or automated
).
Pictures differ from other sign systems, such as writing, by being continuities in which every mark is interdependent, rather than operating through a combination of independent markers like the alphabet. (In computational terminology, they areanalograther thandigitalrepresentations.)
allographic inference,then, I mean
If we set aside for a moment the original terms of my definition (
This framework allows us to associate activities, behaviors, and practices that wouldn't otherwise be grouped together: wordplay, divination, textual transmission, and conjectural reconstruction, for example. It also allows us, as we shall see, to perceive a number of underlying similarities among a set of disciplines that for most of the twentieth century didn't take much interest in one another, namely textual criticism, historical linguistics, and evolutionary biology.
Historically, we have referred to the source of a message about the future as an
oracle
or sibyl
or seer
; within the contexts specified in
this essay, we can alternately refer to that agent, whether human or mechanical, as a
computer.
From this standpoint, divination is not knowledge in advance
of fact
so much as knowledge that is a (computational) permutation of fact
Futurese,
a projection of American English in the year 3000
AD, proposes that the word build
will be pronounced /bIl/ in some American
dialects within a century or more as a consequence of consonant cluster
simplification emendations
of textual scholars; the doublets of Lewis Carroll; the
molecular reconstructions of geneticists; the projections of conlangers (creators of
imaginary languages); the transformation series of historical linguists, textual
scholars, and evolutionary biologists.
These devices and systems are unified by their semiotic and, in some cases, cognitive
properties and behaviors: a pun resembles an editorial reconstruction resembles a
speech error resembles a genetic mutation, and so forth. Each of these transitive
relations is explored in the course of this essay. Consider, as a preliminary
example, the analogy between textual and genetic variation. It is the allographic
equivalence between the two — the fact that both are digitally encoded — that makes
possible a strong theory of translation, allowing one system to be encoded into the
other and, once encoded, to continue to change and evolve in its new state.
Brazilian-born artist Eduardo Kac, who coined the term
The key element of the work is anartist's gene,a synthetic gene that was created by Kac by translating a sentence from the biblical book of Genesis into Morse Code, and converting the Morse Code into DNA base pairs according to a conversion principle specially developed by the artist for this work. The sentence reads:Let man have dominion over the fish of the sea, and over the fowl of the air, and over every living thing that moves upon the earth.It was chosen for what it implies about the dubious notion of divinely sanctioned humanity's supremacy over nature. The Genesis gene was incorporated into bacteria, which were shown in the gallery. Participants on the Web could turn on an ultraviolet light in the gallery, causing real, biological mutations in the bacteria. This changed the biblical sentence in the bacteria. The ability to change the sentence is a symbolic gesture: it means that we do not accept its meaning in the form we inherited it, and that new meanings emerge as we seek to change it.
Kac’s elaborate game of code-switching is enabled by what Matthew Kirschenbaum calls
formal materiality,
the condition whereby a system is able to propagate the illusion (or call it
a working model) of immaterial behavior.
I first became sensitive to the convergence of conjecture and wordplay while studying
Shakespeare as a graduate student. The ludic language of Shakespeare's fools,
soothsayers, and madmen seemed to me to uncannily resemble the language of
Shakespeare's editors as they juggled and transposed letters in the margins of the
page, trying to discover proximate words that shadow those that have actually
descended to us in the hopes of recovering an authorial text. The pleasure George Ian
Duthie, a postwar editor of Shakespeare, shows in permutating variants — juxtaposing
and repeating them, taking a punster's delight in the homophony of stockt, struckt,
and struck; hare and hart; nough and nought
near echolaliaon the heath in the storm scene, the
strange, homeless babblethat
presses up from within Tom's lists, in their jamming up and disjunctions of sense, their isolation of bits of language[:] . . . toad/tod-pole, salads/swallows, wall-newt/water.
Consider, too, the soothsayer Philarmonus, solicited by Posthumus near the close of
When as a lion's whelp shall, to himself unknown, without seeking find, and be embrac'd by a piece of tender air; and when from a stately cedar shall be lopp'd branches which, being dead many years, shall after revive, be jointed to the old stock, and freshly grow; then shall Posthumus end his miseries, Britain be fortunate and flourish in peace and plenty
[To CYMBELINE] The piece of tender air, thy virtuous daughter, Which we call mollis aer , andmollis aer We term it mulier ; whichmulier I divineIs this most constant wife, who even now Answering the letter of the oracle, Unknown to you, unsought, were clipp'd about With this most tender air
love of sound.
Every fool can play upon the word,quips
capaciousness of earwith which Kenneth Gross attributes Hamlet applies in equal measure to Shakespeare’s soothsayers, clowns, fools, tricksters, madmen — and, I would add, editors
great stage of fools4.6.187
Let me try to connect the figure of the fool, like that of the prophet, more directly
to the motifs of error, wordplay, and conjecture. A self-styled corrupter of words
3.1.34
3.1.10-15Viola: Nay, that's certain; they that dally nicely with words may quickly make them wanton
Feste: I would, therefore, my sister had had no name, sir.
Viola: Why, man?
Feste: Why, sir, her name's a word; and to dally with that word might make my sister wanton.
Considered within the context of these lines, corrupter
denotes someone who
has the power to pervert the world by deforming (dallying with
) the language
used to signify it. But Shakespeare mobilizes other meanings as well. In the domain
of textual scholarship, the term I did impeticos thy gratillity
2.3.27
words here are at liberty and have
little meaning apart from that which editors, at the cost of great labor,
finally manage to impose upon them
great labor
to which Michel Grivelet alludes involves exploiting
the digitally encoded phonological structure of language to conjecturally recover
authentic words from corrupt ones: corruptions
— easily confused with its fictional diachronic vectors in
synchrony.
Let me pause here to gather together a number of threads: my central tenet is that
the linguistic procedures of Shakespeare's editors often seem to mimic those of
Shakespeare's most inveterate computers
of language. To emend a text — to
insert, delete, and rearrange letters, phonemes, or sequences of words — is to
ritualistically invoke a divinatory tradition of wordplay that dates back to at least
the third millennium BC. For a benign alien power observing human textual rites from
afar, the linguistic manipulations of a Duthie or a Housman would, I imagine, be for
all intents and purposes indistinguishable from those of a Philarmonus or Feste or
Tom O’Bedlam or Chaldean or, for that matter, a computer programmer using
string-rewriting rules to transform one word into another. What all these examples
have in common are digital units manipulated by computers (programmers, soothsayers,
madmen, fools, punsters, poets, scribes, editors) using a small set of combinatorial
procedures (insertion, deletion, transposition, substitution, relocation) for
conjectural, predictive, or divinatory ends. Duthie and Tom O'Bedlam, computer
programmers and Kabbalists, Philarmonus and historical linguists: they all compute
bits of language.
As an editorial method, such alphabetic computation is dismissed by Housman as a
wanton orthographic game. Finding a kindred spirit in a nineteenth-century German
predecessor, he approvingly quotes the following: Some people, if they see that anything
in an ancient text wants correcting, immediately betake themselves to the art
of palaeography . . . and try one dodge after another, as if it were a game,
until they hit upon something which they think they can substitute for the
corruption; as if forsooth truth were generally discovered by shots of that
sort, or as if emendation could take its rise from anything but a careful
consideration of the thought.
Qtd. in Housman 1921.
monstrous Faultsdisfiguring the masterpiece, alongside Bentley's proposed emendations
Words of a like or near Sound in Pronunciationwere substituted throughout for what Milton intended
is Judiciousis emended to
Unlibidinous;
Nectarousto
Icarus;
Subtle Artto
Sooty Chark;
Woundto
Stound;
Angelicto
Adamic.
near echolaliathat characterizes Tom O'Bedlam's speech in the storm scene also characterizes Bentley's editorial method.
To adjudicate among variants, Bentley finds evidence and inspiration in diverse
knowledge domains. Milton's
in the opening lines of
sacred topthrough an appeal to literary tradition, geology, meteorology, and logic. According to Bentley,
secret topfinds little precedent in the works of antiquity, while
sacred tophas parallels in the Bible, Spenser, and various classical authors
Whatever the shortcomings of Bentley's appeal to precedent, it has the effect of
helping systemize and guide his wordplay. His paronomastic methods are constrained by
other factors as well: the double articulation of natural language, comprised of a
first level of meaningful units (called morphemes) and a second level of meaningless
units (called phonemes), imposes order and rules on the process. There are in English
a total of twenty-six letters of the alphabet capable of representing some 40-45
separate phonemes, and phonotactic and semantic constraints on how those letters and
sounds may be combined. The sequence ptk
in English, for example, is
unpronounceable, and the sequence paf,
while pronounceable, is at the time of
this writing meaningless, except perhaps as an acronym. Bentley's wordplay is thus
bounded by the formal and historically contingent properties of the natural language
with which he works, properties that theoretically prevent his substitutions and
transpositions from degenerating into mere gibberish.
The importance of digital or allographic units to computation, as I have defined it
here, is underscored by Gross's observation that what Tom O'Bedlam manipulates are
(my emphasis). Likewise, Elizabeth Sewell
repeatedly emphasizes that nonsense poetry and wordplay require the divisibility of its material into
ones, units from which a universe can be built
must never be more than the sum of its parts, and must never
fuse into some all-embracing whole which cannot be broken down again into the
original ones. It must try to create with words a universe that consists of
(my emphasis)
The similarities between scribal and poetic computation (understood in the sense in
which I have defined it) are brought home by even a casual look at the editorial
apparatus of any critical edition of Shakespeare, where editors have traditionally
attempted to ascertain whether a particular verbal crux is a poetic device in need of
explication or a misprint in need of emendation. The Shakespearean text is one in
which an error can have all the color and comedic effect of an intentional
malapropism, and an intentional malapropism all the ambiguity and perplexity of an
inadvertent error. Is Cleopatra's knot intrinsicate
a deliberate amalgam of
intricate
and intrinsic,
one that deploys half a dozen
meanings,
or, by contrast, an accidental blend introduced into the text by a
distracted compositor? Is Hamlet's sallied flesh
an alternative spelling of
solid,
one that also plays on sullied,
or, more mundanely, a
typesetting mistake? Does the Duke's headstrong weedes
in
headstrong?steeds
twists and turnsof a great poet's mind
That Mahood feels compelled to devote several pages to the problem of distinguishing
between errors and wordplay in Shakespeare's text is worth lingering over. It
suggests that there is something fundamentally poetic about the changes — or
To clarify the point, consider the phenomenon of metathesis, the simple transposition of elements. Metathesis is a law of sound change (historical linguistics), a cipher device (cryptography), a poetic trope (classical rhetoric), a scribal error (textual criticism), and a speech error (psycholinguistics). There are entries for metathesis in both
Cognitive evidence supports this view. Clinical experiments, slips of the tongue,
slips of the pen, and the speech and writing disorders of aphasics (those whose
ability to produce and process language has been severely impaired by a brain injury)
all provide insight into the structure and organization of the mental lexicon.
Current research models that lexicon as a web whose connections are both acoustic and
semantic: like-sounding and like-meaning words are either stored or linked together
in the brain the problems of aphasic patients are
simply an exaggeration of the difficulties which normal speakers may
experience
There is a sense in which a great poet or
punster is a human being able to induce and select from a Wernicke
aphasia,
writes George Steiner in
Steiner’s insight repays further attention. Poets, like artists in general, often creatively stress-test the system or medium with which they work, probing its edges, overloading it, and pushing it beyond normal operational capacity. Discovering where language breaks down or deviates from regular use is the business of both poet and neuroscientist, providing a means of gaining insight into the mechanisms of language perception, processing, and production. But whereas the neuroscientist gathers data from aphasic patients in a clinical setting, the poet becomes, as it were, his own research subject, artificially manipulating the cognitive networks of meaning and sound that form his dataset. As Steiner notes, when intentionally created, ordered, and embedded in larger textual or linguistic structures, such distortions become poetry.
The mantic wordplay in Shakespeare's
word dice,which in the context of Mayan daykeeping makes links among wordplay, divination, and games explicit
We can also demonstrate the formalism of divinatory practice by looking at some of
the earliest inscribed prophecies of the ancient Near East, which make extensive use
of the same conditional blocks found in modern computer programs to control the flow
and output
of the mantic code: If it rains (
zunnu
iznun) on the day (of the feast) of the god of the city — [then] the
god will be (angry) (zeni) with the city. If the bile
bladder is inverted (nahsat) — [then] it is worrying
(nahdat). If the bile bladder is encompassed (kussa) by the fat — [then] it will be cold (kussu).
The divinatory apparatus consists of an if
clause (in grammatical terms the
then
clause (the
all noise and no laws
formed by the possibility of a chain of
associations between elements of the protasis and elements of the
apodosis,
Only exceptionally are we able to
detect any logical relationship between portent and prediction, although
often we find paronomastic associations and secondary computations based on
changes in directions of numbers. In many cases, subconscious association
seems to have been at work, provoked by certain words whose specific
connotations imparted to them a favourable or an unfavourable character,
which in turn determined the general nature of the prediction.
The divinatory mechanism may be construed as a program for generating textual
variants: the baru, the divinatory priest or technician,
inputs the protasis-omen into the system. A finite set of legal operations
(substitution, insertion, deletion, relocation, transposition) is performed on its
linguistic counters, which are then output in their new configuration to the
apodosis-oracle.As may be seen, all the protases are
constructed according to a structural principle of binary opposition between
the |Threshold| and |the middle| of the Door of the Palace, between |right|
and |left|, between |slit| and |lengthwise slit|. It is thus the system
itself, understood in an ante litteram structural sense, which has prime
importance. The cases which have been observed in the past are no longer the
only ones registered, but rather all possible, conceivable cases are laid
out according to a system based on oppositions and abstract rules.
machine
or artificial
language to natural
or human
language, in this way distinguishing Pascal or C++ or Java from
Italian or English or Chinese. It is this more generic sense of the term that I'm
drawing on here. The fact that so-called machine or artificial languages in the
broader sense are almost always written in English adds a further layer of
cultural, technological, and linguistic complexity to these various
discriminations.
The allographic operations used to generate a future text in Mesopotamian divination
are formally consonant with those that produce variant readings in an open print or
manuscript tradition. Mesopotamian divination baru, which consist of a writing tablet and stylus.
The scholarship on Mesopotamian omen sciences published within the last two decades
suggests that divinatory systems in the ancient Near East exerted profound influence
not only on the content of the literary genres of the ancient world, including the
written record of the Israelites, but also on the interpretive practices of their
exegetes.
Sometimes these [Mesopotamian prophecies] are merely playfuljeux de mot s; but, just as commonly, there is a concern to guard esoteric knowledge. Among the techniques used are permutations of syllabic arrangements with obscure and symbolic puns, secret and obscure readings of signs, and numerological ciphers. The continuity and similarity of these cuneiform cryptographic techniques with similar procedures in biblical sources once again emphasizes the variegated and well-established tradition of mantological exegesis in the ancient Near East — a tradition which found ancient Israel a productive and innovative tradent [i.e., transmitter of the received tradition].
Fishbane’s work demonstrates through exhaustive research that mantic practice of the ancient Near East is a crucially important locus for understanding early textual analysis and emendation of the Hebrew Bible. Our working hypothesis must therefore be that the connections between these hermeneutic traditions are causal as well as formal. Historically, then, and contrary to popular belief, prediction has been as much a part of the knowledge work of the humanities as of the sciences — as much our disciplinary inheritance as theirs. It is time that we own that legacy rather than disavow it. We have for some time now outsourced conjecture to the Natural, Physical, and Computational Sciences. Paradoxically, then, it is only through interdisciplinarity that we can possibly hope to reclaim disciplinarity.
This section looks at the shared arboreal habits,
to use Darwin's term
making creatures from scratch,which is discussed in
it might be possible to re-create the elusive ancestor of all human life on Earth, a hypothetical organism known as LUCA, or thesinceleast universal common ancestor
the remnants of LUCA should be scattered across the genomes of all living things.He goes so far as to entertain the possibility of bringing LUCA
back to life.It is for this reason that the material difference between texts and bodies needs to be continually acknowledged. Otherwise we risk glib pronouncements that fail to take into account what is at stake in asserting only their likeness. For discussion of the origins and limits of the analogy between genetic and textual (or informational) code, see
In an article entitled
trees of history,glossed as
branching diagrams of genealogical descent and change,across the disciplines of evolutionary biology, historical linguistics, and textual criticism. Figures 1, 2, and 3 after O’Hara reproduce three such trees
Figure 1 is Darwin’s well-known tree of life from
trees of historystems from his work in cladistics, a system of classification based on phylogenetic relationships in evolutionary groups of organisms. Over the last several decades, cladistics has become a highly sophisticated computer-assisted methodology, one that has outpaced stemmatics and its linguistic counterpart technically if not conceptually. It is unsurprising, then, that in addition to satisfying a general theoretical curiosity about the coincidence of trees of history in three seemingly disparate fields, O’Hara’s research is also an outreach project. He has, for example, collaborated with Peter Robinson to produce cladistic analyses of Chaucerian manuscripts
The evidence suggesting a direct historical relationship between stemmatics in
textual criticism and the comparative method in linguistics is relatively hard to
come by in the secondary literature, and what little of it there is, is
circumstantial. The English-language textual scholarship has almost nothing to say on
the subject,
Any account of the origins of the comparative method in historical linguistics must take into account the achievements of A. Schleicher (1821-1868), the founding father of the
Comparative method : Comparisons of variants between two or more related languages are undertaken with the purpose of reconstructing their genealogy and proto-language.
Stemmatics : Comparisons of variants between two or more related texts are undertaken with the purpose of reconstructing their genealogy and archetype.One could formulate a parallel definition for phylogenetics: comparisons of characters between two or more related taxa are undertaken with the purpose of reconstructing their genealogy and archetype. Until recently, however, many scholars would have inserted a full stop after genealogy.Less than two decades ago, for example, H. Don Cameron insisted thatFor the zoologist the reconstructed ancestor is a fiction. It is a convenient way of presenting the organized information about the relationships of the real animals in question. Nobody tries to reconstruct a living, breathing theocodont or the protodipteron.He contrasted these attitudes and beliefs with those in textual criticism, where the reconstructed text is. On the contrary, reconstruction has been a desideratum of evolutionary biology from the start: one encounters a conjectural ethos in Darwin's Descent of Man and in the writings of, for example, Thomas Huxley, the eminent Victorian biologist and staunch defender of Darwin's model of evolution. The ethos has always been there, the methodology has not; or rather, the methodology has been hobbled by the lack of a clear unit of comparison and conjecture. But whereas conjectural criticism in recent decades has experienced a downtick in popularity in textual studies, the reverse is true in evolutionary and synthetic biology, where allographic methods have taken hold. With DNA and protein sequences — digital, discrete, and abstract — now firmly established as the clear units of conjectural analysis in evolutionary biology, the reconstructive paradigm has begun to flourish. the real article, in a literal sense the text that Euripides wrote. Reconstruction is a serious business, and the only point of studying manuscripts at all
In order to better understand how an unattested text, language, or genome is inferred
from attested forms, I want to briefly look at two competing classes of stemmatic
algorithms: maximum parsimony and clustering. Broadly speaking, clustering or
distance methods compute the overall similarities between manuscript readings,
without regard to whether the similarities are coincidental or inherited,shortest
tree containing the least number of change events still capable of
accommodating all the variants. The difference is often expressed in evolutionary
biology as that between a phenetic (clustering) versus a cladistic (parsimonious)
approach. Here is how Arthur Lee distinguishes the two: Cladistic analysis is sharply differentiated from cluster
analysis by that which it measures. Cluster analysis groups the objects being
analysed or classified by how closely they resemble each other in the sum of their
variations, using statistical
In cladistic analyses, the evidentiary gold standard excludes most of the
textual data from consideration: only shared derived readings as opposed to shared
ancestral readings are believed to have probative value. An ancestral reading, or
retention, is any character string inherited without change from an ancestor. A
derived reading, by contrast, is a modification of a reading inherited from the
ancestor, often motivated rather than arbitrary from a paleographic, metrical,
phonological, manual, technological, aesthetic, grammatical, cognitive, or some other
perspective (for example, the distance measures.
Cladistic analysis, on the
other hand, analyses the objects in terms of the evolutionary descent of their
individual variants, choosing the evolutionary tree which requires the smallest
number of changes in the states of all the variants.
But tel me, why hidstow with sorwein the Wife of Bath's Prologue, alters
whyto
wherforto fill out the meter). When such readings are contained in two or more manuscripts, they are regarded as potentially diagnostic. Having made it through this initial round of scrutiny, they are then subject to a second round of inspections, which will result in exclusions of variants that have crept into the text via routes not deemed genealogically informative. The transposition of the
Phenetic or clustering analyses involve the use of a distance metric to determine
degrees of similarity among manuscripts. The Levenshtein Distance algorithm, for
example, named after the Russian scientist who created it, tabulates the number of
primitive operations required to transform one variant into another. We can
illustrate its application with one of the most notorious variants in the
Shakespearean canon. In Hamlet's first soliloquy, we encounter the following lines:
1.2.129-30
solid
flesh,
while the second quarto has sallied flesh.
The Levenshtein or edit distance
between them is determined as follows: let solid
equal the source reading and
sallied
the target reading. The first task is to align the words so as to
maximize the number of matches between letters:
Sequence alignment, as the method is called, is a form of collation. Identifying all
the pairings helps us compute the minimum number of procedures needed to change one
string into another. A mismatch between letters indicates that a substitution
operation is called for, while a dash signals that a deletion or insertion is
necessary. Three steps get us from source to target: a substitution operation
transforms the o
of solid
into the a
of sallied
; an
insertion operation supplies the l
of sallied
; and another insertion
operation gives us the penultimate e
in our target variant:
Although heterogeneous in their approach, both clustering and maximum parsimony are
In biology, textual criticism, and historical linguistics, a transformation series
(TS) is an unbroken evolutionary sequence of character states. The changes that a
word or molecular sequence undergoes are cumulative: state A gives rise to state B,
which in turn generates C, and so on. The assumption is that each intermediary node
serves as a bridge between predecessor and successor nodes; state C is a modified
version of B, while D is a derivation of C. The changes build logically on one
another and follow in orderly succession. Consider, for example, a hypothetical (and
admittedly idealized) textual series: head/heal/teal/tell/tall/tail.If we can discover such a transformation
series,
writes H. Don Cameron, we have strong evidence for the relationship
of the manuscripts
in which the readings occur.
Those readings might relate to one another in the vertical direction, in which case
the task of the textual critic is to determine their order and polarity: Conversely, they might relate to one another as leaf nodes in the
horizontal direction,
Both distance and parsimony methods shed light on transformation series. The degree
of difference between two leaf nodes, for example, reflects their degree of
relationship: in our reconstruction, match.
Now an admission: the source of the head/heal/teal/tell/tall/tail series is the
nineteenth-century British author and mathematician Charles Lutwidge Dodgson, better
known by the pseudonym Lewis Carroll, who uses it to illustrate not textual
transmission or biological evolution, but rather a word game of his own invention,
which he called doublets
word ladders,
doublets is played by first designating a start word and end word. The objective is
to progressively transform one into the other, creating legitimate intermediary words
along the way. The player who can accomplish this in the fewest number of steps
wins.
The congruity between doublets and biological evolution is the subject of an ingenious essay by scientist David Searls, entitled
Twas brillig, and the slithy toves Did gyre and gimble in the wabe; All mimsy were the borogoves, And the mome raths outgrabe.
there are plenty of hard words there.BRILLIGmeans four o'clock in the afternoon — the time when you begin BROILING things for dinner.
Upon hearing this explanation,writes Searls, computational biologists
will of course feel an irresistible urge to do something like the following:
That something
is sequence alignment, a preliminary step to computing the edit
distance between the two strings and creating an edit transcript specifying the
semiotic operations required to mutate one into the other.
the notion of the most parsimonious interconversions among strings of letters is . . . at the basis of many string-matching and phylogenetic reconstruction algorithms used in computational biology
Interestingly, Carroll himself alludes to the resemblance between doublets and
Darwinian evolution in an example that anticipates by more than half a century the
linguistic and scriptural foundations of modern molecular genetics. He satirically
bridges the evolutionary gap between apes and humans with a six-step transformation
series:
Extending Carroll's experiment to sets of words, Searls reconstructs hypothetical ancestors for eight leaf nodes:
Here the words at the bottom level are used to infer ancestors, but they might also
be used to project descendants. They support conjectures that
While Carroll initially conceived of doublets as a game of substitutions, involving
the exchange of one letter for another, he later also admitted transpositions, as in
the third step of the following series:
What should we make of the fact that scientific theories of transmission and
reconstruction can be so effectively illustrated with a parlor game? Recall that a
correspondence between conjecture and wordplay was established earlier in this essay.
There we discovered that a textual critic endeavoring to recover a prior text and a
diviner attempting to decipher an oracle by signs were often united in their reliance
on letter substitutions, puns, anagrams, and other permutational devices. Noting that
the procedures underlying such word games were amenable to algorithmic expression, we
labeled both the textual critic and the prophet who perform them computers.
At
the same time, we maintained that they were computers of a special sort, proficient
in the semiotic processing of strings rather than the mathematical reckoning of
numbers. Into their ranks we can also admit evolutionary biologists and historical
linguists, for whom the grammatical operability of signs is no less germane.
In a post-industrial, Westernized society, an individual with the proper education
who displays an aptitude for string manipulation might find gainful employment as an
evolutionary biologist or historical linguist or cryptographer. In another milieu,
that same individual might instead be inducted into the ways of the poet or prophet.
The Mayan priest who practices calendrical divination through puns, the Mesopotamian
baru who deciphers oracles through wordplay, and the Shakespearean soothsayer who
interprets auguries by means of substitutive sounds and false etymologies find their
21st-century scientific complement in the figure of the computational biologist.
Searls remarks that Carroll's doublet puzzles reveal a turn of mind well suited to
methodologies used in modern computational biology.
In this essay I have tried to advance the proposition that conjectural knowledge in the humanities is a manifestation of the inalienable human need to imagine what might have been or could be or almost was. This mandate of the subjunctive is beautifully conveyed by George Steiner in
It is the constructive powers of language to conceptualize the world which have been crucial to man's survival in the face of ineluctable biological constraints, this is to say in the face of death. It is the miraculous — I do not retract the term — capacity of grammars to generate counter-factuals,if-propositions and, above all, future tenses, which have empowered our species to hope, to reach far beyond the extinction of the individual. We endure, we endure creatively due to our imperative ability to sayNoto reality, to build fictions of alterity, of dreamt or willed or awaitedothernessfor our consciousness to inhabit.
imperative abilityis more than a psychological or linguistic impulse; more, even, than a creative or ethical force; it is also a life instinct.
In the course of this work I have also tried to show that algorithms on strings —
vital to conjecture — are as computationally sound as algorithms on numbers. We are
so accustomed to thinking that punsters and textual critics somewhat imprecisely
reshuffle symbols, whereas mathematicians and engineers methodically calculate them
that it will perhaps take a class of scientists who process strings rather than
numbers to convince us otherwise. We have looked at just such a group of
professionals, computational biologists, whose work on the chemical letters
of
DNA resembles nothing so much as the parlor game of doublets or word ladders. And yet
this form of wordplay is couched in the language of computation rather than poetry,
vetted by biologists rather than poets or literary scholars, and published in
scientific journals rather than poetry rags or literary monographs. Perhaps most
significantly, the similarities between the methods of the computational biologist
and the textual scholar — like those between the soothsayer and textual scholar — are
not coincidental, but causal and historical.
One of my primary objectives has been to dispute the commonplace assumption that the
inner workings of conjecture are ineffable and opaque. R. J. Tarrant, for example,
writes that conjectural criticism advances
slowly and unsteadily,
proceeding at a glacial pace
and is not amenable to discussion of methods and trends
proceeds from no method and conveys no
certainty
But what about conjectural methods in humanistic fields other than textual studies
and linguistics that don't rely on systems of inheritance and variation? Take
history, for instance, where the historical imaginary is our predominant tool for
filling in the missing details of the past; or digital preservation, where the
archivist must designate a primary user community for the cultural objects she
preserves. By projecting the knowledge base of this user group into the future, the
archivist makes decisions about what kind of descriptive metadata and contextual
information to include in order to render
the object intelligible
to posterity.
My own hunch is that there are deep structures common to many types of conjecture and
new methods available to others that await discovery and experimentation. Our success
in identifying and exploiting them, however, will depend in no small measure on how
committed the digital humanities community is in coming years to building an
information infrastructure that supports new models of scholarly communication and
research. Patrick Juola, a computational linguist at Duquesne University, for
example, is currently developing a prototype system for exploring corpus-based
approaches to conjecture in the humanities. As Juola explains, his idea is to program
the computer to make essentially random
emendations to texts and then rank them for plausibility using a set of
agreed-upon criteria,
the underlying assumption being that the machine is capable of devising and examining many
more possible readings than even the most prolific human author, and even if
99.9% of the possible readings are flat gibberish, the one in a thousand may
include interesting and provocative readings that human authors have
missed
Patrick Juola, email to the author, 3 Oct. 2008.
I could equally compare American and British
literature, or 19th and 20th century literature, or first-person and
third-person novels, or any other interesting category. Essentially, what the
prototype I envision will do is generate (and test superficially) random
conjectures like
Juola, email to the author, 3 Oct. 2008.
[Works in Category 1] use [Semantic category
B] more than [Works in Category 2].
The promise of Juola's corpus-based conjectural criticism, however, is predicated on the kind of mass digitization from which humanities researchers, as Christine Borgman emphasizes in
generally speaking, the humanities are more interpretive than data-driven,with digital humanists being one obvious exception
I would like to express my gratitude to the following friends, colleagues, and mentors for the substantive feedback they provided on earlier versions of this work: Morris Eaves, Kenneth Gross, Matthew Kirschenbaum, Julia Flanders, Alan Liu, and Katie King. Their judicious comments and ideas enriched my thinking about the project and formed the basis for my revisions. I am also indebted to the anonymous readers for DHQ whose detailed suggestions helped strengthen the essay.
Corrupter of Words,