“Theory in Text Encoding”

Paul Caton Scholarly Technology Group, Brown University Paul_Caton@brown.edu

The existing body of theoretical work in text encoding suffers from two interrelated problems: firstly, confusion over the specific nature of the work; and secondly, the absence of any truly critical theory. Theory’s purest signified and a fundamental predicate of progress in the natural sciences involves a representation that meets discursively bounded criteria of “truth.” According to Renear (1997) the electronic document encoding and processing community “has evolved a rich body of illuminating theory about the nature of text” (107), a claim that expands that of Renear, Durand, and Mylonas (1996) and anticipates Mylonas and Renear (1999) who assert the principal goal of the research community that develops and applies the TEI Guidelines and other text markup schemes is a greater “theoretical understanding of textual representation” (my emphasis). In making their case the authors invoke Lakatosian criteria which they confidently pronounce (some disclaimers notwithstanding) this research community’s work meets. Yet the development of OHCOs 1-through-3, for example—surely one of text encoding’s founding “theoretical” moments— demonstrably fails to qualify as a progressive problemshift in Lakatosian terms.° Text encoding has not generated a rich body of illuminating theory because it cannot. Asking, for example, “what is text, really?” already poses the wrong question because it assumes an undiscovered essence the encoding community can find with a progressive research program. This will not happen because no mysterious core exists whose explanation falls to home-grown text encoding theory. Renear argues that despite falsification of all OHCO variants there is nevertheless “no reason to give up the common-sense view that texts do have an objective structure independent of our methods and theories about them” (1997, 122). Structure there may be, but discovering it is not like positing the double-helix of DNA; we need not strive to understand-by-modelling because of inadequate observational technology or limited knowledge. Praxis prompts reflection which generates principle to guide subsequent praxis. Poetry, for example, shows a self-conscious praxis continually reviewing, refining, and codifying (prescriptively) its methodology,° and this is as true of its encoding as its writing. Out of “successful” text encoding praxis comes not theory but principle.° This is to say not that text encoding has no theoretical component but that pursuing the theory behind a principle takes us to another place: to linguistics or rhetoric or semiotics or the cognitive psychology of visual perception, and so on. Such pursuits can produce sophisticated and thought-provoking borrowings that unquestionably enrich the literature, but these are the exception, not the rule.° In particular, scholarly work in text encoding rarely positions itself with respect to work in modern literary/cultural theory, and even more rarely does it use text encoding as a springboard for a sustained engagement with such theory. This seems to me a significant and unfortunate absence, a product both of text encoding’s desire to differentiate itself as a specific field of intellectual inquiry and of a positivist, utilitarian bias against the perceived negativity and self-imprisoning reflexivity of contemporary theory. Renear’s narrative of the development of text encoding theory exemplifies the latter stance with its Realist/Anti-realist distinction, associating the former with common-sense and characterizing the latter as “consistent with post-structuralist epistemologies” (122). This imitates an exclusionary move Zavarzadeh and Morton (1991) identify in literary studies: positing deconstruction as the boundary of theory beyond which it is unthinkable to go because (supposedly) deconstruction represents the limits of the thinkable, the point where theory swallows itself in absolute relativism. Ironically, Renear’s account positions itself precisely as theoretically reflexive while exposing its own refusal of genuine reflexivity. Text encoding—indeed humanities computing as a whole—can too easily think of itself as related to every humanities discipline but also marginal or even external to all of them. This licenses attitudes such as Renear’s contention that what he calls text encoding theory brings “a much needed fresh perspective on textuality” (107), as if text encoding occupied a different space from traditional disciplines. Contrarily I would argue that whatever its origins in non-academic praxes, text encoding forms itself in and of humanities disciplines. I should stress that I do not consider these disciplines stable sites: they can and should be challenged. Text encoding therefore offers a locus for work that tries to think through the tensions, contradictions, and faultlines that constitute those disciplines qua humanities disciplines. However, unless it can stand on sufficiently equal terms to enter a critical dialogue with theory of exemplary reflexivity and philosophical rigor, the text encoding community’s theoretical work will have limited significance and appeal—a fate already shared by much of the humanities computing literature (Corns 1991; Warwick 1999). My own position is that the historical materialism that comes down to us from Marx, enriched and updated by thinkers such as Lenin, Adorno, Althusser, and many others, offers us the best critical tools for a dialectical engagement with what it means to encode texts in the humanities. Althusser (1982) memorably describes the problem: “Left to itself, a spontaneous (technical) practice produces only the “theory” it needs as a means to produce the ends assigned to it: this “theory” is never more than the reflection of this end, uncriticized, unknown, in its means of realization, that is, it is a by-product of the reflection of the technical practice’s end on its means. A “theory” which does not question the end whose by-product it is remains a prisoner of this end and of the “realities” which have imposed it as an end.” (171, emphasis in original) Currently a prisoner of its pragmatic roots, theoretical work on text encoding has only its chains to lose.

REFERENCES

Louis Althusser. For Marx. London: Verso, 1982.

Dino Buzzetti. “Text Representation and Textual Models.” Paper presented at ACH/ALLC 1999, June 1999, University of Virginia. : , 1999.

Paul Caton. “Towards a Politics of Text Encoding.” Paper presented at ACH/ALLC 2001, June 2001, New York University. : , 2001.

Thomas Corns. “Applications in the Study of English Literature.” Literary and Linguistic Computing. 1991. 6: 127-30.

Elli Mylonas Allen H. Renear. “The Text Encoding Initiative at 10: Not Just an Interchange Format Anymore—But a New Research Community.” Computers and the Humanities. 1999. 33: 1-9.

Wendell Piez. “Beyond the ‘Descriptive vs. Procedural’ Distinction.” Paper presented at Extreme Markup Languages 2001, August 2001, Montreal. : , 2001.

Allen H. Renear. “Out of Praxis: Three (Meta)Theories of Textuality.” Electronic Text: Investigations in Method and Theory. Ed. Kathryn Sutherland. Oxford: Oxford University Press, 1997.

Allen H. Renear. ““The Descriptive/Procedural Distinction is Flawed.” Paper presented at Extreme Markup Languages 2000, August 2000, Montreal. : , 2000.

Claire Warwick. “English Literature, Electronic Text, and Computer Analysis: an Impossible Combination?.” Paper presented at ACH/ALLC 1999, June 1999, University of Virginia. : , 1999.

Mas’ud Zavarzadeh Donald Morton. Theory, (Post)Modernity, Opposition: An “Other” Introduction to Literary and Cultural Theory. PostModern Positions. Washington D.C: Maisonneuve Press, 1991. Vol. 5.