2021
Volume 15 Number 1

Hierarchical or Non-hierarchical? A Philosophical Approach to a Debate in Text Encoding

Abstract

Is hierarchical XML apt for the encoding of complex manuscript materials? Some scholars have argued that texts are non-hierarchical entities and that XML therefore is inadequate. This paper argues that the nature of text is such that it supports both hierarchical and non-hierarchical representations. The paper distinguishes (1) texts from documents and document carriers, (2) writing from "texting", (3) actions that can be performed by one agent only from actions that require at least two agents to come about (“shared actions”), (4) finite actions from potentially infinitely ongoing actions. Texts are described as potentially infinitely ongoing shared actions which are co-produced by author and reader agents. This makes texts into entities that are more akin to events than to objects or properties, and shows, moreover, that texts are dependent on human understanding and thus mind-dependent entities. One consequence from this is that text encoding needs to be recognized as an act participating in texting which in turn makes hierarchical XML as apt a markup for “text representation”, or rather: for texting, as non-hierarchical markup. The encoding practices of the Bergen Wittgenstein Archives (WAB) serve as the main touchstone for my discussion.

1. Introduction

Amongst the many theoretical questions about text, there is a philosophical, or more specifically, an ontological question. The most general form of this question is perhaps ‘What is text?’. In Digital Humanities, the issue has partly been focused around the question whether texts are hierarchical or rather non-hierarchical structures. Examples of this discussion include the statement that “text is best represented as an ordered hierarchy of content object (OHCO), because that is what text really is”  [DeRose et al. 1990, 3], or, in opposition to it, the statement that “humanists are trying to represent what they all agree are non-hierarchical structures”  [Schmidt 2010, 344]. Conflicting lessons for text encoding have been drawn from these two opposed approaches to the question. Some have concluded that the hierarchical markup grammar XML can be regarded as adequate for text encoding, because that is in their view what texts basically are, viz., hierarchical entities. Others reject XML and embedded markup more generally as inadequate, precisely on the basis of the view that texts have, or at least can have, non-hierarchical structures. In this paper, I want to argue that both conclusions have been drawn prematurely due to an erroneous approach to the ontological question about text.
I shall start with presenting a brief example of philosophical authorship from the last century. A reflection on the editorial history of this example and other writings from the same authorship will lead us to a view into editorial philology and, in particular, digital editorial philology. It is in this digital context that the question about the ontological nature of text and its consequences for text encoding have most forcefully been asked. I shall attempt to demonstrate that a philosophical reflection on the hermeneutical nature of our text practices not only helps to understand better the question about the ontology of texts, but also to dispel the idea that the nature of text would as such, i.e. as independent from our text practices, dictate either hierarchical or non-hierarchical markup.

2. Writing

In July 1931, a philosopher in Cambridge reads Augustine’s Confessiones. Augustine’s account of how he learned to speak as a child makes a strong impression on him. Our philosopher reads the account at a time when he is struggling with theoretical questions about language and meaning. He therefore is very sensitive to anything that even remotely deals with these things. Augustine’s description seems generally fair and representative of how we think language acquisition works. But our philosopher gets puzzled about a few sentences. Perhaps he draws a line in the margin of the book, to highlight the passages that he finds perplexing. Later, then, he notes down his thoughts in a notebook, recording what he believes was right and what he believes was wrong with the account. Later still he returns to these notes, and develops his ideas into a longer discussion. He develops an entire argument around Augustine’s account. He regards his discussion of Augustine as a way of becoming clearer about his own thought concerning linguistic meaning, and about the role that humans play in establishing the relation between words and objects. The exact intentions behind Augustine’s original account are of less importance to him now.
He has the discussion of Augustine, together with many other notes and remarks, typed. The resulting typescript he then cuts into paper slips. The slips are of varying sizes: some contain one remark or even only part of a remark, others contain series of remarks or also an entire page. He collects the slips together with cuttings from other typescripts. Next, he reorganizes the contents of his collection. He inserts additional sheets, with handwritten titles for chapters and subchapters. Soon he has this new arrangement of his remarks typed again, hereby producing a large new typescript. It contains more than 4,000 remarks — he calls them “Bemerkungen”. The Bemerkungen are typically separated from each other by one or more blank lines. The typescript looks much like an advanced book manuscript. But soon our philosopher starts to make changes, namely adding, deleting, rearranging and revising remarks and sentences in it. In many places, he adds alternative phrasings. Some parts of the typescript he goes through more than once, making changes in pencil, black ink and red ink. The amount of changes and revisions grows larger and larger. The changes now begin to also extend into parallel notebooks and other writing books. Entire new sections are added, some in the margins of the typed pages, others on the typescript’s verso pages, and yet others in separate notebooks.
About a year later he begins considering the idea of making his discussion of Augustine the beginning of a new book in philosophy, to appear in a parallel German-English edition. About fifteen years earlier he had published the Logisch-philosophische Abhandlung; this book had given him some status. He now produces a concise summary of his argument about Augustine, making it the beginning of a discussion about small and well-defined samples of language use. These he calls “language games” (Sprachspiele), and he intends to make the idea of language games the backbone of his entire new book project.
Eventually, after ten more years’ hard work of producing many new Bemerkungen and revising, rearranging, adding and deleting, he has yet another typescript produced that looks ready for the press. The typescript even includes a title (Philosophische Untersuchungen), a motto and a preface. However, in the remaining five years of his life, our philosopher can never bring himself to finish the work for publication. Two years after his death, in 1953, his friends finally edit and publish it with a parallel translation in English.[1]

3. Scholarship

So far I have done nothing but portray a real example of philosophical authorship from the twentieth century. Note that I did not use the word “text” even once. I could have used the word, but I need not have used it. In some places, I could have said “text” instead of “remark” or, in others, instead of “discussion” or “argument” or “book” or “work”. But in all these cases it would have been replaceable. Our author may sometimes have asked himself: Which text should I choose here? Will I ever finish my text? Is my text good enough? Etc. However, again, the occurrences of the expression “text” in such questions are replaceable with words such as “manuscript”, “book”, “phrasing”, “sequence” or “version”. Regarding the notion of text, our real case example did not seem to pose any special theoretical difficulties. Most importantly, the ontological question of what text is, clearly need not have bothered our author. For the author the notion of text need not be problematic at all. An author may just write, delete, rearrange, rewrite, compose, and so on. Neither did I have to be bothered by the notion when telling the story of the example. So, if there is a specific philosophical, ontological issue about text, where does it come in?
Considering the further development of our philosopher’s story may help to find the answer to our question. Let us first try to locate, with the help of our narrative, the points at which text can become a theoretical issue of any sort. As this particular tale goes, before he dies, our philosopher appoints three friends who are to manage the publication of his writings. The three find themselves confronted with a huge mass of pages (which they first have to collect from different places), some handwritten, some typed, some bound in notebooks, some on loose sheets, some in orderly dossiers. This is now standardly referred to as our philosopher’s “Nachlass”. For some of the books and pieces that they decide to edit from this Nachlass, they are able to use neat enough typescripts — for most of the publications, however, they have to make selections and combinations on both large and small scales, and need to do some substantial editing. They have to decide what to choose for publication; which version to use; how to arrange it; how many and which of our author’s variant phrasings to include; whether to use also variants added in other manuscripts; whether to obey all his instructions or only those that they find conducive; whether always to omit what our philosopher himself had deleted; whether to stick to at least some of his idiosyncratic style and punctuation; whether, and how much, to expand on his elliptic references to either his own ideas or also the ideas and works of others; how much to bother the reader with information about the character of the original Nachlass source; etc. etc.
If not for our author, text now seems to have become an issue at least for his editors, or for any editors of a Nachlass such as Wittgenstein’s. In the processes of editorial decision-making, such editors will often refer to precisely this thing, the text, and find themselves confronted with issues of so-called “textual criticism”. We can imagine them discussing and debating these issues both amongst themselves and with the users of their editions. Both the editors and their critics will argue for their respective positions by reference to what they call “the text”; and this invocation of the text, while the word itself often seems to refer to different things for the different sides, always seems to lend their respective standpoints and arguments strength and significance. Surely, for many of the arising disputes, the expression “text” will again be replaceable by some other words, e.g. “source”. In several cases, however, the expression clearly carries something which is not contained in those other expressions, something like the marker of a norm or standard, or of the right interpretation, and “the text” is precisely the expression to be used.
Let us complete the story with some perceptions and questions from the readers’ side. The Wittgenstein readers asked: Have we received all the text, or are parts missing? Is the text displayed in the correct sequence? Does it contain transcription errors? Does the edition maybe mislead me to adopt a wrong interpretation? Have I been given the right text? Have I been given the text as it was intended by the author? To what extent is the text authorized by its author Ludwig Wittgenstein? Does the edited text correspond to the original? The “textual” situation in the Wittgenstein Nachlass itself is often far from clear. The edited text could be something that physically never existed before, or no longer existed — thus, was something that had to be (re)constructed. Or, the editors came at different times to different conclusions about what “the text” was referring to, and for a few items different editions, as also different translations, were produced. Readers would again ask: Which is the text / translation I should use for my interpretation?
Now, it is true that these questions and issues bring us closer to theoretical discourse about text. But none of them necessarily brings us to the ontological question about what text is. Moreover, there are disciplines that not only treat these questions and issues, but also provide answers and solutions to them. I think here in particular of editorial philology, and, of course, especially digital editorial philology. In the following, I will first stress that digital editorial philology provides solutions to the above-mentioned issues and questions. But in this context we will notice that the very same disciplines that provide the solutions, also in fact seem to give rise to our ontological question about text.[2]

4. Digital scholarship

Methods of textual criticism have been developed for many purposes, including for finding solutions to exactly the kind of issues and questions brought up in the previous section. Twentieth-century textual criticism has improved these methods further through the application of digital techniques. For instance, while the practice of producing editions comprising both facsimiles and transcriptions, ranging from ultra-diplomatic to so-called “students” versions, has already existed in the pre-digital age, the introduction of digital techniques has made producing such editions easier, cheaper and more efficient. But the digital medium has not only provided improved ways of implementing solutions that had already existed before — it has furthermore brought new solutions and possibilities. XML-based user-steered or “interactive dynamic presentation”  [Pichler and Bruvik 2014, 181] of online text archives is entirely new, that is, a genuine achievement of digital editorial philology, and it offers something that had not been possible before.
These achievements of digital editorial philology have become possible through text encoding. At the same time, it is also precisely scholars of text encoding who have forcefully embarked on the ontological question “What is text?”.

5. Philosophy

Hierarchical vs. non-hierarchical representation

It appears that it is exactly digital editorial philology with text encoding at its heart which has motivated the emerging, or at least the notable reinforcement, of what I have called the ontological question about text. It is particularly the question whether hierarchical text encoding grammars such as XML are adequate for the transcription of manuscript source materials that has caused considerable controversy.[3] Opponents of the view often justify their position by invoking a non-hierarchical conception of the nature of text: it is the belief that texts are non-hierarchical which leads them to conclude that hierarchical encoding or markup cannot be the correct method. Paradigmatic cases they appeal to include complex manuscript materials which, so their view, are fundamentally characterized by non-hierarchy or at least multiple structures which overlap with each other. Our philosopher’s Nachlass could be regarded as such a case in question. Against this kind of argument, in turn, proponents of hierarchical markup grammars — though they grant that overlap and multiple hierarchies exist — have argued in favour of adopting the precisely opposite conception of the nature of text, namely a chiefly hierarchical one. Thus, the fundamental issue is no longer one about “Which is the right text?”, but concerns the ontological nature of text.
But is the view that text is a hierarchical object [DeRose et al. 1990] or, in opposition to it, the view that it is a non-hierarchical object [Schmidt 2010], justified? And if either of the two is justified, does this lend argumentative support to a hierarchical or a non-hierarchical approach in text encoding? In answering this question more fully I would have to address at least the following two sets of questions. First, can the general assumption according to which texts are either hierarchical or non-hierarchical, put any demands on the structure of any particular markup system? Does the fact that a particular object of encoding is hierarchical, entail the demand that the encoding itself be hierarchical or, if it is non-hierarchical, that the encoding be non-hierarchical? Against the view that it does, one could argue that we ordinarily accept that three-dimensional entities are represented in two-dimensional structures. Similarly, we make use of hierarchical taxonomies for domains that in fact can be regarded as non-hierarchical; and, whilst being fully aware of the general vagueness, context-sensitivity, ambiguity etc. of ordinary language, we nevertheless take advantage of exact grammars, logics, strictly organized thesauri or computational ontologies for their analysis and processing. What, then, is it that makes it unacceptable to use hierarchical markup-languages for non-hierarchical sources, or non-hierarchical markup-languages for hierarchical sources? Secondly, are the assumptions that texts are either hierarchical or non-hierarchical objects themselves justified? On what grounds, and in what sense, can it be said that the nature of text is either of a hierarchical or a non-hierarchical structure?

Document carriers — Documents — Texts

In this paper, I have a direct focus on the second set of questions, but will provide at least a partial answer also to the first set of questions. Now, to answer the question whether texts are hierarchical entities, we should first try to find out what sort of entities texts could be on a general level. This is after all also what Renear and others wanted: To answer the question what text (really) is. But this ontological question, in turn, should first bring us back to the issue of writing. What is writing? It seems a safe thing to say that writing is an action, and as such it should be possible to describe it in terms of action theory. This implies the application of concepts such as “agent”, “basic action”, “action result”, and others. I would like to suggest the following characteristics of writing:
• First: Writing is, at least in terms of its physical movements, a basic action [Danto 1963, 435f] and thus not caused by other actions.
• Second: Writing produces a finite action result, the written. The written is writing’s intended result; we call it document.
• Third: Writing does not need more than one agent.
It seems important to appreciate the fact that producing documents, writing, is not the same as producing texts, and thus, to distinguish the action of producing documents from the action of producing texts. One important difference is that producing texts is producing documents with meaning, as we normally do when we write, or also furnishing documents with meaning, as we do when we read with understanding. Writing on the other hand does not need to produce meaningful documents and can also be performed by machines. Reading as such can equally be performed by machines (namely “reading machines”), but not reading with understanding.[4] I would now like to introduce for the rest of this paper the technical term “texting” for the action of producing texts. Let us look at some more differences between writing and texting in terms of action theory:
• First, texting is not a basic action but is co-caused by two other actions, writing and reading. (Or: If you look at the matter as one of spoken communication, the two actions that co-cause texting are speaking and hearing.)
• Secondly (and consequently), while the action of writing can be performed by only one agent, it seems then clear that texting is performed by more than one agent. One agent is the author, another is the understanding reader (naturally, the author and the reader can coincide in one and the same person). Consequently, texting is, unlike writing can be, not under the sole control of the author alone. Rather, texting evolves through actions that are shared among a multitude of agents. Therefore, when attempting to adequately describe texting, it is vital to include not only the author agent, but also the reader agent.
• Third, while writing produces a finite and rather stable result (namely documents), texting does not; rather it produces an instable and potentially continuously ongoing, endless and open-ended result. Writing has a clearly determinable beginning and end in time. Texting can have a clearly determinable beginning in time, coinciding with the beginning of the action of writing with understanding, but it does not have a clearly determinable end. Now, ontologically speaking: What sort of entities exactly are then the results of texting, namely texts?
If we start from a widely accepted tripartite division of what exists into objects, properties and events, it seems to make perfect sense to think of written documents, the products from writing, as objects. Equally it seems to make perfect sense to conceive of the carriers of written documents — paper, trees, stone, pergament etc. — as objects. More specifically, documents and document carriers are concrete, material objects. But does the same hold true of texts? Very often the expression “text” is used to mean the same as “document”. However, it is important to note that “text” often also denotes something very different from a document, and that the conditions of identity in the case of text in this sense are not the same as the conditions of identity for documents. This applies for example when we say “The work exists in many drafts and different versions” (one text, many documents), or to any ambiguous sentence, e.g. “John went to the bank,” as well as cases of homonymy and polysemy (one document, many texts). Texts in this sense clearly cannot be concrete objects.[5] Some have suggested that texts are abstract objects (e.g. Renear in [Hockey et al. 1999]; [Huitfeldt et al. 2012]). But there are also some factors which speak against this view, be text now conceived as an abstract object in the sense of a type or as an abstract object in the sense of being an immaterial object.[6] Consequently, though both “document” and “text” are nouns, and many nouns denote objects, it may be that “text” does not denote an object — or that, to speak with Wittgenstein, the “surface grammar” of “text” misleads us into believing that it denotes an object [Wittgenstein 2009, §664]).
Some of the arguments which speak against the view that texts are some kind of abstract object, are the same arguments which actually support the view that texts may be events. To classify texts as events rather than as abstract objects or a property will at first seem a strange thing to say, but it is merely so because we are used to think of texts in analogy to documents, or even document carriers: manuscripts, books, sheets of paper, computer screens etc. which all belong to the domain of objects rather than events. One of the aspects which speak in favour of the event view is that a text at no single (non-durative) point in time seems to be present in its entirety — which is a characteristic of events [Kanzian 2015, 897]. A consequence from the event view of text is that the locus of a text is temporally and spatially distributed: As any event’s locus is the locus of its bearers, so must then also a text’s locus be the locus of its bearers. The text bearers cannot however only be books or computer screens; these, considered by themselves, are document rather than text bearers. If the event view of text is correct, then not only the document itself must be regarded a text bearer, but also the author and the understanding reader. Thus, the text event will need to be seen as taking place exactly in the geographically and chronologically dispersed interplay between authors, documents and readers. This fits very well with our observation above, namely that texts are shared among and coproduced by authors and readers. One advantage from the event view of texts seems to be that it does, ontologically speaking, not demand more than the following ontologically rather uncontroversial entities: as bearers of the event the concrete object document, the concrete object author, and the concrete object reader, and as event proper the action of (understanding) reading.
This implies that it not only makes sense to conceive of texts as events, but indeed events of a special kind, namely actions. Thus, texts not only seem to be produced by actions — they seem themselves to be actions. Within the group of actions, texts can then further be characterized by being actions which are co-produced by authors and readers, thus shared actions.[7]

Text encoding

Text encoding can record data about the document carrier, the document as well as the text. Saying that the source is a notebook or a typescript or that it is written in ink or pencil, pertains to the first; recording which words it contains or which letters are deleted and which are added, pertains to the second; talking about the document’s meaning and stating that there are implicit references and allusions to a work by another author in the document, pertains to the third.[10] On whatever level text encoding moves, it will always also record data about the encoder’s engagement with the source. This becomes particularly clear where it aims at recording the text and thus moves on the third level. However, already on the level of recording data about the document carrier, the encoding attributes structure to the source rather than simply depicting a pre-existing structure (D.R. Raymond in [Biggs and Huitfeldt 1997, 358]). In the language of the above suggested event conception of text one could say that the encoder becomes herself inevitably one of the bearers of the text.
What are then the implications of our philosophical investigation for our question whether hierarchical or rather non-hierarchical markup is appropriate for the encoding of texts? I think the main implication is, to make a long story short, that both are equally appropriate. For, following the present argument, what we encode are as much our own signifying text actions as the source (the source “as such”, as one is tempted to say). Transcription is, with Sahle’s words, “a protocol of perception, mapping and interpretation” [Sahle 2015]. Whether the text itself will be hierarchical or non-hierarchical will therefore depend on us as encoders. Therefore, both the position holding that markup is to be hierarchical because text itself is hierarchical and the opposed view, can be seen to be in one sense correct, but wrong in another. Both seem to draw their consequences for text encoding on an — at least ontologically — unfounded basis. They are making it sound as though the question would be essentially a matter of finding out which is the right representation of a pre-given structure of text. But “hierarchical” or “non-hierarchical” describe aspects of our active engagement with the source and therefore concern the nature of our own actions rather than the nature of independent entities. “‘An ‘OHCO structure’ is’, as Dino Buzzetti says, ‘not a model of the text, but a possible model of its expression’”  [Buzzetti 2002, 71]. The OHCO view of text could thus be rephrased to: “Text is a hierarchical ordering of content objects”. According to Desmond Schmidt, complex manuscript variant structures pose overwhelming challenges for hierarchical markup, and consequently form a primary case for the non-hierarchical approach (as also for non-embedded markup). However, text variants are, on the background of the argument proposed here, not independent entities that put insurmountable constraints on our mapping acts either. What makes up a text variant is namely already co-constituted by our reading and mapping of the source. With Wittgenstein we could thus say that both sides of the debate mix sign talk and symbol talk, and that the primary field of text encoding belongs to the realm of symbols rather than that of signs. A symbol is the sign with meaning: the sign as symbolized [Wittgenstein 1963, 3.32]. Whether to encode a source in hierarchical or non-hierarchical ways is a question of how to map — symbolize — the signs of the source.
What could, or rather: what should then bring us to encode hierarchically rather than non-hierarchically, or the other way around? In the end, it can only be our scholarly interests and needs. If we are interested in encoding document structures, then it may be important to record what we regard as overlapping structures, e.g. overlapping structures at the cross points between sentence or paragraph units on the one hand and page units on the other, through non-hierarchical encoding, or even standoff markup. If we are interested in encoding the sequence of (as such: genetically linear) writing acts, a markup system permitting for recording the points where these writing acts’ manifestations cross, equally may be the thing to choose. But even in these cases, practicing one of the TEI’s recommendations for handling overlap through hierarchical XML may be equally in place.[11] In any way, it seems problematic to hold that it is the text’s nature, as something independent of us, which requires overlap markup. It is rather the nature of our representation of the source which requires hierarchical or non-hierarchical markup, thus something which is under our, not the source’s control.
If this wasn’t true, and consequently: if it wasn’t true that we can adequately transcribe complex primary sources in hierarchical XML, it would be quite mysterious why so many projects manage to encode and edit intricate and multifaceted, so-called overlapping and non-hierarchical handwritten materials with hierarchical XML. They do so in an effective manner, living up to the (still evolving) standards for digital scholarly editions. One example is editorial work on the Wittgenstein Nachlass by the Wittgenstein Archives at the University of Bergen (WAB). It is the ambition of WAB’s XML transcriptions to contain an accurate graphemic record of each single letter that Wittgenstein wrote in the Nachlass, and of the writing acts it was produced by, or subjected to. This information is converted to “diplomatic” version outputs in HTML which, in short, represent the source on the level of its letters and the author’s writing acts. At the same time, our XML transcriptions also permit to produce “linearized” and “normalized” versions, and make yet other, strongly user-steered outputs produced via “interactive dynamic presentation”  [Pichler and Bruvik 2014, 181] in the spirit of Web 2.0 possible. A characteristic of the twenty thousand pages Wittgenstein Nachlass is the abundance of, partly rather complicated, text variance. Each of the around 65,000 occurrences is at WAB XML encoded not only on letter, but also on word level, which again makes outputs in diplomatic, linearized, normalized and other formats possible. It is XML that permits all this. However, at least in my view there is nothing in the source which requires us to choose the hierarchical XML over a non-hierarchical approach for achieving all this, or a non-hierarchical approach over hierarchical XML.[12]

6. Conclusion

The version of the comment on Augustine’s account of language acquisition that our Cambridge philosopher, Ludwig Wittgenstein, eventually ended up with, includes the following passage:

In this picture of language we find the roots of the following idea: Every word has a meaning. This meaning is correlated with the word. It is the object for which the word stands.  [Wittgenstein 2009, §1]

An analogous observation can be made about the debate on hierarchical vs. non-hierarchical markup. The way in which this debate is largely conducted suggests that the central issue concerns the accurate representation of some mind- and action-independent reality. It is assumed that, if texts are hierarchical, the correct depiction must be hierarchical; if they are non-hierarchical, the correct depiction must be non-hierarchical. According to this picture, text encoding is an act of correlating codes with objects and structures in and of themselves. But any text action including text encoding is a creative symbolizing action and, thus, already in the realm of symbols. This is nothing out of the ordinary; it is simply what meaningfully engaging with the world looks like on an everyday basis; it is what each of us does all the time, without running into any theoretical difficulties. Moreover, though one sometimes can hear that the need to escape relativism and to produce encodings and editions that will benefit others (including future generations) requires strict avoidance of interpretation in the domain of encoding, it needs being said that the way of looking at things proposed here does not entail any support to relativism. Rather than worrying about relativism, we simply have to ensure — and all the time work to ensure! — that there is sufficient agreement in our interpretations. Successful communication is not dependent on there being non-interpreted facts, but on there being shared interpretations (or rather, more generally, shared understandings). The TEI substantially helps with that.
The issues from Section 3, as we are now in a position to appreciate, are not to be regarded as pre-given. Rather, as much as they concern the sources to be edited, studied, translated etc., they equally concern ourselves: as authors, editors, readers and scholars, with our preferences, intentions, and the purposes of our actions. A simple question such as “Should the edition follow the physical, chronological or content order of the written?” is as much about what we want to do with the source as about the source “as such”. Questions of this kind ask for an engaged action. Through websites such as WAB’s “Interactive Dynamic Presentation” platform this aspect is put to the fore, and the fact that actions are required, is, at least exemplarily, made explicit. Users of the WAB site can utilize XML transcriptions and XSLT tools as basis for creating text following their own editorial choices. The resulting texts will be shared actions, co-produced by at least the following agents: Ludwig Wittgenstein, WAB’s transcribers and editors, the software authors, the interacting users. The ways in which we talk and argue about text manifests that texts originate in and are carried by understanding and acting human subjects. Texts are mappings of signs onto symbols. Thus, when discussing which of the texts emerging from a rich and complex Nachlass to choose, or what to identify as a “work” in it, etc., we are discussing, first, how to best map this Nachlass’ significatory potential onto symbols and, secondly, which of the symbolizations to give preference to. If it is true that texts are actions, then it therefore lies in the nature of text-talk that it can be evaluative and normative. For it lies in the nature of talk about actions that it can be evaluative and normative. With the later Wittgenstein, we might say that scholarly talk about “text” typically exhibits a normative grammar. This explains why the editorial issues described in Section 3 indeed are issues.
In this paper, I have tried to show that the debate about hierarchical vs. non-hierarchical markup can be resolved by a reflection on the “depth grammar”  [Wittgenstein 2009, §664] or “logical grammar”  [Wittgenstein 1963, 3.325] of “text”. This grammar is, due to texts’ specific ontological nature, categorially different from the grammar of “document”. Texts in the sense in which they are different from documents are ontologically difficult entities and may, as I tried to argue for here, not be objects. Writing alone does not produce texts, but documents. It seems however a fact that texts are in their existence dependent on human understanding, and that it is the meaning and structure constituting aspects of document understanding which at the same time make texts something under our command and responsibility. Therefore, text encoding is no passive depiction but co-constitutes its subject: It never records the mind-independent state of the source alone; rather, it always also records its own actions of recording, its specific representation of the source. Naturally, this goes also for WAB’s own XML transcriptions of the Wittgenstein Nachlass: They are no understanding-free depictions of the source, but already the results from precisely acts of understanding. The point that texts, and also transcriptions, result from acts of understanding, does however, as I have tried to explain, not need to involve any sort of unwanted relativism. The fact that it is us as understanding subjects that decide on the structure of texts explains in turn why XML can be such a successful markup system also for the encoding of complex manuscript materials as indeed it is — which it should not be if it were an independent hierarchical or non-hierarchical structure of the source that decides on the success or failure of our encoding. It is only when these central points are neglected that the debate about hierarchical vs. non-hierarchical markup can arise in the first place.[13]

Notes

[1]  This is a very brief account of the story of the Austrian–British philosopher Ludwig Wittgenstein’s book Philosophical Investigations [Wittgenstein 2009], with a focus on its first four paragraphs where Augustine’s description (in Confessiones I, 6 and 8 [Augustinus 2013]) of how he learned to speak and understand is discussed. The book was published posthumously in 1953 by Wittgenstein’s heirs from his Nachlass. The Wittgenstein Nachlass is described and catalogued by G.H. von Wright in “The Wittgenstein Papers” ([Wright 1982], first published 1969). The earliest preserved version of Philosophical Investigations §§1–4 [Wittgenstein 2009] is from July 1931 and can be found in Ms-111 (http://wittgensteinsource.org/Ms-111,15_f [Wittgenstein 2015]). For a concise and easily accessible account of Wittgenstein’s way of working in the early 1930s see Joachim Schulte on http://wittgensteinsource.org/Ms-111_m [Wittgenstein 2015].
[2]  For an illustrative assessment of challenges and achievements in the pre-digital era of editing Wittgenstein as well as the scholarly reactions to the editions see [Hintikka 1991].
[4]  The distinction between reading with and reading without understanding is also explicitly commented upon by our philosopher [Wittgenstein 2009, §156].
[5] Some may object that we frequently use “text” and “document” interchangeably and that the distinction between “text” in the sense of document and “text” in a sense in which it is very different from documents and document carriers, is counterintuitive and goes against standard usage of the expression. However, not only is it generally accepted that words can have a variety of different uses, and from a philosophical perspective defendable that one distinguishes between the surface and the depth grammar of our concepts and expressions, but also that there is a categorial distinction between document and text is moreover a view held by many digital humanists and scholars of textual criticism (see for example Renear in [Hockey et al. 1999] and [Gabler 2012]).
[6]  Some of the challenges that conceptions of text as abstract object have to face have been addressed by philosophical critiques of the FRBR (Functional Requirements for Bibliographic Records, [IFLA 1998]) ontology which conceives text “works”, “expressions” and “manifestations” as types (see e.g. [Renear and Dubin 2007]). Other challenges to that conception have to do with the fact that texts have a beginning in time and also change with time and place, while abstract objects in a standard sense do not.
[7] There is a long tradition in twentieth century literary theory, poststructuralism, phenomenology, hermeneutics, reader-response criticism, linguistic pragmatics and speech act theory as well as semiotics to see an intimate connection between text and event / action, or even to view texts as some kind of event or action. However, one has to pay attention to the fact that each of these schools come with their own specific terminologies and conceptualizations which may not agree with each other, or with the approach taken here. Here, the view that texts are (shared) actions is, undertaken from the perspective of analytic philosophy, more specifically analytic ontology and action theory, although discussing these views in great detail is beyond the scope and space of the current paper.
[8]  The point of mind-dependency is discussed by [Kanzian 2015] for artefacts in general, with a particular focus on works of art. My view that texts do not exist if they are not produced and maintained as such through understanding reading, has stronger implications for their mind-dependency than Kanzian’s position. Kanzian holds that artefacts are for their subsistence mind-dependent only in the sense that they need a mind that can, but does not actually need to recognize them as such: “Artefakte hängen hinsichtlich ihres Bestehens zu jedem Zeitpunkt davon ab, dass mindestens ein Bewusstsein in der Lage ist, sie als solche anzuerkennen.”  [Kanzian 2015, 901]
[9] It is interesting to note that Augustine himself (the very “opponent” criticized in Wittgenstein’s Philosophical Investigations §1 [Wittgenstein 2009]) in fact promoted the view that includes the human as a sine qua non (e.g. De Dialectica V [Augustinus 1975]) and that will inform Wittgenstein’s entire mature philosophy: Nothing is a sign unless it is understood (and practiced) as a sign by a human.
[10]  Also the term “text technology” (cf. TEXT Technology: The Journal of Computer Text Processing) can be applied on all three levels, thus using “text” in a wide sense. Preservation methods, for example, deal with the document carrier; OCR addresses the level of the document itself; and semantic technologies again refer to the level of text in the narrow sense.
[11]  The string “The trees are green with white flowers” can be seen to contain overlap between the italics of “are green with white” and the underlining of “trees are green”. A transcription such as this one:
The <underline>trees <italic>are
green</underline> with white</italic>
flowers.
would not be well-formed in terms of XML, since it is non-hierarchical, that is: the content of the <italic>-element is not fully embedded in the content of the <underline>-element, but overlapping with it. However, the passage can also be transcribed as well-formed XML text in the following way, applying what is called “fragmentation”:
The <underline>trees <italic part="I">are
green</italic></underline><italic
part="F"> with white</italic>
flowers.
On overlap see more in http://www.tei-c.org/release/doc/tei-p5-doc/en/html/NH.html. It should be emphasized that TEI XML equally allows for stand-off markup; see more in http://www.tei-c.org/Activities/Workgroups/SO/sow06.xml.

[12]  A simple example from Wittgenstein Nachlass Ms-106 can be used to exemplify the basic idea of distinguishing document and text levels through diplomatic and normalized versions, respectively: In this passage, Wittgenstein did not actually write out the two variants “Allgemeinheitsbezeichnung” and “Allgemeinheit”, as he did with the two variants “brauchen” and “verwenden”. He wrote only “Allgemeinheitsbezeichnung” and subsequently deleted part of it, yielding our reading of the passage as containing the two variants “Allgemeinheitsbezeichnung” (being eventually discarded) and “Allgemeinheit”. The diplomatic version can look something like this:

Dann aber scheint es mir als könne
man die Allgemeinheitsbezeichnung — alle etc —
in der Mathematik überhaupt nicht brau-
chen verwenden. [Wittgenstein 2016, Ms–106,90[4]et92[1]]

In the diplomatic version “Allgemeinheitsbezeichnung” is not spelled out to contain two words (“Allgemeinheitsbezeichnung” and “Allgemeinheit”), while in a normalized version it will of course be:

Dann aber scheint es mir als könne
man die Allgemeinheitsbezeichnung Allgemeinheit — alle etc. —
in der Mathematik überhaupt nicht
brauchen verwenden. [Wittgenstein 2016, Ms–106,90[4]et92[1]]

Both the diplomatic and the normalized presentation are produced from one and the same transcription in XML:

<s type="es">Dann aber scheint es mir als
k&ouml;nne man die <choice type="em"><orig type="em1">Allgemeinheit
<del type="d">sbezeichnung</del></orig><orig type="em2">
<choice type="dsl"><orig type="alt1">Allgemeinheitsbezeichnung</orig>
<orig type="alt2">Allgemeinheit</orig></choice></orig></choice> &dash;
alle <abbr type="abb">etc<corr type="tra">&p.abb;</corr>
</abbr> &dash; in der Mathematik &uuml;berhaupt nicht <choice type="dsl">
<orig type="alt1"><del type="d">brau<lb rend="shyphen"/>chen</del></orig>
<orig type="alt2">verwenden</orig></choice>&p.es;</s>

[Wittgenstein 2016, Ms–106,90[4]et92[1]]

The same transcription can be converted also to other outputs, so — through “interactive dynamic presentation” — also by the external user (see http://wittgensteinonline.no/). More WAB Wittgenstein Nachlass transcription samples in XML are available from http://wab.uib.no/cost-a32_xml/.

[13]  I am indebted to many colleagues for exchanges on the ideas appearing in this paper, including D. Apollon, S. Bangu, R. Falch, N. Gangopadhyay, S. Gradmann, A. Greve, S. Greve, C. Huitfeldt, C. Kanzian, J. Macha, S. Markewitz, G. Meggle and A. Renear. I would also like to thank the organizers and participants of GDDH 2016 (6.6.2016; see http://www.etrap.eu/activities/gddh-2016/) for giving me the opportunity to present and discuss an earlier version of this paper, and seminars in Bergen (8.9.2016) and Innsbruck (28.4.2017) for the opportunity to discuss the philosophical ontology behind. Further I would like to thank two anonymous reviewers from DHQ for helpful and constructive comments.

