“Revisiting Revisions: employing XML and XSL to display
deeply encoded, multi-versioned text”
Susan
Schreibman
New Jersey Institute of Technology
susan.Schreibman@njit.edu
Jose
B.
Chua
New Jersey Institute of Technology
Many humanities computing projects working in SGML are contemplating porting
their data into XML for direct display on the Internet. In addition, for many
projects who found the barriers to working with SGML too great, XML and its
related technologies offers them the opportunity of working with low cost (free
in many cases), more human-friendly tools to encode and display their texts.
One of the many areas that the suite of XML technologies promises much greater
functionality over SGML is in the area of display. Although the editors of a
digital project might develop a robust encoding scheme in SGML, unless they
could afford the $1000,000 price tag that went with DynaWeb, in addition to the
salary of a programmer to work with the system, their encoding was for
naught.
XSL (Extensible Stylesheet Language) provides the breakthrough that projects with
limited resources have been waiting for to display deeply-encoded texts. The
display possibilities afforded by XSLT (Extensible Stylesheet Transformation
Language) provide new challenges and opportunities to the humanities computing
community in developing more complex inter- and intra-textual encoding which
displays several distinct yet interconnected objects.
One of the areas begging for investigation in this regard is the use of XML/XSL
for versioning poetry. Encoding and presenting multiple versions of poetry has
been one of the most challenging areas of SGML encoding using the TEI. There
are, indeed, very few examples of it (notable exceptions are The Blake Archive,
The Rosetti Archive, and The Piers Plowman Archive).
Our Poster Session will explore the encoding and display issues of XML-encoded
poetry across several versions, using XSL to display multiple views of the texts
and its related apparatus. XSL is particularly interesting in this regard in
that it is flexible enough to handle nested revisions easily, and can even
assign identification numbers based on an elementís nesting level (see Figure
1).
This type of identification makes it easier to determine the parent of the
revision, and the XSL stylesheet can attach and present it appropriately.
Publishing textual apparatus has always presented editorial difficulties. In
print publication the conundrum of whether to place apparatus at the foot of the
text or at the end of the text was never successfully resolved. Digital editing
has only compounded these problems, and presenting apparatus across versions
presents even greater challenges. Placement depends on which line or lines the
note is referring to, and the existence of those lines depends on which version
is being viewed. One solution we are currently experimenting with is to place
the comment separately from the poem, and to assign linespan and versionspan
parameters. The versionspan parameter would indicate which versions the comment
is applicable to, and the presentation logic (XSL stylesheet) would suppress or
show the comment depending on what version was on display. The linespan
parameter would allow the presentation logic to determine where to place the
comment. This placement could run alongside the line it referring to, or could
be suppressed until a specific user action occurred.
Since the same comment may be displayed across several versions, it would be
beneficial to the reader if he or she knew that a particular comment was already
read. However, persistence is not a goal of XML or XSL. The solution would lay
in either a server-side or client-side application logic language. The simplest
solution would be to give each comment an identification number. A persistence
datastore (either on the server or client side) would keep track of which
comments were already read and the application logic could then take appropriate
action. To give a more concrete example, if an XSLT document transformed an XML
document into HTML, a Javascript routine could mark unread and read comments in
different colors.
Another particularly challenging editorial issue has been the representation of
the intra-textual writing process. When working with a particular manuscript
draft, an editor may make an educated guess as to when additions, deletions and
emendations were made on a particular manuscript draft (based on handwriting,
ink colour, the logic of the sentences, etc). Indeed, one of the beauties of
publishing facsimile editions of manuscripts is that no editorial statement need
be made in this regard ñ it is left up the reader to decide on the writing
process based on an examination of the evidence. Of course, this textual
ambiguity can be replicated in the digital environment by publishing scanned
versions of text. Unfortunately, however, encoded texts allow no such
ambiguity.
Representing the intra-textual revision process in an encoded edition is a
particularly interesting and challenging problem. In these cases, relative
ordering may alleviate the encoding process. We are experimenting with assigning
ordering comparative identifications in XML, regardless of the exact revision
time relative to other emendations. In the case where two or more revisions
cannot be ordered, they are assigned the same ordering identification. It is
then up to the presentation logic to determine how to display it. These
revisions could be displayed side-by-side, or may require an explicit action on
the readerís part to be viewed. Figure 2 shows a very simple example of the
types of XML encoding we are experimenting with in encoding an intra-textual
variation with apparatus.