Because It's Not There: Ekphrasis and the Threat of Graphics in Interactive
Fiction
The genre of interactive fiction has enjoyed increasing critical attention over the past
few years, particularly since the publication of Nick Montfort's
Twisty Little Passages: An Approach to Interactive Fiction.
[1]
According to Eric Eve's definition, an interactive fiction is “a turn-based program driven by textual input from
the player, responding with output that is principally or wholly textual, and
involving a parser and a world model”
[
Eve 2007, ¶1]. In other words, IF is a program that (1) simulates a diegetic world containing
various spaces and objects (the world model), (2) presents that world to the user/player
through the medium of unillustrated or sparsely illustrated text, and (3) permits the user
to interact with its simulated world by inputting textual commands. IF, then, is
distinguished from other genres of video games by its lack of images, and from other forms
of recombinatory or procedural textuality by its inclusion of a world model.
Up to this point, IF has typically been examined from the viewpoint of its textual and
programmatic aspects. For Montfort and others, IF descends from the canonical traditions
of riddle-making and ergodic textuality and participates in the contemporary movement of
electronic literature. According to these claims, the value of IF for scholarly study lies
in what it tells us about textuality, literariness, and the transformations of both in the
digital era. The existing critical discourse presents IF as a primarily
textual,
procedural and
ludic phenomenon — as an
art form or communicative medium which is composed of verbal signifiers that are subject
to rule-based manipulations, and which has historically been used to produce
games.
[2]
With rare exceptions, critics have neglected the visuality of IF. In this paper I will
explain the necessity of rectifying this neglect, and take tentative steps toward doing
so.
The distinction between the textual and the visual, or between the verbal and the visual
signifier, is impossible to define precisely, because, as W.J.T. Mitchell argues, such
distinctions are always already political: “Every theoretical answer to the questions,
What is an image? How are images different from words? seemed inevitably to fall back
into prior questions of value and interest that could only be answered in historical
terms”
[
Mitchell 1990, 3]. For the narrow purposes of the present analysis, we might define a textual
signifier as a sign whose visual appearance is not directly linked to its signifying
value. For example, the visual appearance of the letter P can vary, to a certain
predetermined extent, without altering its semantic value. Similarly, a novel can be set
in a variety of typefaces while still being understood as the “same”
text, and a computer program will carry out the same processes regardless of the font in
which it is written. A visual signifier or image, by contrast, is one whose semantic or
affective value is linked directly to its specific visual appearance, including its
material embodiment and/or its phenomenological effect on the viewer. I will use the term
image to refer interchangeably to “real” images and
mental visualizations. While “pictorial images are inevitably conventional
and contaminated by language”
[
Mitchell 1990, 42], the image at least tries to claim that its
meaning is contingent on
its physical appearance. “The image is the
sign that pretends not to be a sign, masquerading as (or for the believer, actually
achieving) natural immediacy and presence”
[
Mitchell 1990, 43]. Obviously, this word-image distinction is
as problematic and open to critique as any such distinction; see, for example, [
Drucker 2002, 154–160] for an argument that something
is
indeed lost when the materiality of a letter is altered. I claim merely that this
distinction represents a commonsensical understanding of what distinguishes words from
images. It reflects the way in which IF critics typically understand these terms when they
don't interrogate them further.
Critics have typically paid little attention to the visuality of IF — which includes both
its use of actual visible signs, and the visual images it may evoke in the player's mind.
This may seem hardly surprising since, by the second element of the above definition, IF
consists mostly or entirely of textual signifiers and makes limited use of images. In the
present paper, however, I will suggest that interacting with IF is in fact a visual
experience in crucially important ways, and that IF therefore has important things to
teach us about the fate of the visual aspects of verbal signifiers in the digital era.
Without denying that IF participates in various traditions of potential literature and
ludic textuality, as Montfort and others suggest, I here want to suggest that IF is also
an heir to equally longstanding traditions of ekphrasis and of visual prose. As such, IF
poses questions of the relation between descriptive text and readerly visualization that
go back as far as Homer's description of the shield of Achilles — though by virtue of its
ergodic nature, IF also significantly transforms those questions. By viewing IF as a
visual-textual phenomenon, we can improve our understanding of the transformation of
visual prose and readerly visuality in the digital era.
Moreover, a focus on the visuality of IF can improve our understanding of how the genre
defines itself. A recurring concern of ekphrastic poetry is the definition of the relation
of poetry to painting and, more recently, to still photography and film. As Mitchell
argues, ekphrasis is the genre in which text (in the narrow sense given above) confronts
its other: “Ekphrastic poetry is the genre
in which texts encounter their own semiotic ‘others,’ those rival,
alien modes of representation called the visual, graphic, plastic, or
‘spatial’ arts”
[
Mitchell 1994, ¶9] This argument expands on James Heffernan's
reading of ekphrasis as paragonal — that is, as enacting a competitive struggle between
word and image. Elizabeth Bergmann Loizeaux suggests, by contrast, that ekphrasis may also
be motivated by “such modest, and profound, feelings as
companionship or friendship, the terms in which poets often describe their ekphrastic
motives”
[
Loizeaux 2008, 15]. Under either model, however, a central drive behind ekphrasis is the desire to
define poetry or “textual” art itself by contrast to its other. By
directly addressing the image, poetry makes claims for what it can do that the image
can't, and/or asks how it can do what images seem capable of doing more effectively.
This task becomes especially pressing in the present cultural moment. Ekphrastic
literature has perhaps always been both fascinated and repelled by the apparently superior
mimetic power of images to text. As Murray Krieger argues, ekphrasis entails “the defensive concession that language, as arbitrary
and with a sensuous lack, is a disadvantaged medium in need of emulating the natural and
sensible medium of the plastic arts,” which exists in an ambivalent relation to “the prideful confidence in language as a
medium privileged by its very intelligibility”
[
Krieger 1992, 12]. However, the more images advance in both ubiquity and mimetic power, the more
unequal the terms of this relation become. Loizeaux observes that twentieth-century poets'
interest in ekphrasis arises from ambivalent reactions to the growing cultural importance
of the image:
The widespread presence of ekphrasis in
twentieth-century poetry can be understood as both a response to and a participant in
what W.J.T. Mitchell has called “the pictorial turn” from a culture of words into a
culture of images that began in the late nineteenth century with the advent of
photography and then film, and has accelerated since the mid twentieth century with
the invention of television and, now, digital media. Excited — and haunted — by a
sense of images' increasing power in western culture, poets have taken up ekphrasis as
a way of engaging and understanding their allure and force.
[Loizeaux 2008, 3–4]
At the same time that images have attained unprecedented cultural power, poetry has
now “further lost popular readership and its
significant social role”
[
Loizeaux 2008, 6]. Explicit confrontation with the image now becomes a way of justifying he continued
appeal, if not the very existence, of poetry itself.
Similarly, IF authors and critics feel a need to distinguish IF from graphical video
games in order to explain why IF should continue to exist today, despite its apparent
commercial and technological inferiority to graphical video games. Graphics both threaten
and fascinate IF authors in much the same way that paintings both threaten and fascinate
poets. IF authors and critics feel a need to distinguish IF from graphical video games in
order to explain why IF should continue to exist today, despite its apparent commercial
and technological inferiority to graphical video games.
As an example of the study of IF from a visual perspective, in this essay I offer
readings of two recent works of IF that represent opposing conceptions of the genre's
visual aspects. My first text, Nick Montfort's
Ad Verbum
(2000), goes further than perhaps any other work of IF in stressing the genre's textual
properties at the expense of its visual properties. In calling attention to the textual
nature of the IF interface and of the player's input,
Ad
Verbum defines itself as a purely verbal artifact. My second text, Emily Short's
City of Secrets (2003), seeks instead to accentuate its own
visuality by providing evocative descriptions accompanied by abstract imagery. Yet the
mode of visuality that this game proposes is affective, evocative and phantasmal, rather
than vivid and immediately present. This game proposes that IF can be a visual experience,
but that its visuality differs in significant ways from that of the graphical video game.
Though these games approach visuality in very different ways, a central question for both
games is whether and how the visual properties of text can compete with those of more
mimetic forms of imagery. This, I would argue, is as crucial a question for interactive
fiction as it is for ekphrastic poetry, because it touches upon the larger question of
what happens to less explicitly transparent forms of visuality and textuality at a time
when transparent forms of visuality seem to have attained a position of cultural
dominance. As I will argue, IF, like ekphrastic poetry, offers visual experiences which
are indirect, phantasmal, and dependent on the player's imagination. How can such visual
experiences compete with the transparent visual experiences offered by media like computer
games and CG film? Do we still want or need such visual experiences, and if so,
why?
[3] The two games
I'll be discussing represent two possible answers to these pressing questions.
Toward a Theory of IF Visuality
For most IF critics, IF is a verbal, textual and literary medium whose closest affinities
are with the tradition of ergodic textuality that extends from the
I
Ching and the Exeter Book, through the Oulipo and Cortázar, to hypertext
fiction. On this assumption, the visual aspects of IF, if any, are usually ignored. Espen
Aarseth, for example, treats IF as “a new type of literary artifact”
[
Aarseth 1997, 107]. His reading of the Infocom game
Deadline considers
only its literary and ludological aspects. Montfort, the leading authority on the genre,
has equally little to say about its visual qualities. According to the historical
narrative he provides, the antecedents of IF are textual genres, including riddles and
Oulipian potential literature [
Montfort 2005, 37, 65]. The major
exception to this neglect of IF’s visuality is Dennis Jerz’s article “Somewhere Nearby is Colossal Cave,” which compares the geography of Will
Crowther and Don Woods’s
Colossal Cave (or
Adventure), usually considered the first work of interactive
fiction, to the geography of the real cave on which the game was based. In a photo-essay,
Jerz juxtaposes Crowther's room descriptions with photographs of the real-world locations
on which those descriptions are based. However, Jerz's stated goal here is to “establish that Crowther's original was not only
faithful to the geography of the real Colossal Cave, but was also a fantasy
remediation of that site”
[
Jerz 2007, ¶2]. The question that interests Jerz is the extent to which the simulated cave
faithfully reflects the real one. What he leaves unexamined is the general question of
whether the exploration of such simulated spaces can be a spatial and visual
experience.
This critical neglect of the visuality of IF seems unsurprising, given that one might
have difficulty identifying any visual aspects of the genre. What could be the importance
of visuality in a medium which, by definition, includes few or no visual images and relies
primarily on text? If we distinguish visual and textual signifiers according to the
definitions given above, the signifiers that make up a work of IF seem to fall into the
latter category, as their semantic value doesn't depend on their precise visible
instantiation. Contemporary IF interpreters give the player the option of altering details
such as the font, text color and background color, without altering either the precise
text that the program generates, or the code that generates it.
According to a common-sense understanding, the IF work
is the source code,
or perhaps the string of signifiers produced in the execution of that code, but not the
material instantiation of that code. Two players who play the same version of
Ad Verbum using the two sets of interface options shown in figures
1 and 2 are playing the “same” game; the differences in font and color
are purely cosmetic. This is analogous to the commonsensical assumption that the identity
of a literary text resides in the text — the ordered array of signifiers — and that the
material instantiation of those signifiers is merely a cosmetic feature.
[4]
Yet I argue, counterintuitively, that IF may be viewed as a visual and visual-verbal
genre. In the first place, and even before we consider the visual aspects of the IF
interface itself, a central element of nearly all works of IF is the ancient rhetorical
trope of ekphrasis. In ekphrasis, an absent object is described in terms which permit the
reader or listener to visualize that object, to “see” it in the mind's eye as if it were
physically present.
From the reader's perspective, the principal textual components of IF are room
descriptions and object descriptions. The basic purpose of both these types of texts is
designed to enable the player to visualize the phenomena described by the text. As Eric
Eve explains, in IF,
the physical world is generally modelled as a
series of discrete locations known as rooms. The totality of rooms in a
given work of IF is often referred to as the map. Such rooms could
correspond to rooms in a building, but they need not and frequently do not[...].
Conceptually, a room is that segment of physical space that is immediately accessible
to the player character.
[Eve 2007, ¶7] (emphasis in original)
In other words, the typical arrangement of space in IF is that the gameworld is
divided or segmented into several discrete, mutually exclusive chunks. Such a spatial
arrangement is not unique to IF. The fifth item of Mark J.P. Wolf's taxonomy of video game
spatial structures is “adjacent spaces displayed
one at a time”
[
Wolf 2002, 59].
[5] In graphical video games dating back to the late 1970s, such as
Superman and
Berserk, “adjacent spaces or rooms are displayed as a
series of nonoverlapping static screens which cut directly one to the next without
scrolling”
[
Wolf 2002, 59]. However, in a text adventure game, by definition, these chunks of space cannot be
represented by onscreen images.
[6] Instead, a block of onscreen text — the “room description” — is
used to make the player aware of the relevant properties of the present room, including
the exits from that room and the objects it contains. The room description might be said
to take the place of the absent graphical image of the room, although this formulation is
anachronistic insofar as IF predates graphical adventure games. Furthermore, the image of
the room is not “absent” in the sense of having been removed or
abstracted, inasmuch as it never existed to begin with.
Consider, for example, the following room description from
Zork I:
The Great Underground Empire:
You are in the living room. There is a doorway to the east, a wooden door with
strange gothic lettering to the west, which appears to be nailed shut, a trophy
case, and a large oriental rug in the center of the room.
Above the trophy case hangs an elvish sword of great antiquity.
A battery-powered brass lantern is on the trophy case. [Blank et al. 1980]
This text names the room and enumerates all the
visible exits from the room (the doorway and the wooden door) and the visible objects in
it (the door again, the trophy case, the rug, the sword and the lantern). These objects
are all “implemented.” That is, they are defined in the game’s source
code as objects that have certain properties, one of which is that the avatar may be able
to interact with them. The description mentions no objects that aren’t implemented
(although room descriptions often do mention such objects), and it does not fail to
mention any visible objects that are implemented.
The qualifier “visible” is necessary because there's a trap door under the rug. This
object is left unmentioned because on first entering the room, the avatar can't see it.
Finding the trap door (by moving the rug) is a puzzle. The player may well know about the
trap door before moving the rug, perhaps from having played the game before, but such
knowledge does not extend to the avatar. If the player inputs a command referring to the
trap door before moving the rug, the game responds, “You can’t see any trap door
here!” In this case the player may be able to visualize the trap door under the rug,
and perhaps the avatar can even imagine that there's a trap door there, if we imagine the
avatar as being capable of having cognitive operations that the player doesn't share.
However, the avatar still can't
see the trap door in the sense that it is not
physically within his or her visual field.
[7] Thus the room description represents what the
avatar, not the player, sees when he or she looks around the room. It is a translation of
the avatar's direct visual experience into words. The player then has the opportunity to
back-translate those words by activating the faculty of readerly visuality — by forming an
imaginary visualization of the things the avatar
sees.
The primacy of
seeing in IF is indicated by the ubiquitous presence of light
sources in
Adventure and games descended from it. Exploration
can't take place in the absence of light, and light source conservation and transport are
common puzzle themes. As Jeremy Douglass observes, this made sense in
Adventure
“as it is highly dangerous to wander around
cave systems in the dark”
[
Douglass 2007, 132], but the need for light sources subsequently became divorced from its original
context and evolved into a generic convention. Games like Taro Ogawa's
Enlightenment (1998), where the player's goal is to extinguish all
the light sources in a room, or Andrew Plotkin's
Hunter, in
Darkness (1999), where exploration takes place via senses other than sight, are
deliberate reactions against this primacy of sight [
Douglass 2007, 134]. The default assumption in IF is that the avatar experiences the gameworld through the
visual faculty, and that the text presents the avatar's visual experience to the
player.
As translations of visual objects in the medium of language, IF room descriptions (and
object descriptions, of which room descriptions are special cases) are examples of
ekphrasis. In current critical discourse ekphrasis is most often defined as the verbal
description of a visual work of art, but Janice Hewlett Koelb argues that this meaning of
the term is a twentieth-century invention, dating back no earlier than Leo Spitzer’s 1955
essay on Keats's “Ode on a Grecian Urn”
[
Koelb 2006, 2]. Ancient rhetoricians defined ekphrasis as “[a] speech which leads one around (
periegematikos
) bringing the subject matter vividly (
enargos
) before the eyes”
[
Koelb 2006, 23], whatever that subject matter might be. IF games like
Zork certainly meet this definition. The degree of vividness (or
enargeia
) with which the subject matter is “brought
before the eyes” is a factor that varies between different games, and also
between different players, since players might mentally visualize the gameworld more or
less visually depending on how visually inclined they happen to be. On an anecdotal level,
I tend to visualize extensively when I play IF games, but I know other IF players who
claim that they don't do so, and that they understand room descriptions in a conceptual or
propositional way. However, I suggest that IF games must supply the
potential
for visualization in order to provide a meaningful play experience.
[8] What we might call
visualizability is a basic requirement
for traversing most if not all interactive fictions, especially those that include
multiple rooms or rooms with multiple objects in physical contact with each other. In
order to productively interact with the gameworld, the player must possess at least a
minimal understanding of the spatial relationships between the objects in each room and
between the rooms themselves. This requires constructing a mental (or actual) map, which
is, to a substantial degree, a visual operation. As Eve observes, “[t]he totality of rooms in a given work of IF is
often referred to as the
map
”
emphasis in original [
Eve 2007, ¶7]
and this is “probably because someone designing a work of IF
containing more than a handful of rooms almost certainly needs to draw a map
indicating their spatial relations before attempting to write the game, and players
often find it useful to draw schematic maps as they play”
[
Eve 2007, fn5].
When visualizability breaks down — that is, when room and object descriptions fail to
accurately represent what the avatar can see — meaningful play and the ability to traverse
the game successfully may be impeded.
[9] This may happen, for example, when an object
mentioned in a room description is not implemented. By convention, if the player tries to
interact with such an object, the game responds that the object is not important.
Sometimes, however, the game fails to acknowledge the object’s existence and instead
outputs a standard response to commands that reference nonexistent objects, such as “You
can’t see any such thing” or “I don’t see that here.” This behavior is generally
considered a design flaw or even a bug, as Eve explains: “It looks very clumsy if, having told the player that the room is
decorated with striped wallpaper, the game responds with ‘You see no such thing’
when the player tries to examine it”
[
Eve 2007, ¶15].
[10] Such behavior creates a gap between the visual experience
of the avatar and the verbal experience of the player. Somehow, the player can read about
things the avatar can’t see, and this destroys the illusion that the room description
represents the avatar’s visual experience.
An opposite but perhaps more egregious breach of visualizability occurs when the text
fails to mention objects that are implemented and that the avatar should be able to see.
For example, in Dave Baggett and Carl de Marcken's 1994 game
+=3, the avatar must give three objects to a troll as a toll to cross a bridge.
The INVENTORY command reveals that the avatar is holding just one object, and the game's
single room contains no other objects that can be acquired. The solution is to take off
the avatar’s shirt, shoes, pants, socks, glasses and/or underwear, thereby supplying the
missing two items. This solution, though perfectly logical, is cruelly unfair because none
of these articles of clothing are referred to anywhere in the game.
[11] In particular, they aren't mentioned in the responses to the commands
INVENTORY and EXAMINE ME. According to conventions which were well established by 1994,
experienced players would thus conclude that the avatar was wearing nothing important,
because on looking at himself or herself, the avatar sees nothing worth mentioning. The
player would assume that the avatar is wearing clothes (otherwise the avatar's nudity
would be mentioned), but that the clothes have no relevance to gameplay. Objects left
unmentioned are assumed to be below the avatar’s perceptual threshold, and thus either
nonexistent, or irrelevant to the task of traversing the game. The underlying assumption
here is that everything the avatar sees will be translated into descriptive text. In
violating this assumption,
+=3 precludes meaningful play.
Thus, IF is an ekphrastic medium because it consists of texts which describe visual
phenomena and which prompt the reader to create imaginary visualizations of those
phemonena. However, IF difers from other ergodic media by virtue of being
prescriptive rather than
autotelic. The reading of a static
ekphrastic text, like Diderot's
Salons or Ruskin's
word-paintings, is a self-contained experience.
[12] These texts describe absent visual phenomena in such a way as to permit
the viewer to visualize them, but they do not prompt the reader to take any action in
response to these visualizations. The experience of imagining what the text describes is
its own reward. By contrast, when an IF player reads a room or object description, he or
she is expected to take an action in response (i.e. to do work, hence the term
ergodic). The player is prompted to give commands to the avatar based on
the visual and other information in the description.
My argument, thus, is that ekphrasis is the characteristic mode of visual representation
in IF. During the commercial era of IF (approximately coinciding with the lifespan of
Infocom, from 1979 to 1989), ekphrasis, as a means of visual rendering, had certain
comparative advantages over graphics. Graphical video games predate
Colossal Cave by at least 15 years, but these games ran on mainframes or
dedicated arcade machines. The creation of sophisticated graphics was beyond the
technological capabilities of contemporary home computers. Displaying text was much less
labor-intensive. For example, the first commercially successful personal computer was the
Osborne 1, released in 1981. This computer had a monochrome screen which was incapable of
displaying bitmap graphics [
Wikipedia 2009]. On such a platform, a visual
depiction of a building with keys, a brass lamp, food and water on the ground would have
been out of the question. Text made it possible to “show” visual
phenomena that could not have been depicted with the graphic resources available.
Furthermore, text was far more cross-platform than graphics. Infocom games were designed
for the Z-Machine, a “software computer [which]
could be implemented on many different platforms, including almost all of the popular
microcomputers in the United States during the 1980s” including business machines
as well as dedicated gaming machines [
Montfort 2005, 126]. Since all of
these computers were capable of displaying text, all the Infocom games could be ported to
any platform at once simply by writing a new implementor for that platform. The use of
graphics, by contrast, would have made such cross-platform availability an insurmountable
obstacle.
[13]
For these and other reasons, the use of ekphrasis rather than graphics made the
commercial success of IF possible. According to the standard view of the genre’s history,
however, IF's reliance on text was also the cause of its commercial decline.
[14] Over the
course of the 1980s, as the graphical capabilities of home computers advanced, the new
genre of the graphical adventure gradually rendered IF obsolete.
[15] According to
Espen Aarseth, this was a natural succession because graphics, compared to ekphrasis, are
a naturally superior mode of visuality: “Images, especially moving images, are more powerful representations of spatial
relations than texts, and therefore this migration from text to graphics is natural
and inevitable”
[
Aarseth 1997, 102].
[16] By Aarseth's logic, the purpose of a game is to serve as a transparent
window into an imagined space. According to what Bolter and Grusin call the logic of
transparency [
Bolter & Grusin 2003], the game seeks to erase its own materiality and
present the player with a vivid, sensuously present experience of existence in another
world. For this purpose to be fulfilled, the gameworld must be presented with maximum
visual richness. Clearly games that translate the avatar's visual experience into text do
all these things less effectively than games that display the avatar's visual experience
onscreen.
The assumption here is that video game history follows a teleological progression from
lesser to greater transparency. IF becomes commercially unviable because it represents an
earlier stage in this progression. For some authors, this is only natural: the fact that
computer graphics have outstripped the capacities of IF is cause for celebration. An
example of such a view is Julian Dibbell's dismissive description of
Adventure as an inferior precursor to
Myst: “It's hard to believe that that world once
represented the high frontier of computer gaming. Where players of latter-day quests
like
Myst point-and-click their way through complex
graphical environments of an almost liquid radiance […]
Adventure was strictly hunt and peck”
[
Dibbell 2001]. Other authors characterize the gaming industry's ideology of transparency as
unfortunate, and describe IF nostalgically as having been sacrificed on the altar of
progress. Aarseth regrets that the text adventure game, a “young, vigorous, if somewhat bland tradition
of textual entertainment [...] was quickly overrun by the entertainment market”
[
Aarseth 1997, 128]. More recently, Andy Klien began a 2005 article on IF by writing, “Only once in my life have I seen a wonderful
medium effectively wiped out by new technology”
qtd. in [
Douglass 2007, 21–22]
.
Yet interactive fiction still exists today, when the graphical capabilities of personal
computers are far more sophisticated than at the time of IF's commercial collapse. New IF
games are now produced by independent hobbyists and artists rather than by commercial
firms. However, for contemporary IF authors graphics represent an elephant in the room, a
topic that may not be directly discussed but that can't be ignored. Authors of IF in the
post-graphical era cannot avoid the question of why they should bother, since graphics are
now better than IF text at doing what IF text does, for which reason IF will probably
never again be a commercially viable medium. By way of answering this question, IF authors
and critics have sought to claim for IF another type of legitimacy, emphasizing its
aesthetic and scholarly appeal rather than its commercial appeal. If IF can't be a popular
and commercial medium, it can be an auterist and artistic medium. But in order to prove
the aesthetic legitimacy of IF, it becomes necessary to show that IF is an independent
medium from the graphical video game because IF text has properties that graphics
lack.
Where contemporary IF authors and critics differ is in their conception of the precise
nature of these distinctive properties of IF. Within contemporary IF work we can
distinguish two very different approaches to defining the specificity of the genre. The
first approach is to argue that IF is a linguistic and anti-visual medium.
Ad Verbum: Interactive Fiction and Representational Friction
One way in which IF responds to the seemingly superior representational capabilities of
text is by ignoring ekphrasis almost entirely and foregrounding the textual and verbal
qualities of the IF interface. The paradigmatic example of this approach is Nick
Montfort's 2000 game Ad Verbum.
The player's goal in this game is to remove all the objects from a house belonging to the
Wizard of Wordplay. Nearly all of the game’s puzzles must be solved by entering commands
according to various linguistic constraints. Exploiting Bolter and Grusin's logic of
hypermediacy, this game forcibly reminds the player of its nature as a text-based computer
program, rather than a window into a simulated world. This is evident immediately in the
introductory text of the game:
With the cantankerous Wizard of Wordplay evicted from his mansion, the worthless
plot can now be redeveloped. The city regulations declare, however, that the
rip-down job can't proceed until all the items within have been removed.
That's what the demolition contractor explains to you, anyway, as you stand eagerly
on the adventurer's day labor corner. Once he learns of your penchant for
puzzle-solving and your kleptomaniacal tendencies, he hires you for the job. You hop
into the bed of his truck, type a few Zs, and arrive at the site, eager … [Montfort 2000]
“Z” is the standard abbreviation for the
“wait” command, so the last sentence erases the boundaries between
player and avatar, between typing commands and performing actions. Throughout the game the
player is consistently reminded that he or she is not exploring a diegetic world, but
typing commands in response to verbal descriptions. Some of
Ad
Verbum's puzzles in fact involve no interaction with objects or spaces, only
manipulation of language. For example, on the first floor of the mansion, the player
encounters a little boy, Georgie, who refuses to give up his toy dinosaur unless the
player can name more dinosaurs than Georgie can. Georgie knows an arbitrarily large number
of real dinosaur names, so the solution is to input fake dinosaur names — i.e. nonsense
words ending in “saur” or “saurus” — until Georgie
gets frustrated and gives up. Since all the player has to do to solve this puzzle is think
of nonsense words, it doesn't matter whether or how the player visualizes the space where
Georgie is located.
Other puzzles in the game do force the avatar to interact with rooms and objects, but in
order to make the avatar do so, the player has to satisfy certain linguistic constraints.
Most notably, the game contains several “constrained rooms” where the
output text consists entirely of words starting with a specific letter. For example, at
the bottom of
figure 4 we see the initial room description
of the “Wee Wardrobe.”
This same constraint applies to the player's input. Obvious solutions like TAKE
WEAPON don't work; if the player enters a command containing a word that doesn't start
with W, the parser replies, “Wha? Wha? Withhold wrong words. Write wholesomely.” The
puzzle, therefore, is to command the avatar to take the two objects in the room and then
leave, using only words beginning with W.
[17] This constraint applies even to nondiegetic commands like HINT,
SAVE, RESTART, RESTORE and QUIT, and on first entering a constrained room, the player must
read a warning alerting him or her to this fact.
The constrained rooms call attention to the fact that the world of this game is a
linguistic construct, a tissue of words and letters. Of course, this is true in a sense of
the diegetic world of any IF game: the white house in
Zork
doesn't exist independently of the language that describes it.
[18]
Ad Verbum’s innovation is to make explicit the linguistic
nature of the IF gameworld. Since the spaces of
Ad Verbum are
called into being by language, it's logical that these spaces can have linguistic
properties, like the property of only containing objects that start with W. However, by
virtue of being defined in purely verbal terms, these spaces resist translation into
images. What would a room would look like if it contained only things beginning with S?
The first letter of an object’s name is not a property which can be perceived by looking
at it, especially if the object has various possible names. One can imagine a space based
on the physical form of a letter — for example, an S room where the walls, ceiling and
furniture have sinuous, snaky curves, or a V room full of sharp, severe triangles. But
there is no suggestion that the constrained rooms in
Ad
Verbum are organized according to the visual properties of their corresponding
letters. These are entirely linguistic spaces, and the language of which they are composed
is in a sense stripped of visuality. In
Ad Verbum, a letter
is defined purely in relational terms, as a member of a set with 26 members. The question
of the physical instantiation of letters is ignored.
[19]
If descriptions in IF are translations of what the avatar sees into words, the Ad Verbum avatar sees things that can't be seen — for
example, what letter an object starts with, or whether it contains the letter E. This
avatar’s visual experience is fundamentally anti-visual. So the game frustrates the
player’s ability to imaginatively reproduce the avatar’s visual experience. If the things
the avatar “sees” are unseeable, the player can't imagine what it's
like to see those things. This forcibly reminds the player that IF is at bottom a
linguistic and programmatic rather than a spatial experience.
Montfort thereby demonstrates that the world represented in an IF game is dissimilar to
the material, namely language, that represents that world. This is what James Heffernan, a
scholar of ekphrastic poetry, describes as the trope of representational friction, in
which the ekphrastic poem calls attention to the artificiality of the artwork it describes
[
Heffernan 2004, 4, 18–19, 37]. For example, Homer's description of
the shield of Achilles includes the statement that “the earth darkened behind [the ploughmen]
and looked like earth that has been ploughed / though it was gold ”
[
Heffernan 2004, 19]. At the same time that Homer celebrates the amazing power of art to reproduce
reality, he reminds the reader that the work of art is ontologically dissimilar to the
reality it reproduces. Homer celebrates “the
wonder [...] of graphic verisimilitude” specifically by telling the reader “that what appears on the shield is not the
ploughed earth itself, but gold that has been somehow made dark enough to resemble
it”
[
Heffernan 2004, 19]. Because the shield is made of gold, not dirt, it can represent dirt only via
artifice and convention. By analogy, because poetry is made of language and not images, it
can represent images only through a similar artifice. Representational friction, thus, is
a trope that foregrounds the dissimilarity between the descriptive poem and what it
describes. It reminds the reader that the poem is a poem, not a painting or sculpture:
that the reader is not beholding a physically present picture, but imagining a picture
based on his or her interpretation of graphic signifers. Representational friction reminds
the reader of the nature of the activity he or she performs in reading a poem. It defines
the specificity of poetry as distinct from painting and sculpture.
But of course IF players perform an activity that readers of poetry typically don't. In
IF, the player does more than interpret signifiers; he or she also enters commands in
response to those signifiers. These commands produce changes, often of a permanent nature,
in the diegetic gameworld, and thereby determine what signifiers will be given for
interpretation next. Montfort also reveals the verbal nature of the process of entering
commands. The standard conceit is that when the player types a command, this is equivalent
to, and can be visualized as, the avatar performing that action. When I type “take
lantern” and press the enter key, I may imagine that my avatar reaches out his or her
hand and takes the lantern. Of course, what actually happens is that the game program
interprets the words “take lantern” as an action, then checks for whether the action
can succeed or not in the present condition of gameplay. If it can succeed, the lantern is
moved from its current position and added to the player's inventory [
Nelson 2001, 87]. But when Montfort places constraints on the player's ability to
enter commands, he reminds the player that commands don't actually involve interaction
with objects in or attributes of a diegetic world; all they involve is the generation of
signifiers. One puzzle requires the avatar to acquire four books using commands that
follow the linguistic constraints used in the text of the books. For example, the
“dust casing” does not accept commands that include the letter E, and
the “abecedarian book” only accepts commands in which the first word
starts with A and the second word starts with B. If the player tries to take these books
using inappropriate commands, “a mysterious force holds the book to the … shelves.”
Possible solutions include ACQUIRE BOOK and LIFT CASING.
[20]
In the context of obtaining a book, the words TAKE, GET, ACQUIRE, and LIFT all describe
the same action. When I pick up a book, I can use any of these verbs interchangeably to
describe what I'm doing. But in Ad Verbum, the “mysterious
force” that governs the books will accept only some of these actions and not others.
The force allows the avatar to rip the casing or uproot the copybook but not take or get
them, merely because the former two actions satisfy the constraint and the latter two
don't, even though the four actions are not semantically distinguishable and can all be
visualized in the same way. Here Montfort is deliberately subjecting the player to the
notorious “guess the verb” situation, where the player knows what he or
she wants the avatar to do, but has difficulty finding the specific verb that tells the
avatar to do it. When this phenomenon occurs in games, players typically see it a design
flaw, because it violates the logic of transparency. In real life, if one knows what one
wants to do and if one is physically capable of doing it, one can simply do it. In a
graphical video game, the player can just press the button that makes the avatar take the
desired action. So why should it be any different in an IF game? Though this is a
rhetorical question, Montfort answers it by arguing that an IF game does not follow the
procedures of real life, nor those of a graphical video game. An IF game is neither the
real world nor a transparent representation thereof, but rather a computer program in
which both the input and the output consist entirely of text.
In Ad Verbum, representational friction and guess-the-verb
puzzles ultimately serve to define the specificity of IF as opposed to graphical video
games. Since IF is clearly incapable of competing with graphical video games in terms of
commercial appeal, Montfort seeks to claim for IF another type of legitimacy in terms of
aesthetic or academic appeal. Montfort does this by stressing that the visual and spatial
aspects of IF are metaphorical, not literal, because IF is a fundamentally linguistic
medium. IF is an independent and aesthetically legitimate medium because of, not despite,
its lack of graphics. Contemporary IF is not an atavistic throwback to the era before the
graphical video game, but an artistic medium in its own right. By situating IF as a
textual medium, Montfort is also able to connect it to earlier, more canonical forms of
ludic textuality. Thus, Ad Verbum contains explicit
references to famous constrained texts like Walter Abish's Alphabetical Africa and Georges Perec's La
Disparition. In Twisty Little Passages, Montfort
continues this project by arguing that IF has important similarities to the literary genre
of the riddle.
Montfort doesn't refute the allegation that computer graphics are more effective in some
ways than words at representing the contents of fictional spaces. He tacitly accepts this
critique and suggests that the true strength of IF lies elsewhere, in its ability to
manipulate the material of language, an ability that graphical video games lack. If the
graphical video game is a visual medium, then IF is a textual medium. Visual effects are
the proper province of graphical games, while textual effects are specific to IF.
A similar strategy is at work in many other more recent games that exploit the textual
properties of the IF browser, although I don't know of any other game that does this to
the same extent as
Ad Verbum. For example, Jeremy Freese's
Violet, the winner of the 2008 Interactive Fiction
Competition, features a parser which is personified as the avatar's eponymous girlfriend.
This effect is possible in IF because the parser is simultaneously the voice of a narrator
and the means by which the diegetic world is presented to the player. The parser not only
narrates the events of the gameworld, but actually produces that world for the player. In
graphical video games, these two functions are separated. If
Violet were a graphical game, Violet would be no more than what André
Gaudreault calls a delegated narrator (see [
Gaudreault & Barnard 2009, 135–146]).
It would be difficult to create the illusion that Violet was actually creating the
gameworld by speaking about it.
Moreover, if IF is an independent artistic medium in its own right, rather than an
atavistic precursor of graphical video games, then it becomes reasonable to use IF for
purposes other than gaming. This is the idea behind the genre of puzzleless IF, which uses
IF scripting languages but often abandons the elements of spatial exploration and
puzzle-solving. The classic example of puzzleless IF is Adam Cadre's Photopia (1998) and the genre also includes sophisticated chatbots like Emily
Short's Galatea (2000).
But Montfort's strategy of stressing the linguistic and anti-visual properties of IF is
only one way of arguing for the aesthetic legitimacy of the genre. Another approach is to
argue that IF is in fact a visual genre, but that it possesses a type of visuality which
is in some degree unavailable to graphical games. By coincidence, one of the key advocates
of this approach to IF is the aforementioned Emily Short.
[21] Eve probably chose to use wallpaper
as an example because of Andrew Plotkin’s game
Delightful
Wallpaper, in which the avatar is an incorporeal ghost, and is thus unable to
interact with the titular wallpaper or with any other object. Nonetheless, Plotkin
includes many implemented objects in the game and goes to the trouble of including
descriptions for all these objects. According to one reviewer, it was precisely these
descriptions that made Plotkin's game more than a mere puzzlefest [
Bond 2006].
Affective Ekphrasis in City of Secrets
City of Secrets (2003) is a game about spaces. For most of
this game the avatar's goal is simply to explore the setting of the game, known simply as
the City, in order to find a mysterious woman named Evaine. The game's puzzles are mostly
about overcoming barriers to further exploration, and the primary reward the player gets
for solving these puzzles is the ability to explore previously unseen spaces. The City
itself is inherently worth exploring because it's a tourist destination, a place of great
historical and cultural importance. Short's innovation in City of
Secrets is to encourage the player to see this space rather than simply read
about it. Short's descriptive language is precise and detailed, but also deliberately
limited in terms of what it reveals. However, by deliberately limiting the visual
information she provides, Short encourages the player to supply this information by
exercising the faculty of readerly visuality.
The descriptions reproduced in
figure 5 accomplish the
primary practical tasks of an IF room description: they enumerate the exits from each room
and the implemented objects in them, thereby making this part of the game's geography
visualizable. However, the descriptions are in no way ultraprecise; they provide
insufficient information to permit the player to visualize exactly what these spaces look
like. Short neglects to describe the architectural style of the buildings or to specify
the number of buildings or the things depicted in the statues. This omission of detail is
a deliberate choice on Short's part, since she has also written descriptions which are
obsessively detailed. Her 2000 game
Metamorphoses contains a
number of murals which can be both examined and looked at through a magnifying glass,
revealing additional details which can themselves be examined. Short comments, “In writing
Metamorphoses I did think of what I was doing as specifically ekphrasis,
and that’s one reason there are so many layers of detail within the scenery,
especially the murals: I was trying to capture a little of the sense, found in Ovid
and Catullus, that worked pictorial objects have astounding levels of detail”
[
Short 2009].
What happens instead in
City of Secrets is that the omission
of details from the text creates gaps in the player's visualization of the scene, gaps
which the player then has the opportunity to fill. As Wolfgang Iser has argued, filling in
gaps in a text is one of the major cognitive operations performed by readers. Iser
characterizes this process as a propositional or linguistic one, but Peter Schwenger, a
theorist of readerly visuality, suggests that readers perform this process with images as
well as words. Schwenger notes that Iser “speaks of syntheses below the level of
consciousness, which he calls ‘passive syntheses’. Of such syntheses the basic
element is the image”
[
Schwenger 1999, 57]. Another way to theorize this process is through Scott McCloud's concept of
closure, the process whereby the reader of a comic creates mental images that fill the
gaps (or gutters) between the comic's panels [
McCloud1993, 66–68]. If
the concept of closure was designed to account for texts that consist of sequences of
images, then it applies to the IF text insofar as IF, as encountered by the player,
involves precisely such a sequence.
[22] As explained above,
in playing IF the player is presented with a series of visual experiences translated into
verbal terms. Closure is what sutures the gaps in this sequence of disparate images.
Schwenger and Iser's visual “filling in,” which operates when we read a verbal
narrative, is closely analogous to McCloud's “closure,” which operates when we read a
narrative composed of images. Both these modes of reading involve a synesthetic interplay
between the viewer's imagination and the signifiers of the text, whether these signifiers
are defined as visual or verbal in nature. Indeed, the similarity of “closure” to
“filling in” suggests that these two modes of reading are less distinct than they
may appear — that the decision of whether to define a narrative as visual or verbal is to
some extent an arbitrary decision, one which is influenced by cultural politics as well as
by the phenomenology of the reading experience. Even if we choose to define IF as a genre
that employs purely verbal means, the experience of playing IF may not be all that
different from the experience of playing a game that employs (ostensibly) visual
means.
Playing IF, then, could be as much a visual experience as playing a graphical video game.
However, that doesn't rule out the possibility that these two experiences could be visual
in different ways: the visuality of IF might differ from the model of visuality associated
with graphical video games. As early as 1983, Infocom took precisely this position,
arguing in an advertisement that their games “unleash[ed] the world's most powerful
graphics technology,” i.e. the human brain: “We draw our graphics from the
limitless imagery of your imagination — a technology so powerful, it makes any picture
that's ever come out of a screen look like graffiti by comparison.”This argument,
however, still adheres to the logic of transparency: it holds that imagined visuality is
more transparent than graphical visuality and therefore better.
A more nuanced way to distinguish between readerly and graphical visuality might be to
emphasize the personal, subjective or affective aspects of the former. For Schwenger,
reading is necessarily accompanied by a continuous passive process of image generation,
but the reader's preexisting visual inclinations and his or her mental repertory of visual
images affect the way in which he or she concretizes the text's descriptions:
[L]iterature consists of a steady stream of
erased imperatives, according to Elaine Scarry, imperatives that are often
instructions to produce mental pictures. Yet no matter how detailed or precise those
instructions may be, they are never comprehensive enough to override the individual’s
memory bank of images and associations. These play upon the author’s dictated
pictures, an obbligato of the unconscious, of memory and desire. [Schwenger 1999, 4]
Even if Short's room descriptions were more
detailed than they are, they would be unable to supersede the reader's preexisting mental
pictures of analogous rooms; for example, however Short described the Sun Court temple, I
would inevitably imagine it as looking like the U.S. Capitol. (By contrast, when I visit a
similar location in a graphical video game — say, the Bevelle Temple in
Final Fantasy X — I see
only what the game designers
want me to see, and I see the same temple as every other player. The way I understand this
visual image is specific to me, but the way I visualize it is not.) What Short does do,
however, is to condition how the player sees whatever it is that he or she sees, to
suggest the affective resonances of the mental pictures that the player may form. The
effect of Short's descriptions say less about what precisely the avatar sees than about
how the avatar is affected by what is seen, as Short notes: “With
City of
Secrets, though, it’s true that I was trying to do something a little bit
different [as compared to
Metamorphoses]: to hint at the protagonist’s
perceptual filters by describing styles and trends rather than straightforward
physical detail”
[
Short 2009].
For example, the description of the mosaic in the Sun Court reads, “The mosaic is an elegant job and executed in rich materials, but the
design has a facile modern quality that does not entirely appeal to you.” The
temple is described as “[b]uilt in an old style,
but unworn, unchipped, unpolluted.” Combined with the profusion of illusionistic
artwork in this area of the City, especially the façade-painting, these descriptions
suggest that the Sun Court is an insincere place. It is recognizably less ancient than it
appears to be. This suggests that the City's government, of which this space is the public
architectural symbol, is trying to pass itself off as something it's not. Inasmuch as it
is conditioned by such hints as these, the player's visualization of this space becomes
affectively charged. As a counterpoint to this, here is Short's description of a nightclub
called Scheherazade:
Despite the light that leaks
in through the windows, the place seems to be trying for a dark and anonymous ambiance,
with high-backed booths and wood paneling, a ceiling painted black, and hanging swatches
of brocaded purple velvet. The decorations are mostly allusions to the City's distant
shady past as an outpost of thieves and smugglers on the Vuine.
Most of these
details, again, are not relevant to completing the game, but they assist the player in
creatively visualizing the place. The few details that Short does provide — the black
ceiling, high-backed booths, and purple velvet — hint at what gives this place a “dark and anonymous ambiance,” but the player is
invited to fill in the remaining details in his or her own way. The decorations, involving
thieves and smugglers, suggest why the place is “trying for” such an ambiance: it is
a place of darkness, of secrecy and anonymity, a hideout for outlaws or at least for
people who have something to conceal. But at least this is a place that doesn't seek to
present itself as something it's not.
What all these descriptions do is to condition how the player visualizes the room. They
add an affective dimension to the mental picture of the room that the player involuntarily
creates for himself or herself in response to the textual representation of the room.
This effect is further complicated by Short's limited use of graphics.
City of Secrets includes a frame containing images, located to the
left of the main gameplay window. However, these images are more suggestive or symbolic
than mimetic. They suggest the dominant mood or tonality of the scene the player is
witnessing, rather than showing anything in that scene. Accordingly, Jeremy Douglass calls
the images in this game “ambient illustrations”
[
Douglass 2007, 45]. In
figure 5, for example, we see a stylized
representation of the sun against a field of orange fading into white. This image doesn't
depict anything in the Sun Court, except perhaps the sun symbol on the pavement, but it
suggests the offputting, blinding sunniness of the scene.
[23] What we see here is a complex, synaesthetic interplay
between the images described
in the text and the images that the text
is. The actual images help to shape the player's mental images, at the same
time that the latter inflect the player's interpretation of the former.
This is a text that attends to the way in which text is inescapably a visual phenomenon.
In this context it's worth noting that although City of
Secrets allows the player to change the font, text color and other such options,
the title screen and the left-hand window include text which is not affected by such
changes.
Without speculating on the metaphorical associations of this font, I merely note that it
was chosen deliberately. The player enters this game through the threshold of an image
which is primarily composed of textual signifiers, yet contrary to my commonsensical
definition of text, the precise visual instantiation of these signifiers is
clearly important.
For a certain subset of the game's audience,
City of Secrets
was an even more material and visual experience than it is today. On releasing the game,
Short offered players the opportunity to purchase a special edition of the game that came
with a boxed set of “feelies.” The term
feelies refers to “[m]ultimedia epitexts such as journals,
maps, and artifacts, bundled to illustrate the IF work. Popularized by Infocom”
[
Douglass 2007, 392]. Commercial IF games were physical artifacts — floppy discs packaged in boxes and
sold in brick-and-mortar stores — and the inclusion of feelies further intensified the
physicality of those objects. (Feelies served the additional practical function of copy
protection; games like
Sorcerer and
Leather Goddesses of Phobos were unsolvable without information which was
printed on the feelies, and which, in a pre-World Wide Web era, would have been otherwise
unavailable.) This physical side of the IF experience was lost when IF moved to a digital
model of distribution. Seeing this as an unfortunate development, Short helped to create a
website,
feelies.org, that produced and distributed
feelies for contemporary works of IF:
feelies.org started with a conversation that I
had with some of my friends in the IF community, about how the one aspect of
commercial IF we really missed (as players) was the feelies. Some modern IF comes with
“virtual feelies” — PDF files or fake Websites or whatever that
are distributed in a Zip file with the game — and I like those, but we were also
missing the tangible physical objects.
[Loguidice 2004, n.p.]
The
City of Secrets feelies included such items as a
“[t]ourist guide to the City, including
map, digitally-offset printed by Imagers.com in full color on glossy paper” and a
“[q]uantity of dried liontail in a
labeled plastic bag, contained in velvet and/or satin gift bag from boutique magic
shop.” For players who did not purchase the paper feelies, Short also created an
online website for the Southern Light Rail company (this website is now defunct, but has
been cached by the Internet Wayback Machine). This website prominently features the same
font used in the game's title screen.
The fact that Short paid so much attention to the physical and material aspects of City of Secrets indicates that for her, the visual instantiation
of an IF game is not an irrelevant cosmetic detail. It directly influences the player's
experience of the game, an experience which is visual in multiple senses. The visuality of
City of Secrets results from a collaboration between the
preexisting visual memory of the player and the visual details, verbal and graphical,
supplied by the author, as focalized through the “perceptual filters”
of the avatar — who, unlike the avatars in Zork and Ad Verbum, is a well-defined character with a particular
personality and history. The visual experience of this game depends on a complex and
shifting interplay between the player's visual memory, the details the author provides via
the protagonist-avatar, and the imagetextual aspects of the gaming itself.
Now I suggest that such a visual experience has little to do with transparent immediacy.
A transparent visual representation, by definition, is minimally mediated; it presents the
visualized scene without distorting filters, so that it looks the way it would if it were
present before the viewer. The goal of Short's language in this game is not to create such
visual representations. In a text-based game, the only way to create such visual
transparency would be to provide a large amount of precise descriptive detail, so as to
permit the reader to imagine exactly what every aspect of the scene looks like. However,
Short argues in her blog post “The Prose Medium and IF” that such “detail for detail's
sake” is unnecessary and potentially harmful in IF, where ekphrasis is
prescriptive, not autotelic.
[24] The purpose of details in IF prose is to give the player
the information he or she needs to complete the game. Players are expected not just to
process the details but to use them as a guide for how to interact affectively with the
game's operations and its diegetic world. Providing excessive detail would be distracting
and tiresome. Short explains, however, that detail can do something else:
Some of the most effective writers of mood
create their effect not with a large number of common details (the flowers are red,
the door is yellow, etc) but with a small number of very particular ones; and I think
that that is especially true in IF. Words in interactive fiction individually carry
more weight than they carry in static prose, if only because of the amount of
attention we demand the player give to each one. […] I think I would find [P.D.
James's descriptions] to be overkill in an IF game. They’d need to be shortened and
focused, because each sentence would do the work of three or four sentences in the
static prose version. In this respect IF is closer to poetry than to conventional
prose: it is worth taking more time to select fewer words, because each one will be
inspected through a jeweler’s loupe.
[Short 2008, ¶19, 20]
Short suggests here that the purpose of details in IF is not to create a vivid,
immediate and sensuously present mental picture of a scene, but to suggest the mood
associated with that scene. It does this by providing sparse but carefully selected
details, which serve the player as building blocks around which a more complex and
personal vision of the scene can be created.
[25] When Short mentions that Scheherazade has high-backed booths, a
dark ceiling, and decorations that show thieves and smugglers, she does more than simply
inform us that these things are present; she also hints at the affective resonance of this
place. She doesn't tell us what precisely this place looks like, but she provides us with
affective lenses that we can apply to our own visualization of the place. Short's goal in
this game is not to match the transparency of graphical video games, but to activate a
mode of visuality which is affectively rather than sensuously vivid. Ekphrasis has been
used for this purpose since ancient times: Quintillian wrote, for example, that lawyers
should use ekphrasis only where “motivated […] by the speaker’s emotional
engagement with and amplification of his client’s plight”
qtd. in [
Koelb 2006, 29]
. For ancient rhetoricians, ekphrasis was not a transparent means of visual
representation but a tool for augmenting the emotional resonance of the described scene.
City of Secrets suggests that this effect becomes, if
anything, more potent when the described scene is an interactive one.
In
Ad Verbum, the player needs to directly engage with the
verbal properties of IF in order to finish the game.
City of
Secrets doesn't similarly require the player to visualize in order to complete
the game (except at the minimal level described above with reference to
Zork), but this is because
City of
Secrets is a deliberately simple game. As Short writes in the game's ABOUT text,
“This game is meant to be playable even by someone who has never encountered
interactive fiction before, and be a gentle introduction to the genre. It is not
terribly difficult, nor is it possible to die until the very end.” However, her other
works do often require the player to visualize and to do so in an affective and critical
way. In
Savoir Faire (2002), a deliberately challenging
adventure game, the player has the magical power to create “links”
between two similar objects, whereby one object takes on the properties of the other or is
affected by events that occur to the other. In order to use this power effectively, the
player has to observe visual (and other) similarities between the two objects, and this
may require a minute inspection of the two objects involved. For example, the first puzzle
in the game is to open a locked pair of doors.
[26] The description of the doors
reads, “A pair of white-painted doors that lead into the upstairs corridor of the house.
Each door panel is decorated with the family crest, picked out in ostentatious gold, as
though to warn servants not to wander that direction uninvited.” In a nearby room the
player finds a teapot, whose object description reads, “In order to make the linkages
possible, however, it has been painted a glossy white, and the crest of the family
executed on one side in intricate detail.” The solution to the puzzle is to link the
doors to the teapot, then open the lid of the teapot, causing the doors to open. This
works because the teapot and the doors are both white, openable, and decorated with the
same crest. To notice these similarities, the player has to read the descriptions of both
objects “through a jeweler's loupe.” In doing so, the player may visualize the two
objects, but even if the player doesn't do this, the player's activity of closely reading
the descriptions is equivalent to the avatar's activity of closely examining the objects.
Solving this puzzle requires engaging in a mental operation in which
reading
and
looking are inextricably linked. Yet this reading/looking process is not
exclusively goal-directed. At the same time that the object descriptions provide the
player with the information necessary to solve the puzzle, they also help the player to
imagine both the visual appearance and the affective resonances of the objects referenced.
As this example suggests, affective ekphrasis can be a technique of both puzzle-solving
and worldbuilding; using Douglass's distinction, it contributes to both the
“gamelike” and the “narrative” qualities of IF.
City of Secrets combines the emotional vividness of visual
prose with the ability to interact with the visualized world through an avatar, a
combination which is perhaps unique to IF. Instead of trying to match the transparent
visuality of the graphical video game, it provides an IF-specific experience of affective
textual visuality. This is a second possible way in which IF can define itself as an
artistically viable medium and not an inferior precursor to the graphical video game.
Ad Verbum and City of Secrets
adopt two opposing strategies for demonstrating the continuing value of IF in a
post-graphical age. Ad Verbum suggests that IF needn't try to
compete with the visuality of graphical games because IF's strengths lie in its nonvisual
aspects. City of Secrets, by contrast, demonstrates that IF
can be visual in a way which may be inaccessible to graphical games. What both games
implicitly argue is that even if IF games can't (or shouldn't) compete with the visual
transparency of graphical video games, the creation of IF games can still be a viable
artistic pursuit. The coming of graphics doesn't kill IF, but it does force IF to
adapt.
To summarize, I have argued that IF is an ekphrastic medium insofar as it provides the
player with a textual translation of the avatar's direct visual experience. Unlike
traditional ekphrastic poetry and prose, however, IF is prescriptively ekphrastic in that
it asks the player to perform concrete actions in response to its textual pictures. In the
post-graphical age, prescriptive ekphrasis becomes a threatened mode of visual
representation because computer graphics seem to have a superior ability to model the
diegetic world of the game. In order to justify the continued production of IF,
contemporary IF authors have adopted at least two strategies for responding to this
threat. The point of both approaches is to argue that IF offers players experiences that
graphical video games cannot match — an argument which ekphrastic poetry often implicitly
makes with respect to painting. Where the two approaches differ is in how they
characterize these experiences which are unique to IF. One strategy, as demonstrated in
Ad Verbum, is to abandon prescriptive ekphrasis and
concentrate on the purely textual experiences that IF can offer. The other strategy, which
we find in City of Secrets, is to employ an affective rather
than a mimetic mode of ekphrasis, thereby creating emotional effects that would be
difficult to replicate with graphics.
Even the first strategy, however, is still predicated on the visual properties of the IF
genre. Despite claiming to present a world composed purely of linguistic signifiers,
Ad Verbum still structures those signifiers according to a
world model composed of rooms and objects, and such a world model, as I've argued, must be
visualizable in order to be navigable. In City of Secrets,
visualization of the world model becomes the primary appeal of the game. To differing
extents, both texts ultimately offer the player the opportunity to collaborate with the
author in imagining a world. As the product of the player's affective visualization, this
world is, at least ostensibly, more intimate and personal than the vivid, transparent
worlds of commercial video games can possibly be. If authors like Montfort and Short still
write IF, and if players like me still play it, then this testifies to the existence of a
desire for spatial and visual experiences which are more imaginary or affective than
transparent. Regardless of the vivid immediacy of the spaces that graphical video games
allow us to inhabit, we still want to inhabit spaces which, to quote the inscription on
the living room door in Zork, are intentionally left
blank.