Tom Elliott is Associate Director for Digital Programs and Senior Research Scholar at the Institute for the Study of the Ancient World at New York University. Information about his current work — which spans digital approaches to epigraphy, papyrology, historical geography and other aspects of ancient studies — is provided on his home page at http://homepages.nyu.edu/~te20/
Sean Gillies is a computer programmer and pioneer in the field of open source geographic information systems. He has been a member of the MapServer Project's Steering Committee and now leads the GIS-Python Laboratory, an international effort to develop excellent GIS tools for the Python programming language. His sometimes influential blog focuses on the geospatial industry, open source software, and the Web. He currently directs software development at New York University's Institute for the Study of the Ancient World.
Authored for DHQ; migrated from original DHQauthor format
The authors open by imagining one possible use of digital geographic techniques in the context of humanities research in 2017. They then outline the background to this vision, from early engagements in web-based mapping for the Classics to recent, fast-paced developments in web-based, collaborative geography. The article concludes with a description of their own Pleiades Project (http://pleiades.stoa.org), which gives scholars, students and enthusiasts worldwide the opportunity to use, create and share historical geographic information about the Greek and Roman World in digital form.
The history of computing, classics and geography can be seen as a rich and profitable dialogue between many disciplines and practitioners now reinvigorated by a wave of technological and societal change that is breaking down artificial boundaries between scholars and a wider public whose skills, hobbies and interests coincide at the fascinating intersection of humanity, space and time.
As I settle into my chair, a second cup of morning coffee in my hand, an expansive view of the eastern Mediterranean fades in to cover the blank wall in front of me. It's one of my favorite perspectives: from a viewpoint a thousand kilometers above the Red Sea, you can look north and west across an expanse that encompasses Jordan, Egypt and Libya in the foreground and Tunisia, Calabria and the Crimea along the distorting arc of the horizon. A simple voice command clears some of my default overlays: current precipitation and cloud cover, overnight news hotspots, and a handful of icons that represent colleagues whose profiles indicate they're currently at work. Now I see a new overlay of colored symbols associated with my current work: various research projects, two articles I'm peer-reviewing and various other bits of analysis, coding, writing and reading. These fade slowly to gray, but for two. Both of them are sprawling, irregular splatters and clumps of dots, lines and polygonal shapes.
The view pivots and zooms to frame these two symbol groups. The pink batch indicates
the footprint of one of the review pieces, a survey article comparing Greek, Roman
and Arabic land and sea itineraries. The other group, in pale blue, corresponds to my
own never-ending Roman boundary disputes
project. In both cases, I've
previously tailored the display to map findspots of all texts and documents cited or
included, as well as all places named in notes or the cited modern texts and ancient
sources. These two sets of symbols remain highlighted because potentially relevant
information has recently appeared.
I suspect the hits
for the boundary disputes are just articles in the latest
(nearly last) digitized Supplementband of the Pauly-Wissowa
I can't resist making some quick explorations. It's easy enough to select the new
publication and ask for an extract of the itinerary's geography, as well as the place
names in their original orthographies. I follow up with a request for the results of
two correlation queries against all of the itineraries cited in the review article:
how similar are their geographies and how similar are their place names? My
programmatic research assistant reports back quickly: there's a strong correlation
between the sequential locations in a portion of the new document and an overlapping
portion of our only surviving copy of a Roman world map, the so-called Peutinger
Map
, but the forms of several place names differ in some particulars.
But now it's time to teach. Another voice command stores the research stuff and clears the projected display, which reforms in two parts: another virtual globe draped with the day's lecture materials, and a largely empty grid destined to be slowly populated over the next half-hour by the various icons, avatars and video images of my students. I double-check that I'm set to do-not-disturb and lean back to review my lecture notes. Today, Caesar will cross the Rubicon.
We've taken an equally imagined, but less frenetic, view of the scholarly future a
decade from now as that recently sketched for freelance business consultants by web mapping
, neogeography
, social cartography
, the
geoweb
, webGIS
and volunteered geographic information
. Those
unfamiliar with this fast-moving field – one in which speculation and innovation are
widespread – will gain some appreciation of its breadth and heterogeneity if they
start their reading with
For humanists, general research tasks will remain largely unchanged: the discovery,
organization and analysis of primary and secondary materials with the goal of
communicating and disseminating results and information for the use and education of
others.success
of the vast
range of digitization and digital publication efforts – commercial, public and
consortial – now underway, but
It would be impossible in an article-length treatment to produce a comprehensive
history of geography and the classics, digital or otherwise. The non-digital,
cartographic achievements of the field to 1990 were surveyed in detail by Historical
GIS
have been addressed recently in
Readers will detect this gap (and the compounding effects of limited space and time)
in three regrettable categories of omission. Firstly, we have been unable to address
many important comparanda for other periods, cultures and disciplines, for example
the China Historical GIS (http://www.fas.harvard.edu/~chgis/), the AfricaMap Project (http://isites.harvard.edu/icb/icb.do?keyword=k28501), both spearheaded
by Harvard's Center for Geographic Analysis. Secondly, our examples here are heavily
weighted toward the English-speaking world; we have not been able to address such
seminal efforts as the
On the other hand, the initial impact on historical work of the publication in 2000 of the
The ability to identify and locate relevant items of interest is of prime importance
to users of scholarly information systems, and geography is a core axis of interest.
As Buckland and Lancaster put it in their 2004 D-Lib Magazine article: The most fundamental advantage of the
emerging networked knowledge environment is that it provides a much-improved
technological basis for sharing resources of all sorts from all sources. This
situation increases the importance of effective access to information. Place,
along with time, topic, and creator, is one of the fundamental components in
how we define things and search for them.
Obviously, much of the information that the web search providers seek to index is not
in one of these geospatial formats. To determine the geographic relevance of
web-facing content in HTML, MSWord and PDF files and the like, a process called
geoparsing must be employed cf. named entity
recognition
, in which proper nouns in unstructured or semi-structured texts
are disambiguated from other words and then associated with known or notional
entities of interest (e.g., places, persons, concepts). The potential of course
exists for false identifications, ambiguous matches (i.e., one name for many discrete
entities) and complete failures (e.g., due to the obscurity of a name or a referenced
entity). Assuming some level of success in disambiguation and identification of
geographic entities in a text, the next step is to associate coordinates with each
named entity. Unless the textual source is itself a geographic gazetteer containing
map coordinates that can be parsed accurately, the coordinates must be supplied from
an external reference dataset that ideally holds the exact name variants occurring in
the text as well as the geospatial coordinates; we must recognize that it will often
be the case, especially with regard to primary sources and academic literature on
historical topics, that such a reference dataset does not exist. Nonetheless, once
coordinates are assigned to as many entities as possible, and the results are stored
in an index with a link to the original document, mapping, geosearch and other
geographic computations can be performed.
Despite clear challenges and the real possibility of incomplete results for many texts (we shall consider an example below), we believe that geoparsing-based services will mature and diversify rapidly. By 2017, we expect that all web-facing textual resources will be parsed (rightly or wrongly) for geographic content. Audio and video works, where these can be machine-transcribed, will receive similar treatment, whether they are accompanied by extensive metadata or not. Some texts, especially those produced by academics and specialist communities will have coordinates encoded into the digital texts in a way that facilitates automatic extraction, making their geographic indexing relatively simple (e.g., KML, GeoRSS or the like). The majority of texts, however, will still have only geographic names and geographic description, most with neither special tagging nor authority control. The search engines will automatically identify these components and add associated entries to their geographic indexes by matching them with coordinate data in reference datasets. We believe that these datasets will include information drawn from KML and GeoRSS-enabled documents on the web, as well as previously geo-parsed texts (especially those whose genre, topic and regional nature are easily identified using automated techniques). Topic- and genre-aware geoparsing may well prove essential to accurate disambiguation of non-unique place names.
The first fruits of geographic search are already apparent in information sets being surfaced to the web. Consider, for example, Google Books. A copy of J.S. Watson's 1854 translation of Xenophon's
About this bookpage demonstrates the range of connections the search company's software is so far able to make between the book and other resources.
Places mentioned in this book.These consist at present of a map and an index of select places, as illustrated in Figure 1.
It would appear that most of the places
currently highlighted in the Google
Books presentation (evidently a maximum of 10) have been parsed from Ainsworth's
commentary or the unattributed Geographical Index
, which we assume to be
Watson's (pp. 339 - 348). As is the case with many Google search results, we are
given no information about how the indexing or correlation with geographic
coordinates (i.e., the geo-parsing) was accomplished. Some of the results are mapped
correctly, but others have clearly gone awry. Consider, for example, the placement of
Olympia, Syracuse and Tempe (see Figure 2). The placement of these ancient features in North America illustrates one of
the most obvious failure modes of geo-parsing algorithms. If the algorithm succeeds
in finding a match at all, it may find more than one match or a single, but incorrect
match. Given the information publicly available, we cannot know for certain which of
these two circumstances produced the results in question here. It seems probable,
however, that at least the Olympia and Syracuse errors result from selecting the
wrong result of several, as both of these well-known sites appear in many of the
standard gazetteers and geographic handbooks.
There are published methods for resolving such ambiguities on the basis of proximity
to other locations more reliably geo-parsed from the work, but these presuppose
knowledge of a coherent geographic footprint for the work under consideration as well
as a relatively comprehensive and highly relevant reference dataset that can provide
at least a single match for most geographic names (
It seems reasonable to assume that Google and its competitors will continue to seek
or construct more comprehensive reference datasets to support geo-parsing
improvements. Indeed, Google's embrace of VGI can be seen as a deliberate strategy
for this purpose. The Google Earth Community Bulletin Board (http://bbs.keyhole.com) provides a venue for
the sharing of thematic sets of labeled geographic coordinates. The expansion of the
Googlebot's competencies to include KML and GeoRSS expand the harvest potential to
the entire web. This circumstance highlights a class of research and publication work
of critical importance for humanists and geographers over the next decade: the
creation of open, structured, web-facing geo-historical reference works that can be
used for a variety of purposes, including the training of geo-parsing tools and the
population of geographic indexes (see
The automated features outlined above were added to Google Books in January 2007. The
stated motivation was cast in terms of helping users plan your next trip, research an area
for academic purposes or visualize the haunts of your favorite fictional
characters
Earlier this year, we announced
a first step toward
Places in this
book
component of Google Books is intended to help users explore the geography
of the book, the Google Books layer in Google Earth offers users the opportunity to
discover books by way of the geography. If, by chance, we should find ourselves
browsing northeastern Turkey in Google Earth, the Google Books layer offers us a book
icon in the vicinity of the town of İspir. This icon represents Watson's 1854
translation of the Anabasis. Links on the popup description kite will take us back to
specific pages of the scanned Harvard copy we discussed above. Another link provides
us with a way to Search for all books referencing Ispir.
On 30 September 2007,
we were offered information about 706 books. By refining the search to include the
keyword ancient,
we narrowed the result set to 167 books. We judged these
results to be highly relevant to our interests. We then selected a further suggested
refinement from a list of links at the bottom of the page: Geography, Ancient.
The resulting 7 matches are unsurprisingly spotty and largely out-of-date, given the
current state of Google's book digitization project, copyright restrictions and the
degree of publisher text-sharing partnerships. Taking this into account, we again
judged the results highly relevant. The results included
We should not be surprised that web-wide geo-search is with us already in embryo. It follows naturally, as a user requirement, once you have assembled a large index of web content and developed the ability to parse that content and discriminate elements of geographic significance.
Quick and accurate geographic visualization had long been one of the holy grails of
the web. The now-defunct Xerox PARC Map Viewer
Recent innovations in on-line cartographic visualization are revolutionary. We have
only just grown accustomed to digital globes (like NASA WorldWind and Google Earth)
and continuous-panning, slippy
browser-side maps. Google, Yahoo, Microsoft
Live Search and MapQuest have all gone this route, and there is now an excellent,
open-source toolkit for building this sort of client-side maps: OpenLayers (http://openlayers.org). Many of these
applications are doing more than giving us compelling new visual environments. Many
of them are also breaking down traditional divisions between browse and search,
thematic layers, web content, spatial processing and geographic datasets, not least
through the mechanisms known generically as mashups
(web applications that
dynamically combine data and services from multiple, other web applications to
provide a customized service or data product). These tools remain an area of intense,
active development, so we should expect more pleasant surprises. Moreover, as the
costs of projectors, large-format LCD displays and graphics cards fall, and as
digital television standards encourage the replacement of older televisions, display
sizes and resolutions will at last improve. Efficient techniques for the simple
mosaicing of multiple display units and for the ad-hoc interfacing of mobile devices
and on-demand displays provided by third parties expand the horizon still further. It
will not be long before a broader canvas opens for cartographic display.pixels everywhere
in the context of mobile
computing (see
There is not space in this article to treat other forms of geographic visualization,
especially 3D modeling, in detail. It must suffice for us to say that this too is an
area of vigorous change. From the earliest QuickTime VR panoramas (classicists will
remember
Geo-mapping of texts has been with the field of classics since the late 1990s, a
development that paralleled interdisciplinary innovations in digital library research
and development. Although its technology has now been eclipsed by rapid and recent
developments in web-based cartographic visualization, the Perseus Atlas stands as a
pivotal early exemplar of the power of this approach.
The first version of the Perseus Atlas was rolled out in June 1998, shortly after Rob
Chavez joined the Perseus team with the atlas as his primary task.Greek world,
and it was designed solely to provide geographic
context (illustrative maps) for discrete content items in the Perseus collection,
e.g., individual texts (or portions of texts) or object records. Perseus was already
in control of a significant amount of geographical information drawn from its texts
through geo-parsing.
The Perseus Atlas saw two upgrades subsequent to its 1998 debut.
It is clear from even a cursory review of contemporaneous developments that the
Perseus team was in the vanguard of an important revolution in web-based cartographic
visualization. When Perseus first fielded its atlas in 1998, The Alexandria Digital
Library (ADL) Project team had been working for four years to realize Larry Carver's
1983 vision of geographic search in a digital maps catalog geo-library
was
endorsed by the National Research Council Mapping Science Committee (National
Research Council Mapping Science Committee 1999; see further
It is clear that, despite significant challenges, geo-library functionality like
that pioneered by ADL and Perseus is rapidly maturing into one of the standard modes
for browsing and searching all manner of digital information on the web.Find Businesses
widget: you enter (e.g.) Pizza
in the
What
text box; the Where
text box is prepopulated with the
default value Current View
(effectively a bounding box). MapQuest provides
geographic proximity search through its Search Nearby
feature (http://help.mapquest.com/jive/entry.jspa?externalID=349&categoryID=6).
Yahoo! Local provides similar capability (http://local.yahoo.com/). Other examples and experiments abound; their
enumeration here would overly belabor the point.geo-aware
searches will occur more frequently outside special map-oriented
interfaces and services, with the target coordinates being provided automatically and
transparently by search software and other computational proxies.
These developments are being fueled by now-familiar business models.location-based
services and information retrieval; although not
rigorously screened, we believe the number of patents and applications in this
field over the past decade may easily number in the thousands. A web search for a
phrase like location-based
or location-aware search
will retrieve a
significant number of relevant blog postings, articles and business
websites.
Spatial search will help manage more seamlessly the inevitable heterogeneity of the
2017 web. Formats, delivery mechanisms, cost models and access challenges will surely
continue to proliferate. We assume that some resources will be little advanced from
the average 2007 vintage web page or blog post, whereas other works will exhibit the
full range of meaningful structures and linkages now being worked out under the
rubrics semantic web
and linked data
.
We expect current distinctions between geographic content and other digital works
will continue to erode over the next decade. As the geographic content in most
documents and datasets is identified, surfaced and exploited, born geographic
works will be fashioned in increasingly flexible and interoperable ways. To some
extent, specific industries and institutional consortia will surely continue to use
specialized protocols, service-oriented architectures and specialist formats to
provide lossless interchange and rich contextualization of critical data. Smaller
players, and big players wishing to share data with them, will increasingly use
RESTful models, simple URLs and widely understood open formats for basic information
publishing and exchange, even when these methods are lossy.incomplete
data. Even now, a
KML file with appropriate descriptive content and hyperlinks can function as a table
of contents, geographic index or abstract for a document collection coded in TEI or a
custom web application running in Plone. We expect that extensible web feeds (like
the recently codified Atom Syndication Format, RFC 4287: http://www.atompub.org/rfc4287.html) will form a simple, web-wide
infrastructure for notification and metadata exchange, alongside the more complex and
difficult-to-implement special protocols that are already linking federated digital
repositories and grid systems.
Many commercial innovations for online search, retrieval and visualization will be of benefit to humanist scholars and their students. Such benefits will depend, to a significant degree, on the willingness and ability of humanists, their employers and their publishers to adapt their practices to fit the new regime.
To date, humanists have taken one of two approaches in preparing and disseminating
geo-historical information on the web. Large projects with significant funding have
tended to follow the lead of the larger geo-science and commercial GIS communities in
placing emphasis on the elaboration of extensive metadata describing their datasets,
thereby creating a basis for their discovery and inclusion in digital repositories.
In some countries (notably the U.K.) emphasis by national funding bodies has
encouraged such behavior. At the other end of the spectrum, hand-crafted datasets and
web pages are posted to the web with minimal metadata or distributed informally
within particular research communities. The neo-geographical revolution, and in
particular the explosive interest in Google Earth, has recently provided a much
richer and easier mechanism for the dissemination of some types of work in this
latter class (e.g.,
In the United States, e-science approaches have long been dominated by ideas
expressed in an executive order issued by President Clinton in 1994 and amended by
President Bush in 2003 content
relating to the east and west coasts of the United States
(http://www.ngda.org/research.php#CD).
The national repository
approach has similarly found a coherent advocate for
academic geospatial data in the UK by way of EDINA, the national academic data center
at the University of Edinburgh (http://edina.ac.uk/); however, the recent governmental decision to de-fund
the UK Arts and Humanities Data Service has introduced a significant level of
uncertainty and chaos into sustainability planning for digital humanities projects
there (details at http://ahds.ac.uk/). More
broadly, the Global Spatial Data Infrastructure Association seeks to bring together
representatives of government, industry and academia to encourage and promote work in
this area (http://gsdi.org), but to date it seems
to have focused exclusively on spatial data as it relates to governance, commerce and
humanitarian activities.
The top-down, national repository model (and its cousin, the institutional
repository) contrasts sharply with the neo-geographical methods now rapidly
proliferating on the web. To a certain degree, the differences reflect alternative
modes of production. The repository model was born in a period when spatial datasets
were created and used almost exclusively by teams of experts wielding specialized
software. Concerns about efficiency, duplication of effort and preservation were key
in provoking state interest, and consequently these issues informed repository
specifications. VGI, however, has only recently been enabled on the large scale by
innovations in web applications.webby
distribution. For example, KML and GeoRSS-tagged web feeds
can now be posted and managed just like other basic web content.Query: map and GIS user needs
). It may be that the inherently
public aspect of VGI efforts makes it far easier to discover such oversights;
however, we are not aware of any authoritative study addressing the issue in
quantitative terms. There can be no doubt that many VGI datasets, and many of the
web applications that enable their creation and use, fail to address issues as
fundamental as precision, accuracy and data origin. It is particularly
frustrating, for example, that content from Google's geo-aware photo sharing
service, Panoramio (http://www.panoramio.com/), is surfaced in Google Earth with no
reflection of its spatial accuracy. This despite the fact that Panoramio content
includes both images that were geo-tagged manually using an arbitrarily scaled map
interface and images whose EXIF headers originally contained high-resolution
coordinates and other metadata provided automatically by GPS-enabled digital
cameras. On the other hand, an increasing number of VGI projects whose technical
staff include academic or professional geographers are addressing these concerns
and finding innovative ways to solve them in the new media.
In 1997, The Electronic Cultural Atlas Initiative was chartered in Berkeley to ease
and promote cross-project sharing of spatio-temporal data in the humanities ECAI metadata
(a Dublin Core derivative), which is then added to the
clearinghouse. Geographic and temporal components of the search are facilitated by a
map-and-timeline interface realized with server-side TimeMap software (see http://www.timemap.net). This tool may be
seen, on one level, as a reflection of the dominant web search mode of the day, the
directory. But it also anticipated the future of VGI, appreciating that many
humanities projects with spatial components would not be able to fit their content
into the formats supported by the big repositories. Instead, it assumed that these
projects would arrange for web hosting themselves, and take voluntary steps to share
it with others. ECAI's Clearinghouse was also innovative in providing for the
discovery of indexed projects through a combination of search and browse methods that
exploited both spatial and temporal information in intuitive ways. ECAI remains a
vibrant international community, organizing regular conferences and maintaining
linkages across a wide range of projects and disciplines (see http://ecai.org/Activities/conferences.asp).
The first attempt at full-fledged VGI in Classics was the Stoa Waypoint Database, a
joint initiative of Robert Chavez, then with the Perseus Project, and the late Ross
Scaife, on behalf of the Stoa Consortium for Electronic Publication in the Humanities
(http://www.stoa.org). In its public
unveiling, an archive [and] freely accessible source of geographic
data...for archaeologists...students...digital map makers, or anyone else engaged
in study and research
. At initial publication, the dataset comprised
slightly over 2,000 point features (settlements, sites and river mouths), drawn from
work Chavez and Maria Daniels had done for the Perseus Atlas and personal research
projects. The points included both GPS coordinates and coordinates drawn from various
public domain (mostly US government) gazetteers and data sources. Chavez and Scaife
also invited contributions of new data, especially encouraging the donation of GPS
waypoints and tracks gathered in the field. A set of Guidelines for Recording
Handheld GPS Waypoints were promulgated to support this work
The Stoa's interest in GPS waypoints reflected a worldwide trend: in April 1995, the
NAVSTAR Global Positioning System constellation had reached full operational
capability, marking one of the pivotal moments in the geospatial revolution we are
now experiencing. Originally conceived as a military aid to navigation, GPS quickly
became indispensable to both civilian navigation and map-making, as well as the
widest imaginable range of recreational and scientific uses. In May 2000, an
executive order eliminated the intentional degradation of publicly accessible GPS
signals, known as selective availability
Work on the Stoa Waypoints Database was clearly informed by the Perseus Atlas
Project, and also by another concurrent project at the Stoa: the Suda Online (SOL;
http://www.stoa.org/sol/). This
all-volunteer, collaborative effort undertook the English translation of a major
Byzantine encyclopedia of significance for Classicists and Byzantinists alike. The
Suda provides information about many places and peoples. An obvious enhancement to
the supporting web application would have been a mapping system like the Perseus
Atlas. In exploring the possibility, Scaife and Chavez realized that the variant
place names in the Suda and in the Perseus Atlas were a significant impediment to
implementation. The solution they envisioned in collaboration with Neel Smith was
dubbed the Register of Geographic Entities (RAGE). They imagined an inventory of
conceptual spatial units and a set of associated web services that would store
project-specific identifiers for geographic features, together with associated names.
This index would provide for cross-project lookup of names, and dynamic mapping. Some
development work was done subsequently at the University of Kentucky
Our own project, Pleiades (http://pleiades.stoa.org), is heavily influenced by both by the scholarly practices of our predecessors and by on-going developments in web-enabled geography. We are producing a standard reference dataset for Classical geography, together with associated services for interoperability. Combining VGI approaches with academic-style editorial review, Pleiades will enable (from September 2008) anyone — from university professors to casual students of antiquity — to suggest updates to geographic names, descriptive essays, bibliographic references and geographic coordinates. Once vetted for accuracy and pertinence, these suggestions will become a permanent, author-attributed part of future publications and data services. The project was initiated by the Ancient World Mapping Center at the University of North Carolina, with development and design collaboration and resources provided by the Stoa. In February 2008, the Institute for the Study of the Ancient World joined the project as a partner and it is there that development efforts for the project are now directed.
Pleiades may be seen from several angles. From the point of view of the Classical
Atlas Project and its heir, the Ancient World Mapping Center, Pleiades is an
innovative tool for the perpetual update and diversification of the dataset
originally assembled to underpin the Barrington Atlas, which is being digitized and
adapted for inclusion. From an editorial point of view, Pleiades is much like an
academic journal, but with some important innovations. Instead of a thematic
organization and primary subdivision into individually authored articles, Pleiades
pushes discrete authoring and editing down to the fine level of structured reports on
individual places and names, their relationships with each other and the scholarly
rationale behind their content. In a real sense then Pleiades is also like an
encyclopedic reference work, but with the built-in assumption of on-going revision
and iterative publishing of versions (an increasingly common model for digital
academic references). From the point of view of neo-geo
applications, Pleiades
is a source of services and data to support a variety of needs: dynamic mapping,
proximity query and authority for names and places.
Pleiades incorporates a data model that diverges from the structure of conventional
GIS datasets. Complexity in the historical record, combined with varying uncertainty
in our ability to interpret it, necessitate a flexible approach sensitive to inherent
ambiguity and the likelihood of changing and divergent interpretations (cf. place
understood as a bundle of
associations between attested names and measured (or estimated) locations (including
areas). We call these bundles features
. Individual features can be positioned
in time, and the confidence of the scholar or analyst can be registered with respect
to any feature using a limited vocabulary.
We intend for Pleiades content to be reused and remixed by others. For this reason,
we release the content in multiple formats under the terms of a Creative Commons
Attribution Share-Alike license (http://creativecommons.org/licenses/by-sa/3.0/). The Pleiades website
presents HTML versions of our content that provide users with the full complement of
information recorded for each place, feature, name and location. In our web services,
we employ proxies for our content (KML and GeoRSS-enhanced Atom feeds) so that users
can visualize and exploit it in a variety of automated ways. In this way, we provide
a computationally actionable bridge between a nuanced, scholarly publication and the
geographic discovery and exploitation tools now emerging on the web. But for us,
these formats are lossy: they cannot represent our data model in a structured way
that preserves all nuance and detail and permits ready parsing and exploitation by
software agents. Indeed, we have been unable to identify a standard XML-based data
format that simply and losslessly supports the full expression of the Pleiades data
model.places
(in addition to place names
)
by the Text Encoding Initiative might provide another lossless
alternative,
but it seems unlikely to see wide adoption outside scholarly humanities circles
(see TEI Consortium 2008, sub 13. Names, Dates, Peoples and Places
, http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ND.html).
In 2009, we plan to address users' need for a lossless export format by implementing code to produce file sets composed of ESRI Shapefiles and attribute tables in comma-separated value (CSV) format. The addition of a Shapefile+CSV export capability will facilitate a download-oriented dissemination method, as well as position us to deposit time-stamped versions of our data into the institutional repository at NYU and other archival contexts as appropriate. Indeed, this is the most common format requested from us after KML. Although the Shapefile format is proprietary, it is used around the world in a variety of commercial and open-source GIS systems and can be readily decoded by third-party and open-source software. It is, in our judgment, the most readily useful format for individual desktop GIS users and small projects, and because of its ubiquity has a high likelihood of translation into new formats in the context of long-term preservation repositories.
Historical time remains a problematic aspect of the web that cannot be divorced from geography. The use of virtual map layers to represent time periods remains a common metaphor. Timelines and animated maps are also not uncommon, but all these techniques must either remain tied to specific web applications or rely on cross-project data in standard formats that only handle the Gregorian calendar and do not provide mechanisms for the representation of uncertainty. Bruce Robertson's Historical Event Markup Language (HEML; http://heml.mta.ca) remains the most obvious candidate for providing the extra flexibility humanists need in this area, both for modeling and for expression in markup. It is our hope that it too will find modes of realization in the realm of web feeds and semantic interchange.
In its final report, the American Council of Learned Societies Commission on
Cyberinfrastructure for the Humanities and Social Sciences highlighted the importance
of accessibility, openness, interoperability and public/private collaboration as some
of the prerequisites for a culture of vigorous digital scholarship
We view the history of computing, classics and geography as a rich and profitable dialogue between many disciplines and practitioners. It is tragic indeed that we lost much too soon our friend and mentor Ross Scaife, who emerges as a pivotal figure in our narrative of this history. It was he who invited Elliott to the University of Kentucky in 2001 to give the first public presentation about the proposed Pleiades Project at the Center for Computational Studies, and it was he who provided the development server that supported the first two years of Pleiades software development. We are confident that, were Ross with us today as he was at the workshop that occasioned the original version of this paper, he would still be motivating us with challenging examples and stimulating ideas, connecting us with new collaborators and encouraging us to push harder for the changes we wish to see in our discipline.
Despite the sense of loss that inevitably runs through the papers in this volume, and despite the challenges we face in a chaotic and interdisciplinary milieu, we also view the history of geographic computing and the classics as a hopeful omen for the future. We have a spectacular rising wave to ride: a wave of technological and societal change that may well help us conduct research and teach with more rigor and completeness than before while breaking down artificial boundaries between scholars in the academy and members of an increasingly educated and engaged public whose professional skills, public hobbies and personal interests coincide with ours at the fascinating intersection of humanity, space and time.