Laura Matthew is a historian of Spanish colonial Guatemala and associate professor at Marquette University. She is the co-editor with Michel Oudijk of
Michael Bannister is an independent programmer analyst specializing in digital humanities and internet-based instruction in Milwaukee, Wisconsin.
This is the source
The digital archive
This article discusses the Nahuatl/Nawat in Central America digital archive and its challenges.
Some thirty years ago, in
ontological and epistemic choices with distinct ideological and even specifically political implications
Náhuatland
Náhuatin Spanish,
Nahuatland
Nawatin English, are currently the most conventional ways of referencing these related but distinct languages. As Hansen 2016 explains, the orthographic conventions of Nahuan languages are fluid and we do not intend any definitive statement by selecting these particular ones. On the politics of orthography and revitalization see also Olko and Sullivan 2013 esp. pp. 201-11, and van Zantwijk 2011.
NECA assembles a corpus of handwritten, colonial-era texts produced in Central America in variations of the related Mesoamerican languages Nahuatl and Nawat, from eight repositories in Guatemala, Mexico, Spain, and the United States. It emphasizes the fact that these oft-ignored documents exist, and encourages their collaborative study across national, scholarly, community, and disciplinary lines. Neither goal is neutral or apolitical, although the significance of studying these texts may vary depending on whether the user is an Indigenous rights activist from Mexico City or Los Angeles, a linguist of Mayan languages from Guatemala, a native speaker from Guerrero, a primary school teacher from El Salvador, or a doctoral candidate from Europe, etc.
In this essay we explain our rationale for creating a digital archive of Nahuatl texts from Central America in the first place, arguing that NECA's content should be studied not only by individuals analyzing particular texts for the purposes of geographically or disciplinarily bounded research and revitalization projects, but also collaboratively and more experimentally as a standalone corpus. We then review the ontological and epistemic as well as technical choices we made in the project's design to encourage this outcome. NECA's form attempts to prod users towards a variety of actions both within and outside the digital archive. The success or failure of the affordances we created to increase the usefulness and usability of the site, and thus to direct the user toward specific activities, can be measured in the site's analytics. These indicate not just where the digital environment we created is working well or can be improved, but also where it may not be the best workspace available — or at least, not yet.
Nahuatl, best known as the language of the Aztec empire, was spoken by tens of
millions of people in the early sixteenth century. It is not a single language but a
range of mutually intelligible Nahuan
variants ranging from northern Mexico to
Nicaragua since at least the second half of the first millennium A.D. (see Figures 1 and 2). Many
Nahuan languages have died out, especially in the last 150 years. Others persist but
are threatened by continued and increasing contact with and preference for European
languages such as Spanish and English. Today, there are approximately 1.5 million
native speakers of Nahuatl variants in Mexico and the United States disapora, and
around 200 native speakers of the related language Nawat in the Izalcos and Santo
Domingo de Guzmán areas of Sonsonate and in Tacuba, Ahuachapán, both in western El
Salvador (http://www.unesco.org/languages-atlas/index.php).Pipil
or mexicana corrupta
by the Spanish.
Both Nawat
and Pipil
are common terms for the same language spoken
in El Salvador today. To avoid confusion, in this article we refer only to
Nawat.
When the Spanish arrived in 1519, central Mexico was the most urbanized, politically
powerful, and densely populated part of Mesoamerica. The Spanish made the defeated
Aztec capital of Tenochtitlan the bureaucratic heart of their own nascent empire, and
engaged Indigenous intellectuals in a remarkable, sometimes violent merging of
Mesoamerican and European writing systems
This large corpus of Nahuatl documentation from central Mexico has spawned a number of digital projects with a variety of aims, such as increasing access to lesser-known texts and making databases of glyphic and linguistic information searchable online for comparative study. For instance, the
Significant colonial-era Nahuan language documentation also exists from outlying
regions of the former Aztec empire. Like the Aztecs, the Spanish used central Mexican
Nahuatl as an imperial
In Central America, Nahuatl's usefulness as a tool of empire was augmented by its
mutual intelligibility with Nawat and other Eastern Peripheral Nahuan languages
natively spoken in what today is Chiapas (Mexico), southwestern Guatemala, and El
Salvador
In neighboring El Salvador, by contrast, Nawat — the only surviving natively-spoken
Nahuan language in Central America — is simultaneously valorized as part of the
national patrimony and discriminated against in everyday life. In 1932, Salvadoran
state forces massacred tens of thousands of peasants, most of them Nawat speakers, in
response to an uprising against coffee plantations. Fearful of further repression,
survivors avoided speaking Nawat in public or teaching it to their children
For all these diverse and contradictory reasons, few Central Americans have studied historical documents in Nahuan languages from their own region (although this is beginning to change; see Romero 2017, Cossich 2012). Indeed, it has long been assumed that hardly any such documentation existed. The most basic goal of NECA is to correct this false impression. Our central claim, however, is not merely that these documents exist, but that they are worth studying.
Linguistically, Central American documents in Nahuan languages bring an entirely new
data set to debates about the historical evolution of Nahuan languages, especially in
areas beyond the imperial center. Linguists generally agree on the basic dialectal
features of the two main branches of Nahuatl, Eastern and Western, and of the urban,
imperial Nahuatl developed in fifteenth- and sixteenth-century Mexico-Tenochtitlan
archaic
Nahuan language that
predated and continued to be used in Central America alongside the Aztec/Spanish
NECA is also notable for its range of dates and genres: catechisms, wills, letters to
Spanish officials, town council memos, bills of sale, community annals, tributary
rolls, judicial testimony and denunciations, land titles, musical manuscripts, and
confraternity books from the mid-sixteenth to the early eighteenth century. Religious
texts in Indigenous languages are a foundational genre in Mesoamerican studies, and
have been analyzed for the cadences of Mesoamerican ceremonial speech as well as the
intense and sometimes antagonistic back-and-forth between European and Indigenous
intellectuals
vulgarNahuatl used alongside Nawat and the central Mexican
vulgardialect but frequently slipped back into the central Mexican variety with which he was more familiar
Bureaucratic documentation generated mostly by Indigenous
Beyond philology, translations and transcriptions of the documents assembled by NECA
would enrich the social history of the region. The vast majority of lives revealed
are of non-native speakers of Nahuan languages: African urbanites, Oaxacan plantation
workers, Maya choirmasters and cofradía officials, French merchants, and innumerous
Indigenous political leaders: Mam, K'iche', Tzeltal, Tojolabal, Jakalteko, Kaqchikel,
etc. Contact points between friars, Spanish administrators, and local authorities are
also plentiful in these documents. Family relations simmer underneath accusations of
adultery, bigamy, and incest. Inventories and wills track the material culture of
everyday life and the globalization of Mesoamerican commerce. Witchcraft, land and
inheritance disputes, and the forced labor of women all make an appearance. The input
of scholars and community members who may not have Nahuan language skills but who
bring deep expertise in Mayan and Central American history, anthropology,
archaeology, geography, and art history is crucial for contextualizing such
information and incorporating it into larger narratives.Pipil
territory; Luján-Muñoz (1988) and Herrera (2003) on Spanish Guatemala; Stevenson (1964), Borg
(1985), and Morales (2015) on musical
traditions; Lutz (1994) and Lokken (2000) on Afro-descendents in Guatemala; and
Viqueira Albán (2002) on Chiapas.
To our assertion of NECA's potential for advancing Nahuan linguistics and Central
American history, we add the possibility of supporting Nawat revitalization efforts
in El Salvador. Diverse and overlapping intercultural and intergenerational campaigns
have been underway in that country since the early 2000s, including a language nest
primary school immersion program
Preliminary discussions with Salvadorans involved in Nawat revitalization indicate
that while there may be a place for NECA in the future, for now the urgency of
recording and promoting modern Nawat overshadows interest in historical documents.
How NECA might contribute to Nawat revitalization is uncertain, in part, because the
linguistic identification of so many of our documents remains unclear and the
majority are from Guatemala, where Nawat was historically spoken but is no longer.
Again, further study via transcriptions and translations is needed in order to
clarify how the NECA corpus may speak to the case of Salvadoran Nawat. In the
meantime, we hope that NECA's expression of international scholarly interest in
Central American Nahuan languages, free access to downloadable, high-quality images
of colonial-era documents for anyone with an internet connection, and public witness
to the long history of Nawat in El Salvador stands as a one more symbol of cultural identity and pride ... [the] first step in any language
revitalization process
NECA began with a list of over 40 documents compiled by Sergio Romero (University of
Texas at Austin) and Laura Matthew (Marquette University), in collaboration with a
dozen other colleagues, for an encyclopedia project that never materialized. As
Romero and Matthew sought alternate ways to publish the list, new items continued to
surface. It became clear that given the number of Nahuan language documents that go
unrecorded in archive catalogs and the extent to which scholars tend to run across
them unexpectedly, the list could easily grow longer and a traditional print
publication would quickly become outdated. Simply posting the list online might
stimulate interest, but the need to travel to physical archives represented a
significant barrier to serious engagement since those with the most capacity to read
early modern manuscripts in Nahuan languages tend not to live or work in Guatemala
and Chiapas, where the main repositories of NECA's documents are located. Working
with programmer Michael Bannister, and with permission from the original
repositories, Matthew decided in 2015 to create a digital archive of high-quality
images using Omeka, the popular open-source content management system for digital
collections from the Roy Rosenzweig Center for History and New Media (CHNM) at George
Mason University. For the remainder of this essay, we
refers to Matthew and
Bannister as the sole creators and curators of NECA.
Our first curatorial decision was conceptual: to restrict the archive's geographical
range to Central America as defined by colonial-era administrative boundaries. This
meant including documents from Chiapas but not from neighboring and similarly
multilingual places like Oaxaca, where Nahuatl also functioned as a vehicular
We also took seriously Justyna Olko's and John Sullivan's assertion that more research on this topic [of local and regional differences and
their relation to standardization] is greatly needed; especially useful would be a
systematic comparison between regions as well as between higher and lower-ranking
scribes/authors within a given locality
As we began to build the site, created and solicited feedback from an advisory board, and presented at conferences in the United States, Guatemala, and El Salvador, overlapping and mismatched interests in the NECA corpus became increasingly apparent. Historians, anthropologists, and archaeologists working in Central America were enthusiastic about sharing their archival references and interested in the information the documents contained, which they often could not read. Linguists and philologists working primarily in Mexico were interested in the dialectal features of the documents but were unfamiliar with their Central American context and history. Scholars and activists working on Nahuan languages in Central America expressed interest but lacked the financial and human resources to engage NECA without diverting valuable attention from existing projects, especially those supporting revitalization of Nawat in El Salvador.
We began to think about how NECA’s structure could more actively facilitate communication across these disciplinary, regional, and national borders. Unlocking the information inside the documents would be the essential first step for any kind of macro-analysis of the entire corpus, computational or otherwise, and for connecting scholars with similar interests and complementary skills. Could we help scholars find not just the documents, but each other? Could we create an online workspace that encouraged scholars to share their expertise and begin to generate data for comparative and collaborative analysis? Taking inspiration from crowdsourcing projects such as
Add a Documentfeature using a Simple Contact Form plugin to encourage contributions of new documents. A separate, linked Wordpress site (https://nahuatlnawat.wordpress.com) became the project blog and discussion space.
The backbone of Omeka is the items list, supported by Dublin Core-based metadata.
Most metadata elements are obvious: date, title, source, etc. Nevertheless, each
element reflects a curatorial decision made by us with certain goals in mind. We
added new metadata elements for the number of folios
to emphasize the variety
and extent of the corpus, and for at-a-glance decisions by users about whether or not
to transcribe; sample text
to spark the potential transcriber's and/or
translator's interest; location
with the modern countries, states, and/or
departments in addition to the colonial-era information to allow for sub-regional
searches and future experiments in mapping; date of creation
of the item
itself to keep a record of the corpus's growth; and the contributor
of the
document in order to acknowledge her or his research and participation.
Metadata omissions also reveal the synergy between form, content, and curation. A
primary goal of NECA is to encourage the linguistic study of a larger corpus of
Nahuan documents from Central America than usual, and eventually to gather the
results in a database of linguistic features for comparative analysis. Some of our
documents conform to a single, clear Nahuan variant. Most, however, present a mix of
attributes, as one might expect of writing produced by non-native speakers in a
context of ongoing (or decreasing) standardization, colonial power dynamics, and the
adoption by Indigenous people of foreign writing technologies. This linguistic
heterogeneity makes the NECA corpus an exceedingly valuable resource for exploring
the history of Nahuan languages at linguistic borderlands
Decisions about the items themselves predetermine what researchers can and cannot do
with them. Most of NECA's items are fragments within larger documents — sometimes,
much larger. On a mostly non-existent budget, we faced issues of server space, labor,
and funding: photographers require payment, repositories may charge publication fees.
Additionally, in this first iteration of the project we were focused on access and
translation. We therefore chose to publish only the Nahuatl portions of any given
document, for both practical reasons and in order to attract Nahuatl translators.
This decision has consequences. For better or worse, it denies the user access to any
Spanish translation that might have appeared in the original document. It also
separates the fragment from its larger documentary context, digitally replicating the
same de-contextualization that has been suffered by many Mayan-language documents. A
fuller understanding of the document's creation and information can only be achieved
by consulting the original document in relation to its archival context. Data sets of
people, places, and other kinds of information contained in the digital archive — for
instance, paying attention to geographical location or scribal networks — will also
remain incomplete without access to the full original. Researchers will have to
return to the physical archives in order to get the whole picture, and we run the
danger that they will not
Finally, anticipating the user experience led to some programming alterations.
Omeka's automatically generated citations omitted the original archive; we changed
the code to cite the document's physical repository and archival signature first,
followed by NECA and the date of access. To guide users towards specific activities,
we turned Omeka's featured items
into a sample transcription
and
featured collections
into document teams.
Omeka's built-in
internationalization combined with the plugin Locale Switcher made the site
bilingual, allowing users to choose in real time whether to view the site in Spanish
or English. Because we had significantly altered the standard Omeka framework with
new navigation headings, metadata categories, etc., Spanish versions had to be added
to the internationalization code, as did all Spanish translations of all the text
within the transcription tool Scripto. However, these changes affected only the user
interface, not the items' metadata. Assuming that most of our users would be
competent in Spanish but not necessarily in English, we decided to make Spanish the
primary language of the site (and in doing so, officially baptized it as NECA: in
Spanish,
At every structural opportunity we emphasized the collaborative, open nature of the
project and minimized our own gatekeeping. Conversations during beta testing between
anthropologist Janine Gasco and historians Julia Madajczak and Agnieszka Brylak
inspired us to create mechanisms for interdisciplinary document teams to work on
single items. Contributors of new citations are individually added to the About
Us
page as well as to their items' metadata. Transcribers and translators are
encouraged to register for Scripto with their full name so they can be properly
identified in the versioning of transcriptions and translations and credited in
future publications, as we require under our Creative Commons Attribution-Non
Commercial 3.0 U.S. license. NECA is not a crowdsourcing project, but it does invite
researchers to share their documents, modern
Archival research and the transcription and translation of idiosyncratic documents
written in difficult handwriting, often in foreign languages, requires patience,
time, resources, and above all, advanced skills that accrue over the years. Doctoral
degrees, job offers, tenure, and future funding depend on demonstrating the fruits of
this individual labor. There is nothing wrong with claiming the privacy to work, and
what we have labeled document teams
in NECA can also form via email,
conferences, special journal issues, and edited volumes. If NECA's first iteration –
the digital archive – produces a flurry of new publications and dissertations created
outside our platform, this will be a positive result.
NECA nevertheless encourages scholars to go beyond individual documents and to work
beyond their comfort zones. It identifies common research interests across
disciplines and national and academic communities, and presents the opportunity to
share citations, translations, and knowledge in a public forum; to compare notes
online; and eventually, given transcriptions and translations, to create databases,
analyze the corpus as a whole, and experiment with different digital and
computational tools. The NECA corpus is large and geographically varied enough to
reveal not only the dialectal features of Nahuan languages in Central America, but
also the documents’ production related to colonial settlement, ecclesiastical
influence, social and political networks, the economy, and geography. We see great
future value, especially, in thinking through NECA’s data using spatial analysis and
mapping tools. Bringing linguists and translators of Nahuatl together with
non-
So far, the answer is yes and no. NECA's analytics from Reclaim Hosting show that since the digital archive went online in July 2016, it has received the most intensive and consistent use (measured by bandwidth used, the ratio of pages to hits, and annual location data) from Mexico, El Salvador, and Guatemala, as well as Spain, Germany, Poland, and France in Europe — the last three being major centers of Mesoamerican and Nahuatl studies — and the United States, Brazil, and Canada in the Americas. Presentations at the University of Texas at Austin in March 2017, the Congreso de Estudios Mayas in Guatemala City in July 2017, the Asociación Centroamericana de Lingüística annual meeting in San Salvador in August 2017, the American Historical Association annual meeting in January 2018, and the Sociedad Mexicana de Historiografía Lingüística in Mexico City in October 2018, each produced temporary bumps in the number of unique visitors and/or intensity of use, which then tapered off. The Austin presentation acted as an official launch of the project with the power of social media behind it, resulting in an eighteen-fold increase in unique visitors immediately afterwards (March-April 2017). Subsequent presentations in Guatemala and El Salvador produced the most remarkable user data in the site's history thus far. In the two months following (July-August 2017) — and with no official social media push — the number of unique visitors to the site quadrupled. More importantly, the bandwidth and pages-to-hits ratio indicated significantly more searching through the site's most complex pages, such as those containing document images, than after the Austin presentation. The Central Americans' more intensive use is visible in the contrast between their relatively low number of unique visitors (yellow) relative to pages, hits, and bandwidth (blue and green):
From 2016 through 2018, the United States and Ukraine generated most of the site’s
hundreds of thousands of page views, 75% of which lasted thirty seconds or less.
Presumably, a large portion of these were bots. The next largest proportions of
visits, however, lasted for over one hour (around 8%), thirty minutes to an hour
(around 6%), and fifteen to thirty minutes (around 4%), suggesting that a significant
minority of users were seriously engaging the site. Notably, when we ceased to
actively promote the site in 2019 we saw a drop in unique visitors, a consistent
narrowing of the pages-to-hits ratio indicating shallower exploration of the site,
and 88% of visits lasting less than thirty seconds. (For the first time, a large
number of such visits in 2019 came from the Netherlands, bumping Ukraine to third
place in the probably a bot
category). Nevertheless, in 2019 the most
intensive users — those spending thirty minutes to over an hour on the site at a time
— still constituted our next largest user group, or 7% of the total number of
visits.
As a digital archive, therefore, NECA is doing reasonably well even when we do not
take advantage of conferences, social media, and other means to publicize and promote
it. As an online platform for collaborative transcription, it has been less
successful. A few people have used the Add a Document
feature to provide new
citations and high-quality images, but most of the 19 new documents added since the
site’s inception have come from our own research or direct outreach. The same is true
of the Discussion area, where invited essays by Janine Gasco on Nahuan agricultural
terms in the Soconusco and by Adriana Álvarez on Nahuatl instruction at the
Universidad de San Carlos in Guatemala have generated a handful of comments from
community members mostly from El Salvador or of Salvadoran descent in the United
States, but no serious scholarly engagement, without which we cannot move forward to
better understand why, how, or to whom these documents might be important.
The document teams and Scripto's transcription tool have attracted no users at all since beta testing in March 2017. This may be a design issue. This first iteration of NECA is based on a pre-designed Omeka platform and utilizes only the Scripto features made available through the plug-in. Certainly we could improve the transcription and translation tool to be more appealing and effective, including a simpler user interface, better versioning, an improved commenting feature that identifies the user and is always visible, side-by-side images and workspace, progress bars, and the ability to toggle between transcriptions, translations, and versioning on a single page. The features and functionality of the transcription tool at the
How the digital archive's form can encourage engagement with its content is not,
however, only a design issue. The most successful transcription projects come from
outward-facing institutions digitizing items from their physical archives and making
them available to citizen humanists
with the clear goal of public engagement —
for instance (among many other examples), the Smithsonian Institution's
A search through the transcription platform FromThePage's various collections suggests that more academic projects often involve fewer participants, especially where handwritten manuscripts from earlier time periods with idiosyncratic paleography in languages other than English are concerned. Online transcription in these circumstances seems to work best as a collaboration tool between professors and students, or between small groups of colleagues with similar skills. This is the case of the
Comparing these projects, and NECA, to the
To re-design the weakest link of NECA, its transcription tool, would require at minimum a switch from the current pre-designed website and/or outsourcing of the tool, and possibly changing from WikiMedia to a standalone database. It is not clear that, at this stage of the project, the effort would be worth it. While some have expressed interest in using the site as a teaching tool for advanced students who are simultaneously learning Nahuatl and paleography, there is no way to know whether this is happening. Likewise, if more established scholars are working with documents from NECA, they are doing outside the context of the site. At a practical level, scholars may find online transcription and translation, which requires working within the confines of the program and/or between multiple formats, less efficient than traditional methods. They may also appreciate opportunities for face-to-face discussion prior to performing their work online. Scholarship is risky and takes time. Sturdy, creative collaborations between people who have not traditionally worked together — such as the local, national, disciplinary, and academic networks that have expressed interest in NECA yet remain siloed from each other — may initially develop better in person. Rather than immediately overhauling the site or the transcription tool, a better next step for NECA may be more old-fashioned: to convene scholars and community members in different combinations and venues, with the goal of creating collaborative teams and identifying viable research questions and interests in common.
Digital humanities promises more than a new marriage between mathematical, qualitative, and design methodologies and tools. It also proposes a paradigmatic change in how scholars collaborate, flattening research and/or learning communities and vaunting an idealized, non-hierarchical community where people willingly share their research, promote interdisciplinarity, and work in teams of members with complementary skills sets, none of which is seen as more important than another. Despite the ways in which this mimics Silicon Valley-ese (rightly criticized for its hypocrisy), there is much to hold onto here: the potential of digital humanities to communicate with broader publics, to democratize the production of knowledge, to make the fruits of scholarship more accessible, and to make us all more flexible thinkers. As NECA argues, digital archives also have the potential to push scholarship in certain directions by calling attention to understudied texts or problematics and by making the materials for studying them available.
But the digital humanities’ optimistic, even utopian view of the scholarly workplace is tinged with disciplinary, financial, and intergenerational anxieties. In the United States, humanities scholars of all stripes fear the devaluation of their work in the information age. The younger generation faces an increasingly freelance economy and shrinking humanities job market from the peculiar position of being simultaneously valued for their digital savvy (writing code, understanding algorithms, managing project teams, marketing their work), expected to be innovators and jacks-of-all-trades, and suspected of not doing the kinds of specialized research that got their professorly elders tenure. Established scholars are suspected of lagging behind the digital turn, but have more freedom to experiment with digital tools — or not — with far less risk to their future careers. They are also the gatekeepers of the academy.
It is therefore incumbent upon senior scholars, especially, to ponder the lessons of
creative failure in digital humanities projects. NECA shows the potential for digital
archiving to turn a wide range of people's attention towards a particular corpus of
historical documentation and set of questions. NECA also highlights the difficulty of
attracting scholars to skills-intensive transcription and translation online in
collaborative projects without prior commitments, goals, and relationships in common.
While we maintain the first iteration of the NECA digital archive, our next best step
for transcription and translation — the necessary building blocks of any future
database — will involve human, not digital, development: recruiting and funding new
team members, acquiring grant money to pay for skilled transcriptions and
translations, and organizing conferences. With data in hand and new ideas on the
table, we can start to contemplate smaller, more limited digital tools — what
Rockwell and Sinclair [2016] call
náhuatl cotidiano,el de
doctrina,y el de
escribaníaen Cuauhnáhuac entre 1540 y 1671.