TEI in Libraries: Guidelines for Best Practices

“TEI in Libraries: Guidelines for Best Practices”

David Seaman Electronic Text Center University of Virginia dms8f@virginia.edu LeeEllen Friedland Preservation Directorate Library of Congress lfri@loc.gov Chris Powell Humanities Text Initiative University of Michigan sooty@umich.edu Chris Ruotolo Electronic Text Center University of Virginia cjr2q@virginia.edu Jackie Shieh Alderman Library University of Virginia ejs7y@virginia.edu Natalia Smith Wilson Library University of North Carolina at Chapel Hill nsmith@email.unc.edu Perry Willett Main Library Indiana University pwillett@indiana.edu

David Seaman, Chair On June 30-July 1 1998 the Digital Library Federation organized a meeting at the Library of Congress in Washington D.C. on the Text Encoding Initiative (TEI) and Extensible Markup Language (XML). Representatives attended from libraries all over North America and Europe. For more information, see:

The practical result of this conference is rapid and ongoing work by that part of the library community most involved with TEI; the aim is in short order to provide firm, "library-centric" guidelines in the following areas:

1. encoded full-texts
2. metadata (especially MARC/TEI/Dublin Core interchange)
3. the use of TEI to manage page-image projects like JSTOR

Each working group spawned a smaller taskforce to continue the work started in this 2-day meeting. The taskforces for two of the three groups - encoded texts and metadata - have already had follow-up meetings and are issuing draft reports; they met again at ALA, in late January 1999, to incorporate comments and finalize the draft Guide to Recommended Practices. The metadata group were able to draw gratefully on the findings that came out of last year's "Putting our Headers Together" TEI meeting at Oxford University (immediately prior to DRH97). The format of this was a useful model for the text encoding group too, who started with a similar meeting to discuss what we already do ("Putting our Bodies Together", we called it). The ACH/ALLC conference in June 1999 is the ideal first venue to publicly unveil and discuss these recommendations. It will give our users a chance to comment, it will allow librarians in attendance to challenge, augment, and adopt our findings, and (as important as any of the above) it will allow us to start a dialogue with major etext publishers to see if they can begin to deliver texts in forms that integrate closely with the data -- and crucially the metadata -- that we create in-house. The level of commitment to this endeavor has been high, and the ability to present the first version of the Guidelines at the 1999 ACH/ALLC conference (which many of us are attending) provides us with the best, quickest feedback and the opportunity to discuss how we will continue on to a second year of meetings. It is my hope that we can sustain the momentum we currently enjoy to hone the Guidelines and to provide further guidance and tools -- such as the TEI/MARC webform program that is underway at Virginia and the collective guide to TEI tag usage, also in process, that draws on examples of use from a variety of holdings. We believe that our presentation at the ACH/ALLC will be mutually beneficial for us and for the general TEI community, since libraries are a major but often under-represented component in the larger TEI universe. The panel will begin with a brief background summary by the Chair, and move swiftly to reports from each of the three Taskforce areas. Of particular importance to elucidate is not only our findings and recommendations, but the lively process we have gone through - are going through - to reach this consensus, and our reasons for rejecting certain alternatives. Given the nature of this set of Guidelines, and our desire for feedback and modification, we expect to allow plentiful time for discussion. The Taskforce members are all TEI and metadata practitioners - librarians from institutions with a long investment in online delivery of encoded data and a firm sense of mission about what they do and why. This attempt by a major TEI user-community to define for itself common practices in data and metadata creation should hold some useful lessons for other clearly defined TEI communities (textual editors, non-Library university producers, publishers, journal producers) who so far have not articulated a core set of Best Practices, or undergone a critical self-examination of current practices as a way to define recommendations for the future.