Digital Humanities Abstracts

“Into the Depths of Data. Methods of Subject Specific Content Retrieval”
Kurt Gärtner University of Trier gaertnek@mailer.uni-marburg.de Gisela Minn University of Trier minn@uni-trier.de Andrea Rapp University of Trier rappand@uni-trier.de Martin Raspe University of Trier raspe@uni-trier.de Ruth Christmann University of Trier christma@uni-trier.de Thomas Schares University of Trier schares@uni-trier.de

In April 1998, the Competence Centre for Electronic Retrieval and Publishing Techniques in the Humanities was founded at the University of Trier. The use of international hard- and software independent standards as SGML/XML is one of the main targets of the Competence Centre in dealing with full-text digitization especially of critical editions, dictionaries, and important reference works. Information scientists and humanists from various disciplines are working closely together in order to guarantee that the electronic resources developed at the Centre meet with scientific requirements. Furthermore, the team aims at complex and powerful retrieval mechanisms that can be handled easily by a consistently user-oriented design of Graphical User Interfaces. An important overall feature that has often been ignored by people working in the field of digitization but is characteristic for the research done at the Competence Centre is the close linking of software development to the scholarly background of the material. Examples for the development of user-oriented software in different projects as well as for the embedding of the activities of the Competence Centre into research done by universities and the German academies of sciences shall be given in the following three papers on (A) the Rhine-Meuse Net, (B) the WIRE project, and (C) the digitization of the Deutsche Wörterbuch - a history, an art history, and a German language and literature project. (A) The conception of the so-called Rhine-Meuse Net originated from the activities of a Collaborative Research Centre (= SFB 235) having examined the history of a European core area from the Ancient World to the 19th century. For more than 12 years, a large amount of valuable data has been accumulated in multiple document types and formats. However, not all the material was published, although, in many cases, even the unpublished material is of high interest to researchers in and outside the context of the SFB. Therefore, the existing data will now be encoded in order to ensure its longevity and at the same time be entered into a database. Thus it will be possible to use these data even though the funding of the SFB by the Deutsche Forschungsgemeinschaft (= DFG) is due to cease in 2002. (B) In contrast to the Rhine-Meuse Net dealing with material already existing, WIRE, the Word and Image Retrieval Environment, is primarily intended as a tool for scholars that need some support in building new (digital) collections of scientifically relevant texts and images. The internet-based system allows for an integration of texts, structured data, images, and bibliographies into a relational database. As WIRE can be configured according to specific needs, it does not only support the use by individual scholars but is also well apt at being used by teams of scholars working together on a particular object of research. Since various retrieval functions are implemented, WIRE is not only useful for scholars who build new collections but also for those who only want to browse through collections built by their colleagues. (C) The retrodigitization of the Deutsche Wörterbuch by Jacob and Wilhelm Grimm has to be seen in the broader context of dictionary making at the University of Trier. When work on a new Middle High German dictionary was started in 1994, lexicographers wished to have access to as many electronic texts and dictionaries as possible. However, to fully exploit the advantages of an electronic dictionary, one does not only need a fairly thorough markup of the entries but also a highly comfortable way to present the dictionary on screen and thus make it readable - just imagine that several entries of the Deutsche Wörterbuch cover more than 300 columns in print! The demonstration of the CD-ROM prototype of the Deutsche Wörterbuch might serve as a good example for how in-depth retrieval carried out thoroughly contributes to the development of software that allows accessing the dictionary data in new ways. It will be very interesting to see how new possibilities to access data of various provenance and of multiple kinds will lead to new questions, new methods, and new insights into the digitally edited source material.

Title A: The Information and Reference Network for the History of the Rhine-Meuse Area. An Area-Oriented Subject Information System for the Humanities

Dr. Gisela Minn Dr. Andrea Rapp
1. General and Institutional Preconditions
Apart from the parameter "time", the parameter "area" has in the past few years received increased attention as a fundamental category of human existence. Particularly regions as middle-sized units of area have established themselves in a multitude of disciplines as ideal units for investigation. In the Rhine-Meuse Net, the regional area is made use of as a central access and ordering category for the integration of research results that are far apart with regard to time and differ in document type, methods, and topic. The international research compound of the Collaborative Research Centre "Between the Meuse and the Rhine. Connections, Encounters, and Conflicts in a European Core Area from the Ancient World to the 19th Century" (SFB 235) has acquired a large amount of valuable and, with regard to document types, very heterogeneous data, that are not only concerned with a common area of investigation but are also closely connected with regard to content. This complex amount of data forms the nucleus of a projected database serving as a reference system for European regional history. The project is being funded by the Deutsche Forschungsgemeinschaft (DFG) since 1st November 2001. Apart from the historical field with all its specialist research interests, there are involved related disciplines such as art history, archaeology, history of law, and history of German and Roman languages; they all partake in the research compound, as well as various national and international, university and non-university cooperation partners. Therefore the project aims firstly to take into account the changed needs for information of a growing international research community and secondly to lay the grounds for European research in history beyond the borders of nation-states. For this the network is particularly apt, as it opens up a European core area at the intersection between Western and Middle Europe from ancient times up to the present, and it will present the results of international researchcollaboration. The long-term data-conservation and its platform-independent use is ensured by a consistent application of international standards on the basis of SGML/XML.
2. Content-Related Principles of the Network
The realization of the network starts at two core units: Firstly, the annotated bibliography of the whole publication output of the SFB (about 900 nos.) will be edited, including all the unpublished dissertations and theses which document the whole scope of research. Due to the area-oriented interest of the SFB, cartographical methods and techniques of representation belong to the most important research procedures. Thus secondly, an electronic archive of maps was built (of about 500 items) that will be linked to the bibliography. By these two core units that are representative for the whole scope of the network, thesauri of places, persons, and subjects will be accumulated and structured hierarchically for an in-depth disclosure of the data. They form the basic framework for a further indexing of the data and will be extended to a dynamic research tool that will become more extensive and complex with the integration of each new reference unit. A sophisticated system of indexes and metadata will guarantee the linking of these units.
3. Variety of Document Types
The document types representing the cultural heritage as well as the results of scientific research in digital form are very heterogeneous: texts, maps, pictures, plans, images, tables, archival finding-aids and repository guides, indices, bibliographies etc. At the same time, these document types are very closely related as regards content in a very complex and multidirectional manner. In the Humanities especially, far-reaching methodical and content-related impulses are to be expected by an explicit representation of these relations. Moreover, the general approach requires interdisciplinary and comparative studies, new access to digital resources, and the development of cartographic methods for analyzation and documentation. Therefore, we aim at a concatenation, retrieval and integration of these digital reference-units of different document types in a reference compound. The following document types form the database of the network and have to be opened up and interlinked:
  • Units of information referring to area and region such as local registers and catalogues, complex place lexica, single maps and series of maps, annotated atlasses that combine maps, place catalogues, and commentaries.
  • Units of information referring to persons and institutions such as registers and catalogues of persons, prosopographies and biograms of persons, catalogues, lexica, tables, and lists of institutions.
  • Units of information that combine information on texts and pictures such as text or picture catalogues, visualizations, and reconstructions.
  • Units of information that represent sources, archival finding-aids, and instruments for the documentation of research such as special bibliographies, region-related source editions of different genres, and repository guides, literature and review service, documentations of research.
Therefore we provided for the following ways of access by thesauri: access by place (in addition by visual representations such as two- or three-dimensional maps), access by time, access by person, access by topic, access by object via document types (e.g. only maps, only sources, etc.), access by funding organizations (respective research institution).
4. Methods, Technical Bases
Due to the complexity of the structures and the implicit relations characteristic for the Humanities, the construction of such networks cannot be carried out by automatic means only but has to be completed and supervised by human researchers. Therefore it is all the more important to develop mechanisms with the help of standards that support the construction of a complex structure and corresponding retrieval mechanisms effectively. Moreover, these mechanisms have to be well documented and safely stored for further research in times of rapid technical change. Variable, differentiated, and efficient strategies for searches and visualizations have to be created for convenient use. Due to different structures of document types brought together in the network, existing DTD schemes have to be checked as to their usability, varied and expanded, and new DTD schemes have to be developed for document types not already on hand in an SGML-compliant format. These schemes have to be applicable to different research projects as well. The open conception of the network resulting from this is the precondition for a transfer of these structures and methods to other information and reference networks.
5. Comparisons and Prospects
The information and reference network is open to cooperation projects with university and non-university institutions offering further information that goes beyond the scope of research of the SFB. Special emphasis is laid on the integration of libraries and archives. For example, the SFB bibliography as core unit is linked to the OPAC of Trier University Library. The integration of archival finding-aids beginning with the finding-aids of the municipal archive Worms may serve as an example for cooperation with other archives. Furthermore, cooperations with scholars from neighbouring countries have been established which focus on common region-related aspects and methods respectively and, by common use of the information and reference network, should be long-lasting. In some regards, the Rhine-Meuse Net was inspired by the project "The Valley of the Shadow. Two Communities in the American Civil War", that was carried out at the University of Virginia/Charlottesville (http://jefferson.village.virginia.edu/vshadow2/). Especially the regional aspect as well as the variety of document types offe