“Electronic Resources for Historical Linguists. Part 2:
Dictionaries and Related Resources”
Christian
Kay
University of Glasgow, UK
Irené
Wotherspoon
University of Glasgow, UK
Susan
Rennie
Scottish National Dictionary Association, UK
Anne
McDermott
University of Birmingham, UK
Chair: Christian Kay
The set of papers will be followed by a discussion, which may include points
from Part 1: Medieval Studies (session 1.2 on Saturday 22 July).
Developing a User Interface for the Historical Thesaurus of English Ingres Database
Irené Wotherspoon
This is effectively a continuation of a paper published in Literary and Linguistic Computing Vol 7, 1992,
which described the issues and processes involved in transferring
the legacy database of the Historical Thesaurus of
English (HTE) in dBase 11.4 into Ingres 6.1, chosen as
being able to support the retrieval of complex information. During a
long-term project like HTE, spanning decades rather than years,
there will obviously be a huge change in the software available, and
software in use will become obsolete. It is therefore vital to
transfer the database into new options where feasible. The paper
describes the problems encountered in developing a front-end for the
Ingres database, now in OpenIngres O.I.2.0, which would cope with
the multiple conditions of the queries required in what is
essentially a research tool, but which would also be acceptably
user-friendly. The user interface now in use is BI/query, the
successor to GQL, and this will be described and demonstrated. The
complexities of retrieval from the HTE arise from the data concerned
with the currency of the words, taken from the Oxford English Dictionary. This consists of a) actual
starting and finishing dates for periods of currency; b) the
qualifiers ante and circa, as in the OED. (These are not used in the
OED CD queries, but are necessary for retrieving e.g. words current
before a specified date); c) indicators for continuous or
discontinuous currency; d) indicator of present currency, e) Old
English, covering all dates before 1150; f) style and status labels
from the OED attached to periods of currency - a word may be
labelled e.g. 'vulgar' for only part of its currency.
The Electronic Scottish National Dictionary: Work in Progress
Susan Rennie
The Scottish National Dictionary (SND) is the standard historical
dictionary of modern Scots, covering the period from 1700 to the
present. This presentation will describe the current project to
digitise the SND to produce the eSND, which will eventually be
output on the Internet. It will include a brief description of the
SND itself, outlining its history, content and structure, and
describe how the eSND will differ from the printed text, in
particular by integrating the original Supplement and adding new
material from the SNDA's ongoing research. The various stages of the
eSND project will then be discussed, using examples from the work in
progress:
- 1. the data capture, which is being done through scanning and OCR of the printed text;
- 2. the conversion of the OCR data to full XML mark-up, including details of the actual mark-up scheme (which is based on the TEI guidelines), and how this has been adapted to suit the SND text;
- 3. the integration of the original Supplement and new material;
- 4. the development of search tools and a web interface.
Early Dictionaries of English and Historical Corpora
Anne McDermott
This paper discusses the implications for historical corpora of an
examination of some early dictionaries of English. Comparison is
made between Samuel Johnson's Dictionary of the
English Language (1755) and works such as Thomas
Blount's Glossographia (1656), Nathan
Bailey's Dictionarium Britannicum (1730),
and Robert Ainsworth's influential Linguae Latinae
Compendiarius (1736). The claim that early
English-English dictionaries contained many "hard words", i.e.
words, often from classical sources, which had infrequent or nil
occurrence in the language at the time, is examined by checking the
occurrence of such words in historical corpora. This leads to an
examination of the corpora themselves and comparison with corpora of
modern English. Suggestions are made about how historical corpora
might be developed in future, for example by offering
domain-specific search facilities. Reference will be made to A dictionary of the English language on
CD-ROM: the first and fourth editions, Samuel Johnson,
edited by Anne McDermott (Cambridge : CUP in association with the
University of Birmingham, 1996)