“Computer-Aided Acquisition of Language Teaching
Materials from Corpora”
Svetlana
Sheremetyeva
New Mexico State University, USA
Awareness of domain-tuned linguistic peculiarities present in expository
texts is a relevant concept in helping students' reading and writing
competency in terms of genre literacies.
Support for this point of view comes from the analysis of academic written
genres, competing demands for limited resources, the tyranny of scheduling
and from graduate students' verbal protocols about their reading process
(Sengupta 1997).
Genre literacy or sublanguage approach in instructed SLA advocated in this
paper tries to exploit lexical, morphological, syntactic and semantic
restrictions on the specialized languages used by experts in certain fields
of knowledge for communication or in particular types of texts (technical
and scientific articles, instructions, installation manuals, etc.).
Notions of sublanguage distinctiveness rely on linguistic knowledge
concerning different kinds of sublanguage regularities and restrictions.
(Kittredge and Lehrberger 1982). Sublanguages are special subsystems of a
natural language with restricted vocabulary and grammar which, on the one
hand, share some properties with a language as a whole and, on the other
hand, are characterized by some deviations from "general" language.
As far as language instruction is concerned, both defining the content of
this knowledge and ways of sublanguage knowledge elicitation are problems
which do not have a single answer. Despite a long-standing interest in the
analysis of written genres (sublanguages), little research has focused on
how to really use genre specificity in language instruction.
This presentation explores critical issues in the selection of an appropriate
methodological framework for the analysis of profession-related texts. It
aims to provide suggestions as to the kind of sublanguage analysis method
that is supposed to form the basis for developing a system of typological
parameters useful in acquisition of teaching materials and thus tuning
language instruction to the needs of professional communication.
To describe a particular sublanguage it is necessary to study laws underlying
natural language phenomena and laws which make a sublanguage differ from a
language. Sublanguages can be described in many ways. Language instruction
is influenced by such practical parameters as scope and nature of
vocabulary, grammar specificity, potential for ambiguity, lexical and
grammar correlation, if any, which can and should be discovered on the basis
of corpus analysis (Biber et al. 1998; Wichmann et al. 1997).
This study focuses on verbs, as they are central to the structure of a
sentence and consequently to text structure (Levin 1993; Aarts and Meyer
1995). The reason is that in professional reading most problems usually
derive not from technical nouns and noun expressions which are relatively
easy to find in specialized dictionaries but from grammar which is often
characterized by extended sentences with frequently long and telescopic
embedded structures.
The current study also proposes and tests a sublanguage-specific hypothesis
of correlation between lexical meaning, morphological representation (tense,
voice, finiteness) and syntactic realization (subject, object, predicate,
attribute, etc.) of a particular verb in a sublanguage.
Material for the research includes five corpora of 50,000 words each from
different technical sublanguages: aerospace engineering, automobile
engineering, mechanical engineering, technology engineering and patents. The
sample corpora are taken from four technical journals (Space Flight, Automobile and Tractor,
Materials Engineering, and Machine Design) and a corpus of US patent claims.
The main method of analysis is a computer-aided corpus-based combination of
qualitative and quantitative (statistical) techniques applied to a
pre-tagged corpus, which proved to be useful for linguistic knowledge
elicitation (Sheremetyeva 1998). Tagging, done manually by trained
linguists, codes morphosyntactic realizations of sublanguage verbs. For
example, in the sentence "Making_TIA this apparatus they used_2IA a new
technology", the tag TIA means that the verb "make" is used as an adverbial
modifier in the form of Present Participle, the tag 2IA shows that the verb
"use" is realized as a predicate in the form of Past Simple Active.
This methodology allows for a standard automatic frequency count procedure to
be applied to provide:
- a) a verb inventory and its size in terms of verb occurrences;
- b) a verb morphology and grammar inventory and their sizes in terms of occurrences of specific values of tense, aspect, voice, finiteness/nonfiniteness and syntactic functions as well as in terms of co-occurrence of grammatical features (for example, in the sublanguage of automobile engineering the most frequently used nonfinite realization of verbs is the Past Participle in the function of attribute, while no realization of verbs as Gerunds or Infinitives in the function of subject was found);
- c) an inventory of lexical and morphosyntactic correlations (for example, in the sublanguage of automobile engineering the verb "use" is most often realized as the Past Participle in the function of attribute while the most frequent realization of the same verb in the aerospace engineering sublanguage is the Present Participle in the function of adverbial modifier).
Conclusions
The paper presents a computer-aided methodology and the results of selecting teaching materials for optimizing students' reading and writing competencies in terms of genre literacies on the material of four technical sublanguages. The results of the study show "deviations" of every sublanguage from the general language and from each other. They also confirm that there exists a correlation between lexical meanings of many sublanguage verbs and their morphosyntactic realizations. These deviations can be used for selecting professionally oriented language teaching materials to most effectively foster language proficiency development. The approach was tested and proved to be very useful at the Department of Foreign Languages of South Ural State University (Russia). It is expected to be portable to other sublanguages and can be used both for developing theoretical and practical issues in applied linguistics.References
The Verb in Contemporary English: Theory and Description. Ed. B. Aarts Ch. F. Meyer. Cambridge: Cambridge University Press, 1995.
D. Biber S. Conrad D. Reppen. Corpus Linguistics. Investigating Language Structure and Use. Cambridge: Cambridge University Press, 1998.
R. Kittredge J. Lehrberger. Sublanguage: studies of language in restricted domains. Berlin: , 1982.
B. Levin. English Verb Classes and Alternations. Chicago: University of Chicago Press, 1993.
S. Sengupta. “Academic reading skills for L2 learners: Does teaching
selective reading help?.” Proceedings of the Annual Conference of American Association for Applied Linguistics. Seattle, March 13-17, 1997. : , 1997.
S. Sheremetyeva. “Acquisition of Language Resources for Special
Applications.” Proceedings of the workshop Adapting Lexical and Corpus resources to Sublanguages and Applications in conjunction with The First International Conference on Language Resources and Evaluation, Granada, Spain, May 1998. : , 1998.
Teaching and large Corpora. Ed. A. Wichmann S. Fligelstone T. McEnery G. Knowles. New York: Eddison Wesley Longman Inc., 1997.