“An hypothesis of formalization of literary data for
text analysis: a case study on Karl Kraus' writings”
Daniela
Alderuccio
ENEA/UDA (Italy)
alderuccio@casaccia.enea.it
Introduction
The growing availability on the Web of literary heritage is going to make easier humanistic researches, on the one hand facilitating access to information sources and documents and on the other hand providing a knowledge representation of texts, enabling its sharing and reuse. One of the major problems to face in knowledge representation is the formalization of literary data. The main difficulty is to capture the richness of word meanings into an established form, which allows automatic data treatment, preserving the essence of a thing anyway. This challenge is related to the different nature of Computer Science and of the Humanities. The former has its foundation in establishing a formal representation of what exists (formal languages and modeling of reality); the latter is based on interpretation, whose subjectivity escapes from classification or rules. It is recognized that accuracy in literary analysis is related to cultural background and literary sensibility, but the underlying ambiguity of natural languages poses to researchers further difficulties: a specific term may have different or contradictory meanings and intepretations; authors frequently use different words or expressions to refer to the same meaning By developing common formalisms, Computer Science tools aim at reaching a sharable agreement on world representation. Similarly, in order to give an objective basis to concepts (starting point of the analysis), an application of this formal approach in the literary domain may allow experts to define and share a common vocabulary, to reach an agreement on word senses, thus reducing ambiguity. In the hypothesis proposed in this paper, the use of a reference tool (such as an ontology°) seems to offer a means to face this challenging task with success: by keeping from misunderstanding in reading texts and by limiting subjectivity in their analysis, the first expected result is a better comprehension of literary phenomena; by improving knowledge representation of a literary text, the second effect of formalization is the retrieval of more relevant texts for research purposes.Application and Results
In the analysis of a literary phenomenon, some of the aspects to be considered are:- the ambiguity of natural languages, that poses to experts problems in order to limit subjectivity in interpreting texts;
- and the heterogeneity of information sources to select (historical, cultural, geo-political), that determines the need of retrieving relevant documents for the analysis.
Conclusions
The achieved results show that literary data formalization based on ontologies is able to improve the accuracy of literary research. By including definitions of basic concepts in the domain (also in a machine-interpretable form), by identifying relations among them and by defining semantic fields, WordNet allows experts to share information in a domain, to provide critical notes and comments on texts, and to interpret them. Furthermore, from this study emerges that defining the semantic field of words (by applying definitions provided by an ontology) and indexing documents by adopting a semantic categorization is an effective way of representing the content of a text: the faculty to bring to light word meanings, hidden in texts in an implicit form, improves the retrieval of more relevant documents, matching humanistic research needs.References
AA.VV. “.” Information processing & Management ─ An International Journal. New York: Elsevier Science Ltd, 2001. 37: .
D. Alderuccio. “Dualism Truth vs. Propaganda in Karl Kraus. Methodology
for a computer-assisted literary analysis.” ENEA/University of Rome »La Sapienza«, 2000.
H. Arntzen. Karl Kraus und die Presse. Muenchen: Wilhelm Fink Verlag, 1975.
T. De Mauro. Capire le parole. Roma-Bari: Editore Laterza, 1999.
N. Guarino R. Poli. “The role of Ontology in the Information
Technology.” Int’l J. Human-Computer Studies. 1995. 43: 623-965.
M. Gruninger M. Ushold. “Ontologies: principles, methods and
applications.” Knowledge Engineering Review. The University of Edinburgh, 1996. 11: .
P. Kipphof. “Der Aphorismus im Werke von Karl Kraus.” Muenchen, 1961.
K. Kraus. Die Fackel. : Koesel Verlag, 1968.
K. Kraus. Beim Wort genommen. Passau: Koesel Verlag, 1955.
W. Mieder. “Karl Kraus und der sprichwoertliche Aphorismus.” Muttersprache. 1979. 89: 97-115.
G. A. Miller. “WordNet: a lexical data base for English.” Communications of the ACM. 1995. 38: 39-41.
G. A. Miller et al. “WordNet: An on-line lexical database.” International Journal of Lexicography. 1990. 3: .
J. F Sowa. Knowledge representation: logical, philosophical, and computational foundations. Pacific Grove, CA: Brooks Cole Publishing Co., 2000.
E. M.Voorhees. “Natural Language Processing and Information Retrieval.” Information extraction - Towards scalable adaptable systems. Berlin: Springer Verlag, 1999.