In his keynote speech
From Punched Cards to Treebanks: 60
Years of Computational Linguistics at the Eighth International
Workshop on Treebanks and Linguistic Theories (Milan, 2009), Busa sketched three
main typologies of informatics currently in use: “documentaristic” informatics, comprising all the
informatic services allowing efficient “information retrieval”; “editorial” informatics, referring to the wide range of “multimedia” devices for reading books,
watching films, browsing the Internet; and “hermeneutic informatics”, considering the “computerized text analysis, or
language hermeneutics, i.e. interpretation, […] of all our ways of
questioning the
whys of language”
[
Busa 1999, 5]. This latter was the informatic typology to which Busa devoted his
inexhaustible attention during all his life. The focus on language, according to
him, is strictly required by the nature of the relationships between man and
computers, which interact by means of specific programming codes. Moreover, “the computer allows and exigently
demands, as its specific capacities, an exhaustive, detailed, deep,
quantitative knowledge, derived from huge amounts of natural
texts”
[
Busa 1999, 6]. As a consequence, it is arguable that computers and human beings,
technological devices and humanities are not competitors or antagonists, but
they are potential allies, insofar as they both are “human expressions”
[
Busa 1999, 6]. Nevertheless, what do we really know about the language (our language)
on which human communication is based and by which it is conveyed? Are we really
aware of the intrinsic logic, the inner dynamism, and the psychological
implications which are put in motion every time we communicate? Roberto Busa
defined the “language that [is] unknown”
[
Busa 1999, 6]
[3], as
signifying that establishing an interaction with the computer/machine implies a
more profound level of language awareness than that to which we are accustomed.
This raising of consciousness is necessary to introduce consolidated
philological and linguistic methods to the “new qualitative dimensions”
[
Busa 1980] made available by informatics. In the light of this, applying computer
methods to the humanities “can help us to be more humanistic
than before”
[
Busa 1980, 89] because it leads us, first, to an inner journey through rational paths
triggered by language expression. This conviction mirrors the current debate
regarding the necessity of an epistemological reflection on the making of the
digital humanities: for example, Stephen Ramsay and Geoffrey Rockwell have
recently stated that “the understanding of underlying
theoretical claims is the
sine qua non
of humanistic enquiry”
[
Ramsay and Rockwell 2012]. Roberto Busa would have certainly confirmed this vision and reasserted
that these “underlying theoretical
claims” primarily involve language dynamisms and linguistic issues.
More specifically, these underlying implications concern the meaning, the “semantics” of words and sentences, which is
not attainable by a mere quantitative production of a certain amount of data.
Busa claimed that “we do not speak in words but in
sentences. A sentence has a global meaning which is not the pure sum of
the values of its single components. The heart of this problem is
whether we are able to formalise the global meaning of sentences with
something less than the whole sentence itself; in other words, whether
we can succeed in identifying in each sentence something which can be
taken as characteristic of its global meaning”
[
Busa 1980, 88].