DHQ: Digital Humanities Quarterly
2021
Volume 15 Number 1
Volume 15 Number 1
Founding the Special Interest Group Audio-Visual in Digital Humanities: An Interview with Franciska de Jong, Martijn Kleppe, and Max Kemman
Abstract
An interview with Professor Franciska de Jong (Director at CLARIN ERIC), Dr. Martijn Kleppe (Head of Research at the KB, National Library of the Netherlands), and Dr. Max Kemman (Researcher/Consultant at Dialogic) on the founding of the ADHO Audiovisual in Digital Humanities (AVinDH) Special Interest Group. They are interviewed by Stefania Scagliola (Centre for Contemporary and Digital History), who co-founded the group and is a co-editor of this special issue.
Introduction
The special issue was initiated by the Special Interest Group Audio/Visual in Digital Humanities, a group that brings
together community participants who share an interest in audio and/or visual data. Yet,
how did the SIG come into formation? The following is an interview about why and how the
AVinDH SIG was founded. The interview was conducted on June 20th 2020 via the online
application Microsoft Teams.
Stefania: How did you sense the need for such a SIG in the context
of DH?
Franciska: We were interested in a form of dialogue with scholars
working with audiovisual data and computational methods. As a linguist who had built up
expertise in search technology to open up spoken word archives, I had already expanded my
focus from data related to the study of language to audio and video collections that are
part of the cultural heritage domain. Both Erasmus University Rotterdam and the
Netherlands Institute for Sound and Vision were keen on presenting their work stemming
from two international projects, AXES and Post-Yugoslav Voices. The best way to engage
with peers, in our view, was to organise a workshop for the audience that we envisioned.
However, the overwhelming interest and enthusiasm from the side of the presenters and the
audience was a surprise to us. We were clearly filling a gap. After having heard about the
possibility to set-up a Special Interest Group that could be endorsed by ADHO (Alliance of
Digital Humanities Organisations), we immediately took action and submitted a proposal.
Stefania: I specifically recall how one of the presenters involved
in research on sound archives confided to the audience that "at last she did not feel as
the odd one out at a DH conference". Does it mean that DH approaches to audiovisual data
had a backlog compared to textual data?
Franciska: I don't think so. I think they operated in different
circles. They presented their work at conferences on media studies, sound studies, oral
history, and computer science. They published their work in journals that stemmed from
these specific networks. If I think of the first projects that were set up in the
Netherlands to apply state-of-the-art software to open up cultural heritage archives, we
published in proceedings of conferences about software development for the study of
language, or journals about digitisation.
Max: I agree, but at the same time, when compared to the tools that
existed at the time to search for patterns in massive amounts of text, the audiovisual
realm did have a backlog. In the workshop proposal that we put together for DH2014, we
claimed that it was a matter of urgency to develop new tools for extracting information
from audiovisual archives in the same way as could be done with text, particularly given
the prospect of the exponential growth of (moving) images on the web that was envisioned
in 2014. We could, at the time, still refer to the audiovisual as a "blind" medium for
retrieval, quoting Sandom and Enser, because of the need for sequential viewing to extract
knowledge from the source (Sandom and Enser, 2001). It is remarkable how in only six years
time this claim seems no longer valid given the progress in Computer Vision and Speech
Retrieval.
Stefania: The central themes of the subsequent workshops seem to
reflect this progress. We tackled obstacles for the integration of AV in DH at DH2014 in
Lausanne, formally installed the SIG in DH 2015 in Sydney, discussed multimodality at
DH016 in Krakow, explored computer vision at DH2017 in Montreal, hosted a tutorial on
Distant Viewing at DH2018 in Mexico City, and set-up hands-on sessions with audiovisual
research infrastructures at DH2019 in Utrecht. Could you say that the SIG had a
considerable role in this progress?
Martijn: I think that gives too much credit to the SIG, as if we had
formulated specific objectives from the start and were consciously selecting topics and
papers that reflected the state-of-the-art. Our approach was actually quite pragmatic. We
wanted to create a structure for people with similar research interests to meet, either
through the mailing list, or via the yearly conference. I think what was key for the role
of the SIG, was its institutional embedding. Being endorsed by ADHO meant a secured slot
in each pre-DH conference program that could draw the attention of scholars who would
otherwise primarily share their work within their traditional mono-disciplinary scholarly
realm.
Franciska: Computational approaches to audiovisual data were gaining
momentum around that time. You could say that the SIG presented itself at the optimal
moment, in contingency with other important developments, such as national and cross
national funding for opening up audiovisual heritage for the general audience and
scholars.
Stefania: Could you give some examples of these initiatives, and
were key people in these projects involved in our workshops?
Martijn: Well the first key lecture was given by prof. Andreas
Fickers, who was already a central figure in the EU project on cultural heritage of
television, EU Screen, and now runs the Centre for Contemporary and Digital History (C2DH)
in Luxemburg. Scholars working on building up the large scale research infrastructure for
humanities research such as CLARIAH in the Netherlands, including professor of Digital
Culture Heritage Julia Noordegraaf, were well represented at all the SIG's workshops. This
applies also for the Media Ecology Project of Mark Williams from Dartmouth, and for Lauren
Tilton and Taylor Arnold's work on Distant Viewing. Yet, there is no way of telling
whether our program really reflected the state-of-the-art in this field. I recall that at
some point at DH2017 in Montreal I approached a researcher who I expected would have had
an interest in presenting at our SIG pre-conference workshop, but the reply was that he
preferred to present at a slot within the conference. This gave me food for thought about
whether our SIG served as a kind of incubator, for scholars that in a next stage moved on
to a slot in the main conference.
Stefania: With regard to representativeness, you are right that we
cannot make any claim, but I can offer a rough sketch of the contributions from 2014 until
now, based on a review of the workshops at the annual DH conference. What I see is that
most papers or short talks were about innovations in technologies to extract visual or
aural features either from a single collection, from a homogenous archive, or from an
archive with collections in different formats. Next in line was the topic of annotation in
music and audiovisual collections. The innovation in these types of papers were
represented by the possibility to automatically enrich manual annotations with computer
generated links by applying the principles of Linked Open Data. There were some papers
that focused less on tool development and more on the analysis of content, like the study
of changes in voices in news coverage or the use of colours in film. Specifically at the
workshop in Krakow, the theme of multimodality featured contributions that evolved around
education and pedagogy. There was also some interest in restoration, preservation, and
loss of data on the web. Other topics across the workshops included copyright, oral
history, art, metadata and interface design. The most striking shift in the programme of
the SIG, that in a way reflects the increasing maturity of the field, is from a space with
papers presentation to hands-on tutorials. In Mexico 2018 and Utrecht 2019, the workshop
followed more closely a traditional workshop format and offered opportunities to try out
the tools that had been envisioned in talks in the previous workshops.
Max: This may sound like the field is constantly growing within DH,
but when you step out of our AVinDH bubble and look at the statistics that Scott Weingart
collected of the previous DH conferences, you can see that it is still a niche, well
established, but with an abundance of alternatives to present work on computational
approaches to AV. There are a lot of factors that play a role in determining the interest
of scholars working with audiovisual data to profile themselves as digital humanists.
It would really be worthwhile to conduct a more detailed study about the type of
contributions, the affiliations, the platforms where work is published, and the
professional trajectories of scholars involved in this type of DH work to get a better
sense of how AV in DH researchers are advancing the field.
Stefania: To conclude, I would like to pay tribute to our late
Canadian colleague Clara Henderson, an ethnomusicologist working at Indiana University,
who joined the SIG as a representative of the American hemisphere in Sydney DH 2015. This
was where the SIG AvinDH was formally approved by ADHO. Her background in ethnomusicology
helped widen the scope of the SIG beyond the realm of media studies. We were deeply
saddened when we received the news in Autumn 2016 that Clara had passed away after a short
illness. Thinking of all the plans with the SIG that she had in mind and tasks that she
was willing to take up, I would like to dedicate this little thread in the immense
tapestry of DH scholarship that brings people together across the continents to her
memory.
With special thanks to Scott Weingart and Max Kemman for providing the
visualisation of AV in DH conferences and to the members of the former Steering Group
who were willing to share their memories and insights.