DHQ: Digital Humanities Quarterly
Volume 15 Number 1
Preview  |  XML |  Discuss ( Comments )

Founding the Special Interest Group Audio-Visual in Digital Humanities: An Interview with Franciska de Jong, Martijn Kleppe, and Max Kemman


An interview with Professor Franciska de Jong (Director at CLARIN ERIC), Dr. Martijn Kleppe (Head of Research at the KB, National Library of the Netherlands), and Dr. Max Kemman (Researcher/Consultant at Dialogic) on the founding of the ADHO Audiovisual in Digital Humanities (AVinDH) Special Interest Group. They are interviewed by Stefania Scagliola (Centre for Contemporary and Digital History), who co-founded the group and is a co-editor of this special issue.


The special issue was initiated by the Special Interest Group Audio/Visual in Digital Humanities, a group that brings together community participants who share an interest in audio and/or visual data. Yet, how did the SIG come into formation? The following is an interview about why and how the AVinDH SIG was founded. The interview was conducted on June 20th 2020 via the online application Microsoft Teams.
Stefania: How did you sense the need for such a SIG in the context of DH?
Franciska: We were interested in a form of dialogue with scholars working with audiovisual data and computational methods. As a linguist who had built up expertise in search technology to open up spoken word archives, I had already expanded my focus from data related to the study of language to audio and video collections that are part of the cultural heritage domain. Both Erasmus University Rotterdam and the Netherlands Institute for Sound and Vision were keen on presenting their work stemming from two international projects, AXES and Post-Yugoslav Voices. The best way to engage with peers, in our view, was to organise a workshop for the audience that we envisioned. However, the overwhelming interest and enthusiasm from the side of the presenters and the audience was a surprise to us. We were clearly filling a gap. After having heard about the possibility to set-up a Special Interest Group that could be endorsed by ADHO (Alliance of Digital Humanities Organisations), we immediately took action and submitted a proposal.
Stefania: I specifically recall how one of the presenters involved in research on sound archives confided to the audience that "at last she did not feel as the odd one out at a DH conference". Does it mean that DH approaches to audiovisual data had a backlog compared to textual data?
Franciska: I don't think so. I think they operated in different circles. They presented their work at conferences on media studies, sound studies, oral history, and computer science. They published their work in journals that stemmed from these specific networks. If I think of the first projects that were set up in the Netherlands to apply state-of-the-art software to open up cultural heritage archives, we published in proceedings of conferences about software development for the study of language, or journals about digitisation.
Max: I agree, but at the same time, when compared to the tools that existed at the time to search for patterns in massive amounts of text, the audiovisual realm did have a backlog. In the workshop proposal that we put together for DH2014, we claimed that it was a matter of urgency to develop new tools for extracting information from audiovisual archives in the same way as could be done with text, particularly given the prospect of the exponential growth of (moving) images on the web that was envisioned in 2014. We could, at the time, still refer to the audiovisual as a "blind" medium for retrieval, quoting Sandom and Enser, because of the need for sequential viewing to extract knowledge from the source (Sandom and Enser, 2001). It is remarkable how in only six years time this claim seems no longer valid given the progress in Computer Vision and Speech Retrieval.
Stefania: The central themes of the subsequent workshops seem to reflect this progress. We tackled obstacles for the integration of AV in DH at DH2014 in Lausanne, formally installed the SIG in DH 2015 in Sydney, discussed multimodality at DH016 in Krakow, explored computer vision at DH2017 in Montreal, hosted a tutorial on Distant Viewing at DH2018 in Mexico City, and set-up hands-on sessions with audiovisual research infrastructures at DH2019 in Utrecht. Could you say that the SIG had a considerable role in this progress?
Figure 1. 
Number of works accepted at the ADHO DH conferences between 2013-2018 related to A/V and Multimedia. Figure provided by Scott Weingart.
Martijn: I think that gives too much credit to the SIG, as if we had formulated specific objectives from the start and were consciously selecting topics and papers that reflected the state-of-the-art. Our approach was actually quite pragmatic. We wanted to create a structure for people with similar research interests to meet, either through the mailing list, or via the yearly conference. I think what was key for the role of the SIG, was its institutional embedding. Being endorsed by ADHO meant a secured slot in each pre-DH conference program that could draw the attention of scholars who would otherwise primarily share their work within their traditional mono-disciplinary scholarly realm.
Franciska: Computational approaches to audiovisual data were gaining momentum around that time. You could say that the SIG presented itself at the optimal moment, in contingency with other important developments, such as national and cross national funding for opening up audiovisual heritage for the general audience and scholars.
Stefania: Could you give some examples of these initiatives, and were key people in these projects involved in our workshops?
Martijn: Well the first key lecture was given by prof. Andreas Fickers, who was already a central figure in the EU project on cultural heritage of television, EU Screen, and now runs the Centre for Contemporary and Digital History (C2DH) in Luxemburg. Scholars working on building up the large scale research infrastructure for humanities research such as CLARIAH in the Netherlands, including professor of Digital Culture Heritage Julia Noordegraaf, were well represented at all the SIG's workshops. This applies also for the Media Ecology Project of Mark Williams from Dartmouth, and for Lauren Tilton and Taylor Arnold's work on Distant Viewing. Yet, there is no way of telling whether our program really reflected the state-of-the-art in this field. I recall that at some point at DH2017 in Montreal I approached a researcher who I expected would have had an interest in presenting at our SIG pre-conference workshop, but the reply was that he preferred to present at a slot within the conference. This gave me food for thought about whether our SIG served as a kind of incubator, for scholars that in a next stage moved on to a slot in the main conference.
Stefania: With regard to representativeness, you are right that we cannot make any claim, but I can offer a rough sketch of the contributions from 2014 until now, based on a review of the workshops at the annual DH conference. What I see is that most papers or short talks were about innovations in technologies to extract visual or aural features either from a single collection, from a homogenous archive, or from an archive with collections in different formats. Next in line was the topic of annotation in music and audiovisual collections. The innovation in these types of papers were represented by the possibility to automatically enrich manual annotations with computer generated links by applying the principles of Linked Open Data. There were some papers that focused less on tool development and more on the analysis of content, like the study of changes in voices in news coverage or the use of colours in film. Specifically at the workshop in Krakow, the theme of multimodality featured contributions that evolved around education and pedagogy. There was also some interest in restoration, preservation, and loss of data on the web. Other topics across the workshops included copyright, oral history, art, metadata and interface design. The most striking shift in the programme of the SIG, that in a way reflects the increasing maturity of the field, is from a space with papers presentation to hands-on tutorials. In Mexico 2018 and Utrecht 2019, the workshop followed more closely a traditional workshop format and offered opportunities to try out the tools that had been envisioned in talks in the previous workshops.
Max: This may sound like the field is constantly growing within DH, but when you step out of our AVinDH bubble and look at the statistics that Scott Weingart collected of the previous DH conferences, you can see that it is still a niche, well established, but with an abundance of alternatives to present work on computational approaches to AV. There are a lot of factors that play a role in determining the interest of scholars working with audiovisual data to profile themselves as digital humanists.
It would really be worthwhile to conduct a more detailed study about the type of contributions, the affiliations, the platforms where work is published, and the professional trajectories of scholars involved in this type of DH work to get a better sense of how AV in DH researchers are advancing the field.
Stefania: To conclude, I would like to pay tribute to our late Canadian colleague Clara Henderson, an ethnomusicologist working at Indiana University, who joined the SIG as a representative of the American hemisphere in Sydney DH 2015. This was where the SIG AvinDH was formally approved by ADHO. Her background in ethnomusicology helped widen the scope of the SIG beyond the realm of media studies. We were deeply saddened when we received the news in Autumn 2016 that Clara had passed away after a short illness. Thinking of all the plans with the SIG that she had in mind and tasks that she was willing to take up, I would like to dedicate this little thread in the immense tapestry of DH scholarship that brings people together across the continents to her memory.
Picture of Clara Henderson.
Figure 2. 
Clara Henderson's photo on her Twitter account.
With special thanks to Scott Weingart and Max Kemman for providing the visualisation of AV in DH conferences and to the members of the former Steering Group who were willing to share their memories and insights.

Works Cited