The collegial spirit of the project prohibits us from analyzing the dataset of the 2009 iteration in a definitive way. We do not believe that the project curators have a special relationship to the dataset that resulted as it was conceived of and organized to be an open shared project, and it could also be argued that we are too close to the project. We were encouraged that participants like Stéfan Sinclair applied analytical tools to the RSS feed on the day in question, a sign that the community felt they had the right to analyze the results as they were happening. Nonetheless as organizers we have analyzed the project in order to understand what worked and didn't. We share this analysis in that spirit and not as a definitive analysis of the community.
3.1 Answering the Question
The first question we should be asking is whether the project has advanced our knowledge of what digital humanists do and what the nature of the field is. The answer is not simple because the project has become part of the field it describes. Yes, the majority of participants posted entries about what they did on March 18th, but there has also been a level of critical reflection on the project that makes clear that the entries can't be trusted as an “objective” record of participant activity.
[8] As a result, our view of what it is that we are doing has shifted. We went from a simple idea for a project that would be a form of community auto-ethnography (and an experiment in social media in the humanities), to wondering if we haven't stumbled upon an alternative form of (un)conference where people gather on their own time for a day to discuss things without paying to attend timed events. While most participants still do document their day, there is a degree of self-reflection and disciplinary reflection that makes the project more than just a community documenting everyday practices.
What is clear is that the How do you define Humanities Computing / Digital Humanities? feature of the registration process is probably a better answer to the question of what the digital humanities is than the full dataset. This feature, initially designed to help us filter applicants, has proven a useful collection of short definitions for all to use.
We have also had to be honest with ourselves about our motivation, especially as we annually consider whether we want to go through with another iteration when we have no specific funding for the project. In the Fall of 2008 when we conceived of it, we thought of it not only as a way to answer a question, but also as a way to experiment with crowdsourcing in the humanities inexpensively – without having to get a grant, which is what we usually do in our field. As such, it has successfully answered a different question, i.e. whether we could design a crowd-sourced discussion about the digital humanities.
[9] Each year there have been more participants and a modicum of attention. In its success, the project may be taking on a life of its own.
3.2 Participation
A second type of analysis is to look at participation statistics. Participants for the Day of DH 2009 were primarily recruited in two ways: through direct invitation and through an application process advertised on the Humanist email list. Word of mouth filled in the rest of group. The most common occupations amongst participants were teaching roles, such as professors and instructors, and research roles. Also common were administrative heads, programmers, and librarians. However, within the group there was a noticeable lack of students. Despite our efforts to encourage a diverse crowd of participants, students did not feel confident in the value of their experiences to want to participate in the project.
Reflecting the linguistic and geographic bias of the organizers, Canada was the most represented country – which is not surprising given that it was organized out of Canada – followed by the USA, Great Britain, Ireland, and Germany.
[10]
Country |
Number of Participants |
Canada |
30 |
United States |
23 |
Great Britain |
13 |
Ireland |
5 |
Germany |
4 |
Slovenia |
1 |
Australia |
3 |
Luxembourg |
1 |
Italy |
1 |
Sweden |
1 |
Poland |
1 |
Switzerland |
1 |
Netherlands |
1 |
Total |
85 |
Table 1.
National Distribution
We can’t escape concluding that, despite the open call, Anglophones were much more likely to participate despite our efforts. This could be due to an Anglo-centric bias in the digital humanities. It could be that people doing this sort of work elsewhere don’t associate with the term digital humanities preferring something like informatics. It could be that social networking projects tap into existing networks despite the potential for broader distribution. Whatever the cause, it is one of the limitations of the project and one which we are addressing (see below).
The participants described their 2009 “Day in the Life of Digital Humanities” with a total of 668 posts, 434 images, and 181 comments. The distribution of these items among different blogs is illustrated in the above graph, where the x-axis represents the count of items in a blog and the y-axis represents how many blogs had each count. This basic analysis reveals that the majority of blogs (twenty-six) contained about six posts in each. Many blogs had ten or fewer posts, but there were a few blogs with as many as eighteen posts.
The graph also shows that several blogs had a fairly large number of images. However, as many as twenty-two blogs included no images. We don't know whether exclusively textual entries were intended or if any technical problems discouraged participants from sharing images. The number of comments per blog follows a similar pattern. Perhaps in a future event, something can be done to encourage more images and comments.
3.3 The Tagging
While keeping the tags applied by participants, we decided after the event to add a common set of classification terms to the data. Before this could be done a consistent system had to be developed, beginning with the decision of whether to pursue a controlled vocabulary or free text. WordPress uses its own terms for both: 'categories' for controlled vocabularies and 'tags' for free text. Free text tagging was decided against, as the dataset and range of subjects covered within the Day of DH 2009 was too small to benefit from it. Instead, a controlled vocabulary was created.
[11]
The first step in establishing a controlled vocabulary is determining a level of specificity [
Taylor 2005]. Should categories be very specific or more general? While the former offers a finer description of its text, the latter is less time-consuming and less prone to error. With the Day of DH 2009 tag set, we pursued a more general set of umbrella terms. Going through a partial set of results, overarching concepts were identified until the set was at saturation. In the first draft, concepts were hierarchical, with relatively abstract top-level headings such as “actions” and “events”. As we began to work with this approach, we realized that it was more specific and complex than the project required, and by the final version the hierarchies were mostly flattened and considerably more direct. Finally, it was decided that terms should define explicit concepts and avoid those which are implicit. For example, location and time were categorized only when they were part of the content. In these and other examples, for the indexer to extrapolate data not in the text would be to risk introducing inconsistencies. This choice for broad specificity appears to have been an appropriate one, as we were able to keep the classification process in the purview of just one coder, which benefited the consistency of the task.
Once the category vocabulary had been established, a single member of the team tagged each post in the interests of consistency. Several additional categories of tags were created as the range of activities explored by the bloggers emerged. Initially, the process of categorizing the individual posts seemed straightforward within the context of the controlled vocabulary. However, it soon became clear that a controlled vocabulary, while needed in order to structure our dataset for eventual export and quantitative analysis, did not fully encompass the complexities inherent in the average Day of a Digital Humanist. The main issue with using the controlled vocabulary for tagging the blog entries was interpretive (c.f. [
McCarty 1991]; [
Hockey 2000]). Though a category tag gives the impression of being an objective label of any particular content, the tag chosen to represent any particular post is colored by the tagger’s personal biases, unconscious or otherwise. In some instances, the labels applied by the Day of DH 2009 research team were different than those applied by the researchers themselves.
This brings up the question of whether one interpretation is more valid than another. In the case of a single blog, an activity labeled “learning” by a participant but labeled “research” by the Day of DH 2009 organizing group is not a large issue. The tag applied externally to the original author could easily be changed to accommodate the author’s intended interpretation of his or her own work. However, since hundreds of posts were tagged, a consistent interpretation of activities was necessary to control the quality of the exported data, therefore superseding, in some cases, the interpretation of the author.
In addition to the interpretive gap between the original author and the coder, there is the possibility that a tag applied to a post is going to bias others’ interpretation of that post. If the Day of DH 2009 tagger tags a post “Research” then what influence does that category tag have on the next reader? Would that person have interpreted that post as representative of another activity entirely? Ultimately, it was decided that the interpretive issues with tagging were all appropriate for the project. In the time since, we have found that our subsequent work with the data has not been limited by the choices made in classification. The Humanities as a whole encompass disciplines in which multiple interpretations are usual and welcome. We acknowledge that the system of tagging in place for the Day of DH 2009 posts is only one possible interpretive framework given the range of activities engaged by the digital humanities community, and we have been refining it with each iteration.
3.4 Analyzing the Tagged Dataset
After the tagging we then analyzed the dataset using Voyeur Tools (
http://voyeurtools.org) and counting the distribution of tags. Here are the 20 highest frequency content words:
digital |
402 |
day |
398 |
work |
362 |
I'm |
228 |
humanities |
256 |
time |
253 |
project |
243 |
students |
189 |
research |
177 |
today |
165 |
working |
154 |
meeting |
128 |
class |
121 |
people |
119 |
DH |
116 |
home |
101 |
good |
98 |
text |
97 |
data |
96 |
Table 2.
List of 20 high-frequency content words
It shouldn’t surprise us that “day”, “digital”, and “humanities” are among the most frequently used words. Likewise, we would expect entries describing what people are doing to use “I’m” often (as in “I’m doing X or Y”.) Similarly, the frequency of “time” can be ascribed to the role of time management in describing a day of work, especially if you have to take time to blog about what you are doing in addition to doing it.
From a disciplinary perspective what is interesting is the importance of the “project”. The frequency of “project” suggests we conceive our work as being around projects. This is supported by the tagging. “Project Work” (DDH-ProjectWork) was the most popular tag, being applied to 209 entries. Project work and discourse around projects is one of the distinguishing features of digital humanities work; we doubt philosophers would use the word as frequently to describe their work.
DDH-EarlyMorning |
6 tags |
DDH-Morning |
189 tags |
DDH-Afternoon |
167 tags |
DDH-Evening |
62 tags |
DDH-Night |
38 tags |
DDH-AllDay |
11 tags |
Table 3.
Time of Day Tags
Returning to time, we were surprised how many entries were tagged Evening or Night. Based on the data collected, a large number of participants work at home in the evening.
DDH-Home |
112 tags |
DDH-Office |
98 tags |
DDH-Outside |
25 tags |
DDH-Class |
22 tags |
DDH-CoffeeHouse |
19 tags |
DDH-Lab |
18 tags |
This correlates with the high incidence of the Home tag. Note also the importance of the Coffee House which surpasses the lab. We would expect the computer lab to be more important; perhaps coffee is the most important technology in the digital humanities.
DDH-ProjectWork |
209 tags |
DDH-AdminService |
84 tags |
DDH-Email |
81 tags |
Table 5.
Administration and Project Tags
As for what we do with our time, project work is important, as mentioned, but our time is also taken up by other administrative tasks including email and service tasks. This is not to devalue administration and service, but to acknowledge their importance.
DDH-Meeting |
91 tags |
DDH-Conference |
32 tags |
DDH-Class |
22 tags |
DDH-Lab |
18 tags |
We think it also important to note how social digital humanists are. Contrary to the image of the solitary humanist, digital humanists spend a lot of time with other people in meetings, in conferences (or planning them), in class, and in labs. That is not to say that we don’t also work alone (see Office and Home above), but we spend a significant amount of time with others. This could be connected to project and administrative work as digital projects typically involve multiple people with different skills who need to communicate and meet. Here’s a typical description, from one participant: “this a regular and generally argumentative internal review meeting at which people from across the department report on projects underway.” The attitudes towards meetings, are, however, mixed. As another puts it, after reading the posts of others, “People go to way, way too many meetings.”
DDH-Reflecting |
113 tags |
DDH-Teaching |
100 tags |
DDH-Research |
84 tags |
DDH-Reading |
65 tags |
DDH-Programming |
55 tags |
DDH-Writing |
54 tags |
DDH-Blogging |
31 tags |
DDH-DataCollection |
19 tags |
DDH-Learning |
16 tags |
DDH-Editing |
15 tags |
DDH-Gaming |
5 tags |
If we look at tags for types of academic activity we see that digital humanists do what we would expect humanists to do including Reflecting, Teaching, and Research. The high count for Reflecting is, as we discuss later in this paper, due in part to the nature of the Day of DH project, but it is also a paradigmatically humanist response. Humanists will be relieved to know that Reading, Writing, and Editing are still done in the digital humanities, but Programming, Blogging, Data Collection, and Gaming are new activities for the humanities. It is surprising that programming would show up more often than writing, even given the role of computing.
Lastly we share some anecdotal thoughts coming from the Day of DH and the correspondence around the project. Coming from a university that has a number of DH projects, faculty, and graduate students (the University of Alberta has an MA in the field) we forget how lonely it can be to do digital humanities elsewhere. Many who do computing in the humanities are alone in their university and feel isolated both in their department and from the field. We were struck by how many people told us in correspondence how they welcomed the Day of DH because it let them be part of a larger research community for one day. It also gave many a feeling of visibility in and belonging to a field that can increasingly be seen as exclusive. While there was an application to participate, we didn’t turn anyone down who understood what they were getting into. This meant that many people who felt outside the discipline now felt part of it for a day and part of building the disciplines’ self-understanding. This is good as the digital humanities is a field that always thought it was inclusive, but can fail to live up to its self-image.
[12]