The glamour of print
There is no doubt that print sources, as opposed to digital ones, still carry
immense cultural cachet. As this article will explore in detail, many scholars
prefer to cite the print version of a work, even when they have only seen a
digital surrogate, in part because of this perceived prestige. We shall see that
such practice is problematic in a number of ways, and causes difficulties even
when using resources which are widely used and respected. For example, reference
works such as the
Oxford English Dictionary (OED)
and
Encyclopedia Britannica (particularly the 11th
edition of 1910-11) are viewed as paragons of scholarship. Those who have worked
in reference publishing (such as the authors of this paper) may have a slightly
more nuanced view: mistakes and misjudgements inevitably occur and a large pool
of contributors creates problems of consistency. The larger the project (the OED
runs to 20 volumes plus supplements as of last printing
[2]) the more difficult amendments, at least in the print copies,
become.
An example of this unevenness is the OED entry for
hubris, which
falls far below its usual standards. The definition is “Presumption, orig. towards the gods; pride, excessive
self-confidence”
[
OED 1933]. This is flatly contradicted by the first quotations,
which have nothing to do with gods. The etymology is short by OED standards,
“<Greek
ὓβρις” and
inadequate: which parts of the definition equate to Greek
ὓβρις? All of them? Arguably none of them do, in
fact. Liddell and Scott’s Greek Lexicon, still the standard dictionary of
classical Greek for English speakers, gives a number of meanings of
ὓβρις: “wanton violence;
insolence; lust, lewdness; (of animals) violence; an outrage; violation,
rape; serious injury to the person; a loss by sea; an overbearing
man”. The sense development from Greek to English has been ignored by
the OED, except insofar as it has confused the definer.
A much better treatment of
hubris is to be found in Wikipedia [
Wikipedia 2016]. Here the complexities of the term are addressed
much more fully
[3]. The
OED entry was written in 1933 and, it seems, has not been revised since, other
than to add some newer citations. There is no doubt that when the OED team comes
around to revising
hubris they will produce a superior
entry
[4]. However, it is likely that there will never be another print
edition of the OED [
Daily Telegraph 2014], so any improved
hubris will never be citable in a print format. Advocates of
print citation over digital citation would be condemned by their stated
principles to forever cite a piece of work that is flawed.
It is clear that we are at a critical juncture in the culture of citation.
Reference sources and journals are increasingly becoming online only, or exist
online in expanded versions of print. Yet, for reasons this paper will explore,
there is still a widespread aversion to the citing (as opposed to the using) of
digital resources. It remains a brave researcher who will cite Wikipedia in
preference to the OED.
In this article we will explore the resistance to citing digital sources that is
still widespread in the humanities. We have found that this resistance is
particularly prevalent where a print version of the source also exists; for this
reason we have focused on two digital projects which can be consulted online but
whose citations can be silently “converted” to a print-only
reference. As well as presenting the results of several surveys of users, we
will suggest some solutions to the culture of non-citation which we
uncovered.
A neglected topic
Despite our discovery of this pervasive culture of non-citation, we have been
able to find little mention of it as a research topic. It is noticeable, in
reviewing the literature on digital citation, that cross-disciplinary studies of
the subject tend to
exclude humanities subjects
[5], preferring to measure disciplines
from the sciences and social sciences. One such study says, “Humanities papers were excluded from the analysis
because of their long citation windows and high uncitedness
rates”
[
Lozano et al. n.d.]. Effectively, digital citations in the humanities are so low as not to
register in comparison with science and social sciences: “The very low percentage of articles cited at least once
may be a reflection of the tendency of humanities researchers to cite
books instead of articles”
[
Larivière and Gingras 2009].
Note that here we are talking about citation of articles. The focus of this
article, the citation of digital resources, appears to be the unloved son of the
unloved son. This impression is borne out by the present authors’ experience
when surveying journal editors about their digital citation practices. Many
editors assumed, in an understandable but illuminating example of
déformation professionelle, that by digital citation we meant
simply citation of online articles. This is not primarily what we mean here, but
it is important to note that citation of articles is more familiar to humanities
scholars than citation of digital resources such as digital libraries. This is
in part because tools such as Crossref automatically offer digital citations for
articles alongside print citations and in part because citation of online
articles has a certain congruity and familiarity with their print siblings.
Dalbello et al. have studied the digital citations in five journals in Classical
Studies and English and come to the same broad conclusions as the present
article: “supporting argument by means of electronic document
[
sic]…is not considered to be evidentiary… and… [t]hese
citation practices point to the still invisible nature of the electronic
document that is now ubiquitous in supporting the actual research
practice”
[
Dalbello et al. 2006, 4].
A small-scale study (with 16 participants) examined the use of e-texts, defined
as “any textual material in electronic form, used as a
primary source”
[
Sukovic 2009, 1002], and found that e-texts used in research were largely excluded from
citation in the written output: “references clearly did not represent the extent of
e-text used in the research”
[
Sukovic 2009, 1007]. Sukovic’s study, although not statistically significant, also found more
likelihood of citing e-texts where no corresponding print version exists [
Sukovic 2009, 1009]. What is unique about the present study
is that we have concentrated on digital resources where a print copy does exist
and where a digital citation is very easily converted to a print citation. This
situation provides the sternest test of a scholar’s attitude and citation
ethics, which in itself reflects the prevailing attitude to the scholarliness of
each type of citation.
British History Online and EEBO-TCP as case studies
This paper studies the citation of two digital resources:
British History Online (BHO) and
The Early
English Books Online Text Creation Partnership (EEBO-TCP). These are
highly instructive case studies for two reasons: they both offer digital
versions of print originals (which means that users can choose to cite print,
even when they have used digital) and both resources proved amenable to being
analysed using the Oxford Internet Institute’s
Toolkit for
the Impact of Digitised Scholarly Resources (TIDSR), which uses
quantitative (analytics, bibliometrics, log file analysis, surveys, webometrics)
and qualitative (focus groups, workshops, interviews, user feedback) methods to
evaluate the impact of scholarly digital resources [
TIDSR n.d.].
British History Online [
BHO 2015] and the Early English Books
Online Text Creation Partnership [
EEBO-TCP 2015] have both become
key resources in their fields. BHO, begun in 2003, is a digital library of core
sources for the history of the British Isles; it is run by the Institute of
Historical Research (IHR) and the History of Parliament Trust (HoP). It contains
about 1250 transcribed texts to date. EEBO-TCP is a collaboration between the
Universities of Oxford and Michigan and the commercial publisher ProQuest, which
creates searchable XML-encoded editions of early English books based on
ProQuest’s image resource. EEBO-TCP began in 1999 and will ultimately create
around 70,000 full texts which are accessed chiefly via ProQuest’s interface,
although there are also other, local implementations. EEBO-TCP is currently
available only through academic subscription at institutional level. However,
all EEBO-TCP texts will ultimately be freely available in the public domain, and
restrictions were lifted on the first batch of materials on 1 January 2015
[6]. By contrast, in BHO,
while some of the texts are behind a subscription paywall, the majority of the
site is freely available.
EEBO-TCP aims to create an XML-encoded edition of every monographic text printed
in English or in England in the period 1473-1700. If there is more than one
edition of a particular work, then the first edition is selected unless there is
a compelling reason to choose a later one (if the first edition is badly damaged
or incomplete, for example). The texts for BHO are chosen by an academic
advisory group from the IHR and HoP, with a view to meeting the needs of its
core users; BHO is aimed at research-level historians (one university, after a
trial of BHO's subscription content, declined on the basis that it was too
advanced for its undergraduates). Similarly, EEBO-TCP’s core users are generally
scholars at postgraduate level and beyond, although many undergraduates do use
the resource too
[7].
BHO receives about 10 million page views per year. Having carried out the TIDSR
analysis in 2010-11, it became clear that, despite this high usage, it is very
little cited in academic literature [
Webster and Blaney 2011]. The
bibliometrics (citation analysis) that allowed this conclusion are corroborated
by qualitative data: feedback sent to BHO from scholars who are irritated by the
site’s practice of not displaying inline page numbers. These scholars complain
that it makes the process of converting what they are reading on BHO to a print
citation more difficult.
As set out in Webster and Blaney (2011), a search for citations of British
History Online in journals (carried out in 2010) found 14 results using the
Scopus service and 17 results using Google Scholar. By contrast, blog posts for
just the period June to November 2010 showed that British History Online was
referred to 84 times.
Qualitative data suggested the reason for under-citation: analysis of site
feedback over the period 2003-2010 found a number of complaints about the
non-display of the print page numbers from the digitized books. For example,
this feedback message from 2007 is representative:
With
great pleasure, I have been going through your most excellent online version
of the ‘Thurloe State Papers’ for a scholarly paper which I am writing.
After consulting a particular transcribed document, I would then click on to
its approximate ‘Page’ number in the heading above, so as to search and find
the exact citation for my Endnotes. However, I now see that your current
online version does not display any original page images, thus preventing me
from determining a precise citation. How can I access the images, so as to
find the correct page-number for any given document?
EEBO-TCP’s application of the TIDSR analysis was carried out in 2012-13, as part
of the Jisc-funded project
SECT: Sustaining the EEBO-TCP
Corpus in Transition
[
SECT 2013], a study of the impact and sustainability of the
EEBO-TCP corpus. The TIDSR analysis was used to inform a set of recommendations
and practical implementations for the future curation and development of the
corpus
[8].
Bibliometric analysis suggested that EEBO-TCP is having a steadily increasing
impact on scholarship in relevant fields. The analysis surveyed EEBO and
EEBO-TCP-related publications in databases such as JSTOR and Scopus,
demonstrating a steady growth in such publications over the decade 2002 to 2012.
The Scopus data allows us to see the country of the authors of the publications,
indicating that authors from USA and the UK are most likely to mention their use
of EEBO, followed by two other English-speaking countries, Canada and Australia.
If we look at the journals that these articles were published in, we find a
range of journals chiefly in the fields of English Literature, Language and
History of the medieval and early modern periods. The bibliographic data,
therefore, supports the assertion that EEBO has had an increasing positive
impact on scholarship, particularly in English-speaking countries [
Siefring and Meyer 2013, 29–31].
However, user feedback (particularly via a user survey discussed in detail
below), as with BHO, indicated that many scholars, particularly in the
humanities, fail to cite or otherwise acknowledge their use of digital
resources. The quantitative data accumulated during bibliometric analysis can
therefore only be partial; if users are not citing their use of EEBO and
EEBO-TCP then any numbers-based demonstration of their impact could be
significantly lower than the true impact on scholarship. This disparity raises
the issue of citation practice – what are users citing if they are not pointing
to their use of digital collections? The image sets in EEBO and the full texts
based on them in EEBO-TCP are based on actual printed books from libraries
around the world: do users cite these original print copies even though they
have never actually seen them?
This raises questions for all creators of digital resources: Why do users avoid
citing the digital copies? What implications does this have for creators of
digital resources, particularly when they need to demonstrate impact? And what
measures can content creators introduce to combat the problem?
Survey of citation practice
As part of the TIDSR analysis, the SECT project conducted an online survey of
EEBO-TCP users. The survey was run for around four months from the summer into
the autumn of 2012. The survey was advertised on the project website and via
Twitter, and was highlighted at the EEBO-TCP Oxford conference in 2012. Details
were also sent to faculty administrators at units specialising in the early
modern period at institutions across the UK, for circulation to their students
and staff. 220 people in total started the EEBO-TCP survey, 208 completed at
least part of the survey, and 185 completed it in full. The survey asked
participants for lots of information about their use of EEBO and EEBO-TCP, and
some of the questions asked pertained to citation practice [
Siefring and Meyer 2013, 7–26].
The survey sought to establish the impact of EEBO-TCP in both teaching and
research, and revealed interesting attitudes to citation in both areas. First we
asked users who identified themselves as spending at least one-fifth of their
time teaching [
Siefring and Meyer 2013, 7]:
Most respondents use online resources in their teaching either daily (20%) or
several times a week (40%). Teaching academics not only use online resources for
teaching themselves but actively encourage their students to use them for their
own work. As we would expect from a survey that set out to reach EEBO-TCP users,
a high number of respondents encourage their students to use EEBO in particular
[
Siefring and Meyer 2013, 8]:
Almost all of these teachers encourage students to access online materials (97%),
and none of the remaining 3% of respondents actively discourage their use. The
survey also revealed that use of online resources in research is similarly
ubiquitous [
Siefring and Meyer 2013, 8]:
It is clear that online resources are now heavily used by most teaching academics
and researchers. EEBO-TCP in particular is very widely used in early modern
studies. But is this enormous weight of use reflected in citation practice? In
order to explore this question further, researchers were asked how they
themselves cite materials from EEBO-TCP. Those who teach were asked how they
would instruct their students to acknowledge resources that they have consulted
online [
Siefring and Meyer 2013, 17].
The use of EEBO is now commonplace in research and teaching and yet 34% of
respondents fail to acknowledge that they have used an EEBO text and instead
cite the print version only. A quarter of respondents actively teach their
students to cite only print. These students are being taught to ignore or
disguise their use of digital resources. The responses to this question suggest
an additional problem: many (and in the case of the EEBO-TCP-aware audience for
the survey, most) researchers want to acknowledge their use of online material
but, as there is no single established way of doing so, there is considerable
variation in practice. Some cite both print and online sources, some online
only, some simply place “[Online]” after their citation.
Further illustrative examples can be found in the free-text responses of those
who answered “other” to the question above.
Examples of “other” answers to the question of how to teach students to cite:
- “Cite original print version, using originating library's citation
system (e.g., Huntington Library) and then indicate that they read the
copy on EEBO.”
- “Like in the EEBO guidelines - cite the print version as seen in
EEBO, including ref to EEBO as well as the library the digital version
is scanned from.”
- “Cite the original and the source they've used (e.g. EEBO,
Online)”
- “Cite the original where there is one or the online one where there
is not”
- “It really depends on the specific situation”
- “Good question!”
Examples of “other” answers to the question of how researchers
do or would cite EEBO-TCP:
- “(In the case of my recent book, I had in fact seen physical copies
of all the EEBO resources I used.)”
- “Cite the resource, cite the online source name but not
URL”
- “Cross-check against image and then cite print version
only”
- “Cite fully both electronic and print versions”
- “Due to recent discussions with colleagues, I am in transition as to
how I would cite documents obtained from EEBO.”
- “Don't know”
These responses suggest the uncertainty that many feel when confronted with the
issue of how to cite their online sources. But the deeper problem remains that
many apparently don’t want to reveal that they used online sources at all.
The culture of non-citation
The EEBO-TCP user survey suggested that around a third of researchers fail to
indicate their use of digital resources at all. Is this indicative of wider
practice? What are the reasons for users failing to cite digital material? Why
are (some) authors reluctant to cite digital if they can change to a print
citation?
In order to try to find out more about the culture of citation, in April and May
2013 the authors sent via email a short survey to 60 UK-based print journals
covering the fields of literature and history. Representatives from each journal
were asked to send their responses also via email. An attempt was made to
balance the selection by surveying journals covering different time periods,
geographical focus, and thematic approach. In order to maximise the likelihood
of reply, the survey asked three simple questions:
- If you receive mss with digital citations do you change them to print if
possible or leave them as digital?
- If you receive mss with print citations do you change them to digital if
possible or leave them as print?
- Do you have a policy on digital versus print citation in your authors'
guidelines?
37 replies (a response rate of 62%) were received. 97% said they would not change
a print citation to digital, and 78% would not change a digital citation to
print. Nine asked for clarification on what was meant by “digital
citation.” Some responses treated the question as only pertaining
to the DOI of a journal article, not appearing to have in mind citations to
online text resources.
More interesting than the bald figures are the comments included by some editors
with their replies. For a number of journals digital citation by authors was
mentioned as a rare or non-existent occurrence, for example, “[our journal]
doesn't receive submissions with digital citations, so these questions are
not relevant to us.”
This raises the question of cause and effect. Do authors eschew digital citation
because they think it would be frowned upon? If a journal never prints digital
citations then potential authors reading it may think this is a deliberate
policy (although the editor mentioned above did not say that they would not
include digital citations, only that they do not receive them).
So, it may be that journal editors receive few digital citations and researchers
rarely see such citations in articles they are using for their research. What
assumptions are fuelling this cycle? Why do many shy away from citing digital,
whether they be authors or editors?
[9]
Practically, including URLs is seen to be a problem. Many fear that a particular
URL will no longer be active in five or ten years’ time (see, for example, [
European Science Foundation 2011]). While it is true that some URLs might not
remain the same, this problem is increasingly being addressed by libraries and
archives.
Legislation change in 2013 mandated the UK’s copyright libraries to capture the
UK’s public web domain as part of their remit [
UK Web archive 2016].
Additionally The National Archives have addressed this problem in their UK
Government Web Archive [
UKGWA 2016] (prompted by complaints that
Hansard references were suffering from link rot) by providing a bridging page
that takes users from the defunct URL to the archived web page.
The Internet Archive has released a Firefox extension, “No More
404s”. If the extension is enabled the browser hitting a broken
link will offer to take the user to an archived version of that page in the
Wayback Machine [
Firefox 2016]. This cannot be comprehensive
(because the Internet Archive is not comprehensive) but it offers a broad
solution to link rot that only this particular archive is equipped to
provide.
Harvard Law School Library leads a group of libraries and others in maintaining
Perma.cc, a free service which allows anyone with an account to create a link to
an archival version of a web page [
Perma.cc 2016]. In principle
this makes it possible for scholars to create digital citations which will never
break.
A more pertinent barrier to digital citation might be URLs that are too long and
cumbersome. Journal editors and print publishers dislike them because they look
ugly and are hard to typeset or format. Academics and students too dislike their
appearance, and the fact that, together with a full citation, they can affect
the word or page count of a piece of work. They are unattractive for readers
more generally. They have often been generated for technical reasons by the
content creators with little thought for the needs of eventual users.
Philosophically, some researchers may feel that there is little difference
between the database where one accesses a text and the library where one reads a
book. Such researchers wouldn’t cite the library, so why cite a database? Many
scholars, especially early modernists, do of course cite the source library.
Those who give this as a reason for failure to cite have perhaps never
considered the question of whether it is honest to hide their use of digital
material, or thought about how such material is funded. We would further argue
that there is a clear difference in the reading experience between manuscript,
print and digital, and to elide this in citation does an injustice to each [
Grafton 2009, 310f]
[
Edwards 2013].
This “hiding” of the use of digital resources gets to the
heart of the problem of non-citation: many believe digital resources are, or are
perceived to be, less reliable or not as good as “real” (i.e.
conventional print) resources. Some seem to believe that they need to hide their
use of digital surrogates in order to appear to be a better scholar (see [
Sukovic 2009] for some instructive examples from interviews with
researchers).
It may be that leading scholars, secure in their jobs and reputations, would be
less nervous about citing digital resources, if they think it appropriate. There
is some anecdotal evidence that this is the case. Historians Paul Kennedy
(internationally famous for
The Rise and Fall of the Great
Powers) and Norman Davies (internationally famous for
The Isles) have both approvingly cited Wikipedia,
sometimes in explicit preference to print sources, in more recent books [
Kennedy 2013]
[
Davies 2012, 143–149]. Michael Screech’s
Laughter at the Foot of the Cross, written as an
emeritus professor, mentions his being unable to find a reference despite
reading the complete works of St Jerome, but that a “young
friend” found the passage easily by using a database [
Screech 1997, 70]. But these are senior academics with
nothing left to prove.
In a trenchant article, the historian Tim Hitchcock points to a deeper scholarly
problem that lack of transparency over digital sources is obscuring. Researchers
are trained in traditional source materials: libraries, archives, conventional
reference works. They are, usually, inexpert in using digital resources: “We have not established the necessary new systems of
reference and validation that would make our use of these resources
transparent and repeatable.”
[
Hitchcock 2013, 18]
Hitchcock is making a broader point than this article seeks to address: digital
resources are problematic, often deeply so, and in claiming to use print where
they have used digital, researchers are seriously misrepresenting their
methodology. In order that scholarly work receive due scrutiny, it is essential
that scholars be clear, open and honest about their use of digital
resources.
Solutions? Changing perceptions and practice
What, then, can be done to improve the reputation of digital resources and to
encourage users to acknowledge their use? Such change must start with the actual
content creators themselves.
Fundamentally, content creators need to make it easy for their users to be open
and to properly acknowledge their use of a particular resource. If it is easy to
cite a digital resource, more users will do so. Digital resources should make
URLs as short as possible and, if possible, human-decodable, and should include
a clear link to an automatically-generated citation from the main page of a
text, image, or entry. In this way, content creators can make it as easy as
possible for their users to cite (or as difficult as possible for them not to),
with the result that citation rates should improve. A number of high profile
sites, such as the OED, Oxford DNB, Wikipedia, and indeed British History
Online, do allow users to automatically generate citations in multiple formats,
although even these excellent resources use URLs that are not obviously
decodable for readers.
[10]
Some scholars are concerned about how to cite something that may change or be
updated. Digital editors must, therefore, make it clear how to date content
accessed via their resource. Release information and/or editorial updates should
be made as obvious as possible. By dating digital items in this way, online
resource managers can help users feel comfortable about how to clearly refer to
the evidence that they are citing and when they are citing it. Sites should
encourage or guide users always to give a date of access whenever they cite a
digital resource, and should include such a date in automatically generated
citations.
Indeed, while this article was being researched and written, British History
Online was relaunched with a new citation generator and using a completely new
format for its URLs, to make them more immediately meaningful to the user.
(British History Online had for a number of years offered citation help,
allowing the automatic generation of a citation for any page of content: this
has not had a discernible effect on citation habits). As a secondary benefit, if
British History Online completely disappeared the new URLs would enable a
researcher to trace the URL to a portion of a print book. In practice it might
be easier to locate the URL in the UK Web Archive maintained by the British
Library on behalf of all copyright libraries in the UK, although this is not
currently publicly available; the Internet Archive’s crawls of the site are, it
seems, not comprehensive.
What was previously a database-generated number has now been converted to a
human-readable series and book. Further specificity is provided by the page
range of the book; where this is not possible, for born-digital content and
dictionary-like material, a meaningful subsection name has been chosen. For
example the old URL for the Survey of London, Volume 46, pages 280 to 293
was:
This has now become:
An additional advantage here is that the user who only wants the volume level, or
the series level, can strip off parts of the URL intuitively:
This process was carried out semi-automatically for about 100,000 URLs, using the
original database and simply concatenating database fields. Although the results
are surely a great improvement, the process was not onerous. There was
consultation in the team about the best forms of abbreviations to use for the
best trade-off between shortness of URL and clarity. The decisions here were,
first, to use standard abbreviations where they exist. For example, for the
Victoria County History (known to historians as the VCH) standard county
abbreviations were used:
The process could only be semi-automatic because some fields of the database
inevitably contained characters which are not allowed in URLs (such as quotation
marks in the titles of Acts of Parliament), which had to be located with a
regular expression and treated on a case-by-case basis (a side-benefit was that
this process exposed some metadata errors which could be fixed). Further, the
URLs had to be tested for uniqueness. Some of the new URLs were not unique
because they pointed to the single page of the same book. This was most
frequently the case with the folio volumes of the journals of the House of Lords
and House of Commons, where several days’ sittings might occur on the same page,
but had been separated for digitisation.
For example, page 14 of the print version of the Journal of
the House of Lords, volume 1, contains very short summaries of
sittings on various days in January 1550. Each sitting is a different file on
BHO, with a different URL. These have been given a suffixed letter to
disambiguate what would otherwise be the same URL:
As this example shows, although page ranges are easily added automatically to
URLs, they are not necessarily best practice. Better still would be a meaningful
string chosen by an editor: in the Lords and Commons journals this would be the
date of sitting; in the VCH example above, the parish or other unit under
discussion. This could not be done retrospectively in this case, but can be done
conveniently as content is created. Future material digitised on the site will
use an editorial decision to create each URL. For example, since relaunch the
site has published Proceedings in Parliament 1624, each with a bespoke URL
encoding the date of sitting:
The new version of the site went live in December 2014 and so it is still too
early to say if this change of URL convention will make a difference to the
under-citation of BHO discussed above. But one of the impediments given to
digital citation has, for this resource, been definitively removed. Indeed a
further refinement was added in October 2015, in response to the objection that
a page range was too imprecise for an academic reference. When the user mouses
over a paragraph of text, a pilcrow appears in the left-hand margin (very
faintly, so as not to be distracting); clicking on the pilcrow inserts the
relevant paragraph number into the URL bar, allowing a more precise reference
than a traditional print one.
These practical measures could help users feel more comfortable with the
practicalities of digital citation. However, the more philosophical discomfort
of according digital materials the same weight as print must also be addressed.
Those working in the field of digital content creation are doubtless aware that
some digital resources seem to be held in particularly high esteem. Despite the
nuances of editorial practice discussed at the beginning of this article, the
OED or the Oxford Dictionary of National Biography, for example, are understood
to be built on sound scholarship and their medium is considered unimportant.
With such examples in mind, other digital content creators could usefully
consider how best to promote the scholarly rigour and importance of their own
resource as a way of gradually eroding residual beliefs in the lack of
respectability of the digital. As a first step, web resources should provide
easily accessible editorial documentation at the point of accessing texts and
images (rather than solely on project-descriptive websites), enabling users to
fully understand the nature of the material they are accessing and the
assumptions that they can make about it. By making clear the nature of their
materials, content creators encourage researchers to use them in appropriate
ways and thereby enhance their own scholarly reputations.
All of these mechanisms may help change individuals’ scholarly citation
practices. In turn, these scholars, should they become teaching academics, will
pass on their habits and expectations to their students. The EEBO-TCP user
survey asked participants how they prefer to learn about digital resources [
Siefring and Meyer 2013, 12]:
Overwhelmingly participants prefer to explore digital resources themselves or
learn about them from peers. Uptake for library training sessions tends to be
low. Uptake for web tutorials seems rather low too – but this may be due to lack
of publicity or planning for dissemination. However, participants indicated in
the free text responses “other” ways of learning about
digital resources. Of the 17 free text responses to this question, 11 of them
describe recommendations by a tutor, professor or lecturer, at both
undergraduate and graduate level. This reveals a particularly important means of
learning about online collections: from those who teach [
Siefring and Meyer 2013, 12].
Teaching academics play a vital role in disseminating scholarly practices. Online
resources could usefully prepare citation guidelines and editorial documentation
that could be circulated to academic departments and subject administrators for
inclusion in local documentation given to students as they begin their studies.
Although beyond the scope of this article, a survey of what guidelines are
currently used would provide useful data on the current generation of students
and how they are expected to cite. Making such teaching and training materials
easily available on project websites would also be helpful. One-off project-led
training sessions could be worthwhile, providing that they are properly promoted
to encourage good attendance. By integrating awareness of the nature of digital
resources and the importance of citing them directly into teaching, digital
content creators can work together with academics to help shape the practices of
the next generation of scholars.
An increasing problem for contributors and editors who prefer print citation for
journal articles will be that journal publishers themselves are steadily moving
towards online-only publication for their journals (for obvious economic
reasons). Alice Meadows, of Wiley Publishing, argues in a blog post that one
sticking point — the provision of print journals to members as a key benefit of
the membership of a learned society — is beginning to become less important as
an issue [
Meadows 2012]. Another driver will be the moves towards
open access in journal publishing: open access presupposes a digital format.
Likewise the changes to the UK Research Excellence Framework (in operation after
1 April 2016) will require scholars to deposit all scholarly outputs in an
institutional repository (by their nature, primarily digital) [
HEFCE 2015].
As mentioned above, most survey respondents conceived our questions about digital
citation generally as being about DOIs specifically. As also mentioned earlier,
this may be a form of deformation professionelle: they were
thinking as journal editors, so they thought only of journal citation. But it
seems plausible too that the ubiquity of DOIs has acted as a forcing function:
editors have had to engage with them and make policy decisions about them. There
has been no such forcing function for other digital sources, as is evidenced by
survey responses such as: “I would change them [digital citations] to
print, but have actually not had the need to, since authors seem to choose
print anyway.”
Online-only journals can, obviously, only be cited digitally. It will become more
and more confusing if – as scholars increasingly use three-fold citation types:
items which exist only in print, digital versions or surrogates of originally
print items, and items which are born-digital – researchers cite some online
items and not others. As citing online-only material becomes commonplace, it
should become equally commonplace to cite all material accessed online,
regardless of whether a print version exists or not.
While active solutions can be undertaken by individual projects and content
creators, a gradual shift in culture and practice may already be being led by
respected institutions. But as Patrick Dunleavy argues in a thought-provoking
blog post, change cannot be effected unilaterally but will be “a long process of driving out legacy citation
systems”
[
Dunleavy 2014]. Contra Dunleavy, however, we would argue that the principle should be:
cite what you used. This does not preclude giving an additional
digital or print citation to aid discovery.
Recently the Royal Society announced their move to continual publication, whereby
they will give a DOI but no page numbers [
Royal Society n.d.]. Such
moves will change the focus within digital scholarship. It might help, too, to
bear in mind historical precedents. For example, we might consider when it
became standard practice to include colophon data or title page printing
information during the move from manuscript to print culture. We should also
bear in mind that there is an element of serendipity in which citation methods
take off and which don’t. For example, in classical studies, Stephanus
pagination has become a standard way of referencing the work of Plato. The
system is based on a 1578 edition of Plato’s works by Henri Estienne [
Wikipedia 2016]. Estienne is also responsible for the division of
Bible chapters into verses that have become so conventional today that many do
not realise that verses, and indeed chapters, of the Bible are bibliographic
conventions that have nothing to do with the earliest known manuscripts. It was
pure chance that these arbitrary systems, eschewing page numbers and even
editions, became the standard. It is easy to imagine other, equally arbitrary
but effective, citation norms arising for the citation of digital texts.
Conclusion
Change of practice at individual scholarly level reflects and promotes change at
a wider cultural level. As more and more established academics are open about
their use of online resources, the belief that digital content is less scholarly
should lessen, and citation should improve. Open access and changes to the REF
in the UK will, as mentioned above, sharpen the imperative to cite digital
versions of content. Like the OED, many journals are abandoning print. Young
researchers, not just luminaries like Kennedy and Davies, should find it
increasingly easy to cite what they used in research without apology; their
critics will find their position increasingly untenable.
In the meantime, it falls primarily to digital content creators to heighten
awareness of the issues involved. Simply raising the issue is often enough – in
conversation with them, the authors of this article have found that many
scholars have never thought carefully about their digital behaviour and simply
require some prompting to reconsider their citation habits. We must continue to
talk about it, formally and informally. Conference papers, blog posts, articles
and presentations which focus on the problem of digital citation will keep the
issue current and will encourage users to consider their own practices
[11]. We must take the opportunity to look
beyond our own specialisms to interdisciplinary knowledge exchange. Text and
image-based resources, for example, should look for input from other areas of
study where there is philosophical overlap – such as citation for audiovisual
material [
BUFVC 2013] or for music. Raising awareness of digital
citation should become part of the public engagement strategy for all digital
resources.
Digital citation is important because it is a reflection of how digital resources
are valued. It is important because it helps build cases for further funding and
enhancement based on evidence of use and impact. It is important because it
allows readers of published research to trace and discover sources, both known
and new to them, as accurately as possible. It is also honest.
We hope and expect that, in time, the currently too widespread practice of citing
a print work which has neither been seen nor used will come to seem an
unfortunate historical interlude, one in which the practice of scholarly
transparency was briefly and lamentably abandoned. Digital resources are here to
stay – it is time that they received the credit that is their due.