Last fall I suggested that I would investigate how well digital cultural heritage collections were being utilized by researchers. Turns out this was harder than I expected. But from my very initial research it seems that scholarly writing does not cite a lot of cultural material available online. This has led me to some questions for the group – some you’ve probably already considered, but maybe a few new ones worth thinking through together.
I started by searching for the use of the terms “digital archive,” “digital collection,” “online,” “http” and “www” in American History and American Studies dissertations and journal articles published between 2002-2011. But I was surprised by the small number of results – fewer than 10% using “digital archive” or “digital collection,” and less than 30% using “online,” “http” or “www.” American Studies led the way in usage or discussion of digital material, almost double the amount of references for each term. But either there were very few citations, or the terminology used varied enough that they required more detailed searches than Proquest’s interface (which requires a PDF download in order to review the full text) allows.
For example, out of roughly 21,000 dissertations from this period with the subject heading of American History or American Studies (unfortunately, there are duplicates here since many authors select more than one major subject heading), only 270 used “digital archive” and 240 used “digital collection” somewhere in the full text. A proximity search resulted in approximately 1,400 in American History and 2,500 in American Studies. Even the larger of these numbers is less than 10%, which is surprisingly low. For comparison, about 6,000 used “museum” and 1,800 used “material culture.”
Of course not all institutional materials available online are titled “digital archive” or “digital collection.” But broadening the terminology to “online,” or “http” or “www” also had surprisingly few results: about 2,700 and 4,700 for American History; 4,700 and 7,800 for American Studies. Do only about a half of the American History/Studies dissertations in the previous 10 years cite materials found online? [NB – published material that is also available on the web, in Proquest Historical Newspapers for example, does not necessarily need to include a URL, and that would undoubtedly increase the number of citations for material encountered digitally.]
A search in history and affiliated subjects in Project Muse and JSTOR also returned a small number of uses of “digital archive” or “digital collection” – only 30 in Muse and 240 in JSTOR. Even when searching for these terms in Library Science journals, there were only about 60 results.
This lack of citation of material encountered digitally was very surprising to me, and I wonder if you all have the same reaction?
We know that researchers encounter primary source material digitally and explore collections – or at least skim finding aids – online. So I have some questions, about institutions and researchers both.
- Do institutions need or want to keep track of citations of their digital collections? If so, do they just search for their URL or DOI in these databases or use another method? Or are they more interested in general number of hits than number of references?
- Would an increased number of citations in scholarship help justify the effort and expense of digitizing collections? Are the textual finding aids prepared cheaply and mainly for researchers, and the visual interface to the collections for the public and educators?
- Overall, do cultural heritage organizations want or need to identify or cater to their scholarly users, or stay focused on a broader public? How does scarcity of resources influence this decision? And what would researchers want from digital cultural heritage collections?
- Certainly researchers explore collections online, but do they also need to go to an institution to see the physical object? Would a researcher be comfortable citing a primary source object that they have seen only virtually, or will they need to see it in person as well?
- And a related question – already raised by Sheila – would the ability to encounter digital versions of material culture objects increase the usage of this kind of evidence in scholarship? Most history dissertations and research articles privilege documents, photographs, and sometimes paintings and fine art objects. Is the lack of citation due to the difficulty of accessing physical collections and the only relatively recent availability of material culture online? Or because of larger biases in the discipline? Will some version of the Smithsonian Commons encourage historians to use the holdings of the institution because offers an easy-to-use interface to the institutions APIs? And will the professional gatekeepers allow it?
So in the end, I am bringing more questions than answers to the working group on Sunday. But I look forward to discussing with you these wide-ranging questions as well as the ones raised in each of your posts.
Pingback: I Want to Use Your Collections! | Visualizing the Past
It’s still pretty much standard practice for scholars to cite the original item, not its digitized version–which is considered a “surrogate” by many libraries and used as such by many scholars–leading to dramatic under-reporting of use of digital resources in print. Of course, there are important differences between the digital version and its original, and since that may influence the way it’s interpreted or used, it would be better if more scholars would fess up. This practice especially applies to discovery: we all at least sometimes search Amazon.com or Google Books looking for likely resources and only pull the books from the library shelves after we know we need pages that are hidden from full view. But the same thing can happen with unique materials digitized as part of online special collections: the “item record” provided is for the original resource, not the surrogate. (What would a digital surrogate’s item record look like? Maybe, if an image, it would at minimum list the original dimensions, pixel depth and file format?) You raise a terrific point by suggesting that institutions might do better to encourage the direct citation of their digital materials, as such. That would, of course, require a big culture change in two entrenched communities of practice: academia and cultural repositories.
Which raises the material culture issue: no digital representations that presently exist are adequate substitutes for the objects themselves. Wow, though, what a difference easy discoverability across institutions would make! Also, there really is no consistent format for citing artifacts in scholarly writing, either in the body of the text or in the notes, even in the material culture world, beyond including the name of the holding institution and the accession number. It might be interesting to look more closely at a publication like Winterthur Portfolio to get a sense of whether an automated search of journals or dissertations for use of artifacts in scholarship is even viable.
Interesting results! I guess one thing for us to keep in mind is that completed dissertations in the humanities are likely to be a lagging indicator of change. It takes years to get from a proposal to a finished project, so the availability of digital materials at the outset of a project is likely the best shot at shaping the planning for a project.
Beyond this, I would be curious to know how many folks are obfuscating the digital sources they work from. For example, look up anybody who relies on historical newspapers. I bet there are a lot of folks out there who are using the digitized proquest historical newspapers but not necessarily providing URLs for those pages. (Many of those URLs don’t really look like persistent URLs.)
In any event, new projects like NGA Images< and the Library of Congress’s relatively new cross library search system for digital materials are going to make it much easier to get to digital materials.
Oh, one last thought. In my experience, a lot of the game of historical research is grounded in working on things that haven’t been worked on before. That is, there are plenty of folks who decide on their dissertation research by identifying an archive, or collection that few people have looked at. In practice, this ends up being really valuable, it keeps historians from looking at all the same sources, but it means that digitizing a collection can actually act to deter some historians from a collection.
Joan and I were talking about what you mentioned above and not citing digitized newspapers, periodicals, et al. I’m one of those folks, even though I included digital repositories in my citations I didn’t include the URLs for digitized newspapers and periodicals in my dissertation citations. I cited the articles, though I discovered and read most of them from online databases. Interestingly, part of the reason I didn’t include URLs was because I started my research pre-Zotero, and so I printed out the PDFs of the articles and entered them in as sources while I was writing (egad, I know). I have URLs for some additional newspaper and magazine articles that I found later after-Zotero but I wanted to be consistent. Additionally since ProQuest is gated, it didn’t seem like a good way to cite something if it wasn’t truly available to all readers. In fact, one database that GMU used to buy access to, doesn’t any more. For some articles that I never printed & saved them directly to Zotero I don’t have them any more. Silly me. I have URLs that I can’t even access.
I wasn’t trying to hide my use of digital sources, I cited hundreds of periodicals, but at the time didn’t think of the digitized version as something different from the physical version. And that is also something that I think would be great to address, if we have time.
This is so interesting! Becoming more self-conscious, as scholars, about how/whether we signal the “digitalness” of any sources we use is undoubtedly kind of important, although I do on the other hand see why, with conventional sources that are digitized (e.g. newspapers), more people don’t do that. I mean, when citing a book, in the past, we didn’t signal “book” except by the use of italics vs. quotation marks in the citation. And on a related note, that makes me realize that the citation style, more generally, in the past has been the way of signaling type of source, and we are still using those conventions when the sources are books, newspapers, letters, etc., in digital form.
I think we are probably doing a better job with “born digital” resources like blog posts, etc, that don’t have a ready analogue in the print/non-digital world. On that score, however, being able to track citations in some good way clearly IS important to giving such writings legitimacy, especially in the academic world of tenure/promotion.
Additionally, there is a related question about tracking citation of sources within digital communication venues (e.g. Twitter, etc.). If my research article, digital collection, or whatever else is repeatedly cited/linked/referenced on Twitter, how do I follow that and what does that say about its impact? Jason Priem, a doctoral candidate, at UNC-Chapel Hill’s School of Information and Library Science is doing research on this: http://jasonpriem.org/cv/
Pingback: I Want to Use Your Collections! | Lot 49