Today, it is increasingly difficult for scholars to do historical, primary source research without institutional access to fulltext databases like Early English Books Online, Eighteenth-Century Collections Online, and the 17th and 18th Century Newspaper Collections, among others. These resources are often prohibitively expensive for smaller colleges and universities, and travel funding for archival research is down–especially the kind of long-term archival research that would enable scholars to some kinds of large-scale research.
These web-accessible resources have been, indeed, a boon to scholars across the United States and beyond; they enable the kind of granular reading traditional scholarship in the humanities has been built on, but they also enable research on macro levels, with vast bodies of content over time. As Franco Moretti puts it in Graphs, Maps, Trees, “within that old territory [is] a new object of study: instead of concrete, individual works, a trio of artificial constructs…in which the reality of the text undergoes a process of deliberate reduction and abstraction.” He calls this “distant reading,” where distance—in time, in space—is “not an obstacle, but a specific form of knowledge: fewer elements, hence a sharper sense of their overall interconnection.”
Resources like ECCO and EEBO can give scholars just that kind of knowledge; if one wants to know how many times, in what frequencies, and in what contexts a particular search term or string was used in English volumes published over the course of a century, one can use ECCO to find out. One can enter the search term, view a list of results, limit them in various ways, and examine the facsimile page images—even download the entire (non-searchable) PDF of a single text to a flash drive, especially helpful if one is using such tools at public libraries like the Library of Congress.
Yet, there is to date no simple, user-friendly tool that allows the raw data of the results stream to be used for data mining or graphing; the metadata of the results cannot be downloaded to, for instance, a spreadsheet, for visual manipulation in tools like IBMs ManyEyes or Pivot charts. A browser plugin to capture the results stream, or an export feature built in to the fulltext database (as one additional export feature) would give scholars the ability to explore, practically, Moretti’s theory of distant reading.