The Long-Term Preservation of Open Access Journals

The long term preservation of open access journals is one of the 7 criteria for the DOAJ Seal because DOAJ believes that it is an extremely important business process which a publisher of academic content should commit to. This couldn’t be more applicable than in the Global South where financial support and rigorous standards around journal publishing aren’t always available and, sadly, journals tend to just disappear from the Internet without warning. This is a huge problem for the academic footprint of the Global South, not to mention the hundreds of authors whose published papers just aren’t online any more and cannot be retrieved or ever cited.

When DOAJ established its criteria for the Seal in 2014, we were conscious that anything with a cost associated with it effectively put up a barrier to those low income or financially unstable journals to getting the Seal. DOAJ is committed to smooth that path as much as possible. In 2013, DOAJ announced a working agreement with CLOCKSS, one of the archives included on our application form, to seek out funding for a joint project which would get as many of DOAJ’s long-tail of single journals archived and preserved as possible. Unfortunately, those plans didn’t come to fruition and since then, the archiving and digital preservation landscape has changed somewhat.

What remains to be done is clear however: we must help ALL journals get into an archiving and digital preservation program and therefore I am delighted to welcome this guest post by Craig Van Dyck, the Executive Director of the CLOCKSS Archive.

Thanks for reading.

Dom, DOAJ Operations Manager


Users of scholarly content rely upon long-term access to that content. Scholarly research is long-lived, and users need to be able to re-access content repeatedly.

One concern about digital scholarly journals is that they could disappear from the web, which would undermine scholars’ ability to access the materials that they need.

In response to this concern, several Preservation services are available. These services work somewhat differently, but they all aim to ensure the long-term availability of scholarly content on behalf of end-users. Prominent services are CLOCKSS and Portico in the US, Scholars Portal and the Public Knowledge Project Preservation Network (PKP PN) in Canada, and CINES in Europe. Publishers are welcome to participate in any or all of these services. And some national libraries also have archival collections.

In today’s environment, it is considered best-practice for a scholarly publisher to include its content in a preservation service. To receive the DOAJ Seal, journals must be included in a preservation system.

In this post, we will focus on CLOCKSS, with some reference to PKP PN, because those two services both use the LOCKSS technology, which is arguably at the high-end of the spectrum of preservation solutions.

LOCKSS Technology

LOCKSS stands for Lots of Copies Keep Stuff Safe. The technology was invented at the Stanford University Library about 20 years ago. It relies upon multiple copies of the digital content being hosted at geographically distributed nodes. The software (which is open source) includes a unique polling-and-repair mechanism. The multiple nodes are constantly exchanging information about the content that they hold. If one node reports a difference vs. the other nodes, that one node is out-voted by the other nodes, and the variant node’s piece of content is replaced by the correct content from one of the other nodes. In this way, the archive is “dark”, meaning that end-users do not access the content, but the technology ensures that the data is in good repair.

The CLOCKSS Archive

  • The C in CLOCKSS stands for Controlled. This is because CLOCKSS uses twelve servers located at blue-chip libraries around the world, all with first-rate infrastructure and security. CLOCKSS is a free-standing 501(c)(3) charitable non-profit organization, using the LOCKSS technology and working with the LOCKSS technical and operational teams at Stanford, to preserve scholarly content for the long-term. CLOCKSS is certified as a Trusted Digital Repository. In its Trustworthy Repositories Audit & Certification report by the Council for Research Libraries, CLOCKSS received the only perfect score for technology.
  • CLOCKSS includes many Open Access publishers. For example, 24 publishers using the open source OJS publishing system are preserved in CLOCKSS. In total, CLOCKSS is preserving over 20,000 journal titles, with over 30 million journal articles and 75,000 books, growing rapidly each year.
  • One unique aspect of CLOCKSS is that when content is “triggered” for access, CLOCKSS makes the content freely available to all, under a Creative Commons license, which is a sign of the commitment to the concept of Open Access. A “trigger” occurs if a journal has disappeared, or will soon disappear, from the web. To date CLOCKSS has triggered 53 journals.
  • CLOCKSS can access publishers’ journals in two different ways: by harvesting the content from the publishing platform; or by the publisher providing the content to CLOCKSS by FTP.
  • Another unique aspect of CLOCKSS is the governance structure. The Board of Directors is comprised half by libraries and half by publishers. The scholarly community itself is thus responsible for the policies and practices of CLOCKSS.
  • Publishers sign an Agreement with CLOCKSS, which governs rights and responsibilities. There is a small annual cost for participating in CLOCKSS. CLOCKSS is financially sustainable, which is an important element for a long-term preservation archive. 350 libraries around the world, as well as 250 publishers, contribute to CLOCKSS’s sustainability.

Public Knowledge Project Preservation Network (PKP PN)

  • The Public Knowledge Project is a multi-university initiative developing (free) open source software and conducting research to improve the quality and reach of scholarly publishing. PKP is based at Simon Fraser University in Canada, which is where the Open Journal Systems (OJS) software was originally developed.
  • The PKP Preservation Network is an additional capability that enables easy long-term preservation of journals using OJS version 2.4.8 or higher. PKP PN uses the LOCKSS preservation software.
  • There are currently 800 journals preserving their content in PKP PN.
  • There are no fees for participating in the PKP PN. A journal manager must agree to Terms of Use.

Conclusion

It is strongly recommended that scholarly journals and books should be preserved for the long-term in a preservation system. Content that is not preserved is at-risk of being lost. And publishers who do not contribute their content to a preservation system are at-risk of not being considered a serious publisher. The value of long-term preservation is well worth a small cost.

Craig Van Dyck
Executive Director, CLOCKSS Archive
cvandyck@clockss.org