Researcher perspective: DOAJ as a source of research data

This is a guest blog post by Mikael Laakso.

About the author

Mikael Laakso works as an Associate Professor (tenure track) in Information Studies at Tampere University, Finland where he specializes in scholarly communication and open science. Since 2009 he has been researching the changing landscape towards openness in scholarly publishing by studying combinations of bibliometrics, web metrics, business models, science policy, and author behavior.

Did you know that DOAJ is not just an excellent resource for discovering interesting journals to read from, or perhaps submit article manuscripts to yourself, but also a unique source of research data for anyone doing research that concerns the scholarly publishing landscape?

Ever since I got properly introduced to the intricacies of scholarly communication some 15 years ago I have been consistently fascinated by the lack of knowledge we have about what is essentially the key knowledge building system we have in our society. What do you mean we don’t really know how many journals and articles are published every year, let alone how that splits into finer grained details about the publishing landscape (e.g. what share of content is available open access)? Surely that can’t be the case?

Back when I first got involved in research within this domain, it was through a research project that built on data from the DOAJ to estimate the share of scholarly journal articles that are available open access. Already at that point in time there was no data comparable to what DOAJ was offering – standardized data describing peer-reviewed journals that make their content available for anyone to read for free. All the data concerning the included journals could be downloaded as a CSV file just as you still can today https://doaj.org/terms/ -> journal CSV file) which makes for a very accessible to just browse and filter the data as such using e.g. Libre Office, Google Sheets or Excel. Always up to date and re-shareable with a clear CC BY-SA license – perfect!

In that study and many of the others I´ve been involved in there is usually a need to integrate some other journal information as well, e.g. to check in which other indexing services DOAJ journals are also included. By having each publication having its unique identifier (ISSN/E-ISSN) it is possible to keep track of everything even when doing some quite complex stuff involving DOAJ data and journal-level data coming from elsewhere. So checking for duplicates/matches across two or more journal lists is super easy with the provided CSV file as one can check for ISSN/E-ISSN matches with a simple formula in one’s favorite spreadsheet program (e.g. INDEX MATCH). Perhaps the most exciting research I’ve been involved in that built upon this core principle of cross-checking several journal datasets was for a study on open access journals that have vanished from the internet where the analysis included comparing older DOAJ CSV files to more recent one´s to detect where journals potentially might have vanished from the web permanently. Journal preservation is a problem bigger than any single actor and has been a hard area to cover comprehensively for the long tail of individual journals scattered around the web, but we as authors of the study were happy that DOAJ took an active role in doing what is within their reach by initiating wider collaborative action to improve preservation coverage for journals.

Where the true strength of the DOAJ journal really shines is in how transparent the inclusion criteria for journals are, when a journal is included in the DOAJ you can already assume a lot about its processes, and thus the whole dataset. Without naming names there are journal indexing services that are much less transparent and seemingly subjective in what they include, and others that are much more inclusive but that then include a lot of titles that do clearly not belong in a listing of peer-reviewed scholarly journals (e.g. report series and similar grey literature) since there has clearly been no manual curation like DOAJ has. And did I mention that these other services most often 1) have high fees for accessing the data, 2) do not allow re-sharing of exported data in any way, which make them bad fits for building increasingly open and inclusive science on.

Both open access journal publishing as well as the data available to describe the publications have come a long way since 2009 when I first downloaded my first data from the DOAJ. No longer is data limited to descriptors for the journal-level, but now an increasing amount of journals are also providing the metadata for their individual published articles to be queried through the DOAJ API which further expands the research possibilities of the service now and in the future.

Over the years I have often thought that it would be great if all scholarly journals would have data available about them in a similar way that DOAJ provides for open access journals – openly, transparently, and with consistent curated metadata. As the years have passed that thought has become less pressing as the growth in numbers of open access journals has been so strong. From covering a small corner of scholarly publishing in 2009, DOAJ has kept pace with the growth in the sector and now captures and tracks a substantial share of all active journals globally. As open access journal publishing continues its growth trajectory so will also the value of DOAJ keep increasing, both for content readers and authors, but also for researchers conducting research within the space of scholarly communication.

DISCLAIMER: This post has not been sponsored or edited by the DOAJ. I just love the service and want to support it in any way I can because it has been so useful to me over the years. DOAJ is a key infrastructure for open access journal publishing and brings unique value to the scholarly publishing landscape in many different ways.

2 Comments

Jan Erik Frantsvåg says:

09/10/2024 at 10:27

Like Mikael, I have used DOAJ journal data for much research since 2010. This is a unique source of quality information on an important part of the scholarly publishing landscape. It should be used more and better by more scholars.
I wholeheartedly support what Mikael writes here!

1. DOAJ says:
  
  10/10/2024 at 06:40
  
  Thank you, Jan Erik, for your support!

Researcher perspective: DOAJ as a source of research data

Leave a Reply Cancel reply

2 Comments

Terms

Popular categories