We are getting better at ingesting metadata. Until then: give us your metadata!

I just saw a tweet about an article published on PeerJ called ‘For 481 biomedical open access journals, articles are not searchable in the Directory of Open Access Journals nor in conventional biomedical databases‘. The article concludes:

DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases. However, DOAJ only indexes articles for half of the biomedical journals listed, making it an incomplete source for biomedical research papers in general.

This made me very happy, especially when I read that the authors compare DOAJ with Medline, PubMed Central, EMBASE and SCOPUS to reach their conclusion. Those are some impressive databases to be ranked alongside.

Of course, the authors’ conclusion has a very large BUT in it: DOAJ contains article metadata for only half of the journals it indexes. Why is this and what are we doing about this?

The biggest issue for us is that since our launch in 2003 providing DOAJ with article metadata has been optional. DOAJ never made it compulsory for an indexed journal to upload its metadata. DOAJ encouraged it and made it easier for publishers to provide it but it has never been a criteria for being indexed. One of the reasons for this is that there is such a huge range of publishers in DOAJ and some of them simply don’t have the resources to create and provide metadata so by making this compulsory, a very large amount of small, open access journals would have been excluded.

Another issue is that, like PMC, DOAJ requires XML to be delivered to its own DTD. For journals that are not on the Open Journal System—OJS journals have a plugin that generates DOAJ XML automatically—or for those journals that have a huge content archive, processing and converting existing content into DOAJ XML is no small task. There are cost barriers, there is a lack of understanding around XML and there is a general reluctance, particularly by the larger publishers, to convert content to yet another DTD. DOAJ has toyed with the idea of accepting more XML flavours, the most obvious being JATS, since it is a standard DTD within STM publishing. What about SSH journals, or those journals that produce XML using their own proprietary DTD? As much as DOAJ wants to facilitate article upload, with limited finding there simply isn’t one solution that will work for everyone.

However, there is good news which I highlighted in an earlier post. This year, DOAJ will launch an API and a metadata harvester. Among other things, the API will enable the bulk upload of article metadata. The metadata harvester will seek out article metadata for journals that are indexed in DOAJ.

Until we get those two features on their feet, I would absolutely URGE publishers to give DOAJ their metadata. Those who do so already said that being in DOAJ increases traffic to their sites and increases visibility of content and we know from our Google Analytics that DOAJ is the starting place for many searches for open access content. If you need help in generating XML, then contact me and I will point you to some resources that I have used to troubleshoot XML problems in the past.

Who knows? At some point in the near future Messrs Liljekvist, Andresen, Pommergaard and Rosenberg might be able to publish an update to their paper with the conclusion that: ‘DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases.’ Full stop. Or even better:

DOAJ is the most complete registry of OA journals.


