Since 2012, DOAJ has been on a path of data quality improvement. DOAJ metadata is used all over the world and all over the Web. Improving and fixing the quality of our metadata can be painstaking work but the effort goes a long way as changes propagate across the Web via search engines, aggregator databases, library portals and other databases.
Along these lines, the largest publisher (in terms of the number of journals) in DOAJ recently added missing abstracts to over 100,000 articles and fixed broken special characters in approximately 2000 more. This was a huge effort on their part and DOAJ is grateful for the work that has gone into this project. It is an achievement that will be welcomed by DOAJ metadata consumers.
To date, Hindawi has 161,334 articles loaded to DOAJ and until recently was the largest contributor of metadata to our index. That title was taken from them recently when DOAJ ingested the entire PLOS archive from Europe PMC.
Very soon, we will be releasing new functionality on DOAJ that will improve the detail in the journal record, increase DOAJ’s accuracy even more and will encourage greater input from the community. We have also abandoned our policy to remove journals that have ceased publication.
Ever since we migrated to our new platform, we have been trying to find a slot in our development schedule to correctly display journal continuations on our site. It was important for us to get this correct, both in terms of the metadata and the display, so that users could easily understand how one, or several, journals transformed into a journal with a new title and ISSN. This new development will allow us to correctly display a journal’s timeline as it goes through name and ISSN changes.
Send us feedback on a journal
A decreasing amount of the data in DOAJ is old and DOAJ is always grateful to receive updates from users who get in touch to alert us to broken URLs or a change of title. With this in mind, and to make feeding back as simple as possible, we will add a ‘Tell us about this journal’ button to every journal entry in DOAJ. We hope this will encourage greater input from our users and lead to more updates to journal entries. This development is perfectly aligned with DOAJ being curated by the community and really facilitates an up-to-date resource.
Greater granularity for APC information
We know. The APC information on DOAJ was misleading. It lumps together 2 distinct groups of journals into one big ‘No’ group. We have broken that group out into: No, where we have up-to-date information from journals that they DO NOT charge APCs; and No information. ‘No information’ is the group of journals which has submitted a reapplication to us and we have yet to review it. The number of ‘No information’ journals will diminish in size as we work our way through the reapplications. I wrote about this inaccuracy previously.
Retaining the records – journals that have ceased publication
As long as they fulfil some basic criteria, we will no longer remove journals that cease publishing and we will even add back into DOAJ journals that were removed only because they stopped publishing. This change of policy could potentially add hundreds of articles back into DOAJ along with tens of archived journals.
Access from China
Oh, and watch this space… In early June, we’ll be adding ~175 000 PLoS articles to DOAJ… More on that later.
UPDATE (11th May): the list of removed journals is here on the 3rd tab.
Today DOAJ will remove approximately 3300 journals for failure to submit a valid reapplication before the communicated deadline; a deadline which was extended twice to allow more time for reapplications. This batch removal is another step in DOAJ’s two year long project to increase the value and accuracy of the information provided in it.
Here are some details about the reapplication project from its launch in January 2015 to today:
- The reapplication process is a necessary step towards ensuring that all journals in DOAJ (of which there were about 10000) met the higher criteria for indexing that DOAJ launched in March 2014. The criteria were produced as a response to the maturing open access arena, the greater demands made on open access publishing by questionable journals and publishers, and to retain DOAJ’s relevancy and importance in open access publishing.
- Some journals have been in DOAJ since 2003 and have never refreshed their information with us.
- As of today over 5000 journals have already submitted their reapplication to us and we are busy assessing those. Many reapplications have been accepted back into DOAJ.
- The contact for every journal to be removed from DOAJ was emailed at least 4 times, informing them of our intention to remove their journals if they failed to submit a reapplication by the agreed deadline.
- We send email via Mailchimp and took all the necessary precautions to ensure that our emails didn’t end up in Spam, get trapped in institutional firewalls, or failed to deliver for other reasons. We used the Mailchimp authentication options to “verify” that our emails were from a genuine source.
- The first email, announcing the reapplication project and inviting people to reapply, was sent out in January 2015 and went to publishers with 11 or more journals in DOAJ. The second email went out to publishers with 10 or less journals in DOAJ in June 2015.
- Reminders were sent out regularly, once a month as well as announcing the deadline to our largest communities: via this blog, Twitter and Facebook.
- To ensure that our emails ended up with the correct contact, we spent a considerable amount of time tidying up our contacts database: we updated at least 1000 records.
Removed journals are welcome to submit a new application to DOAJ at any time. They will be placed in the queue along with other applications. We will add a third tab to our spreadsheet ‘DOAJ: journals added and removed‘ that will list all of the journals removed.
When a journal is removed from DOAJ, any article metadata will also become unavailable. This is standard functionality. We are confident that the majority of the journals removed have never supplied article metadata to us, or have done once but haven’t sent us anything for at least 2 years.
If you use DOAJ as a data source and would like to do your own analysis of the journals indexed, download our journals CSV (https://doaj.org/csv) today before 11am BST, 12pm CEST. A copy of that spreadsheet is also available here.
UPDATED: 25th May 2016
We have now broken out the APC facet on the web site to show 3 categories:
No information: journals for which we have received a reapplication which has yet to be processed. This is the majority of the journals in DOAJ. As we process these reapplications and accept them, this figure will go down over time. We have the APC information in the Admin system; the records just need to be reviewed. For every reapplication accepted, one will drop from the ‘No information’ total and be added to the Yes or No total. For every reapplication rejected, one will drop from the ‘No information’ total only.
No: these journals have submitted updated APC information to us and DO NOT charge APCs
Yes: these journals have submitted updated APC information to us and DO charge APCs.
The numbers in the text below are now historical but were true as of the site upgrade in April 2015.
In my post the other day, I promised to provide the APC information from the old site. Here it is as of today (11th May 2015):
APC? Number of journals
N 6283 (67.6%)
Y 2999 (32.3%)
No info 9 (0.1%)
Today there are 10,508 journals in DOAJ which leaves 1217 journals unaccounted for in the old APC data above. These are all journals that have been accepted into DOAJ under the new criteria. (We have accepted 1217 journals into DOAJ since March 2014.) We know from the new data that 364 of them do have APCs. Therefore 853 journals have NO APCs. Then we can work out the following TOTALS for ALL journals in DOAJ:
APC? Number of journals
N 7136 (67.9%)
Y 3363 (32%)
No info 9 (0.1%)
This also means that the APC facet on the new site should display:
APC? Number of journals
N 853 (8.1%)
Y 364 (3.5%)
No info 9291 (88.4%)
88.4% of all the journals in DOAJ have yet to reapply.
There has been a lot of focus in research on author processing charges (APCs) and submission charges, particularly in the last 16 months or so and DOAJ data is often used as a basis of that research. Heather Morrison’s recent article in Publications and Walt Crawford’s research published in Cites and Insights are two very recent examples.
DOAJ wants to raise the visibility of charges information even further to facilitate future research and to make it easier for authors, researchers and funders to make informed decisions on where to publish. As part of our commitment to raising the level of quality of data in DOAJ, we released yesterday a small but important change to the display of charging information. All journals accepted into DOAJ after March 2014, or back into DOAJ after a successful reapplication, will have the following information displayed against them:
- Does the journal have APCs or Submission charges?
- If so, how much and what is the currency of those charges?
- What is the URL where that information is clearly displayed and stated on the journal web site?
- If there are no charges, what is the URL where that information is clearly displayed and stated on the journal web site?
During our review of applications we request that ‘no charges’ is stated explicitly on the journal’s site and we will ask publishers to add that information if they have not already done so.
You will find the new information on each journal’s table of contents page; that is to say the long, detailed view of all the information and metadata that we hold for a journal accessible by clicking a journal’s title in search results. Two examples would be here where the journal has no charges, or here where the journal has APCs.
There are further improvements in the pipeline: we will move the information above the [more detail] link on these pages; we will add charge information to all records in search results; we will include amount and currency in our downloadable CSV file; and we will point the Publication Charges facet in search to the new data. These changes are scheduled for completion in April.
We regularly receive notification that DOAJ data has been used for analysis; analysis done by publishers, librarians, students, technologists, bloggers and many others. That the data is central to so many studies continues to reinforce the importance of the DOAJ in the open access movement. We are confident that, once our current upgrade is complete, and when all the existing journals have been re-evaluated, DOAJ will provide data of an even higher quality that is incomparable to the “old” DOAJ; that is updated more frequently and of a previously unseen level of granularity. It will be a dataset monitored by a large, international network of Associate Editors and Editors, consistently checking and reviewing.
That the data is so regularly used places a responsibility at DOAJ’s door to ensure that the level of data quality is high. This is a responsibility that we take seriously and so I thought it worth clarifying a few points about the DOAJ data.
“This site is undergoing maintenance”
The data in the DOAJ database has been collected over a 10 year period and in those last 10 years, not only has the data been through several migrations and transformations but we have seen the size of DOAJ grow from 300 to just short of 10 000 journals. That rate of growth is increasing year on year. This means that we have a large amount of legacy data.
In 2013, we announced that, in response to the changing nature of open access, we would change significantly the inclusion criteria for journals to be listed in the DOAJ, developing our back-end systems accordingly to match the dramatic increase in the resulting workload. The Community was involved as we sought opinion on the changes we should make. The changes needed amounted to a huge piece of development which required certain activities to be put on hold. Additionally, DOAJ migrated platform in December 2013 so the routine activities of adding journals, removing journals and adding article metadata had to be placed on hold. (Adding new journals was eventually on hold for just over 4 months; removing journals for just under 3.) It wasn’t without a good reason though. DOAJ was migrated to an open source, standards based stable database, hosted by our technical partners Cottage Labs. This also necessitated a substantial clean up of the legacy data.
The result of such a huge project meant that the usual level of data maintenance by our Editorial Team decreased. It didn’t stop – during the first quarter of 2014, 92 journals were earmarked for removal – but for a few months, the public view of DOAJ remained relatively static because the usual weeding and refining was on hold for a few months. There are still areas of the DOAJ data that we know needs to be reviewed and corrected.
Previously, not all information was compulsory
A publisher applying for a journal to be included in the old DOAJ was only required to provide 6 initial pieces of information. Once accepted, the publisher was encouraged to return to the site to provide further information about the journal. One such piece of information was the author processing charge (APC). This is clearly an important piece of information that authors, in particular, like to know up front. Therefore we took the opportunity, when we were designing the new application form, to raise the visibility of this information, require it on application and make it a compulsory question. Naturally, this means that our legacy data has holes in it which need to be filled. All the new journals applying for inclusion after March 19th 2014 have already answered this question. Once we start the re-application process, the ~9700 existing journals in DOAJ will have to answer this question. This process is scheduled to begin in the 3rd quarter of 2014.
All DOAJ data is publisher-provided
While we can force an answer to a question in a web form, we cannot force someone to return to DOAJ and update their information. Of course, information changes over time and often we find that publishers have forgotten to update us. Our system of regular review – our aim is that every journal be reviewed at least once a year – hopes to catch these changes and correct them as quickly as possible. Our new network of Associate Editors will do this more efficiently and pro-actively than ever before but we still encourage the community to get in touch when it spots things that seem to differ. The community can play an important role as our eyes and ears and we encourage that.
We first announced the start of our transition period back in October 2013. Since then we have been very open, not only about our progress and development plans but also about the effects that the work would have on the DOAJ itself. Hopefully, with this post I have given you a little more detail as to why there may be inaccuracies in the DOAJ, what we have done to address those and what we will continue to do as our database develops.
We appreciate the patience and support from the community! As always, do please get in touch should you have questions or comments.