Guest post: a technical update from our development team

This is a guest post by Richard Jones, founding partner of Cottage Labs and member of the DOAJ team. Cottage Labs has hosted, developed and managed the DOAJ platform since December 2013 and is responsible for keeping DOAJ available to the vast number of individuals using DOAJ every day.


To the public, it may seem that not a lot has changed at doaj.org for the past year or so but in the background, a lot of work has been going on to prepare for some major improvements.

January – August 2019

During this period, our technical focus has been on 3 major areas: the Application Form; the editorial workflow system, which underpins the application process; and the User Interface (UI). In addition, we have been carrying out the final bits of work to improve the stability and scalability of the system, with the net result that in those 8 months there was only 3 minutes of unscheduled down-time.

The team measures its throughput via the number of issues that are successfully dealt with per month in our GitHub issue tracker. On average we’re handling 20-25 issues per month, some of which are support questions. These questions come from a variety of sources including DOAJ team members, end users, or from technical users of the API and other machine interfaces.

We’ve also been working with a new performance monitor to identify bugs, and for the first time we are able to detect issues with the system that go unreported or even unnoticed by end users.

Here are some of the minor improvements we’ve made:

  • Improved API documentation
  • Further GDPR compliance: cookie consent banner; marketing opt in/out preferences for users; and anonymisation of data used in testing and development
  • Data about articles in the Journal CSV file
  • A preliminary overhaul of the site’s layout template and CSS, in preparation for a much larger UI upgrade next year.

Here are some major bits of work that we have carried out:

  • Enhancements to our historical data management system. We track all changes to the body of publicly available objects (Journals and Articles) and we have a better process for handling that.
  • Introduced a more advanced testing framework for the source code. As DOAJ gains more features, the code becomes larger and more complex. To ensure that it is properly tested for before going into production, we have started to use parameterised testing on the core components. This allows us to carry out broader and deeper testing to ensure the system is defect free.
  • A weekly data dump of the entire public dataset (Journals and Articles) which is freely downloadable.
  • A major data cleanup on articles: a few tens of thousands of duplicates, from historical data or sneaking in through validation loopholes, were identified and removed. We closed the loopholes and cleaned up the data.
  • A complete new hardware infrastructure, using Cloudflare. This resulted in the significant increase in stability mentioned above and allows us to cope with our growing data set (increasing at a rate of around 750,000 records per year at this point).

And here are some projects we have been working on which you will see come into effect over the next few weeks:

  • A completely new search front-end. It looks very similar to the old one, but with some major improvements under-the-hood (more powerful, more responsive, more accessible), and gives us the capability to build better, cooler interfaces in the future.
  • Support for Crossref XML as an article upload format. In the future this may also be extended to the API and we may also integrate directly with Crossref to harvest articles for you. We support the current Crossref schema (4.7) and we will be supporting new versions as they come along.

Finally, we welcomed a new developer to our team, Aga, who joined Cottage Labs and the DOAJ team in July of this year.

Taking a longer view, developments coming down the pipe in the next 6-8 months or so are:

  • A major overhaul to the UI, following extensive design and user experience work by DOAJ’s UX consultant.
  • A lot of work on the editorial back-end (so you might not notice much change on the public side) to improve the throughput and usability of the system for the editors and administrators.
  • A new, revamped application form, which will be easier to use and offer you better support in applying to DOAJ or updating your existing Journals.

If you have any questions or would like more detail on anything you have read here, do please contact us or leave a comment here.

Myth-busting: DOAJ takes too long to reach a decision

This is a myth.

From about 2012 until 2017, DOAJ was struggling to keep on top of the amount of applications being received.

Implementing new acceptance criteria and making 9900+ journals reapply exacerbated the problem and suddenly we had many reapplications and new applications coming in at the same time.

Triage
All applications go through an initial review to filter out incomplete or substandard applications. We call this process Triage. (From March 2015 to November 2017, Triage rejected 3112 sub-quality, incomplete or duplicate applications.)  Today, the average turnaround on an application from submission to initial review is a few days at the most.

From submission to decision
To improve the time taken to review an application and reach a decision to accept or reject, a revised and improved editorial workflow was implemented. You can read a full explanation on each of the 7 points in our progress report for 2018. The effects of those changes, which we have been monitoring carefully since 2018, are significant. 

Today we have no outstanding applications that were submitted in 2018, and only a small number dating from the first quarter of 2019 remain to be completed. We aim to reach a decision on all applications submitted within 6 months (and are still working hard to reduce that time too) although many are now completed in 3 months or less.

Even so, why does reviewing an application take time?
There are over 50 questions in our application form. Much of the work involved in reviewing an application is correspondence with the applicant and manually checking each answer. Each answer is checked for 3 things: that the answer in the form is correct; that the URL provided contains the information required in the question; that the information on the site is complete and correct.

Incommunicado
A contributing factor to the myth that DOAJ takes too long to reach a decision is the perception that DOAJ never responds. In the past, we did reject some applications without contacting the publisher about this. Since 2018, we have sent emails out for all rejected applications.

One of the most common reasons that an application is rejected is because we do not hear back from the applicant. There can be technical issues at play here: we suspect that sometimes our system-generated alerts, informing the applicant of the progress of their application, don’t reach their recipient. This is often due to particularly sensitive institutional firewalls, messages ending up in Spam folders, or email addresses no longer being valid. But it is also true that long delays in responding to DOAJ’s queries, or in making requested changes, can mean that an application is rejected.