If the SPARC piece we recently featured in didn’t give you a good idea into the amount of work that we’re undertaking to improve DOAJ—and ultimately the overall quality of open access publishing—then this [incomplete] list of our developments for 2015 should provide you with a fuller picture.
We’re coming to the end of a fairly small but significant and, in my opinion, very exciting project that will do the following:
- improve the visibility of individual articles, both locally in DOAJ and outside in external search engines
- improve overall discoverability in, and linking from, external databases
- improve metadata extraction for re-use
- give visibility to new datasets from the information we are collecting in the new application form with a level of granularity we’ve never had before
- update the UI and add better browsing
- add a much needed visualisation for existing continuations and ultimately, the management of new ones
- reduce DOAJ’s overall response times
As I have mentioned in previous posts, DOAJ’s front end is currently straddling two datasets and this project will start to reveal all the lovely, new data that we have been collecting since March 2014, in all its glory. As publishers reapply (1072 out of ~9700 journals have already submitted their reapplications), the old dataset will drop away and the new, updated data will be displayed on the site.
This will have a few implications, the most important being that our extractable (CSV*, OAI-PMH etc) and reportable (statistics, figures) datasets will vary wildly over the coming 6 months as we bed down into the new dataset and flush out the old stuff. For example, you will notice a big shift in our APC numbers, the total numbers for the countries of publishers, the numbers of journal languages and the accuracy of journal classifications. Get in touch if you’d like more detailed information on this part and, to use a cliché, pardon our dust as we clear away the old and build out the new.
Here’s a partial list of everything we will do this year. It’s only partial because we haven’t yet finalised the list for the latter part of 2015. I’ll be able to update you on that at a later date. I’ve broken it into two sections: Behind The Scenes, all features that are effectively invisible to the site user but will make a huge difference for the publishers and any service using DOAJ data; and The User Interface, where site users will notice improved site interaction and site response times.
* Our CSV file will be updated after this project completes.
Behind The Scenes
I am very pleased to report that our OpenURL service will be reinstated AND upgraded. When we migrated from our old platform in 2014, we were unable to move the OpenURL 0.1 service. We know that many of our institutional and business partners rely on OpenURL to connect out to the content in our database and this has led to some understandable frustration as the links have failed to resolve. We’ll be relaunching with OpenURL Version 1.0.
A landing page for every article
Another ‘feature’ that we are reinstating, every single article entry in DOAJ will have, once again, its own landing page. This will increase the visibility of open access content in search engines and in turn, traffic to journal sites. Individual articles will appear in search engine results. Site visitors will be able to surf directly to an article from an external engine search. We will be able to display more concise and specific information, within DOAJ, at the article level.
We will be structuring the metadata of article content to give it fixed and recognisable fields, tagged in a way that will make it easier to index, ingest, link to and find.
Google Scholar compatibility
The structuring of the metadata will, among other things, allow Google Scholar to crawl and index every single article item in DOAJ. Although some article content appears in Google Scholar currently, our unstructured metadata means that the Google Scholar indexers are left to guess a lot of the information and, when they can’t do that successfully, they skip it. My contact at Google Scholar provided us with the metadata specifications needed to ensure indexing in Google Scholar and we are confident that this will have a great impact on traffic to the original articles on publishers’ sites. It can take Google Scholar up to 8 weeks to crawl new content (an up to 9 months to come back and re-crawl additional content!) so we will have to be patient before we see any noticeable results.
The last of the ‘behind-the-scenes’ improvements, we’ll be adding more granularity to our OAI-PMH service. At the moment, we’re generating OAI metadata using standard Dublin Core which, while meeting basic needs, isn’t specific enough around certain DOAJ-specific information such as citation.
The User Interface
An upgrade to our facets
We have upgraded the software that enables our facet searching. We have done this to both facilitate future developments around searching but also to make the process of selecting and filtering using facets more intuitive. We will also be introducing new facets that will reveal the new information we are collecting about journals, such as type of peer review, full text format, and The DOAJ Seal. We’d be interested to know what you think about the new interface.
Probably the most exciting feature we are adding is a fully functional subject browser. We have received a fair bit of feedback asking for better browsing by subject capabilities. In DOAJ, all journals and their articles are classified according to the Library of Congress Classification. Every journal has one or two subjects assigned to it when it is accepted into DOAJ. Users will now be able to expand and collapse a tree to reveal the subject categories, or start by typing a keyword to reveal relevant categories. The number of records assigned to that subject will be revealed and a button offered to take users to the results.
Improved search results
We’ve cleaned up the search results for both journal and article entries. We’ve standardised the display of search results for articles to the Vancouver citation style and where relevant, we’re adhering to the NISO PIE-J standard that details recommendations for the presentation and identification of e-journals. This means a simpler, cleaner display with standardised information clearly identified in search results for both journals and articles. Importantly, we’ve surfaced the APC information in search results: not simply whether a journal charges APCs or not, but how much they charge and in which currency.
Revealing the new application form data
All of the data that we are capturing in the new application form will now be displayed against the journals and their articles. We are redesigning the journal landing pages so that the most pertinent information is higher up the page (such as APC data), long URLs are hidden and we’ve adopted a two column treatment to give greater visibility to all of the new data. New data includes: permanent article identifier, waiver policy, archiving policy, plagiarism policy, machine-readable license and deposit policy. We’ve decided to simply display this information on the journal landing pages, rather than create facets for all of them: a facet for every new category would make the search interface clunkier and slow down response times. We’d be interested to hear from you of you feel that some of the new categories definitely need a facet in search.
Shareable, stable URLs
A regular user of DOAJ will tell you how the URLs are incredibly long and don’t “share” very well. They were slightly unstable: if a piece of metadata in the URL included a special character, such as an accent, then when that URL was shared, it often broke. We’ve stabilised the platform so that users can share URLs and the links will never break. The URLs will still be long so we’re plugging in Bit.ly, right next to the search results, so that users can generate a shortened but stable URL to share.
Display of continuations
DOAJ has been studying the NISO PIE-J recommendations in the context of continuations, of which we have several hundred in DOAJ. (A continuation is when a journal changes its title, ISSN(s) or both and these have, until now, been invisible in DOAJ.) We will be adding the correct visualisation of continuations to the journal landing pages. Note that this won’t allow us to create new continuations: that piece of work will be done later in the year.
We’ve been publishing our list of journals that have received the DOAJ Seal since January 2014 but, so far, we’ve been unable to display that information on the site. We will shortly start adding the DOAJ Seal logo to journal landing pages and journal entries in search results.