“We’re going to a hackathon and would love to work with DOAJ’s data! Do you have a dump of all DOAJ data?”
“Er, no. Sorry. I mean, we could get it for you but it will take a while. You could probably get it yourself if you know how to extract the JSON… When do you need it for?”
“Tomorrow.”
“Oh.”
We’ve had this conversation a few times. Or had requests from eager individuals and organisations who want to use the rich offerings of the DOAJ metadata. They’ve told us of the wonderful things they could do with the data (slicing, reporting, analysing, apps, databases, software…) and we’ve never been able to help them in good time. But now, we can….
Introducing, full data dumps of ALL the metadata in DOAJ, both journals and articles: https://doaj.org/public-data-dump
So what? Well, let me suggest to you why these are a good thing:
- For the journal metadata, the CSV is really the only easy-to-use format. The journal data dump provides another way to do this.
- The data dumps are updated weekly, so can keep you up-to-date on a reasonably short delay. (There is no change feed, just a full dump.)
- When you want all of the DOAJ data for any reason, you can just take it!
- Deep paging on the search API is no longer permitted – search is for search, not harvesting. The data dump allows you to harvest.
- Whenever you want a subset of the DOAJ data, you can just download the data dump, then filter it locally for your needs. For example, if you are a publisher and you want to see all of your metadata in DOAJ, that is all in this data dump, and you can then filter by ISSN
- You can use it to enhance any local data in your own system or database: you may have basic article metadata in your system, and you want to extend it with DOAJ metadata.
- If you want to aggregate publications data from multiple sources, this is one way of quickly getting that information from DOAJ (versus using OAI-PMH).
- These data dumps are more metadata rich than OAI-PMH
- You may want to use the data for analysis or data mining or other forms of research, or hackathons.
- The data dumps are also useful as a test dataset.
So there you have it. Despite a rather awkward name, data dumps are A Good Thing.
We’d love to know what you think, so do please leave a comment here or send us feedback: feedback@doaj.org
2 Comments