{"id":2172,"date":"2019-04-05T09:32:32","date_gmt":"2019-04-05T08:32:32","guid":{"rendered":"http:\/\/blog.doaj.org\/?p=2172"},"modified":"2019-04-05T09:32:32","modified_gmt":"2019-04-05T08:32:32","slug":"full-doaj-data-dump-now-available","status":"publish","type":"post","link":"https:\/\/blog.doaj.org\/de\/2019\/04\/05\/full-doaj-data-dump-now-available\/","title":{"rendered":"Full DOAJ data dump now available"},"content":{"rendered":"<p>&#8220;We&#8217;re going to a hackathon and would love to work with DOAJ&#8217;s data! Do you have a dump of all DOAJ data?&#8221;<\/p>\n<p>&#8220;Er, no. Sorry. I mean, we could get it for you but it will take a while. You could probably get it yourself if you know how to extract the JSON&#8230; When do you need it for?&#8221;<\/p>\n<p>&#8220;Tomorrow.&#8221;<\/p>\n<p>&#8220;Oh.&#8221;<\/p>\n<p>We&#8217;ve had this conversation a few times. Or had requests from eager individuals and organisations who want to use the rich offerings of the DOAJ metadata. They&#8217;ve told us of the wonderful things they could do with the data (slicing, reporting, analysing, apps, databases, software&#8230;) and we&#8217;ve never been able to help them in good time. But now, we can&#8230;.<\/p>\n<p>Introducing, full data dumps of ALL the metadata in DOAJ, both journals and articles: https:\/\/doaj.org\/public-data-dump<\/p>\n<p>So what? Well, let me suggest to you why these are a good thing:<\/p>\n<ul>\n<li>For the journal metadata, the CSV is really the only easy-to-use format. The journal data dump provides another way to do this.<\/li>\n<li>The data dumps are updated weekly, so can keep you up-to-date on a reasonably short delay. (There is no change feed, just a full dump.)<\/li>\n<li>When you want all of the DOAJ data for any reason, you can just take it!<\/li>\n<li>Deep paging on the search API is no longer permitted &#8211; search is for search, not harvesting. The data dump allows you to harvest.<\/li>\n<li>Whenever you want a subset of the DOAJ data, you can just download the data dump, then filter it locally for your needs. For example, if you are a publisher and you want to see all of your metadata in DOAJ, that is all in this data dump, and you can then filter by ISSN<\/li>\n<li>You can use it to enhance any local data in your own system or database: you may have basic article metadata in your system, and you want to extend it with DOAJ metadata.<\/li>\n<li>If you want to aggregate publications data from multiple sources, this is one way of quickly getting that information from DOAJ (versus using OAI-PMH).<\/li>\n<li>These data dumps are more metadata rich than OAI-PMH<\/li>\n<li>You may want to use the data for analysis or data mining or other forms of research, or hackathons.<\/li>\n<li>The data dumps are also useful as a test dataset.<\/li>\n<\/ul>\n<p>So there you have it. Despite a rather awkward name, data dumps are A Good Thing.<\/p>\n<p>We&#8217;d love to know what you think, so do please leave a comment here or send us feedback: feedback@doaj.org<\/p>","protected":false},"excerpt":{"rendered":"<p>&#8220;We&#8217;re going to a hackathon and would love to work with DOAJ&#8217;s data! Do you have a dump of all DOAJ data?&#8221; &#8220;Er, no. Sorry. I mean, we could get it for you but it will take a while. You could probably get it yourself if you know how to extract the JSON&#8230; When do&#8230;<\/p>","protected":false},"author":378,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_s2mail":"","_kadence_starter_templates_imported_post":false,"_kad_post_transparent":"","_kad_post_title":"","_kad_post_layout":"","_kad_post_sidebar_id":"","_kad_post_content_style":"","_kad_post_vertical_padding":"","_kad_post_feature":"","_kad_post_feature_position":"","_kad_post_header":false,"_kad_post_footer":false,"_kad_post_classname":"","footnotes":""},"categories":[616,617,618],"tags":[81,161,199,298,334],"class_list":["post-2172","post","type-post","status-publish","format-standard","hentry","category-metadata","category-new-feature","category-news-update","tag-article-metadata","tag-database","tag-downloads","tag-journal-metadata","tag-metadata"],"_links":{"self":[{"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/posts\/2172","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/users\/378"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/comments?post=2172"}],"version-history":[{"count":0,"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/posts\/2172\/revisions"}],"wp:attachment":[{"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/media?parent=2172"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/categories?post=2172"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.doaj.org\/de\/wp-json\/wp\/v2\/tags?post=2172"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}