This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
We have some exciting news for fans of big batches of metadata: this year’s public data file is now available. Like inyearspast, we’ve wrapped up all of our metadata records into a single download for those who want to get started using all Crossref metadata records.
We’ve once again made this year’s public data file available via Academic Torrents, and in response to some feedback we’ve received from public data file users, we’ve taken a few additional steps to make accessing this 185 gb file a little easier.
First, we’re proactively hosting seeds in a few locations around the world to improve torrent download performance in terms of both speed and reliability.
And second, we’ve added an option to download this year’s public data file directly from Amazon S3 for a small transaction fee paid by the recipient, bypassing the need to use the torrent altogether. The fee just covers the AWS cost of the download. Instructions for downloading the public data file via the “Requester Pays” method are available on the “Tips for working with Crossref public data files and Plus snapshots” page.
The 2023 public data file features over 140 million metadata records deposited with Crossref through the end of March 2023, including over 76,000 grant records. Because Crossref metadata is always openly available, you can use our API to keep your local copy of our metadata corpus up to date with new and updated records.
In previous years, closed and limited references were removed from the public data file. Since we updated our membership terms to make all deposited references open in 2022, the 2023 public data file for the first time includes all references deposited with us.
We hope you find this public data file useful. Should you have any questions about how to access or use the file, please see the tips below, or bring your questions to our community forum.
Tips for using the torrent and retrieving incremental updates
Use the public data file if you want all Crossref metadata records. Everyone is welcome to the metadata, but it will be much faster for you and much easier on our APIs to get so many records in one file. Here are some tips on how to work with the file.
Use the REST API to incrementally add new and updated records once you have the initial file. Here is how to get started (and avoid getting blocked in your enthusiasm to use all this great metadata!).
While bibliographic metadata is generally required, because lots of metadata is optional, records will vary in quality and completeness.