This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
Data citation is seen as one of the most important ways to establish data as a first-class scientific output. At Crossref and DataCite we are seeing growth in journal articles and other record types citing data, and datasets making the link the other way. Our organizations are committed to working together to help realize the data citation community’s ambition, so we’re embarking on a dedicated effort to get things moving.
Efforts regarding data citation are not a new thing. One of the first large-scale initiatives to establish data citation as a standard academic practice was the FORCE11 Joint Declaration of Data Citation Principles (JDDCP) in 2014. This declaration was endorsed by over 100 organizations in the scholarly community as well as many individuals.
Following this agreement on how data citation should be done, many projects followed. Within FORCE11, the Data Citation Implementation Pilot brought together publishers and repositories to put data citation into practice and work on the implementation of the JDDCP. Within the context of the Research Data Alliance,
a data-literature linking group started under the name of Scholix to establish a framework for exchanging information about the relationships between articles and datasets. The infrastructure building blocks now feed into projects such as Make Data Count and Enabling FAIR Data.
Projects aside, if datasets are cited consistently and in a standard way, it will make it much easier for the research community to see links between different research outputs and work with these outputs. It also makes it much easier to count these citations, so that researchers can get credit for their data and the sharing of that data.
The underlying work has been done to create an infrastructure that will effectively support and disseminate information on data citation. Data citation is here today!
Different organizations know how to handle data citations, and are starting to count these and make that information available in turn. This means that the only thing that’s needed is for people to actually cite data, and this information be captured and passed on. Some Crossref and DataCite members have already made great progress on this already (see Melissa Harrison’s blog on what eLife is doing).
The goals of all the data citation projects can only be realized if you start doing data citation, and we know you’ll have questions about it…
In the coming months, we’ll be posting several blogs and organizing sessions to tell you how you can start doing data citation - if you’re attending FORCE2018 you can catch our joint workshop there. So stay tuned and please get in touch if you can’t wait, we’d love to help you get started!