This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
As the range of public services (e.g. RSS) offered by publishers has matured this gives rise to the question: How can they expose their public data so that a user may discover them? Especially, with DOI there is now in place a persistence link infrastructure for accessing primary content. How can publishers leverage that infrastructure to advantage?
Anyway, I offer this figure as to how I see the current lie of the land as regards DOI services and data.
Legend - Current DOI service architecture showing data repositories, service access points, and open/closed data domains.
The figure above shows the three data repositories and service access points in the current DOI services architecture. At right and bottom of the figure are the two types of service (public services and private services) that together are instrumental in getting a user from a DOI-based link (on a third-party site) to the correct page of content (from the primary content provider). (Note that a fourth, private data repository – the institutional repository – comes into play when OpenURL user context-sensitive linking is added.)
At left of the figure are services operated by Crossref on its own metadata database which support a) publisher lookups of DOI, and b) third-party metadata services (DOI-to-metadata and metadata-to-DOI conversions). These might best be labelled protected services since they are not freely available: the first is open to members at a cost, while the second is free but to associated organizations only – members, affiliates, etc.
The term open data is used here in the sense implied by the current W3C SWEO LOD (Linking Open Data) Project. Open data is public data unencumbered by any access restrictions. By contrast, closed data is data that has some access restrictions placed on it – even data that is open to affiliates. (This is not an issue that LOD addresses directly, although it is implied that data is globally ‘open’, i.e. public.)
The current DOI service architecture thus breaks down as:
Native DOI services – resolving the DOI token
Public – DOI Proxy Server (‘dx.doi.org’)
Related DOI services – using the DOI token
Protected – Crossref
Private – Publisher
Note that a DOI is ‘resolved’ into state data registered with it, or as ISO CD 26324 puts it: “Resolution is the process of submitting a specific DOI name to the DOI system and receiving in return the associated values held in the DOI resolution record for one or more types of data relating to the object identified by that DOI name.”
So, how might publishers best leverage this DOI service architecture to expose their public data?