This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
As part of our Event Data work we’ve been investigating where DOI resolutions come from. A resolution could be someone clicking a DOI hyperlink, or a search engine spider gathering data or a publisher’s system performing its duties. Our server logs tell us every time a DOI was resolved and, if it was by someone using a web browser, which website they were on when they clicked the DOI. This is called a referral.
This information is interesting because it shows not only where DOI hyperlinks are found across the web, but also when they are actually followed. This data allows us a glimpse into scholarly citation beyond references in traditional literature.
Last year Crossref Labs announced Chronograph, an experimental system for browsing some of this data. We’re working toward a new version, but in the meantime I’d like to share the results for 2015 and some of 2016. We have filtered out domains that belong to Crossref member publishers to highlight citations beyond traditional publications.
Top 10 DOI referrals from websites in 2015
This chart shows the top 10 referring non-primary-publisher domains of DOIs per month. Note that if browsers don’t send the referrer (e.g. from an HTTPS page), we don’t get to find out. Because the top 10 can be different month to month, the total number of domains mentioned can be more than 10. Subdomains are combined, which means that, for example, the wikipedia.org entry covers all Wikipedia languages. This chart covers all of 2015 and the first two months of 2016.
The top 10 referring domains for the period:
webofknowledge.com
baidu.com
serialssolutions.com
scopus.com
exlibrisgroup.com
wikipedia.org
google.com
uni-trier.de
ebsco.com
google.co.uk
It’s not surprising to see some of these domains here: for example serialssolutions.com and exlibrisgroup.com are effectively proxies for link resolvers, Baidu and Google are incredibly popular search engines which would show up anywhere. But it is exciting to see Wikipedia ranked amongst these. For more detail look out for the new Chronograph.
HTTP vs HTTPS in 2015
We’ve also seen a steady increase in HTTPS referral traffic, i.e. people clicking on DOIs from sites that are using HTTPS. While it is still dwarfed by HTTP, there was a steady uptick throughout 2015.
This chart shows HTTP vs HTTPS referrals per day, which shows up the weekly spikes. It doesn’t include resolutions where we don’t know the referrer.
Increasing numbers of people are moving to HTTPS for reasons of security, privacy and protection from tampering. Google has announced plans to take HTTPS into account when ranking search results. Wikipedia has moved exclusively to HTTPS, and I’ll be telling the story of how Crossref and Wikipedia collaborated in an upcoming blog post.
Chronograph
Another version of Chronograph will be available soon. It will contain full data for all non-primary-publisher referring domains. Stay tuned!