Blog

With a little help from your Crossref friends: Better metadata

We talk so much about more and better metadata that a reasonable question might be: what is Crossref doing to help?

Members and their service partners do the heavy lifting to provide Crossref with metadata and we don’t change what is supplied to us. One reason we don’t is because members can and often do change their records (important note: updated records do not incur fees!). However, we do a fair amount of behind the scenes work to check and report on the metadata as well as to add context and relationships. As a result, some of what you see in the metadata (and some of what you don’t) is facilitated, added or updated by Crossref.

A Registry of Editorial Boards - a new trust signal for scholarly communications?

Background

Perhaps, like us, you’ve noticed that it is not always easy to find information on who is on a journal’s editorial board and, when you do, it is often unclear when it was last updated. The editorial board details might be displayed in multiple places (such as the publisher’s website and the platform where the content is hosted) which may or may not be in sync and retrieving this information for any kind of analysis always requires manually checking and exporting the data from a website (as illustrated by the Open Editors research and its dataset).

A ROR-some update to our API

Earlier this year, Ginny posted an exciting update on Crossref’s progress with adopting ROR, the Research Organization Registry for affiliations, announcing that we’d started the collection of ROR identifiers in our metadata input schema. 🦁

The capacity to accept ROR IDs to help reliably identify institutions is really important but the real value comes from their open availability alongside the other metadata registered with us, such as for publications like journal articles, book chapters, preprints, and for other objects such as grants. So today’s news is that ROR IDs are now connected in Crossref metadata and openly available via our APIs. 🎉

Come and get your grant metadata!

Tl;dr: Metadata for the (currently 26,000) grants that have been registered by our funder members is now available via the REST API. This is quite a milestone in our program to include funding in Crossref infrastructure and a step forward in our mission to connect all.the.things. This post gives you all the queries you might need to satisfy your curiosity and start to see what’s possible with deeper analysis. So have the look and see what useful things you can discover.

Lesson learned, the hard way: Let’s not do that again!

TL;DR

We missed an error that led to resource resolution URLs of some 500,000+ records to be incorrectly updated. We have reverted the incorrect resolution URLs affected by this problem. And, we’re putting in place checks and changes in our processes to ensure this does not happen again.

How we got here

Our technical support team was contacted in late June by Wiley about updating resolution URLs for their content. It’s a common request of our technical support team, one meant to make the URL update process more efficient, but this was a particularly large request. Shortly thereafter, we were provided with nearly 1,200 separate files by Atypon on behalf of Wiley in order to update the resolution URLs of ~9 million records. We manually spot checked over 50 of these files, because, prior to this issue, our technical support team did not have a mechanism to automatically check for errors. That labor intensive review did not turn up any problems. That is, those 50 samples had no errors with the headers, like were found later.

Some rip-RORing news for affiliation metadata

We’ve just added to our input schema the ability to include affiliation information using ROR identifiers. Members who register content using XML can now include ROR IDs, and we’ll add the capability to our manual content registration form, participation reports, and metadata retrieval APIs in the near future. And we are inviting members to a Crossref/ROR webinar on 29th September at 3pm UTC.

The background

We’ve been working on the Research Organization Registry (ROR) as a community initiative for the last few years. Along with the California Digital Library and DataCite, our staff has been involved in setting the strategy, planning governance and sustainability, developing technical infrastructure, hiring/loaning staff, and engaging with people in person and online. In our view, it’s the best current model of a collaborative initiative between like-minded open scholarly infrastructure (OSI) organizations.

RFP: Help evaluate the reach and effects of metadata

Jennifer Kemp

Jennifer Kemp – 2021 July 21

In MetadataCommunity

UPDATE, 14 October 2021:

We received several excellent proposals in response to this RFP and we’d like to thank everyone involved for their time and enthusiasm.

We are excited to announce the two projects that have been selected, to run through early 2023. Stay tuned!

With or Without: Measuring Impacts of Books Metadata
This project will test the premise that academic books metadata improves discoverability and usage by assessing the impact of book chapter records with DOIs (unique from metadata associated with the entire book) with associated chapter and book attributes. The study aims to prove or disprove its hypothesis and rank metadata attributes by their association with successful content discovery and access. The findings will be considered alongside similar metadata research in order to develop a metadata efficacy framework, which can be used to determine the return on metadata investments by publishers and service providers.

An Advisory Group for Preprints

We are delighted to announce the formation of a new Advisory Group to support us in improving preprint metadata. Preprints have grown in popularity over the last few years, with increasing focus brought by the need to rapidly disseminate knowledge in the midst of a global pandemic. We have supported metadata deposits for preprints under the record type ‘posted content’ since 2016, and members currently register a total of around 17,000 new preprints metadata records each month.

Service Provider perspectives: A few minutes with our publisher hosting platforms

Service Providers work on behalf of our members by creating, registering, querying and/or displaying metadata. We rely on this group to support our schema as it evolves, to roll out new and updated services to members and to work closely with us on a variety of matters of mutual interest. Many of our Service Providers have been with us since the early days of Crossref. Others have joined as scholarly communications has grown and services have evolved. Though fewer than 20 in number, their impact far outweighs the size of the group.

Doing more with relationships - via Event Data

Crossref aims to link research together, making related items more findable, increasing transparency, and showing how ideas spread and develop. There are a number of moving parts in this effort: some related to capturing and storing linking information, others to making it available.

By including relationship metadata in Event Data, we are taking a big step to improve the visibility of a large number of links between metadata. We know this is long-promised and we’re pleased that making this valuable metadata available supports a number of important initiatives. We will also be backfilling, so all previously deposited relationships will eventually become available as events. The first step will be to add relationships between items that have DOIs, such as between a research article and a related review report or dataset.