In the first half of this year we’ve been talking to our community about post-publication changes and Crossmark. When a piece of research is published it isn’t the end of the journey—it is read, reused, and sometimes modified. That’s why we run Crossmark, as a way to provide notifications of important changes to research made after publication. Readers can see if the resesarch they are looking at has updates by clicking the Crossmark logo.
We’re happy to note that this month, we are marking five years since Crossref launched its Grant Linking System. The Grant Linking System (GLS) started life as a joint community effort to create ‘grant identifiers’ and support the needs of funders in the scholarly communications infrastructure.
The system includes a funder-designed metadata schema and a unique link for each award which enables connections with millions of research outputs, better reporting on the research and outcomes of funding, and a contribution to open science infrastructure.
In our previous blog post about metadata matching, we discussed what it is and why we need it (tl;dr: to discover more relationships within the scholarly record). Here, we will describe some basic matching-related terminology and the components of a matching process. We will also pose some typical product questions to consider when developing or integrating matching solutions.
Basic terminology Metadata matching is a high-level concept, with many different problems falling into this category.
Update 2024-07-01: This post is based on an interview with Euan Adie, founder and director of Overton._
What is Overton? Overton is a big database of government policy documents, also including sources like intergovernmental organizations, think tanks, and big NGOs and in general anyone who’s trying to influence a government policy maker. What we’re interested in is basically, taking all the good parts of the scholarly record and applying some of that to the policy world.
A Schematron report tells you if there’s a metadata quality issue with your records.
Schematron is a pattern-based XML validation language. We try to stop the deposit of metadata with obvious issues, but we can’t catch everything because publication practices are so varied. For example, most family names in our database that end with jr are the result of a publisher including a suffix (Jr) in a family name, but there are of course surnames ending with ‘jr’.
We do a weekly post-registration metadata quality check on all journal, book, and conference proceedings submissions, and record the results in the schematron report. If we spot a problem we’ll send you an alert. Any identified errors may affect overall metadata quality and negatively affect queries for your content. Errors are aggregated and sent out weekly via email in the schematron report.
What should I do with my schematron report?
The report contains links (organized by title) to .xml files containing error details. The XML files can be downloaded and processed programmatically, or viewed in a web browser: