In the first half of this year we’ve been talking to our community about post-publication changes and Crossmark. When a piece of research is published it isn’t the end of the journey—it is read, reused, and sometimes modified. That’s why we run Crossmark, as a way to provide notifications of important changes to research made after publication. Readers can see if the resesarch they are looking at has updates by clicking the Crossmark logo.
We’re happy to note that this month, we are marking five years since Crossref launched its Grant Linking System. The Grant Linking System (GLS) started life as a joint community effort to create ‘grant identifiers’ and support the needs of funders in the scholarly communications infrastructure.
The system includes a funder-designed metadata schema and a unique link for each award which enables connections with millions of research outputs, better reporting on the research and outcomes of funding, and a contribution to open science infrastructure.
In our previous blog post about metadata matching, we discussed what it is and why we need it (tl;dr: to discover more relationships within the scholarly record). Here, we will describe some basic matching-related terminology and the components of a matching process. We will also pose some typical product questions to consider when developing or integrating matching solutions.
Basic terminology Metadata matching is a high-level concept, with many different problems falling into this category.
Update 2024-07-01: This post is based on an interview with Euan Adie, founder and director of Overton._
What is Overton? Overton is a big database of government policy documents, also including sources like intergovernmental organizations, think tanks, and big NGOs and in general anyone who’s trying to influence a government policy maker. What we’re interested in is basically, taking all the good parts of the scholarly record and applying some of that to the policy world.
Crossref’s Similarity Check service is used by our members to detect text overlap with previously published work that may indicate plagiarism of scholarly or professional works. Manuscripts can be checked against millions of publications from other participating Crossref members and general web content using the iThenticate text comparison software from Turnitin.
The 2000 members who already make use of Similarity Check upload almost 2,000,000 documents each month to look for matching text in other publications.
We have some great news for those 2000 members –– a completely new version of iThenticate is on its way, and will start to roll out to users in the coming months.
New functionality has been developed based on your feedback over the past few years and includes:
An improved Document Viewer that makes PDFs searchable and accessible, with responsive design for ease of use on different screen sizes. All of the functionality of the Viewer and the Text-only reports in the previous version have been streamlined into just two views: Sources Overview and All Sources.
Improved exclusion options to make refining matches even easier. Smarter citation detection now identifies probable citations both inline and in reference sections.
A new “Content Portal” where you can see what percentage of your own content has been successfully indexed for the iThenticate comparison database, and download reports of indexing errors that need to be fixed.
A new API for integration with manuscript submission systems allows display of the largest matching word count and the top 5 source matches alongside the Similarity Score.
The maximum number of pages and file size per document has been doubled to 800 pages/200 MB.
The new document viewer in iThenticate v2.0
Improved reference exclusion
Crossref members can use Similarity Check directly by logging in, or via an integration with a submission/peer review system. We are working with many system providers to bring v2.0 to you as soon as possible. In the meantime, we are looking for members to help us test the new system directly in the iThenticate user interface. If you are interested and can spare a few hours some time in the next month please let me know.
And if your organization is not yet using Similarity Check to assess the originality of the manuscripts you receive do take a look at the many benefits the service has to offer.