This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
The below information will help you understand how to interpret your Similarity Report, whether you’re using iThenticate v1 or v2.
To calculate the Similarity Score, iThenticate scans your submitted document’s text, and checks it against each of the repositories you’ve chosen. The system takes the number of matching words found within the document and divides it by the document’s total word count to produce the Similarity Score percentage for the report.
If you apply exclusion options to the document, the system removes all matches for the exclusion option logic and recalculates the Similarity Score percentage.
iThenticate does not check for plagiarism; it checks for similarity. Where a section of the submission’s content is similar or identical to one or more sources, it will be flagged for review. This doesn’t automatically mean plagiarism, however - just similarity.
It’s perfectly natural for a submission to match against some sources in the database. A high degree of overlap may indicate a well-researched document with many references to existing work, and as long as these sources are quoted and referenced correctly, this is perfectly acceptable. A high degree of overlap may also be present where an author has already shared their work on a preprint repository. If the author(s) are the same, this is not a problem.
It’s important that you don’t set a Similarity Score over which you automatically reject manuscripts - where there’s a high degree of overlap, your editors and reviewers should decide if the match is acceptable or not, as part of their general review process.
Similarity Reports and preprints
It is entirely possible (and acceptable) for an author to submit an article to a journal even though they’ve previously made the article available as a preprint. In this case, we expect a high degree of similarity between the preprint and author’s submitted manuscript.
Therefore, if you find a high degree of similarity between a manuscript you’re checking in iThenticate and a preprint by the same author(s), this is likely to be because the manuscript is a match with its own preprint. However, if the manuscript and preprint do not have the same author(s), this may indicate a problem, and you should investigate further.
Some preprints can be found in iThenticate’s Crossref Posted Content repository, so take this into account if you are checking against this repository.
But even if you have excluded the Crossref Posted Content repository in your settings (v1or v2), it is still possible for preprints to appear as matches to a submission, because iThenticate also crawls preprint repositories on the web.
We recommend including preprints in your results to ensure you are checking that preprints haven’t been plagiarised by a different author, but if you see a pre-print match for the same author, this isn’t plagiarism.