This year, metadata development is one of our key priorities and we’re making a start with the release of version 5.4.0 of our input schema with some long-awaited changes. This is the first in what will be a series of metadata schema updates.
What is in this update?
Publication typing for citations
This is fairly simple; we’ve added a ‘type’ attribute to the citations members supply. This means you can identify a journal article citation as a journal article, but more importantly, you can identify a dataset, software, blog post, or other citation that may not have an identifier assigned to it. This makes it easier for the many thousands of metadata users to connect these citations to identifiers. We know many publishers, particularly journal publishers, do collect this information already and will consider making this change to deposit citation types with their records.
Every year we release metadata for the full corpus of records registered with us, which can be downloaded for free in a single compressed file. This is one way in which we fulfil our mission to make metadata freely and widely available. By including the metadata of over 165 million research outputs from over 20,000 members worldwide and making them available in a standard format, we streamline access to metadata about scholarly objects such as journal articles, books, conference papers, preprints, research grants, standards, datasets, reports, blogs, and more.
Today, we’re delighted to let you know that Crossref members can now use ROR IDs to identify funders in any place where you currently use Funder IDs in your metadata. Funder IDs remain available, but this change allows publishers, service providers, and funders to streamline workflows and introduce efficiencies by using a single open identifier for both researcher affiliations and funding organizations.
As you probably know, the Research Organization Registry (ROR) is a global, community-led, carefully curated registry of open persistent identifiers for research organisations, including funding organisations. It’s a joint initiative led by the California Digital Library, Datacite and Crossref launched in 2019 that fulfills the long-standing need for an open organisation identifier.
We began our Global Equitable Membership (GEM) Program to provide greater membership equitability and accessibility to organizations in the world’s least economically advantaged countries. Eligibility for the program is based on a member’s country; our list of countries is predominantly based on the International Development Association (IDA). Eligible members pay no membership or content registration fees. The list undergoes periodic reviews, as countries may be added or removed over time as economic situations change.
Following on from the missing XMP Specification version number discussed in the previous post here below are listed some miscellaneous gripes I’ve got with XMP (on what otherwise is a very promising technology). I would be more than happy to be proved wrong on any of these points.
1. XMP version history and archive
There doesn’t appear to be any XMP version history or archive hosted by Adobe as far as I can tell.
2. Unpublished schemas
Also there is nothing published - outside the XMP Spec itself - on the core schemas used by XMP. There’s nothing to be gleaned from the namespace URIs used. The Adobe namespaces, e.g.
So, that can leave us with undocumented terms (e.g. ‘xmpMM:Manifest‘ used by Adobe InDesign CS2 4.0.5) from documented schemas and also undocumented schemas (e.g. ‘pdfx‘).
3. UUID
Note also that many Adobe apps do not use the URN syntax for ‘uuid:‘. The XMP Spec also has this to say:
_“There is no formal standard for URIs that are based on an abstract UUID. The following proposal may be relevant:
(see: 3 XMP Storage Model / Serializing XMP / rdf:Description elements / rdf:about attribute)”
I guess the XMP Spec (Sept. ’05) had just been bedded down more or less when the URN namespace for ‘uuid:‘ was published as RFC 4122 in July ’05.
4. RDF/XML serialization
The biggie.
XMP schemas specify fixed property value types in RDF/XML, i.e. they specify a fixed profile of RDF/XML instead of generic RDF/XML. This has been commented on recently by myself on the semantic-web list, and also here by Bruce D’Arcus speaking about OpenDocument, and here by Mike Linksvayer speaking for CC.
This profiling of RDF/XML leads to real problems. For example, Adobe have defined a Dublin Core (DC) schema which lists the property value types that DC values can assume in an XMP serialization. Meantime, the PRISM 2.0 draft spec defines an incompatible mapping of DC terms to XMP property values. Since both schemas would make use of the same DC namespace (though PRISM haven’t actually specified a DC namespace for use with XMP but do use elsewhere the regular DC namespace) this isn’t going to work. I did supply some feedback on this to the PRISM WG but have heard nothing back from them. So, PRISM XMP looks uncertain at this time. Which, for us, must be a shame.