Blog

Tony Hammond

Tony worked alongside Crossref at nature.com between 2006 and 2010.

Using ISO URNs

Tony Hammond

Tony Hammond – 2007 October 01

In Identifiers

(Update - 2007.10.02: Just realized that there were some serious flaws in the post below regarding publication and form of namespace URIs which I’ve now addressed in a subsequent post here.)

By way of experimenting with a use case for ISO URNs, below is a listing of the document metadata for an arbitrary PDF. (You can judge for yourselves whether the metadata disclosed here is sufficient to describe the document.) Here, the metadata is taken from the information dictionary and from the document metadata stream (XMP packet).

The metadata is expressed in RDF/N3. That may not be a surprise for the XMP packet which is serialized in RDF/XML, as it’s just a hop, skip and a jump to render it as RDF/N3 with properties taken from schema whose namespaces are identified by URI. What may be more unusual is to see the document information dictionary metadata (the “normal” metadata in a PDF) rendered as RDF/N3 since the information dictionary is not nodelled on RDF, not expressed in XML, and not namespaced. Here, in addition to the trusty HTTP URI scheme, I’ve made use of two particular URI schemes: “iso:” URN namespaces, and “data:” URIs.

(Continues.)

Whole Lotta ID

Tony Hammond

Tony Hammond – 2007 October 01

In Identifiers

ISO has registered with the IANA a URN namespace identifier (“iso:”) for ISO persistent resources. From the Internet-Draft: “This URN NID is intended for use for the identification of persistent resources published by the ISO standards body (including documents, document metadata, extracted resources such as standard schemata and standard value sets, and other resources).” The toplevel grammar rules (ABNF) give some indication of scope: NSS = std-nss std-nss = “std:” docidentifier *supplement *docelement [addition]

Authors in Context?

Tony Hammond

Tony Hammond – 2007 September 30

In ORCID

On the subject of author IDs (a subject Crossref is interested in and on which held a meeting earlier this year, as blogged about here), this post by Karen Coyle “Name authority control, aka name identification” may be worth a read. She starts off with this: “Libraries do something they call “name authority control”. For most people in IT, this would be called “assigning unique identifiers to names.” Identifying authors is considered one of the essential aspects of library cataloging, and it isn’t done in any other bibliographic environment, as far as I know.

XMP-Ville

Tony Hammond

Tony Hammond – 2007 September 25

In XMP

Been so busy looking into the technical details of XMP that I almost forgot to check out the current landcsape. Luckily I chanced on these articles by Ron Roszkiewicz for The Seybold Report (and apologies for lifting the title of this post from his last). The articles about XMP are well worth reading and chart the painful progress made to date:

  • The Brief Tortured Life of XMP (July ’05)
    • [Thought Leaders Hammer out Metadata Standard] (April ’07)
      • [Metadata Persistence and “Save for Web…”] (July ’07)

      From the earlier characterization of XMP as “underachieving teenager” Roszkiewicz is cautiously optimistic that IDEAlliance’s XMP Open initiative (an initiative to advance XMP as an open industry specification) will help outreach and foster adoption of this fledgling technology.

      (Continues.)

The Name’s The Thing

Tony Hammond

Tony Hammond – 2007 September 20

In XMP

I’m always curious about names and where they come from and what they mean. Hence, my interest was aroused with the constant references to “XAP” in XMP. As the XMP Specification (Sept. 2005) says:

“NOTE: The string “XAP” or “xap” appears in some namespaces, keywords, and related names in this document and in stored XMP data. It reflects an early internal code name for XMP; the names have been preserved for compatibility purposes.”

Actually, it occurs in most of the core namespaces: XAP, rather than XMP.

(Continues.)

Chapter 9 - The Closed Book

Tony Hammond

Tony Hammond – 2007 September 15

In Discussion

Hadn’t really noticed before but was fairly gobsmacked by this notice I just saw on the DOI® Handbook: **Please note that Chapter 9, Operating Procedures is for Registration Agency personnel only.** DOI® Handbook doi:10.1000/182 http://www.doi.org/hb.html And, indeed, the Handbook’s TOC only reconfirms this: 9 Operating procedures* *The RA password is required for viewing Chapter 9. 9.1 Registering a DOI name with associated metadata 9.2 Prefix assignment 9.3 Transferring DOI names from one Registrant to another

Custom Panel for CC

Tony Hammond

Tony Hammond – 2007 September 15

In Metadata

Creative Commons now have a custom panel for adding CC licenses using Adobe apps - see here. Interesting on two counts: Machine readable licenses XMP metadata But I still think that batch solutions for adding XMP metadata are really required for publishing workflows. And ideally there should be support for adding arbitrary XMP packets if we’re going to have truly rich metadata. I rather fear the constraints that custom panels place upon the publisher.

Last Orders Please!

Tony Hammond

Tony Hammond – 2007 September 13

In Metadata

Public comment period on the PRISM 2.0 draft ends Saturday (Sept. 15) ahead of next week’s WG meeting to review feedback and finalize the spec. (I put in some comments about XMP already. Hope they got that.)

Marking up DOI

Tony Hammond

Tony Hammond – 2007 September 11

In XMP

(Update - 2007.09.15: Clean forgot to add in the rdf: namespace to the examples for xmp:Identifier in this post. I’ve now added in that namespace to the markup fragments listed. Also added in a comment here which shows the example in RDF/XML for those who may prefer that over RDF/N3.)

So, as a preliminary to reviewing how a fuller metadata description of a Crossref resource may best be fitted into an XMP packet for embedding into a PDF, let’s just consider how a DOI can be embedded into XMP. And since it’s so much clearer to read let’s just conduct this analysis using RDF/N3. (Life is too short to be spent reading RDF/XML or C++ code. :~)

(And further to Chris Shillum’s comment [(Update - 2007.09.15: Clean forgot to add in the rdf: namespace to the examples for xmp:Identifier in this post. I’ve now added in that namespace to the markup fragments listed. Also added in a comment here which shows the example in RDF/XML for those who may prefer that over RDF/N3.)

So, as a preliminary to reviewing how a fuller metadata description of a Crossref resource may best be fitted into an XMP packet for embedding into a PDF, let’s just consider how a DOI can be embedded into XMP. And since it’s so much clearer to read let’s just conduct this analysis using RDF/N3. (Life is too short to be spent reading RDF/XML or C++ code. :~)

(And further to Chris Shillum’s comment]2 on my earlier post Metadata in PDF: 2. Use Cases where he notes that Elsevier are looking to upgrade their markup of DOI in PDF to use XMP, I’m really hoping that Elsevier may have something to bring to the party and share with us. A consensus rendering of DOI within XMP is going to be of benefit to all.)

(Continues.)

The Second Wave

Tony Hammond

Tony Hammond – 2007 September 11

In Metadata

You might have been wondering why I’ve been banging on about XMP here. Why the emphasis on one vendor technology on a blog focussed on an industry linking solution? Well, this post is an attempt to answer that.

Four years ago we at Nature Publishing Group, along with a select few early adopters, started up our RSS news feeds. We chose to use RSS 1.0 as the platform of choice which allowed us to embed a rich metadata term set using multiple schemas - especially Dublin Core and PRISM. We evangelized this much at the time and published documents on XML.com (Jul. ’03) and in D-Lib Magazine (Dec. ’04) as well as speaking about this at various meetings and blogging about it. Since that time many more publishers have come on board and now provide RSS routinely, many of them choosing to enrich their feeds with metadata.

Well, RSS can be seen in hindsight as being the First Wave of projecting a web presence beyond the content platform using standard markup formats. With this embedded metadata a publisher can expand their web footprint and allow users to link back to their content server.

Now, XMP with its potential for embedding metadata in rich media can be seen as a Second Wave. Media assets distributed over the network can now carry along their own metadata and identity which can be leveraged by third-party applications to provide interesting new functionalities and link-back capability. Again a projection of web presence.

(Continues.)