Deposit harvester
The deposit harvester allows you to retrieve metadata records for content that you’ve registered. The metadata retrieved is in our UNIXSD output format, which delivers the exact metadata submitted in a deposit, including any citations registered. Members (or their designated third parties) may only retrieve their own metadata.
The harvester uses Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) to deliver the metadata. The verbs Identify, ListMetadataFormats, ListSets, ListIdentifiers, ListRecords, and GetRecord are supported.
Ownership and retrieval restrictions - who can retrieve records?
The deposit harvester will only retrieve records for the authorized owner of the metadata records. Metadata ownership is established by the DOI prefix(es) associated with a user’s account (learn more about transferring responsibility for DOIs. Many members have one prefix and one account, but some members may have multiple prefixes. For example, Member A has been assigned account abcd, which is associated with prefixes 10.xxxx
, 10.yyyy
, and 10.zzzz
. Member A can retrieve metadata owned by prefixes 10.xxxx
, 10.yyyy
, and 10.zzzz
using their abcd account.
Ownership of DOIs and titles often moves from member to member, so a title-owning prefix will not always match the prefix of the DOIs attached to the title. Retrieval permission is granted to the current owner, not the original depositor. For example, Member B registers identifier 10.5555/jfo.33425
. Ownership of the journal and all identifiers is transferred to Member A with prefix 10.50505
. The DOI is now “owned” by prefix 10.50505
, and only Member A may harvest the metadata record for that identifier.
Sets
The deposit harvester supports a hierarchy of sets. The hierarchy is in three parts: <work-type>
:<prefix>
:<publication-id>
. For example, the set J:10.12345:6789
will return metadata for a journal (J), with prefix 10.12345
, and publication id 6789
. The set B will return all book metadata. The set S:10.12345
will return all the series metadata associated with the 10.12345
prefix.
The work-type designators are:
- J for journals
- B for books and book-like works (reports, conference proceedings, standards, dissertations)
- S for non-journal series and series-like works.
If no set is specified, the set “J” is used.
Example requests
ListSets
Retrieve list of titles owned by the prefixes assigned to your account:
https://oai.crossref.org/DepositHarvester?verb=ListSets&usr=username&pwd=password
ListRecords
Retrieve data for a prefix:
https://oai.crossref.org/DepositHarvester?verb=ListRecords&metadataPrefix=cr_unixsd&set=work-type:prefix&usr=username&pwd=password
Retrieve data for a single title:
https://oai.crossref.org/DepositHarvester?verb=ListRecords&metadataPrefix=cr_unixsd&set=work-type:prefix:title ID&usr=username&pwd=password
GetRecord
Retrieve data for a single DOI:
https://oai.crossref.org/DepositHarvester?verb=GetRecord&metadataPrefix=cr_unixsd&identifier=info:doi/DOI&usr=username&pwd=password
When using GetRecord, the <DOI>
value should be URL encoded.
Identify
Use to check the status of the deposit harvester (no account needed):
https://oai.crossref.org/DepositHarvester?verb=Identify
Lists available metadata formats (currently UNIXREF)
https://oai.crossref.org/DepositHarvester?verb=ListMetadataFormats
Request parameters
- work-type: J for journals, B for book or conference proceeding titles, S for series
- prefix: the owning prefix of the title being retrieved
- title ID: the title identification number assigned by us. Title IDs are included in the ListSets response described above.
- username and password: account details for the prefix/title being retrieved
Results
Results conform to Crossref’s UNIXREF format and may contain the following root elements:
- journal
- book
- conference
- dissertation
- report-paper
- standard
- sa_component
- database
Using resumption tokens with the deposit harvester
Some OAI-PMH requests are too big to be retrieved in a single transaction. If a given response contains a resumption token, the user must make an additional request to retrieve the rest of the data. You must provide the account name and password with both the initial request and subsequent resumption requests. A resumption without authentication details will fail. Learn more about resumption tokens.
Initial request
https://oai.crossref.org/DepositHarvester?verb=ListRecords&metadataPrefix=cr_unixsd&set=J:10.4102:83986&usr=username&pwd=password
Request with resumption token
https://oai.crossref.org/DepositHarvester?verb=ListRecords&metadataPrefix=cr_unixsd&set=J:10.4102:83986&usr=username&pwd=password&resumptionToken=01f7f30e-f692-4cc4-97b2-1eaf88b3f17f