Limits...
Community next steps for making globally unique identifiers work for biocollections data.

Guralnick RP, Cellinese N, Deck J, Pyle RL, Kunze J, Penev L, Walls R, Hagedorn G, Agosti D, Wieczorek J, Catapano T, Page RD - Zookeys (2015)

Bottom Line: Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets.There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses.Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided.

View Article: PubMed Central - HTML - PubMed

Affiliation: Florida Museum of Natural History, University of Florida, Gainesville, FL 32611-2710 USA.

ABSTRACT
Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided.

No MeSH data available.


Related in: MedlinePlus

Identifier schemes differ in whether redirections and mappings to ensure stability are centrally managed or not. Top: a DOI dereferencing service like CrossRef or Datacite redirects to the actual content provider; the URIs of content data and RDF metadata are publicly visible and can be used as independent (albeit often unstable) identifiers. Bottom: A linked open data pattern, where each content provider assumes the responsibility for maintaining a stable mapping; the content negotiation is internal. Modified after Hagedorn 2013.
© Copyright Policy - creative-commons-attribution
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4400380&req=5

Figure 3: Identifier schemes differ in whether redirections and mappings to ensure stability are centrally managed or not. Top: a DOI dereferencing service like CrossRef or Datacite redirects to the actual content provider; the URIs of content data and RDF metadata are publicly visible and can be used as independent (albeit often unstable) identifiers. Bottom: A linked open data pattern, where each content provider assumes the responsibility for maintaining a stable mapping; the content negotiation is internal. Modified after Hagedorn 2013.

Mentions: It is not feasible (or, at this stage, even desirable) for the entire biodiversity community to adopt a single implementation for identifiers. However, evaluation of the available technical solutions is a high priority, and the scope of solutions includes IGSNs, DOIs, EZID ARKs, LOD-URIs and UUIDs (comparisons among many of the different options are shown in Table 2 and a comparison of more or less centrally managed mapping and redirection services is shown in Figure 3). The group explored several different viewpoints promoting the utilization of HTTP URIs for all identifiers and did not reach a consensus. HTTP URIs have the advantage that they provide a semantic web compatible default dereferencing method through the standard http protocol and can be flexibly constructed (Hagedorn et al. 2013). The advantage of many identifiers not being a HTTP URI is that the omission of a default dereferencing method avoids potential confusion and may allow for even greater flexibility. However, we recommend all identifiers have the ability to be dereferenceable through at least one http-based service, even if the http-form is not preferred.


Community next steps for making globally unique identifiers work for biocollections data.

Guralnick RP, Cellinese N, Deck J, Pyle RL, Kunze J, Penev L, Walls R, Hagedorn G, Agosti D, Wieczorek J, Catapano T, Page RD - Zookeys (2015)

Identifier schemes differ in whether redirections and mappings to ensure stability are centrally managed or not. Top: a DOI dereferencing service like CrossRef or Datacite redirects to the actual content provider; the URIs of content data and RDF metadata are publicly visible and can be used as independent (albeit often unstable) identifiers. Bottom: A linked open data pattern, where each content provider assumes the responsibility for maintaining a stable mapping; the content negotiation is internal. Modified after Hagedorn 2013.
© Copyright Policy - creative-commons-attribution
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4400380&req=5

Figure 3: Identifier schemes differ in whether redirections and mappings to ensure stability are centrally managed or not. Top: a DOI dereferencing service like CrossRef or Datacite redirects to the actual content provider; the URIs of content data and RDF metadata are publicly visible and can be used as independent (albeit often unstable) identifiers. Bottom: A linked open data pattern, where each content provider assumes the responsibility for maintaining a stable mapping; the content negotiation is internal. Modified after Hagedorn 2013.
Mentions: It is not feasible (or, at this stage, even desirable) for the entire biodiversity community to adopt a single implementation for identifiers. However, evaluation of the available technical solutions is a high priority, and the scope of solutions includes IGSNs, DOIs, EZID ARKs, LOD-URIs and UUIDs (comparisons among many of the different options are shown in Table 2 and a comparison of more or less centrally managed mapping and redirection services is shown in Figure 3). The group explored several different viewpoints promoting the utilization of HTTP URIs for all identifiers and did not reach a consensus. HTTP URIs have the advantage that they provide a semantic web compatible default dereferencing method through the standard http protocol and can be flexibly constructed (Hagedorn et al. 2013). The advantage of many identifiers not being a HTTP URI is that the omission of a default dereferencing method avoids potential confusion and may allow for even greater flexibility. However, we recommend all identifiers have the ability to be dereferenceable through at least one http-based service, even if the http-form is not preferred.

Bottom Line: Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets.There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses.Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided.

View Article: PubMed Central - HTML - PubMed

Affiliation: Florida Museum of Natural History, University of Florida, Gainesville, FL 32611-2710 USA.

ABSTRACT
Biodiversity data is being digitized and made available online at a rapidly increasing rate but current practices typically do not preserve linkages between these data, which impedes interoperation, provenance tracking, and assembly of larger datasets. For data associated with biocollections, the biodiversity community has long recognized that an essential part of establishing and preserving linkages is to apply globally unique identifiers at the point when data are generated in the field and to persist these identifiers downstream, but this is seldom implemented in practice. There has neither been coalescence towards one single identifier solution (as in some other domains), nor even a set of recommended best practices and standards to support multiple identifier schemes sharing consistent responses. In order to further progress towards a broader community consensus, a group of biocollections and informatics experts assembled in Stockholm in October 2014 to discuss community next steps to overcome current roadblocks. The workshop participants divided into four groups focusing on: identifier practice in current field biocollections; identifier application for legacy biocollections; identifiers as applied to biodiversity data records as they are published and made available in semantically marked-up publications; and cross-cutting identifier solutions that bridge across these domains. The main outcome was consensus on key issues, including recognition of differences between legacy and new biocollections processes, the need for identifier metadata profiles that can report information on identifier persistence missions, and the unambiguous indication of the type of object associated with the identifier. Current identifier characteristics are also summarized, and an overview of available schemes and practices is provided.

No MeSH data available.


Related in: MedlinePlus