Limits...
NeXML: rich, extensible, and verifiable representation of comparative data and metadata.

Vos RA, Balhoff JP, Caravas JA, Holder MT, Lapp H, Maddison WP, Midford PE, Priyam A, Sukumaran J, Xia X, Stoltzfus A - Syst. Biol. (2012)

Bottom Line: We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats.The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications.An active, open, community-based development process enables future revision and expansion of NeXML.

View Article: PubMed Central - PubMed

Affiliation: NCB Naturalis, Leiden, The Netherlands. rutger.vos@ncbnaturalis.nl

ABSTRACT
In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.

Show MeSH

Related in: MedlinePlus

NeXML syntax example: Phenoscape character states. This code fragment shows how the Phenoscape project uses the NeXML-compatible application Phenex to annotate character states. A character, identified by “char01,” is defined as able to occupy any of the states from state set “states01.” Within that state set, in this instance, there is only the state “state0102.” That state is annotated with an EQ statement (here expressed in a Phenex-specific XML dialect) that identifies a morphological feature called the “antorbital” and qualifies it as being absent. (In a complete NeXML document, the format element occurs within a characters element, which is preceded by a container of OTUs, i.e., an otus element, here omitted for clarity.)
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3376374&req=5

fig3: NeXML syntax example: Phenoscape character states. This code fragment shows how the Phenoscape project uses the NeXML-compatible application Phenex to annotate character states. A character, identified by “char01,” is defined as able to occupy any of the states from state set “states01.” Within that state set, in this instance, there is only the state “state0102.” That state is annotated with an EQ statement (here expressed in a Phenex-specific XML dialect) that identifies a morphological feature called the “antorbital” and qualifies it as being absent. (In a complete NeXML document, the format element occurs within a characters element, which is preceded by a container of OTUs, i.e., an otus element, here omitted for clarity.)

Mentions: The Phenoscape project (Dahdul et al. 2010a) links evolution to genomics using phenotype ontologies. By using the “Entity–Quality” (EQ; see Mungall et al. 2010) formalism of the OBO consortium, the Phenoscape project can combine organism-specific terms from ontologies such as the Teleost Anatomy Ontology (Dahdul et al. 2010b) with quality terms from the generic Phenotype And Trait Ontology (Gkoutos et al. 2004) to create intersections that describe character states in a specific group of organisms. The Phenex software (Balhoff et al. 2010) allows such descriptions to be attached to NeXML character states. Figure 3 shows an example code fragment of this. The format element contains enumerations of character-state definitions containing one or more state elements. The annotation clarifies that the state with identifier state0102 describes a phenotype, which is indicated by the usage of the describes Phenotype predicate from the Phenoscape vocabulary; the description itself is expressed in PhenoXML, a Phenex-specific syntax for constructing EQ statements (Balhoff et al. 2010). In this case, the entity TAO:0000127 has the quality PATO:0000462, which means that state state0102 describes the absence of the antorbital (a bone). For each subsequently defined character (char), the applicable state set is defined by referencing its identifier. Adopting NeXML has allowed the Phenoscape project to describe character states in great detail in an interoperable, machine-readable way.Figure 3.


NeXML: rich, extensible, and verifiable representation of comparative data and metadata.

Vos RA, Balhoff JP, Caravas JA, Holder MT, Lapp H, Maddison WP, Midford PE, Priyam A, Sukumaran J, Xia X, Stoltzfus A - Syst. Biol. (2012)

NeXML syntax example: Phenoscape character states. This code fragment shows how the Phenoscape project uses the NeXML-compatible application Phenex to annotate character states. A character, identified by “char01,” is defined as able to occupy any of the states from state set “states01.” Within that state set, in this instance, there is only the state “state0102.” That state is annotated with an EQ statement (here expressed in a Phenex-specific XML dialect) that identifies a morphological feature called the “antorbital” and qualifies it as being absent. (In a complete NeXML document, the format element occurs within a characters element, which is preceded by a container of OTUs, i.e., an otus element, here omitted for clarity.)
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3376374&req=5

fig3: NeXML syntax example: Phenoscape character states. This code fragment shows how the Phenoscape project uses the NeXML-compatible application Phenex to annotate character states. A character, identified by “char01,” is defined as able to occupy any of the states from state set “states01.” Within that state set, in this instance, there is only the state “state0102.” That state is annotated with an EQ statement (here expressed in a Phenex-specific XML dialect) that identifies a morphological feature called the “antorbital” and qualifies it as being absent. (In a complete NeXML document, the format element occurs within a characters element, which is preceded by a container of OTUs, i.e., an otus element, here omitted for clarity.)
Mentions: The Phenoscape project (Dahdul et al. 2010a) links evolution to genomics using phenotype ontologies. By using the “Entity–Quality” (EQ; see Mungall et al. 2010) formalism of the OBO consortium, the Phenoscape project can combine organism-specific terms from ontologies such as the Teleost Anatomy Ontology (Dahdul et al. 2010b) with quality terms from the generic Phenotype And Trait Ontology (Gkoutos et al. 2004) to create intersections that describe character states in a specific group of organisms. The Phenex software (Balhoff et al. 2010) allows such descriptions to be attached to NeXML character states. Figure 3 shows an example code fragment of this. The format element contains enumerations of character-state definitions containing one or more state elements. The annotation clarifies that the state with identifier state0102 describes a phenotype, which is indicated by the usage of the describes Phenotype predicate from the Phenoscape vocabulary; the description itself is expressed in PhenoXML, a Phenex-specific syntax for constructing EQ statements (Balhoff et al. 2010). In this case, the entity TAO:0000127 has the quality PATO:0000462, which means that state state0102 describes the absence of the antorbital (a bone). For each subsequently defined character (char), the applicable state set is defined by referencing its identifier. Adopting NeXML has allowed the Phenoscape project to describe character states in great detail in an interoperable, machine-readable way.Figure 3.

Bottom Line: We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats.The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications.An active, open, community-based development process enables future revision and expansion of NeXML.

View Article: PubMed Central - PubMed

Affiliation: NCB Naturalis, Leiden, The Netherlands. rutger.vos@ncbnaturalis.nl

ABSTRACT
In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.

Show MeSH
Related in: MedlinePlus