Limits...
Treelink: data integration, clustering and visualization of phylogenetic trees.

Allende C, Sohn E, Little C - BMC Bioinformatics (2015)

Bottom Line: In many of these studies, tree nodes need to be associated with a variety of attributes.For example, in studies concerned with viral relationships, tree nodes are associated with epidemiological information, such as location, age and subtype.Our software can successfully integrate phylogenetic trees with different data sources, and perform operations to differentiate and visualize those differences within a tree.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Engineering and Sciences, Universidad Adolfo Ibañez, Diagonal las Torres 2640, Santiago, 7941169, Chile. christian.allende.cid@gmail.com.

ABSTRACT

Background: Phylogenetic trees are central to a wide range of biological studies. In many of these studies, tree nodes need to be associated with a variety of attributes. For example, in studies concerned with viral relationships, tree nodes are associated with epidemiological information, such as location, age and subtype. Gene trees used in comparative genomics are usually linked with taxonomic information, such as functional annotations and events. A wide variety of tree visualization and annotation tools have been developed in the past, however none of them are intended for an integrative and comparative analysis.

Results: Treelink is a platform-independent software for linking datasets and sequence files to phylogenetic trees. The application allows an automated integration of datasets to trees for operations such as classifying a tree based on a field or showing the distribution of selected data attributes in branches and leafs. Genomic and proteonomic sequences can also be linked to the tree and extracted from internal and external nodes. A novel clustering algorithm to simplify trees and display the most divergent clades was also developed, where validation can be achieved using the data integration and classification function. Integrated geographical information allows ancestral character reconstruction for phylogeographic plotting based on parsimony and likelihood algorithms.

Conclusion: Our software can successfully integrate phylogenetic trees with different data sources, and perform operations to differentiate and visualize those differences within a tree. File support includes the most popular formats such as newick and csv. Exporting visualizations as images, cluster outputs and genomic sequences is supported. Treelink is available as a web and desktop application at http://www.treelinkapp.com .

Show MeSH
Phylogenetic visualizations. Left: HIV Subtype Consensus tree. Right: HIV Consensus tree classified by the country field of a dataset
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4696249&req=5

Fig1: Phylogenetic visualizations. Left: HIV Subtype Consensus tree. Right: HIV Consensus tree classified by the country field of a dataset

Mentions: The application comes with 2 general purpose functions based on data integration. The user can visualize and cross-reference specific attributes from the dataset by searching for them inside the leafs or by selecting them on the table. An annotated visualization is then rendered by the software showing the distribution of those fields on the tree (Fig. 1). Classification is accomplished by selecting a categorical trait or property loaded from the dataset, which is distributed and displayed along the leafs of tree.Fig. 1


Treelink: data integration, clustering and visualization of phylogenetic trees.

Allende C, Sohn E, Little C - BMC Bioinformatics (2015)

Phylogenetic visualizations. Left: HIV Subtype Consensus tree. Right: HIV Consensus tree classified by the country field of a dataset
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4696249&req=5

Fig1: Phylogenetic visualizations. Left: HIV Subtype Consensus tree. Right: HIV Consensus tree classified by the country field of a dataset
Mentions: The application comes with 2 general purpose functions based on data integration. The user can visualize and cross-reference specific attributes from the dataset by searching for them inside the leafs or by selecting them on the table. An annotated visualization is then rendered by the software showing the distribution of those fields on the tree (Fig. 1). Classification is accomplished by selecting a categorical trait or property loaded from the dataset, which is distributed and displayed along the leafs of tree.Fig. 1

Bottom Line: In many of these studies, tree nodes need to be associated with a variety of attributes.For example, in studies concerned with viral relationships, tree nodes are associated with epidemiological information, such as location, age and subtype.Our software can successfully integrate phylogenetic trees with different data sources, and perform operations to differentiate and visualize those differences within a tree.

View Article: PubMed Central - PubMed

Affiliation: Faculty of Engineering and Sciences, Universidad Adolfo Ibañez, Diagonal las Torres 2640, Santiago, 7941169, Chile. christian.allende.cid@gmail.com.

ABSTRACT

Background: Phylogenetic trees are central to a wide range of biological studies. In many of these studies, tree nodes need to be associated with a variety of attributes. For example, in studies concerned with viral relationships, tree nodes are associated with epidemiological information, such as location, age and subtype. Gene trees used in comparative genomics are usually linked with taxonomic information, such as functional annotations and events. A wide variety of tree visualization and annotation tools have been developed in the past, however none of them are intended for an integrative and comparative analysis.

Results: Treelink is a platform-independent software for linking datasets and sequence files to phylogenetic trees. The application allows an automated integration of datasets to trees for operations such as classifying a tree based on a field or showing the distribution of selected data attributes in branches and leafs. Genomic and proteonomic sequences can also be linked to the tree and extracted from internal and external nodes. A novel clustering algorithm to simplify trees and display the most divergent clades was also developed, where validation can be achieved using the data integration and classification function. Integrated geographical information allows ancestral character reconstruction for phylogeographic plotting based on parsimony and likelihood algorithms.

Conclusion: Our software can successfully integrate phylogenetic trees with different data sources, and perform operations to differentiate and visualize those differences within a tree. File support includes the most popular formats such as newick and csv. Exporting visualizations as images, cluster outputs and genomic sequences is supported. Treelink is available as a web and desktop application at http://www.treelinkapp.com .

Show MeSH