Limits...
Novel methodology for construction and pruning of quasi-median networks.

Ayling SC, Brown TA - BMC Bioinformatics (2008)

Bottom Line: Graph reduction or pruning methods have been used to reduce network complexity but some of these methods are inapplicable to datasets in which recombination has occurred and others are procedurally complex and/or result in disconnected networks.We address the problems inherent in construction and reduction of quasi-median networks.Application of this approach to 5S rDNA sequence data from sea beet produced a pruned network within which genetic isolation between populations by distance was evident, demonstrating the value of this approach for exploration of evolutionary relationships.

View Article: PubMed Central - HTML - PubMed

Affiliation: Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK. sarah.ayling@manchester.ac.uk

ABSTRACT

Background: Visualising the evolutionary history of a set of sequences is a challenge for molecular phylogenetics. One approach is to use undirected graphs, such as median networks, to visualise phylogenies where reticulate relationships such as recombination or homoplasy are displayed as cycles. Median networks contain binary representations of sequences as nodes, with edges connecting those sequences differing at one character; hypothetical ancestral nodes are invoked to generate a connected network which contains all most parsimonious trees. Quasi-median networks are a generalisation of median networks which are not restricted to binary data, although phylogenetic information contained within the multistate positions can be lost during the preprocessing of data. Where the history of a set of samples contain frequent homoplasies or recombination events quasi-median networks will have a complex topology. Graph reduction or pruning methods have been used to reduce network complexity but some of these methods are inapplicable to datasets in which recombination has occurred and others are procedurally complex and/or result in disconnected networks.

Results: We address the problems inherent in construction and reduction of quasi-median networks. We describe a novel method of generating quasi-median networks that uses all characters, both binary and multistate, without imposing an arbitrary ordering of the multistate partitions. We also describe a pruning mechanism which maintains at least one shortest path between observed sequences, displaying the underlying relations between all pairs of sequences while maintaining a connected graph.

Conclusion: Application of this approach to 5S rDNA sequence data from sea beet produced a pruned network within which genetic isolation between populations by distance was evident, demonstrating the value of this approach for exploration of evolutionary relationships.

Show MeSH
Flow diagram illustrating quasi-median network construction for a set of hypothetical sequences. (1) Sequence 'a' is chosen as the reference sequence, constant columns are shown in red. (2) Shows only variable positions. (3) Arrows indicate identical columns to be collapsed together. (4) In the set of semi-processed sequences the 4th and 5th characters contain the same partition between (0/1) and (A/BC) shown in (red/black). (5) Binary tuples representing multistate characters are shown in red. Arrows indicate identical columns to be collapsed. (6) Set of processed sequences from which the median closure (7) can be built. (8) Shows the conversion of binary tuples to multistate character states. Numbering from left to right, positions 3, 4 and 5 encode the first multistate character, 6, 7 and 3 encode the second; '*' represents the virtual median: sequences containing the virtual median are shown in red. (9) The virtual medians are expanded to form a set of sequences with each possible multistate character state. Grey sequences are those which have already been generated. (10) Shows the quasi-median closure.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2267707&req=5

Figure 1: Flow diagram illustrating quasi-median network construction for a set of hypothetical sequences. (1) Sequence 'a' is chosen as the reference sequence, constant columns are shown in red. (2) Shows only variable positions. (3) Arrows indicate identical columns to be collapsed together. (4) In the set of semi-processed sequences the 4th and 5th characters contain the same partition between (0/1) and (A/BC) shown in (red/black). (5) Binary tuples representing multistate characters are shown in red. Arrows indicate identical columns to be collapsed. (6) Set of processed sequences from which the median closure (7) can be built. (8) Shows the conversion of binary tuples to multistate character states. Numbering from left to right, positions 3, 4 and 5 encode the first multistate character, 6, 7 and 3 encode the second; '*' represents the virtual median: sequences containing the virtual median are shown in red. (9) The virtual medians are expanded to form a set of sequences with each possible multistate character state. Grey sequences are those which have already been generated. (10) Shows the quasi-median closure.

Mentions: Here we present a novel approach to generate quasi-median networks for a set of aligned DNA sequences. This method incorporates multistate characters by inferring virtual medians to connect them. The median closure of the sequences with virtual medians is determined, after which the virtual medians are converted to multistate characters generating the quasi-median closure (see Figure 1 for an overview). The process is outlined in detail below.


Novel methodology for construction and pruning of quasi-median networks.

Ayling SC, Brown TA - BMC Bioinformatics (2008)

Flow diagram illustrating quasi-median network construction for a set of hypothetical sequences. (1) Sequence 'a' is chosen as the reference sequence, constant columns are shown in red. (2) Shows only variable positions. (3) Arrows indicate identical columns to be collapsed together. (4) In the set of semi-processed sequences the 4th and 5th characters contain the same partition between (0/1) and (A/BC) shown in (red/black). (5) Binary tuples representing multistate characters are shown in red. Arrows indicate identical columns to be collapsed. (6) Set of processed sequences from which the median closure (7) can be built. (8) Shows the conversion of binary tuples to multistate character states. Numbering from left to right, positions 3, 4 and 5 encode the first multistate character, 6, 7 and 3 encode the second; '*' represents the virtual median: sequences containing the virtual median are shown in red. (9) The virtual medians are expanded to form a set of sequences with each possible multistate character state. Grey sequences are those which have already been generated. (10) Shows the quasi-median closure.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2267707&req=5

Figure 1: Flow diagram illustrating quasi-median network construction for a set of hypothetical sequences. (1) Sequence 'a' is chosen as the reference sequence, constant columns are shown in red. (2) Shows only variable positions. (3) Arrows indicate identical columns to be collapsed together. (4) In the set of semi-processed sequences the 4th and 5th characters contain the same partition between (0/1) and (A/BC) shown in (red/black). (5) Binary tuples representing multistate characters are shown in red. Arrows indicate identical columns to be collapsed. (6) Set of processed sequences from which the median closure (7) can be built. (8) Shows the conversion of binary tuples to multistate character states. Numbering from left to right, positions 3, 4 and 5 encode the first multistate character, 6, 7 and 3 encode the second; '*' represents the virtual median: sequences containing the virtual median are shown in red. (9) The virtual medians are expanded to form a set of sequences with each possible multistate character state. Grey sequences are those which have already been generated. (10) Shows the quasi-median closure.
Mentions: Here we present a novel approach to generate quasi-median networks for a set of aligned DNA sequences. This method incorporates multistate characters by inferring virtual medians to connect them. The median closure of the sequences with virtual medians is determined, after which the virtual medians are converted to multistate characters generating the quasi-median closure (see Figure 1 for an overview). The process is outlined in detail below.

Bottom Line: Graph reduction or pruning methods have been used to reduce network complexity but some of these methods are inapplicable to datasets in which recombination has occurred and others are procedurally complex and/or result in disconnected networks.We address the problems inherent in construction and reduction of quasi-median networks.Application of this approach to 5S rDNA sequence data from sea beet produced a pruned network within which genetic isolation between populations by distance was evident, demonstrating the value of this approach for exploration of evolutionary relationships.

View Article: PubMed Central - HTML - PubMed

Affiliation: Manchester Interdisciplinary Biocentre, University of Manchester, 131 Princess Street, Manchester M1 7DN, UK. sarah.ayling@manchester.ac.uk

ABSTRACT

Background: Visualising the evolutionary history of a set of sequences is a challenge for molecular phylogenetics. One approach is to use undirected graphs, such as median networks, to visualise phylogenies where reticulate relationships such as recombination or homoplasy are displayed as cycles. Median networks contain binary representations of sequences as nodes, with edges connecting those sequences differing at one character; hypothetical ancestral nodes are invoked to generate a connected network which contains all most parsimonious trees. Quasi-median networks are a generalisation of median networks which are not restricted to binary data, although phylogenetic information contained within the multistate positions can be lost during the preprocessing of data. Where the history of a set of samples contain frequent homoplasies or recombination events quasi-median networks will have a complex topology. Graph reduction or pruning methods have been used to reduce network complexity but some of these methods are inapplicable to datasets in which recombination has occurred and others are procedurally complex and/or result in disconnected networks.

Results: We address the problems inherent in construction and reduction of quasi-median networks. We describe a novel method of generating quasi-median networks that uses all characters, both binary and multistate, without imposing an arbitrary ordering of the multistate partitions. We also describe a pruning mechanism which maintains at least one shortest path between observed sequences, displaying the underlying relations between all pairs of sequences while maintaining a connected graph.

Conclusion: Application of this approach to 5S rDNA sequence data from sea beet produced a pruned network within which genetic isolation between populations by distance was evident, demonstrating the value of this approach for exploration of evolutionary relationships.

Show MeSH