Limits...
A mixture model and a hidden markov model to simultaneously detect recombination breakpoints and reconstruct phylogenies.

Boussau B, Guéguen L, Gouy M - Evol. Bioinform. Online (2009)

Bottom Line: In this article, we propose and implement a Mixture Model on trees and a phylogenetic Hidden Markov Model to reveal recombination breakpoints while searching for the various evolutionary histories that are present in an alignment known to have undergone homologous recombination.These models are sufficiently efficient to be applied to dozens of sequences on a single desktop computer, and can handle equivalently nucleotide or protein sequences.We estimate their accuracy on simulated sequences and test them on real data.

View Article: PubMed Central - PubMed

Affiliation: Université de Lyon, université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, Villeurbanne F-69622, France. boussau@biomserv.univ-lyon1.fr

ABSTRACT
Homologous recombination is a pervasive biological process that affects sequences in all living organisms and viruses. In the presence of recombination, the evolutionary history of an alignment of homologous sequences cannot be properly depicted by a single bifurcating tree: some sites have evolved along a specific phylogenetic tree, others have followed another path. Methods available to analyse recombination in sequences usually involve an analysis of the alignment through sliding-windows, or are particularly demanding in computational resources, and are often limited to nucleotide sequences. In this article, we propose and implement a Mixture Model on trees and a phylogenetic Hidden Markov Model to reveal recombination breakpoints while searching for the various evolutionary histories that are present in an alignment known to have undergone homologous recombination. These models are sufficiently efficient to be applied to dozens of sequences on a single desktop computer, and can handle equivalently nucleotide or protein sequences. We estimate their accuracy on simulated sequences and test them on real data.

No MeSH data available.


Related in: MedlinePlus

Ability of the Phylo-HMM (left) and Mixture Model (right) to detect the breakpoint position in simulated alignments. The dashed grey line corresponds to values that would be obtained with an ideal method, whose reconstructions are identical to simulations.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC2747125&req=5

f4-ebo-2009-067: Ability of the Phylo-HMM (left) and Mixture Model (right) to detect the breakpoint position in simulated alignments. The dashed grey line corresponds to values that would be obtained with an ideal method, whose reconstructions are identical to simulations.

Mentions: Both the MM and the Phylo-HMM most often detect two segments in the alignment. In such cases, Figure 4 shows that the precision with which the breakpoint is predicted displays the same dependency upon the length of the smaller segment as the ability of the models to detect the number of segments. The phylo-HMM seems slightly better than the MM in detecting the precise breakpoint position when the smallest partition is ≥200 bases long. Although the Phylo-HMM seems not as good as the Mixture Model when the smallest partition is 100 bases long, the difference between the two methods is not significant (Student t-test and Wilcoxon test on the absolute differences between expected position and predicted position). This suggests that using more than a single autocorrelation parameter in the HMM method may not be useful, even when segment lengths are very dissimilar.


A mixture model and a hidden markov model to simultaneously detect recombination breakpoints and reconstruct phylogenies.

Boussau B, Guéguen L, Gouy M - Evol. Bioinform. Online (2009)

Ability of the Phylo-HMM (left) and Mixture Model (right) to detect the breakpoint position in simulated alignments. The dashed grey line corresponds to values that would be obtained with an ideal method, whose reconstructions are identical to simulations.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC2747125&req=5

f4-ebo-2009-067: Ability of the Phylo-HMM (left) and Mixture Model (right) to detect the breakpoint position in simulated alignments. The dashed grey line corresponds to values that would be obtained with an ideal method, whose reconstructions are identical to simulations.
Mentions: Both the MM and the Phylo-HMM most often detect two segments in the alignment. In such cases, Figure 4 shows that the precision with which the breakpoint is predicted displays the same dependency upon the length of the smaller segment as the ability of the models to detect the number of segments. The phylo-HMM seems slightly better than the MM in detecting the precise breakpoint position when the smallest partition is ≥200 bases long. Although the Phylo-HMM seems not as good as the Mixture Model when the smallest partition is 100 bases long, the difference between the two methods is not significant (Student t-test and Wilcoxon test on the absolute differences between expected position and predicted position). This suggests that using more than a single autocorrelation parameter in the HMM method may not be useful, even when segment lengths are very dissimilar.

Bottom Line: In this article, we propose and implement a Mixture Model on trees and a phylogenetic Hidden Markov Model to reveal recombination breakpoints while searching for the various evolutionary histories that are present in an alignment known to have undergone homologous recombination.These models are sufficiently efficient to be applied to dozens of sequences on a single desktop computer, and can handle equivalently nucleotide or protein sequences.We estimate their accuracy on simulated sequences and test them on real data.

View Article: PubMed Central - PubMed

Affiliation: Université de Lyon, université Lyon 1, CNRS, UMR 5558, Laboratoire de Biométrie et Biologie Evolutive, 43 boulevard du 11 novembre 1918, Villeurbanne F-69622, France. boussau@biomserv.univ-lyon1.fr

ABSTRACT
Homologous recombination is a pervasive biological process that affects sequences in all living organisms and viruses. In the presence of recombination, the evolutionary history of an alignment of homologous sequences cannot be properly depicted by a single bifurcating tree: some sites have evolved along a specific phylogenetic tree, others have followed another path. Methods available to analyse recombination in sequences usually involve an analysis of the alignment through sliding-windows, or are particularly demanding in computational resources, and are often limited to nucleotide sequences. In this article, we propose and implement a Mixture Model on trees and a phylogenetic Hidden Markov Model to reveal recombination breakpoints while searching for the various evolutionary histories that are present in an alignment known to have undergone homologous recombination. These models are sufficiently efficient to be applied to dozens of sequences on a single desktop computer, and can handle equivalently nucleotide or protein sequences. We estimate their accuracy on simulated sequences and test them on real data.

No MeSH data available.


Related in: MedlinePlus