Limits...
Phylogenetic support values are not necessarily informative: the case of the Serialia hypothesis (a mollusk phylogeny).

Wägele JW, Letsch H, Klussmann-Kolb A, Mayer C, Misof B, Wägele H - Front. Zool. (2009)

Bottom Line: However, different phylogenetic trees often contain conflicting results and contradict significant background data.We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies.Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded.

View Article: PubMed Central - HTML - PubMed

Affiliation: Zoologisches Forschungsmuseum Alexander Koenig, Adenauerallee 160, 53313 Bonn, Germany. w.waegele.zfmk@uni-bonn.de.

ABSTRACT

Background: Molecular phylogenies are being published increasingly and many biologists rely on the most recent topologies. However, different phylogenetic trees often contain conflicting results and contradict significant background data. Not knowing how reliable traditional knowledge is, a crucial question concerns the quality of newly produced molecular data. The information content of DNA alignments is rarely discussed, as quality statements are mostly restricted to the statistical support of clades. Here we present a case study of a recently published mollusk phylogeny that contains surprising groupings, based on five genes and 108 species, and we apply new or rarely used tools for the analysis of the information content of alignments and for the filtering of noise (masking of random-like alignment regions, split decomposition, phylogenetic networks, quartet mapping).

Results: The data are very fragmentary and contain contaminations. We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies. Split-decomposition, quartet mapping and neighbornet analyses reveal conflicting nucleotide patterns and lack of distinct phylogenetic signal for the deeper phylogeny of mollusks.

Conclusion: Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded. Contradictions between phylogenies based on different analyses are already a strong indication of unnoticed pitfalls. The use of tree-independent tools for exploratory analyses of data quality is highly recommended. Concerning the new mollusk phylogeny more convincing evidence is needed.

No MeSH data available.


Neighbornet graph estimated from p-distances with SplitsTree and using the complete alignment from Giribet et al. (2006). Color code: Cephalopods are shown in orange, Caudofoveata mauve, Scaphopoda brown, Gastropoda blue, Polyplacophora green. Laevilipilina is nested within a subclade of the Bivalvia (red). Note long branches leading to cephalopods and to the gastropods Cellana and Eulepetopsis, which together form a weak clade probably supported by parallel substitutions. Polyphyly of gastropods and lack of distinct treeness indicates that, in this alignment, there is little conserved phylogenetic signal which is stronger than noise.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2710323&req=5

Figure 1: Neighbornet graph estimated from p-distances with SplitsTree and using the complete alignment from Giribet et al. (2006). Color code: Cephalopods are shown in orange, Caudofoveata mauve, Scaphopoda brown, Gastropoda blue, Polyplacophora green. Laevilipilina is nested within a subclade of the Bivalvia (red). Note long branches leading to cephalopods and to the gastropods Cellana and Eulepetopsis, which together form a weak clade probably supported by parallel substitutions. Polyphyly of gastropods and lack of distinct treeness indicates that, in this alignment, there is little conserved phylogenetic signal which is stronger than noise.

Mentions: For exploratory data analyses we first used the original, complete alignment[4]. Neighbornet graphs constructed from uncorrected distances (Fig. 1: all 9378 positions, 108 taxa, fit value = 93,08) had only few splits supported by distinct edges. The clade Serialia as proposed by Giribet et al. (2006) does not exist in this inference. The monoplacophoran sequence (Laevipilina antarctica) is found amidst a cluster of bivalves. The most prominent split separates all cephalopods except the Nautilus sequences, which branch off more basally from the cephalopod clade and is also supported as a whole by a set of parallel edges (Fig. 1: taxa and separating edges in orange). The remaining network is dominated by parallelograms, hence it is obvious that the alignment contained many conflicting nucleotide patterns. The signal for monophyly of the Mollusca was not distinct. The Caudofoveata (Chaetoderma nitidulum and Scutopus ventrolineatus in mauve) are clearly separated from the remaining sequences, and there are short parallel edges for the two clades Scaphopoda and Polyplacophora (Fig. 1, brown and green, respectively). The Gastropoda are scattered over the graph (blue). Two long-branched gastropod sequences (Cellana sp., Eulepetopsis vitrea) are attracted to the long cephalopod branch. Non-monophyly of Gastropoda and Bivalvia together with a lack of jackknife-support values for the deeper nodes were also attributes of the tree published by Giribet et al. [4]. The lack of support for deeper clades in Figure 1 indicates the absence of a distinct phylogenetic signal for most of the larger species groups.


Phylogenetic support values are not necessarily informative: the case of the Serialia hypothesis (a mollusk phylogeny).

Wägele JW, Letsch H, Klussmann-Kolb A, Mayer C, Misof B, Wägele H - Front. Zool. (2009)

Neighbornet graph estimated from p-distances with SplitsTree and using the complete alignment from Giribet et al. (2006). Color code: Cephalopods are shown in orange, Caudofoveata mauve, Scaphopoda brown, Gastropoda blue, Polyplacophora green. Laevilipilina is nested within a subclade of the Bivalvia (red). Note long branches leading to cephalopods and to the gastropods Cellana and Eulepetopsis, which together form a weak clade probably supported by parallel substitutions. Polyphyly of gastropods and lack of distinct treeness indicates that, in this alignment, there is little conserved phylogenetic signal which is stronger than noise.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2710323&req=5

Figure 1: Neighbornet graph estimated from p-distances with SplitsTree and using the complete alignment from Giribet et al. (2006). Color code: Cephalopods are shown in orange, Caudofoveata mauve, Scaphopoda brown, Gastropoda blue, Polyplacophora green. Laevilipilina is nested within a subclade of the Bivalvia (red). Note long branches leading to cephalopods and to the gastropods Cellana and Eulepetopsis, which together form a weak clade probably supported by parallel substitutions. Polyphyly of gastropods and lack of distinct treeness indicates that, in this alignment, there is little conserved phylogenetic signal which is stronger than noise.
Mentions: For exploratory data analyses we first used the original, complete alignment[4]. Neighbornet graphs constructed from uncorrected distances (Fig. 1: all 9378 positions, 108 taxa, fit value = 93,08) had only few splits supported by distinct edges. The clade Serialia as proposed by Giribet et al. (2006) does not exist in this inference. The monoplacophoran sequence (Laevipilina antarctica) is found amidst a cluster of bivalves. The most prominent split separates all cephalopods except the Nautilus sequences, which branch off more basally from the cephalopod clade and is also supported as a whole by a set of parallel edges (Fig. 1: taxa and separating edges in orange). The remaining network is dominated by parallelograms, hence it is obvious that the alignment contained many conflicting nucleotide patterns. The signal for monophyly of the Mollusca was not distinct. The Caudofoveata (Chaetoderma nitidulum and Scutopus ventrolineatus in mauve) are clearly separated from the remaining sequences, and there are short parallel edges for the two clades Scaphopoda and Polyplacophora (Fig. 1, brown and green, respectively). The Gastropoda are scattered over the graph (blue). Two long-branched gastropod sequences (Cellana sp., Eulepetopsis vitrea) are attracted to the long cephalopod branch. Non-monophyly of Gastropoda and Bivalvia together with a lack of jackknife-support values for the deeper nodes were also attributes of the tree published by Giribet et al. [4]. The lack of support for deeper clades in Figure 1 indicates the absence of a distinct phylogenetic signal for most of the larger species groups.

Bottom Line: However, different phylogenetic trees often contain conflicting results and contradict significant background data.We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies.Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded.

View Article: PubMed Central - HTML - PubMed

Affiliation: Zoologisches Forschungsmuseum Alexander Koenig, Adenauerallee 160, 53313 Bonn, Germany. w.waegele.zfmk@uni-bonn.de.

ABSTRACT

Background: Molecular phylogenies are being published increasingly and many biologists rely on the most recent topologies. However, different phylogenetic trees often contain conflicting results and contradict significant background data. Not knowing how reliable traditional knowledge is, a crucial question concerns the quality of newly produced molecular data. The information content of DNA alignments is rarely discussed, as quality statements are mostly restricted to the statistical support of clades. Here we present a case study of a recently published mollusk phylogeny that contains surprising groupings, based on five genes and 108 species, and we apply new or rarely used tools for the analysis of the information content of alignments and for the filtering of noise (masking of random-like alignment regions, split decomposition, phylogenetic networks, quartet mapping).

Results: The data are very fragmentary and contain contaminations. We show that that signal-like patterns in the data set are conflicting and partly not distinct and that the reported strong support for a "rather surprising result" (monoplacophorans and chitons form a monophylum Serialia) does not exist at the level of primary homologies. Split-decomposition, quartet mapping and neighbornet analyses reveal conflicting nucleotide patterns and lack of distinct phylogenetic signal for the deeper phylogeny of mollusks.

Conclusion: Even though currently a majority of molecular phylogenies are being justified with reference to the 'statistical' support of clades in tree topologies, this confidence seems to be unfounded. Contradictions between phylogenies based on different analyses are already a strong indication of unnoticed pitfalls. The use of tree-independent tools for exploratory analyses of data quality is highly recommended. Concerning the new mollusk phylogeny more convincing evidence is needed.

No MeSH data available.