Limits...
Molecular taxonomy of phytopathogenic fungi: a case study in Peronospora.

Göker M, García-Blázquez G, Voglmayr H, Tellería MT, Martín MP - PLoS ONE (2009)

Bottom Line: The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data.Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy.The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence.

View Article: PubMed Central - PubMed

Affiliation: Organismic Botany, Eberhard Karls University of Tübingen, Tübingen, Germany. peronospora@goeker.org

ABSTRACT

Background: Inappropriate taxon definitions may have severe consequences in many areas. For instance, biologically sensible species delimitation of plant pathogens is crucial for measures such as plant protection or biological control and for comparative studies involving model organisms. However, delimiting species is challenging in the case of organisms for which often only molecular data are available, such as prokaryotes, fungi, and many unicellular eukaryotes. Even in the case of organisms with well-established morphological characteristics, molecular taxonomy is often necessary to emend current taxonomic concepts and to analyze DNA sequences directly sampled from the environment. Typically, for this purpose clustering approaches to delineate molecular operational taxonomic units have been applied using arbitrary choices regarding the distance threshold values, and the clustering algorithms.

Methodology: Here, we report on a clustering optimization method to establish a molecular taxonomy of Peronospora based on ITS nrDNA sequences. Peronospora is the largest genus within the downy mildews, which are obligate parasites of higher plants, and includes various economically important pathogens. The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data. Optimization was based on both taxonomy-based and host-based reference information, yielding the same outcome. Resampling and permutation methods indicate that the method is robust regarding taxon sampling and errors in the reference data. Tests with newly obtained ITS sequences demonstrate the use of the re-classified dataset in molecular identification of downy mildews.

Conclusions: A corrected taxonomy is provided for all Peronospora ITS sequences contained in public databases. Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy. The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence.

Show MeSH

Related in: MedlinePlus

Maximum-likelihood tree, bottom part.Phylogram as inferred with RAxML and rooted with the Pseudoperonospora sequences present in the dataset. Branches are scaled in terms of the number of substitutions per site. Numbers above/below the branches are maximum likelihood and maximum parsimony bootstrap support values from 100 replicates. The sequence labels contain the “organism” entry and the accession number from the GenBank files; for the validity of these entries, the corrected “organism” names and the revised taxonomy, see supporting file S2. Taxonomic unit (TU) numbers from optimal clustering settings are provided in rectangular brackets. These numbers are only used to circumscribe the TU; they do not indicate relationships between the TU (e.g. TU 16 is not closer to TU 15 than to TU 91). Red labels denote accessions affected by type I conflicts, blue labels by type II conflicts, mauve labels by both type I and II conflicts and green labels by database errors due to incorrect data submission. The red (type I) or blue (type II) lines connect the accessions affected by the respective conflict, with the conflict subtype given to the right. Type I concern the presence of the same taxon in different clusters (TU), type II the presence of several taxa within the same cluster (TU). Subtypes: Ia, different TU correspond to different hosts; Ib-Ic, different TU correspond to the same host; Ib, different TU are effected by sequencing/alignment artefacts; Ic different TU are effected by high genetic variability; IIa different taxa within a TU occur on the same host species/genus; (IIa) different taxa within a TU occur on different host genera within the same family; IIb different taxa within a TU occur on different host families. The tree is continued in Fig. 4.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2712678&req=5

pone-0006319-g003: Maximum-likelihood tree, bottom part.Phylogram as inferred with RAxML and rooted with the Pseudoperonospora sequences present in the dataset. Branches are scaled in terms of the number of substitutions per site. Numbers above/below the branches are maximum likelihood and maximum parsimony bootstrap support values from 100 replicates. The sequence labels contain the “organism” entry and the accession number from the GenBank files; for the validity of these entries, the corrected “organism” names and the revised taxonomy, see supporting file S2. Taxonomic unit (TU) numbers from optimal clustering settings are provided in rectangular brackets. These numbers are only used to circumscribe the TU; they do not indicate relationships between the TU (e.g. TU 16 is not closer to TU 15 than to TU 91). Red labels denote accessions affected by type I conflicts, blue labels by type II conflicts, mauve labels by both type I and II conflicts and green labels by database errors due to incorrect data submission. The red (type I) or blue (type II) lines connect the accessions affected by the respective conflict, with the conflict subtype given to the right. Type I concern the presence of the same taxon in different clusters (TU), type II the presence of several taxa within the same cluster (TU). Subtypes: Ia, different TU correspond to different hosts; Ib-Ic, different TU correspond to the same host; Ib, different TU are effected by sequencing/alignment artefacts; Ic different TU are effected by high genetic variability; IIa different taxa within a TU occur on the same host species/genus; (IIa) different taxa within a TU occur on different host genera within the same family; IIb different taxa within a TU occur on different host families. The tree is continued in Fig. 4.

Mentions: The maximum-likelihood tree inferred from the poa alignment had a log likelihood of -16392.00 and is shown in Figs. 3, 4, 5, together with the numbers of the taxonomic units obtained by clustering the 427 sequences using the optimal parameter settings. In a previous comprehensive study on Peronospora phylogeny [41], backbone resolution of the phylogenetic trees was relatively low. The poa maximum-likelihood tree showed the same pattern, even though the separation of Peronospora and Pseudoperonospora was well supported (Fig. 3). However, strong (93% under maximum likelihood, 68% under maximum parsimony) support was present for a large clade comprising mainly parasites of Caryophyllales and Ranunculales; a subclade of it comprising the same species except Peronospora arborescens was supported with 97% and 95%, respectively (Fig. 4). Some smaller groups with uniform host relationships are also well supported, e.g. a clade comprising four accessions of Rubiaceae parasites (98/99% bootstrap; Fig. 3). In contrast, a large monophylum of exclusively Fabaceae pathogens is present in the tree, but without support (Fig. 5). On the other hand, the tree contains a large number of near-terminal nodes that receive high support, most of which are equivalent to a taxonomic unit (Figs. 3–5). Even though not all taxonomic units are monophyletic in the tree, no taxonomic unit was found that conflicted with a well supported branch.


Molecular taxonomy of phytopathogenic fungi: a case study in Peronospora.

Göker M, García-Blázquez G, Voglmayr H, Tellería MT, Martín MP - PLoS ONE (2009)

Maximum-likelihood tree, bottom part.Phylogram as inferred with RAxML and rooted with the Pseudoperonospora sequences present in the dataset. Branches are scaled in terms of the number of substitutions per site. Numbers above/below the branches are maximum likelihood and maximum parsimony bootstrap support values from 100 replicates. The sequence labels contain the “organism” entry and the accession number from the GenBank files; for the validity of these entries, the corrected “organism” names and the revised taxonomy, see supporting file S2. Taxonomic unit (TU) numbers from optimal clustering settings are provided in rectangular brackets. These numbers are only used to circumscribe the TU; they do not indicate relationships between the TU (e.g. TU 16 is not closer to TU 15 than to TU 91). Red labels denote accessions affected by type I conflicts, blue labels by type II conflicts, mauve labels by both type I and II conflicts and green labels by database errors due to incorrect data submission. The red (type I) or blue (type II) lines connect the accessions affected by the respective conflict, with the conflict subtype given to the right. Type I concern the presence of the same taxon in different clusters (TU), type II the presence of several taxa within the same cluster (TU). Subtypes: Ia, different TU correspond to different hosts; Ib-Ic, different TU correspond to the same host; Ib, different TU are effected by sequencing/alignment artefacts; Ic different TU are effected by high genetic variability; IIa different taxa within a TU occur on the same host species/genus; (IIa) different taxa within a TU occur on different host genera within the same family; IIb different taxa within a TU occur on different host families. The tree is continued in Fig. 4.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2712678&req=5

pone-0006319-g003: Maximum-likelihood tree, bottom part.Phylogram as inferred with RAxML and rooted with the Pseudoperonospora sequences present in the dataset. Branches are scaled in terms of the number of substitutions per site. Numbers above/below the branches are maximum likelihood and maximum parsimony bootstrap support values from 100 replicates. The sequence labels contain the “organism” entry and the accession number from the GenBank files; for the validity of these entries, the corrected “organism” names and the revised taxonomy, see supporting file S2. Taxonomic unit (TU) numbers from optimal clustering settings are provided in rectangular brackets. These numbers are only used to circumscribe the TU; they do not indicate relationships between the TU (e.g. TU 16 is not closer to TU 15 than to TU 91). Red labels denote accessions affected by type I conflicts, blue labels by type II conflicts, mauve labels by both type I and II conflicts and green labels by database errors due to incorrect data submission. The red (type I) or blue (type II) lines connect the accessions affected by the respective conflict, with the conflict subtype given to the right. Type I concern the presence of the same taxon in different clusters (TU), type II the presence of several taxa within the same cluster (TU). Subtypes: Ia, different TU correspond to different hosts; Ib-Ic, different TU correspond to the same host; Ib, different TU are effected by sequencing/alignment artefacts; Ic different TU are effected by high genetic variability; IIa different taxa within a TU occur on the same host species/genus; (IIa) different taxa within a TU occur on different host genera within the same family; IIb different taxa within a TU occur on different host families. The tree is continued in Fig. 4.
Mentions: The maximum-likelihood tree inferred from the poa alignment had a log likelihood of -16392.00 and is shown in Figs. 3, 4, 5, together with the numbers of the taxonomic units obtained by clustering the 427 sequences using the optimal parameter settings. In a previous comprehensive study on Peronospora phylogeny [41], backbone resolution of the phylogenetic trees was relatively low. The poa maximum-likelihood tree showed the same pattern, even though the separation of Peronospora and Pseudoperonospora was well supported (Fig. 3). However, strong (93% under maximum likelihood, 68% under maximum parsimony) support was present for a large clade comprising mainly parasites of Caryophyllales and Ranunculales; a subclade of it comprising the same species except Peronospora arborescens was supported with 97% and 95%, respectively (Fig. 4). Some smaller groups with uniform host relationships are also well supported, e.g. a clade comprising four accessions of Rubiaceae parasites (98/99% bootstrap; Fig. 3). In contrast, a large monophylum of exclusively Fabaceae pathogens is present in the tree, but without support (Fig. 5). On the other hand, the tree contains a large number of near-terminal nodes that receive high support, most of which are equivalent to a taxonomic unit (Figs. 3–5). Even though not all taxonomic units are monophyletic in the tree, no taxonomic unit was found that conflicted with a well supported branch.

Bottom Line: The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data.Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy.The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence.

View Article: PubMed Central - PubMed

Affiliation: Organismic Botany, Eberhard Karls University of Tübingen, Tübingen, Germany. peronospora@goeker.org

ABSTRACT

Background: Inappropriate taxon definitions may have severe consequences in many areas. For instance, biologically sensible species delimitation of plant pathogens is crucial for measures such as plant protection or biological control and for comparative studies involving model organisms. However, delimiting species is challenging in the case of organisms for which often only molecular data are available, such as prokaryotes, fungi, and many unicellular eukaryotes. Even in the case of organisms with well-established morphological characteristics, molecular taxonomy is often necessary to emend current taxonomic concepts and to analyze DNA sequences directly sampled from the environment. Typically, for this purpose clustering approaches to delineate molecular operational taxonomic units have been applied using arbitrary choices regarding the distance threshold values, and the clustering algorithms.

Methodology: Here, we report on a clustering optimization method to establish a molecular taxonomy of Peronospora based on ITS nrDNA sequences. Peronospora is the largest genus within the downy mildews, which are obligate parasites of higher plants, and includes various economically important pathogens. The method determines the distance function and clustering setting that result in an optimal agreement with selected reference data. Optimization was based on both taxonomy-based and host-based reference information, yielding the same outcome. Resampling and permutation methods indicate that the method is robust regarding taxon sampling and errors in the reference data. Tests with newly obtained ITS sequences demonstrate the use of the re-classified dataset in molecular identification of downy mildews.

Conclusions: A corrected taxonomy is provided for all Peronospora ITS sequences contained in public databases. Clustering optimization appears to be broadly applicable in automated, sequence-based taxonomy. The method connects traditional and modern taxonomic disciplines by specifically addressing the issue of how to optimally account for both traditional species concepts and genetic divergence.

Show MeSH
Related in: MedlinePlus