Limits...
Using targeted enrichment of nuclear genes to increase phylogenetic resolution in the neotropical rain forest genus Inga (Leguminosae: Mimosoideae).

Nicholls JA, Pennington RT, Koenen EJ, Hughes CE, Hearn J, Bunnefeld L, Dexter KG, Stone GN, Kidner CA - Front Plant Sci (2015)

Bottom Line: Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies.We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera.In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing.

View Article: PubMed Central - PubMed

Affiliation: Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK ; Royal Botanic Garden Edinburgh Edinburgh, UK.

ABSTRACT
Evolutionary radiations are prominent and pervasive across many plant lineages in diverse geographical and ecological settings; in neotropical rainforests there is growing evidence suggesting that a significant fraction of species richness is the result of recent radiations. Understanding the evolutionary trajectories and mechanisms underlying these radiations demands much greater phylogenetic resolution than is currently available for these groups. The neotropical tree genus Inga (Leguminosae) is a good example, with ~300 extant species and a crown age of 2-10 MY, yet over 6 kb of plastid and nuclear DNA sequence data gives only poor phylogenetic resolution among species. Here we explore the use of larger-scale nuclear gene data obtained though targeted enrichment to increase phylogenetic resolution within Inga. Transcriptome data from three Inga species were used to select 264 nuclear loci for targeted enrichment and sequencing. Following quality control to remove probable paralogs from these sequence data, the final dataset comprised 259,313 bases from 194 loci for 24 accessions representing 22 Inga species and an outgroup (Zygia). Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies. We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera. In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing. We conclude that targeted enrichment provides the large volumes of phylogenetically-informative sequence data required to resolve relationships within recent plant species radiations, both at the species level and for within-species phylogeographic studies.

No MeSH data available.


Proportion of variable (gray bars) and parsimony informative (white bars) sites at the 251 Inga target loci enriched through hybrid capture for which data were obtained for at least two of the 19 I. umbellifera accessions.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4584976&req=5

Figure 4: Proportion of variable (gray bars) and parsimony informative (white bars) sites at the 251 Inga target loci enriched through hybrid capture for which data were obtained for at least two of the 19 I. umbellifera accessions.

Mentions: After trimming sites that had missing data for more than half the accessions, the concatenated alignment of 168 loci for the 22 accessions of population-level sampling within I. umbellifera and its close relative I. brevipes (including the technical replicate) plus the 24 comparison accessions contained a total of 224,786 bases, with 16,972 (7.6%) variable positions and 6345 (2.8%) parsimony informative sites. Analysis of these data using phylogenetic methods supported intra-specific divergence between I. umbellifera populations from different geographic areas, and also showed I. brevipes to be nested within I. umbellifera (Figure 3). Results also show that the two French Guianan I. umbellifera populations with distinct leaf chemistries, one with high levels of tyrosine, one without any tyrosine (Figure 3; Kursar pers. comm.), form separate, robustly supported clades which are not sister to each other. Both the no-clock and relaxed clock models gave identical topologies with almost identical support values. This topology was similar to that of the 24 accession analysis (Figure 2A), with a few minor though well-supported differences, which may reflect the extended taxon sampling. Variation per locus for the 19 I. umbellifera samples ranged from zero to 5.2% (mean 1.6%) and the proportion of parsimony informative sites ranged from zero to 3.5% (mean 0.9%; Figure 4). Variation between the two technical replicates of I. umbellifera sample KD882 was minimal, with only two sites differing out of the 209,239 sites in the final screened set of loci where data were present for both replicates, equivalent to an error rate of 0.00096%. This very low error rate suggests that the branch lengths we see for within-population parts of the phylogeny reflect true levels of variation, and that our approach limits the noise present in such large datasets.


Using targeted enrichment of nuclear genes to increase phylogenetic resolution in the neotropical rain forest genus Inga (Leguminosae: Mimosoideae).

Nicholls JA, Pennington RT, Koenen EJ, Hughes CE, Hearn J, Bunnefeld L, Dexter KG, Stone GN, Kidner CA - Front Plant Sci (2015)

Proportion of variable (gray bars) and parsimony informative (white bars) sites at the 251 Inga target loci enriched through hybrid capture for which data were obtained for at least two of the 19 I. umbellifera accessions.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4584976&req=5

Figure 4: Proportion of variable (gray bars) and parsimony informative (white bars) sites at the 251 Inga target loci enriched through hybrid capture for which data were obtained for at least two of the 19 I. umbellifera accessions.
Mentions: After trimming sites that had missing data for more than half the accessions, the concatenated alignment of 168 loci for the 22 accessions of population-level sampling within I. umbellifera and its close relative I. brevipes (including the technical replicate) plus the 24 comparison accessions contained a total of 224,786 bases, with 16,972 (7.6%) variable positions and 6345 (2.8%) parsimony informative sites. Analysis of these data using phylogenetic methods supported intra-specific divergence between I. umbellifera populations from different geographic areas, and also showed I. brevipes to be nested within I. umbellifera (Figure 3). Results also show that the two French Guianan I. umbellifera populations with distinct leaf chemistries, one with high levels of tyrosine, one without any tyrosine (Figure 3; Kursar pers. comm.), form separate, robustly supported clades which are not sister to each other. Both the no-clock and relaxed clock models gave identical topologies with almost identical support values. This topology was similar to that of the 24 accession analysis (Figure 2A), with a few minor though well-supported differences, which may reflect the extended taxon sampling. Variation per locus for the 19 I. umbellifera samples ranged from zero to 5.2% (mean 1.6%) and the proportion of parsimony informative sites ranged from zero to 3.5% (mean 0.9%; Figure 4). Variation between the two technical replicates of I. umbellifera sample KD882 was minimal, with only two sites differing out of the 209,239 sites in the final screened set of loci where data were present for both replicates, equivalent to an error rate of 0.00096%. This very low error rate suggests that the branch lengths we see for within-population parts of the phylogeny reflect true levels of variation, and that our approach limits the noise present in such large datasets.

Bottom Line: Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies.We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera.In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing.

View Article: PubMed Central - PubMed

Affiliation: Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK ; Royal Botanic Garden Edinburgh Edinburgh, UK.

ABSTRACT
Evolutionary radiations are prominent and pervasive across many plant lineages in diverse geographical and ecological settings; in neotropical rainforests there is growing evidence suggesting that a significant fraction of species richness is the result of recent radiations. Understanding the evolutionary trajectories and mechanisms underlying these radiations demands much greater phylogenetic resolution than is currently available for these groups. The neotropical tree genus Inga (Leguminosae) is a good example, with ~300 extant species and a crown age of 2-10 MY, yet over 6 kb of plastid and nuclear DNA sequence data gives only poor phylogenetic resolution among species. Here we explore the use of larger-scale nuclear gene data obtained though targeted enrichment to increase phylogenetic resolution within Inga. Transcriptome data from three Inga species were used to select 264 nuclear loci for targeted enrichment and sequencing. Following quality control to remove probable paralogs from these sequence data, the final dataset comprised 259,313 bases from 194 loci for 24 accessions representing 22 Inga species and an outgroup (Zygia). Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies. We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera. In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing. We conclude that targeted enrichment provides the large volumes of phylogenetically-informative sequence data required to resolve relationships within recent plant species radiations, both at the species level and for within-species phylogeographic studies.

No MeSH data available.