Limits...
Using targeted enrichment of nuclear genes to increase phylogenetic resolution in the neotropical rain forest genus Inga (Leguminosae: Mimosoideae).

Nicholls JA, Pennington RT, Koenen EJ, Hughes CE, Hearn J, Bunnefeld L, Dexter KG, Stone GN, Kidner CA - Front Plant Sci (2015)

Bottom Line: Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies.We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera.In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing.

View Article: PubMed Central - PubMed

Affiliation: Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK ; Royal Botanic Garden Edinburgh Edinburgh, UK.

ABSTRACT
Evolutionary radiations are prominent and pervasive across many plant lineages in diverse geographical and ecological settings; in neotropical rainforests there is growing evidence suggesting that a significant fraction of species richness is the result of recent radiations. Understanding the evolutionary trajectories and mechanisms underlying these radiations demands much greater phylogenetic resolution than is currently available for these groups. The neotropical tree genus Inga (Leguminosae) is a good example, with ~300 extant species and a crown age of 2-10 MY, yet over 6 kb of plastid and nuclear DNA sequence data gives only poor phylogenetic resolution among species. Here we explore the use of larger-scale nuclear gene data obtained though targeted enrichment to increase phylogenetic resolution within Inga. Transcriptome data from three Inga species were used to select 264 nuclear loci for targeted enrichment and sequencing. Following quality control to remove probable paralogs from these sequence data, the final dataset comprised 259,313 bases from 194 loci for 24 accessions representing 22 Inga species and an outgroup (Zygia). Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies. We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera. In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing. We conclude that targeted enrichment provides the large volumes of phylogenetically-informative sequence data required to resolve relationships within recent plant species radiations, both at the species level and for within-species phylogeographic studies.

No MeSH data available.


Proportion of variable (gray bars) and parsimony informative (white bars) sites across the 183 Inga target loci enriched through hybrid capture that were selected for the phylogenetic analysis of 24 test accessions and that have data for at least three accessions. Solid arrows below the x-axis indicate the percentage of variable sites within the Sanger-sequenced ITS and concatenated plastid loci; dashed arrows indicate the respective percentages of parsimony informative sites.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4584976&req=5

Figure 1: Proportion of variable (gray bars) and parsimony informative (white bars) sites across the 183 Inga target loci enriched through hybrid capture that were selected for the phylogenetic analysis of 24 test accessions and that have data for at least three accessions. Solid arrows below the x-axis indicate the percentage of variable sites within the Sanger-sequenced ITS and concatenated plastid loci; dashed arrows indicate the respective percentages of parsimony informative sites.

Mentions: Within the comparison set of 24 phylogenetically-representative accessions, sequence data were recovered for every accession for the majority (87.1%) of target loci, with a further 4.2% of loci missing data from only a single accession (Supplementary Figure 5). The stringent mapping process and screening of loci resulted in 194 target loci in the final dataset. In 11 of these, data were only recovered for two or fewer accessions and so were not useful for phylogenetic reconstruction, giving 183 phylogenetically informative loci. We had an average of 180.5 (sd 1.3) loci per accession and 22.8 (sd 4.9) accessions per locus. The lengths of these informative alignments ranged between 194 and 4469 bp (mean 1707 bp, Supplementary Figure 6). The percentage of variable sites (including Zygia) ranged from 0 to 12.2% (mean 5.5%); the percentage of parsimony informative sites ranged from 0 to 5.0% (mean 1.7%; Figure 1). In comparison, the percentages of variable and parsimony informative sites in the Sanger sequence data were 4.9 and 1.2% for plastid DNA and 12.1 and 4.3% for ITS respectively (see Figure 1, Supplementary Table 3). This combination of variation levels and lengths provides enough data at each target locus to carry out multi-locus analyses that explore the data on a locus-by-locus basis (such as species tree reconstruction).


Using targeted enrichment of nuclear genes to increase phylogenetic resolution in the neotropical rain forest genus Inga (Leguminosae: Mimosoideae).

Nicholls JA, Pennington RT, Koenen EJ, Hughes CE, Hearn J, Bunnefeld L, Dexter KG, Stone GN, Kidner CA - Front Plant Sci (2015)

Proportion of variable (gray bars) and parsimony informative (white bars) sites across the 183 Inga target loci enriched through hybrid capture that were selected for the phylogenetic analysis of 24 test accessions and that have data for at least three accessions. Solid arrows below the x-axis indicate the percentage of variable sites within the Sanger-sequenced ITS and concatenated plastid loci; dashed arrows indicate the respective percentages of parsimony informative sites.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4584976&req=5

Figure 1: Proportion of variable (gray bars) and parsimony informative (white bars) sites across the 183 Inga target loci enriched through hybrid capture that were selected for the phylogenetic analysis of 24 test accessions and that have data for at least three accessions. Solid arrows below the x-axis indicate the percentage of variable sites within the Sanger-sequenced ITS and concatenated plastid loci; dashed arrows indicate the respective percentages of parsimony informative sites.
Mentions: Within the comparison set of 24 phylogenetically-representative accessions, sequence data were recovered for every accession for the majority (87.1%) of target loci, with a further 4.2% of loci missing data from only a single accession (Supplementary Figure 5). The stringent mapping process and screening of loci resulted in 194 target loci in the final dataset. In 11 of these, data were only recovered for two or fewer accessions and so were not useful for phylogenetic reconstruction, giving 183 phylogenetically informative loci. We had an average of 180.5 (sd 1.3) loci per accession and 22.8 (sd 4.9) accessions per locus. The lengths of these informative alignments ranged between 194 and 4469 bp (mean 1707 bp, Supplementary Figure 6). The percentage of variable sites (including Zygia) ranged from 0 to 12.2% (mean 5.5%); the percentage of parsimony informative sites ranged from 0 to 5.0% (mean 1.7%; Figure 1). In comparison, the percentages of variable and parsimony informative sites in the Sanger sequence data were 4.9 and 1.2% for plastid DNA and 12.1 and 4.3% for ITS respectively (see Figure 1, Supplementary Table 3). This combination of variation levels and lengths provides enough data at each target locus to carry out multi-locus analyses that explore the data on a locus-by-locus basis (such as species tree reconstruction).

Bottom Line: Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies.We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera.In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing.

View Article: PubMed Central - PubMed

Affiliation: Ashworth Labs, Institute of Evolutionary Biology, School of Biological Sciences, University of Edinburgh Edinburgh, UK ; Royal Botanic Garden Edinburgh Edinburgh, UK.

ABSTRACT
Evolutionary radiations are prominent and pervasive across many plant lineages in diverse geographical and ecological settings; in neotropical rainforests there is growing evidence suggesting that a significant fraction of species richness is the result of recent radiations. Understanding the evolutionary trajectories and mechanisms underlying these radiations demands much greater phylogenetic resolution than is currently available for these groups. The neotropical tree genus Inga (Leguminosae) is a good example, with ~300 extant species and a crown age of 2-10 MY, yet over 6 kb of plastid and nuclear DNA sequence data gives only poor phylogenetic resolution among species. Here we explore the use of larger-scale nuclear gene data obtained though targeted enrichment to increase phylogenetic resolution within Inga. Transcriptome data from three Inga species were used to select 264 nuclear loci for targeted enrichment and sequencing. Following quality control to remove probable paralogs from these sequence data, the final dataset comprised 259,313 bases from 194 loci for 24 accessions representing 22 Inga species and an outgroup (Zygia). Bayesian phylogenies reconstructed using either all loci concatenated or a gene-tree/species-tree approach yielded highly resolved phylogenies. We used coalescent approaches to show that the same targeted enrichment data also have significant power to discriminate among alternative within-species population histories within the widespread species I. umbellifera. In either application, targeted enrichment simplifies the informatics challenge of identifying orthologous loci associated with de novo genome sequencing. We conclude that targeted enrichment provides the large volumes of phylogenetically-informative sequence data required to resolve relationships within recent plant species radiations, both at the species level and for within-species phylogeographic studies.

No MeSH data available.