Limits...
Exploring the utility of cross-laboratory RAD-sequencing datasets for phylogenetic analysis.

Gonen S, Bishop SC, Houston RD - BMC Res Notes (2015)

Bottom Line: The number of orthologous SbfI RAD loci identified decreased with increasing evolutionary distance between the species, with several thousand loci conserved across five salmonid species (divergence ~50 MY), and several hundred conserved across the more distantly related teleost species (divergence ~100-360 MY).This has positive implications for the repeatability of SbfI RAD-Seq and its potential to address research questions beyond the scope of the original studies.Furthermore, the concordance in tree topologies and relationships estimated in this study with published teleost phylogenies suggests that similar meta-datasets could be utilised in the prediction of evolutionary relationships across populations and species with readily available RAD-Seq datasets, but for which relationships remain uncharacterised.

View Article: PubMed Central - PubMed

Affiliation: The Roslin Institute, University of Edinburgh, Midlothian, EH25 9RG, Scotland, UK. Serap.gonen@roslin.ed.ac.uk.

ABSTRACT

Background: Restriction site-Associated DNA sequencing (RAD-Seq) is widely applied to generate genome-wide sequence and genetic marker datasets. RAD-Seq has been extensively utilised, both at the population level and across species, for example in the construction of phylogenetic trees. However, the consistency of RAD-Seq data generated in different laboratories, and the potential use of cross-species orthologous RAD loci in the estimation of genetic relationships, have not been widely investigated. This study describes the use of SbfI RAD-Seq data for the estimation of evolutionary relationships amongst ten teleost fish species, using previously established phylogeny as a benchmark.

Results: The number of orthologous SbfI RAD loci identified decreased with increasing evolutionary distance between the species, with several thousand loci conserved across five salmonid species (divergence ~50 MY), and several hundred conserved across the more distantly related teleost species (divergence ~100-360 MY). The majority (>70%) of loci identified between the more distantly related species were genic in origin, suggesting that the bias of SbfI towards genic regions is useful for identifying distant orthologs. Interspecific single nucleotide variants at each orthologous RAD locus were identified. Evolutionary relationships estimated using concatenated sequences of interspecific variants were congruent with previously published phylogenies, even for distantly (divergence up to ~360 MY) related species.

Conclusion: Overall, this study has demonstrated that orthologous SbfI RAD loci can be identified across closely and distantly related species. This has positive implications for the repeatability of SbfI RAD-Seq and its potential to address research questions beyond the scope of the original studies. Furthermore, the concordance in tree topologies and relationships estimated in this study with published teleost phylogenies suggests that similar meta-datasets could be utilised in the prediction of evolutionary relationships across populations and species with readily available RAD-Seq datasets, but for which relationships remain uncharacterised.

No MeSH data available.


Example tree of all ten fish species obtained in this study using RAxML. Evolutionary relationships obtained using RAD data in this study were congruent with those of Near et al. [49] (teleost species) and Shedko et al. [48] (salmonid species) (Figure 1). Parameters—RAD loci present in at least five of ten species; 452 loci, 4,094 between-species variants. Branch lengths (given as percentages) estimated in RAxML are given along individual branches (in blue), and node bootstrap support values (1,000 bootstrap replicates) are given at individual nodes (in red). Branch lengths are not drawn to scale.
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4495686&req=5

Fig2: Example tree of all ten fish species obtained in this study using RAxML. Evolutionary relationships obtained using RAD data in this study were congruent with those of Near et al. [49] (teleost species) and Shedko et al. [48] (salmonid species) (Figure 1). Parameters—RAD loci present in at least five of ten species; 452 loci, 4,094 between-species variants. Branch lengths (given as percentages) estimated in RAxML are given along individual branches (in blue), and node bootstrap support values (1,000 bootstrap replicates) are given at individual nodes (in red). Branch lengths are not drawn to scale.

Mentions: Likewise, across the ten teleost fish species, evolutionary relationships were estimated using variants derived from RAD loci common to at least seven of the ten species (137 loci, 1,440 variants; Table 2) and compared to the estimates using orthologous RAD clusters common to at least five of the ten species (452 loci, 4,094 variants; Table 2). Overall, tree topologies were consistent with previously published literature (Figures 1, 2; Additional file 5, trees 3 and 4). Monophyly of the Salmonidae and monophyly of the three Onchorhynchus species was predicted with 100% bootstrap support. Across both the salmonid and the teleost datasets, relaxing the threshold for inclusion of RAD loci in the analysis did not change estimated relationships or tree topology. Improvements in node support were also observed, for example, all salmonid species nodes were estimated with 100% support (vs. 98–100%) when the minimum taxon coverage at a RAD locus was reduced from seven to five of the ten species included (e.g. Additional file 5, trees 3 and 4). However, improvements in node support were not seen in all cases, for example, the node placing spotted gar as outgroup was not as strongly supported when the minimum taxon coverage was reduced (48–80%; Additional file 5, trees 3 and 4). Although bootstrap support is generally accepted as a reliable indicator of node accuracy, recent in silico studies suggest that this may not always be the case with RAD-Seq data [18]. Since true node support values obtained using empirical datasets are unknown, the accuracy of the reported bootstrap values cannot be quantified in this study.Figure 2


Exploring the utility of cross-laboratory RAD-sequencing datasets for phylogenetic analysis.

Gonen S, Bishop SC, Houston RD - BMC Res Notes (2015)

Example tree of all ten fish species obtained in this study using RAxML. Evolutionary relationships obtained using RAD data in this study were congruent with those of Near et al. [49] (teleost species) and Shedko et al. [48] (salmonid species) (Figure 1). Parameters—RAD loci present in at least five of ten species; 452 loci, 4,094 between-species variants. Branch lengths (given as percentages) estimated in RAxML are given along individual branches (in blue), and node bootstrap support values (1,000 bootstrap replicates) are given at individual nodes (in red). Branch lengths are not drawn to scale.
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4495686&req=5

Fig2: Example tree of all ten fish species obtained in this study using RAxML. Evolutionary relationships obtained using RAD data in this study were congruent with those of Near et al. [49] (teleost species) and Shedko et al. [48] (salmonid species) (Figure 1). Parameters—RAD loci present in at least five of ten species; 452 loci, 4,094 between-species variants. Branch lengths (given as percentages) estimated in RAxML are given along individual branches (in blue), and node bootstrap support values (1,000 bootstrap replicates) are given at individual nodes (in red). Branch lengths are not drawn to scale.
Mentions: Likewise, across the ten teleost fish species, evolutionary relationships were estimated using variants derived from RAD loci common to at least seven of the ten species (137 loci, 1,440 variants; Table 2) and compared to the estimates using orthologous RAD clusters common to at least five of the ten species (452 loci, 4,094 variants; Table 2). Overall, tree topologies were consistent with previously published literature (Figures 1, 2; Additional file 5, trees 3 and 4). Monophyly of the Salmonidae and monophyly of the three Onchorhynchus species was predicted with 100% bootstrap support. Across both the salmonid and the teleost datasets, relaxing the threshold for inclusion of RAD loci in the analysis did not change estimated relationships or tree topology. Improvements in node support were also observed, for example, all salmonid species nodes were estimated with 100% support (vs. 98–100%) when the minimum taxon coverage at a RAD locus was reduced from seven to five of the ten species included (e.g. Additional file 5, trees 3 and 4). However, improvements in node support were not seen in all cases, for example, the node placing spotted gar as outgroup was not as strongly supported when the minimum taxon coverage was reduced (48–80%; Additional file 5, trees 3 and 4). Although bootstrap support is generally accepted as a reliable indicator of node accuracy, recent in silico studies suggest that this may not always be the case with RAD-Seq data [18]. Since true node support values obtained using empirical datasets are unknown, the accuracy of the reported bootstrap values cannot be quantified in this study.Figure 2

Bottom Line: The number of orthologous SbfI RAD loci identified decreased with increasing evolutionary distance between the species, with several thousand loci conserved across five salmonid species (divergence ~50 MY), and several hundred conserved across the more distantly related teleost species (divergence ~100-360 MY).This has positive implications for the repeatability of SbfI RAD-Seq and its potential to address research questions beyond the scope of the original studies.Furthermore, the concordance in tree topologies and relationships estimated in this study with published teleost phylogenies suggests that similar meta-datasets could be utilised in the prediction of evolutionary relationships across populations and species with readily available RAD-Seq datasets, but for which relationships remain uncharacterised.

View Article: PubMed Central - PubMed

Affiliation: The Roslin Institute, University of Edinburgh, Midlothian, EH25 9RG, Scotland, UK. Serap.gonen@roslin.ed.ac.uk.

ABSTRACT

Background: Restriction site-Associated DNA sequencing (RAD-Seq) is widely applied to generate genome-wide sequence and genetic marker datasets. RAD-Seq has been extensively utilised, both at the population level and across species, for example in the construction of phylogenetic trees. However, the consistency of RAD-Seq data generated in different laboratories, and the potential use of cross-species orthologous RAD loci in the estimation of genetic relationships, have not been widely investigated. This study describes the use of SbfI RAD-Seq data for the estimation of evolutionary relationships amongst ten teleost fish species, using previously established phylogeny as a benchmark.

Results: The number of orthologous SbfI RAD loci identified decreased with increasing evolutionary distance between the species, with several thousand loci conserved across five salmonid species (divergence ~50 MY), and several hundred conserved across the more distantly related teleost species (divergence ~100-360 MY). The majority (>70%) of loci identified between the more distantly related species were genic in origin, suggesting that the bias of SbfI towards genic regions is useful for identifying distant orthologs. Interspecific single nucleotide variants at each orthologous RAD locus were identified. Evolutionary relationships estimated using concatenated sequences of interspecific variants were congruent with previously published phylogenies, even for distantly (divergence up to ~360 MY) related species.

Conclusion: Overall, this study has demonstrated that orthologous SbfI RAD loci can be identified across closely and distantly related species. This has positive implications for the repeatability of SbfI RAD-Seq and its potential to address research questions beyond the scope of the original studies. Furthermore, the concordance in tree topologies and relationships estimated in this study with published teleost phylogenies suggests that similar meta-datasets could be utilised in the prediction of evolutionary relationships across populations and species with readily available RAD-Seq datasets, but for which relationships remain uncharacterised.

No MeSH data available.