Limits...
Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

Straub SC, Fishbein M, Livshultz T, Foster Z, Parks M, Weitemier K, Cronn RC, Liston A - BMC Genomics (2011)

Bottom Line: The results highlight the promise of next generation sequencing for development of genomic resources for any organism.Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives.This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331, USA. straubs@science.oregonstate.edu

ABSTRACT

Background: Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution.

Results: A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed.

Conclusions: The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.

Show MeSH
The rDNA cistron of Asclepias syriaca. Blue boxes represent genes. Thick black lines represent additional transcribed sequence. The gray and dashed lines represent non-transcribed and unassembled sequence respectively. Only partial sequences of the non-transcribed spacer (NTS) and external transcribed spacer (ETS) were able to be assembled due to repeats, so the length of the intergenic spacer (IGS) remains unknown.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3116503&req=5

Figure 3: The rDNA cistron of Asclepias syriaca. Blue boxes represent genes. Thick black lines represent additional transcribed sequence. The gray and dashed lines represent non-transcribed and unassembled sequence respectively. Only partial sequences of the non-transcribed spacer (NTS) and external transcribed spacer (ETS) were able to be assembled due to repeats, so the length of the intergenic spacer (IGS) remains unknown.

Mentions: A 385 bp contig containing a 120 bp 5S rDNA sequence [GenBank:JF312047] and an rDNA cistron of 7,541 bp [GenBank:JF312046] were assembled for A. syriaca (Figure 3). Repeat structure in the external transcribed spacer (ETS) and intergenic spacer (IGS) regions of the rDNA cistron prevented further extension of that contig and produced an error in the assembly where reads corresponding to similar repeats were incorrectly piled up, highlighting one of the primary difficulties of genome assembly using short reads [32,91]. Consequently, the first 280 bp of the contig sequence were removed prior to downstream analyses. The median read depth for the final assemblies were 406× for 5S and 738× for the 18S-5.8S-26S cistron. Using the median read depth as an estimate of the number of rDNA copies sequenced and the nuclear genome coverage of approximately 0.4×, rough approximations of the number of 5S rDNA repeats and of rDNA cistron copies are 1,015 and 1,845 respectively. These estimates are in line with values observed for other plants [92,93].


Building a model: developing genomic resources for common milkweed (Asclepias syriaca) with low coverage genome sequencing.

Straub SC, Fishbein M, Livshultz T, Foster Z, Parks M, Weitemier K, Cronn RC, Liston A - BMC Genomics (2011)

The rDNA cistron of Asclepias syriaca. Blue boxes represent genes. Thick black lines represent additional transcribed sequence. The gray and dashed lines represent non-transcribed and unassembled sequence respectively. Only partial sequences of the non-transcribed spacer (NTS) and external transcribed spacer (ETS) were able to be assembled due to repeats, so the length of the intergenic spacer (IGS) remains unknown.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3116503&req=5

Figure 3: The rDNA cistron of Asclepias syriaca. Blue boxes represent genes. Thick black lines represent additional transcribed sequence. The gray and dashed lines represent non-transcribed and unassembled sequence respectively. Only partial sequences of the non-transcribed spacer (NTS) and external transcribed spacer (ETS) were able to be assembled due to repeats, so the length of the intergenic spacer (IGS) remains unknown.
Mentions: A 385 bp contig containing a 120 bp 5S rDNA sequence [GenBank:JF312047] and an rDNA cistron of 7,541 bp [GenBank:JF312046] were assembled for A. syriaca (Figure 3). Repeat structure in the external transcribed spacer (ETS) and intergenic spacer (IGS) regions of the rDNA cistron prevented further extension of that contig and produced an error in the assembly where reads corresponding to similar repeats were incorrectly piled up, highlighting one of the primary difficulties of genome assembly using short reads [32,91]. Consequently, the first 280 bp of the contig sequence were removed prior to downstream analyses. The median read depth for the final assemblies were 406× for 5S and 738× for the 18S-5.8S-26S cistron. Using the median read depth as an estimate of the number of rDNA copies sequenced and the nuclear genome coverage of approximately 0.4×, rough approximations of the number of 5S rDNA repeats and of rDNA cistron copies are 1,015 and 1,845 respectively. These estimates are in line with values observed for other plants [92,93].

Bottom Line: The results highlight the promise of next generation sequencing for development of genomic resources for any organism.Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives.This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Botany and Plant Pathology, Oregon State University, Corvallis, Oregon 97331, USA. straubs@science.oregonstate.edu

ABSTRACT

Background: Milkweeds (Asclepias L.) have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L.) could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution.

Results: A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp) and 5S rDNA (120 bp) sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp), with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae) unigenes (median coverage of 0.29×) and 66% of single copy orthologs (COSII) in asterids (median coverage of 0.14×). From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites) and phylogenetics (low-copy nuclear genes) studies were developed.

Conclusions: The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species and its relatives. This study represents a first step in the development of a community resource for further study of plant-insect co-evolution, anti-herbivore defense, floral developmental genetics, reproductive biology, chemical evolution, population genetics, and comparative genomics using milkweeds, and A. syriaca in particular, as ecological and evolutionary models.

Show MeSH