Limits...
Independent, rapid and targeted loss of highly repetitive DNA in natural and synthetic allopolyploids of Nicotiana tabacum.

Renny-Byfield S, Kovařík A, Chester M, Nichols RA, Macas J, Novák P, Leitch AR - PLoS ONE (2012)

Bottom Line: Allopolyploidy (interspecific hybridisation and polyploidy) has played a significant role in the evolutionary history of angiosperms and can result in genomic, epigenetic and transcriptomic perturbations.We examine the immediate effects of allopolyploidy on repetitive DNA by comparing the genomes of synthetic and natural Nicotiana tabacum with diploid progenitors N. tomentosiformis (paternal progenitor) and N. sylvestris (maternal progenitor).Abundance estimates, based on sequencing depth, indicate NicCL3 is almost absent in N. sylvestris and has been dramatically reduced in copy number in the allopolyploid N. tabacum.

View Article: PubMed Central - PubMed

Affiliation: School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom.

ABSTRACT
Allopolyploidy (interspecific hybridisation and polyploidy) has played a significant role in the evolutionary history of angiosperms and can result in genomic, epigenetic and transcriptomic perturbations. We examine the immediate effects of allopolyploidy on repetitive DNA by comparing the genomes of synthetic and natural Nicotiana tabacum with diploid progenitors N. tomentosiformis (paternal progenitor) and N. sylvestris (maternal progenitor). Using next generation sequencing, a recently developed graph-based repeat identification pipeline, Southern blot and fluorescence in situ hybridisation (FISH) we characterise two highly repetitive DNA sequences (NicCL3 and NicCL7/30). Analysis of two independent high-throughput DNA sequencing datasets indicates NicCL3 forms 1.6-1.9% of the genome in N. tomentosiformis, sequences that occur in multiple, discontinuous tandem arrays scattered over several chromosomes. Abundance estimates, based on sequencing depth, indicate NicCL3 is almost absent in N. sylvestris and has been dramatically reduced in copy number in the allopolyploid N. tabacum. Surprisingly elimination of NicCL3 is repeated in some synthetic lines of N. tabacum in their forth generation. The retroelement NicCL7/30, which occurs interspersed with NicCL3, is also under-represented but to a much lesser degree, revealing targeted elimination of the latter. Analysis of paired-end sequencing data indicates the tandem component of NicCL3 has been preferentially removed in natural N. tabacum, increasing the proportion of the dispersed component. This occurs across multiple blocks of discontinuous repeats and based on the distribution of nucleotide similarity among NicCL3 units, was concurrent with rounds of sequence homogenisation.

Show MeSH
The cluster NicCL7/30.(a) Cluster NicCL7/30 shown as a graph. Individual sequence reads are represented as nodes on the graph and for simplicity edges representing similarity hits are not shown. The position of nodes was calculated using the Fruchterman-Reingold algorithm. (b) The same graph but with sequences highlighted depending on the progenitor species from which they derive. (c) Another representation of NicCL7/30 but indicating sequence similarity to conserved coding domains (CCD) including protease (PROT), reverse transcriptase (RT), RNaseH (RH), integrase (INT), chromovirus chromo-domain (CHDII) and gag-pol (GAG). (d) Estimated copy-numbers from 454 read-depth analysis, along the length of the most abundant contig in the merged cluster NicCL7/30. A region between ∼500 and 3200 bp is more abundant than the remaining contig and likely represents the LTR region of this retroelement, where higher abundance may be due to the presence of solo-LTRs. The position of PCR primers used to make probes for this sequence are indicated with arrows.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3351487&req=5

pone-0036963-g002: The cluster NicCL7/30.(a) Cluster NicCL7/30 shown as a graph. Individual sequence reads are represented as nodes on the graph and for simplicity edges representing similarity hits are not shown. The position of nodes was calculated using the Fruchterman-Reingold algorithm. (b) The same graph but with sequences highlighted depending on the progenitor species from which they derive. (c) Another representation of NicCL7/30 but indicating sequence similarity to conserved coding domains (CCD) including protease (PROT), reverse transcriptase (RT), RNaseH (RH), integrase (INT), chromovirus chromo-domain (CHDII) and gag-pol (GAG). (d) Estimated copy-numbers from 454 read-depth analysis, along the length of the most abundant contig in the merged cluster NicCL7/30. A region between ∼500 and 3200 bp is more abundant than the remaining contig and likely represents the LTR region of this retroelement, where higher abundance may be due to the presence of solo-LTRs. The position of PCR primers used to make probes for this sequence are indicated with arrows.

Mentions: A graph-based clustering approach described in was used to identify and reconstruct, in silico, the major repeat types present in the genomes of N. tabacum, N. sylvestris and N. tomentosiformis as described in Renny-Byfield etal. [15]. A combined dataset of 454 sequence reads from all three species was used to generate clusters and contigs representing repetitive DNA sequences. Mutual similarities can then be visualised in graph form (Fig. 1 a and Fig 2 b) in which nodes correspond to sequence reads, and a Fruchterman-Reingold algorithm is used to position nodes. Reads that are most similar are placed closest together whilst those that are less closely related are more distal (described in detail in Novak etal. [31]). Contig assembly is performed with reads from each cluster and the contigs are named according to the number of the cluster from which they derive (X) and Nic designates Nicotiana, i.e. NicCLX. Each cluster typically generates multiple contigs, each of which is designated a number (Y), giving a format NicCLX contigY. All contigs assembled in this work are available via our websites: http://webspace.qmul.ac.uk/sbyfield/Simon_Renny-Byfield/Data.html and http://webspace.qmul.ac.uk/arleitch/Site/Home.html.


Independent, rapid and targeted loss of highly repetitive DNA in natural and synthetic allopolyploids of Nicotiana tabacum.

Renny-Byfield S, Kovařík A, Chester M, Nichols RA, Macas J, Novák P, Leitch AR - PLoS ONE (2012)

The cluster NicCL7/30.(a) Cluster NicCL7/30 shown as a graph. Individual sequence reads are represented as nodes on the graph and for simplicity edges representing similarity hits are not shown. The position of nodes was calculated using the Fruchterman-Reingold algorithm. (b) The same graph but with sequences highlighted depending on the progenitor species from which they derive. (c) Another representation of NicCL7/30 but indicating sequence similarity to conserved coding domains (CCD) including protease (PROT), reverse transcriptase (RT), RNaseH (RH), integrase (INT), chromovirus chromo-domain (CHDII) and gag-pol (GAG). (d) Estimated copy-numbers from 454 read-depth analysis, along the length of the most abundant contig in the merged cluster NicCL7/30. A region between ∼500 and 3200 bp is more abundant than the remaining contig and likely represents the LTR region of this retroelement, where higher abundance may be due to the presence of solo-LTRs. The position of PCR primers used to make probes for this sequence are indicated with arrows.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3351487&req=5

pone-0036963-g002: The cluster NicCL7/30.(a) Cluster NicCL7/30 shown as a graph. Individual sequence reads are represented as nodes on the graph and for simplicity edges representing similarity hits are not shown. The position of nodes was calculated using the Fruchterman-Reingold algorithm. (b) The same graph but with sequences highlighted depending on the progenitor species from which they derive. (c) Another representation of NicCL7/30 but indicating sequence similarity to conserved coding domains (CCD) including protease (PROT), reverse transcriptase (RT), RNaseH (RH), integrase (INT), chromovirus chromo-domain (CHDII) and gag-pol (GAG). (d) Estimated copy-numbers from 454 read-depth analysis, along the length of the most abundant contig in the merged cluster NicCL7/30. A region between ∼500 and 3200 bp is more abundant than the remaining contig and likely represents the LTR region of this retroelement, where higher abundance may be due to the presence of solo-LTRs. The position of PCR primers used to make probes for this sequence are indicated with arrows.
Mentions: A graph-based clustering approach described in was used to identify and reconstruct, in silico, the major repeat types present in the genomes of N. tabacum, N. sylvestris and N. tomentosiformis as described in Renny-Byfield etal. [15]. A combined dataset of 454 sequence reads from all three species was used to generate clusters and contigs representing repetitive DNA sequences. Mutual similarities can then be visualised in graph form (Fig. 1 a and Fig 2 b) in which nodes correspond to sequence reads, and a Fruchterman-Reingold algorithm is used to position nodes. Reads that are most similar are placed closest together whilst those that are less closely related are more distal (described in detail in Novak etal. [31]). Contig assembly is performed with reads from each cluster and the contigs are named according to the number of the cluster from which they derive (X) and Nic designates Nicotiana, i.e. NicCLX. Each cluster typically generates multiple contigs, each of which is designated a number (Y), giving a format NicCLX contigY. All contigs assembled in this work are available via our websites: http://webspace.qmul.ac.uk/sbyfield/Simon_Renny-Byfield/Data.html and http://webspace.qmul.ac.uk/arleitch/Site/Home.html.

Bottom Line: Allopolyploidy (interspecific hybridisation and polyploidy) has played a significant role in the evolutionary history of angiosperms and can result in genomic, epigenetic and transcriptomic perturbations.We examine the immediate effects of allopolyploidy on repetitive DNA by comparing the genomes of synthetic and natural Nicotiana tabacum with diploid progenitors N. tomentosiformis (paternal progenitor) and N. sylvestris (maternal progenitor).Abundance estimates, based on sequencing depth, indicate NicCL3 is almost absent in N. sylvestris and has been dramatically reduced in copy number in the allopolyploid N. tabacum.

View Article: PubMed Central - PubMed

Affiliation: School of Biological and Chemical Sciences, Queen Mary University of London, London, United Kingdom.

ABSTRACT
Allopolyploidy (interspecific hybridisation and polyploidy) has played a significant role in the evolutionary history of angiosperms and can result in genomic, epigenetic and transcriptomic perturbations. We examine the immediate effects of allopolyploidy on repetitive DNA by comparing the genomes of synthetic and natural Nicotiana tabacum with diploid progenitors N. tomentosiformis (paternal progenitor) and N. sylvestris (maternal progenitor). Using next generation sequencing, a recently developed graph-based repeat identification pipeline, Southern blot and fluorescence in situ hybridisation (FISH) we characterise two highly repetitive DNA sequences (NicCL3 and NicCL7/30). Analysis of two independent high-throughput DNA sequencing datasets indicates NicCL3 forms 1.6-1.9% of the genome in N. tomentosiformis, sequences that occur in multiple, discontinuous tandem arrays scattered over several chromosomes. Abundance estimates, based on sequencing depth, indicate NicCL3 is almost absent in N. sylvestris and has been dramatically reduced in copy number in the allopolyploid N. tabacum. Surprisingly elimination of NicCL3 is repeated in some synthetic lines of N. tabacum in their forth generation. The retroelement NicCL7/30, which occurs interspersed with NicCL3, is also under-represented but to a much lesser degree, revealing targeted elimination of the latter. Analysis of paired-end sequencing data indicates the tandem component of NicCL3 has been preferentially removed in natural N. tabacum, increasing the proportion of the dispersed component. This occurs across multiple blocks of discontinuous repeats and based on the distribution of nucleotide similarity among NicCL3 units, was concurrent with rounds of sequence homogenisation.

Show MeSH