Limits...
The repetitive component of the sunflower genome as shown by different procedures for assembling next generation sequencing reads.

Natali L, Cossu RM, Barghini E, Giordani T, Buti M, Mascagni F, Morgante M, Gill N, Kane NC, Rieseberg L, Cavallini A - BMC Genomics (2013)

Bottom Line: Also many families of non autonomous retrotransposons and DNA transposons (especially of the Helitron superfamily) were identified.The results substantially matched those previously obtained by using a Sanger-sequenced shotgun library and a standard 454 whole-genome-shotgun approach, indicating the reliability of the proposed procedures also for other species.The repetitive sequences were collected to produce a database, SUNREP, that will be useful for the annotation of the sunflower genome sequence and for studying the genome evolution in dicotyledons.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Agricultural, Food, and Environmental Sciences, University of Pisa, Via del Borghetto 80, I-56124 Pisa, Italy. lnatali@agr.unipi.it.

ABSTRACT

Background: Next generation sequencing provides a powerful tool to study genome structure in species whose genomes are far from being completely sequenced. In this work we describe and compare different computational approaches to evaluate the repetitive component of the genome of sunflower, by using medium/low coverage Illumina or 454 libraries.

Results: By varying sequencing technology (Illumina or 454), coverage (0.55 x-1.25 x), assemblers and assembly procedures, six different genomic databases were produced. The annotation of these databases showed that they were composed of different proportions of repetitive DNA families. The final assembly of the sequences belonging to the six databases produced a whole genome set of 283,800 contigs. The redundancy of each contig was estimated by mapping the whole genome set with a large Illumina read set and measuring the number of matched Illumina reads. The repetitive component amounted to 81% of the sunflower genome, that is composed mainly of numerous families of Gypsy and Copia retrotransposons. Also many families of non autonomous retrotransposons and DNA transposons (especially of the Helitron superfamily) were identified.

Conclusions: The results substantially matched those previously obtained by using a Sanger-sequenced shotgun library and a standard 454 whole-genome-shotgun approach, indicating the reliability of the proposed procedures also for other species. The repetitive sequences were collected to produce a database, SUNREP, that will be useful for the annotation of the sunflower genome sequence and for studying the genome evolution in dicotyledons.

Show MeSH
Number of sequences composing the 30 most numerous families of LTR-REs (above) and DNA transposons (below).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3852528&req=5

Figure 5: Number of sequences composing the 30 most numerous families of LTR-REs (above) and DNA transposons (below).

Mentions: The most redundant family, belonging to the Gypsy repeat superfamily, included only 96 of the 47,924 sequences of SUNREP (0.20%). Only four Gypsy families were composed of more than 50 SUNREP sequences. Considering the 30 most numerous LTR-REs families, the vast majority belonged to the Gypsy superfamily (Figure 5). Among the 30 most numerous DNA transposons, the most common families belonged to the Helitron class, followed by putative MITEs (Figure 5). It should be noted that the number of sequences that belong to a family in SUNREP does not reflect the redundancy of that family in the genome.


The repetitive component of the sunflower genome as shown by different procedures for assembling next generation sequencing reads.

Natali L, Cossu RM, Barghini E, Giordani T, Buti M, Mascagni F, Morgante M, Gill N, Kane NC, Rieseberg L, Cavallini A - BMC Genomics (2013)

Number of sequences composing the 30 most numerous families of LTR-REs (above) and DNA transposons (below).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3852528&req=5

Figure 5: Number of sequences composing the 30 most numerous families of LTR-REs (above) and DNA transposons (below).
Mentions: The most redundant family, belonging to the Gypsy repeat superfamily, included only 96 of the 47,924 sequences of SUNREP (0.20%). Only four Gypsy families were composed of more than 50 SUNREP sequences. Considering the 30 most numerous LTR-REs families, the vast majority belonged to the Gypsy superfamily (Figure 5). Among the 30 most numerous DNA transposons, the most common families belonged to the Helitron class, followed by putative MITEs (Figure 5). It should be noted that the number of sequences that belong to a family in SUNREP does not reflect the redundancy of that family in the genome.

Bottom Line: Also many families of non autonomous retrotransposons and DNA transposons (especially of the Helitron superfamily) were identified.The results substantially matched those previously obtained by using a Sanger-sequenced shotgun library and a standard 454 whole-genome-shotgun approach, indicating the reliability of the proposed procedures also for other species.The repetitive sequences were collected to produce a database, SUNREP, that will be useful for the annotation of the sunflower genome sequence and for studying the genome evolution in dicotyledons.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Agricultural, Food, and Environmental Sciences, University of Pisa, Via del Borghetto 80, I-56124 Pisa, Italy. lnatali@agr.unipi.it.

ABSTRACT

Background: Next generation sequencing provides a powerful tool to study genome structure in species whose genomes are far from being completely sequenced. In this work we describe and compare different computational approaches to evaluate the repetitive component of the genome of sunflower, by using medium/low coverage Illumina or 454 libraries.

Results: By varying sequencing technology (Illumina or 454), coverage (0.55 x-1.25 x), assemblers and assembly procedures, six different genomic databases were produced. The annotation of these databases showed that they were composed of different proportions of repetitive DNA families. The final assembly of the sequences belonging to the six databases produced a whole genome set of 283,800 contigs. The redundancy of each contig was estimated by mapping the whole genome set with a large Illumina read set and measuring the number of matched Illumina reads. The repetitive component amounted to 81% of the sunflower genome, that is composed mainly of numerous families of Gypsy and Copia retrotransposons. Also many families of non autonomous retrotransposons and DNA transposons (especially of the Helitron superfamily) were identified.

Conclusions: The results substantially matched those previously obtained by using a Sanger-sequenced shotgun library and a standard 454 whole-genome-shotgun approach, indicating the reliability of the proposed procedures also for other species. The repetitive sequences were collected to produce a database, SUNREP, that will be useful for the annotation of the sunflower genome sequence and for studying the genome evolution in dicotyledons.

Show MeSH