Limits...
An EST resource for tilapia based on 17 normalized libraries and assembly of 116,899 sequence tags.

Lee BY, Howe AE, Conte MA, D'Cotta H, Pepey E, Baroiller JF, di Palma F, Carleton KL, Kocher TD - BMC Genomics (2010)

Bottom Line: The ESTs were assembled into 20,190 contigs and 36,028 singletons for a total of 56,218 unique sequences and a total assembled length of 35,168,415 bp.Over the whole project, a unique sequence was discovered for every 2.079 sequence reads. 17,722 (31.5%) of these unique sequences had significant BLAST hits (e-value < 10(-10)) to the UniProt database.These sequences are an important resource for studies of gene expression, comparative mapping and annotation of the forthcoming tilapia genome sequence.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, University of Maryland, College Park, Maryland 20742, USA.

ABSTRACT

Background: Large collections of expressed sequence tags (ESTs) are a fundamental resource for analysis of gene expression and annotation of genome sequences. We generated 116,899 ESTs from 17 normalized and two non-normalized cDNA libraries representing 16 tissues from tilapia, a cichlid fish widely used in aquaculture and biological research.

Results: The ESTs were assembled into 20,190 contigs and 36,028 singletons for a total of 56,218 unique sequences and a total assembled length of 35,168,415 bp. Over the whole project, a unique sequence was discovered for every 2.079 sequence reads. 17,722 (31.5%) of these unique sequences had significant BLAST hits (e-value < 10(-10)) to the UniProt database.

Conclusion: Normalization of the cDNA pools with double-stranded nuclease allowed us to efficiently sequence a large collection of ESTs. These sequences are an important resource for studies of gene expression, comparative mapping and annotation of the forthcoming tilapia genome sequence.

Show MeSH

Related in: MedlinePlus

The redundancy of each library at different depths of sequencing. The x-axis is the number of sequence reads from each library. The y-axis indicates the number reads required to discover a sequence that does not cluster with the existing sequences for that library. Results after each round of sequence are shown for Br3 (squares) and Ret4 (diamonds). Other libraries are shown with circles. The point in the upper left is the non-normalized Ret3 library.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2874815&req=5

Figure 2: The redundancy of each library at different depths of sequencing. The x-axis is the number of sequence reads from each library. The y-axis indicates the number reads required to discover a sequence that does not cluster with the existing sequences for that library. Results after each round of sequence are shown for Br3 (squares) and Ret4 (diamonds). Other libraries are shown with circles. The point in the upper left is the non-normalized Ret3 library.

Mentions: We performed separate Cap3 assemblies to assess the rate of sequence discovery for each library (Table 2). After sequencing 5,000 clones from each library, the rate of discovery ranged from 1.1 to 1.6 reads/discovery (Figure 2). This quantification allowed us to select the least redundant libraries for further sequencing.


An EST resource for tilapia based on 17 normalized libraries and assembly of 116,899 sequence tags.

Lee BY, Howe AE, Conte MA, D'Cotta H, Pepey E, Baroiller JF, di Palma F, Carleton KL, Kocher TD - BMC Genomics (2010)

The redundancy of each library at different depths of sequencing. The x-axis is the number of sequence reads from each library. The y-axis indicates the number reads required to discover a sequence that does not cluster with the existing sequences for that library. Results after each round of sequence are shown for Br3 (squares) and Ret4 (diamonds). Other libraries are shown with circles. The point in the upper left is the non-normalized Ret3 library.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2874815&req=5

Figure 2: The redundancy of each library at different depths of sequencing. The x-axis is the number of sequence reads from each library. The y-axis indicates the number reads required to discover a sequence that does not cluster with the existing sequences for that library. Results after each round of sequence are shown for Br3 (squares) and Ret4 (diamonds). Other libraries are shown with circles. The point in the upper left is the non-normalized Ret3 library.
Mentions: We performed separate Cap3 assemblies to assess the rate of sequence discovery for each library (Table 2). After sequencing 5,000 clones from each library, the rate of discovery ranged from 1.1 to 1.6 reads/discovery (Figure 2). This quantification allowed us to select the least redundant libraries for further sequencing.

Bottom Line: The ESTs were assembled into 20,190 contigs and 36,028 singletons for a total of 56,218 unique sequences and a total assembled length of 35,168,415 bp.Over the whole project, a unique sequence was discovered for every 2.079 sequence reads. 17,722 (31.5%) of these unique sequences had significant BLAST hits (e-value < 10(-10)) to the UniProt database.These sequences are an important resource for studies of gene expression, comparative mapping and annotation of the forthcoming tilapia genome sequence.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Biology, University of Maryland, College Park, Maryland 20742, USA.

ABSTRACT

Background: Large collections of expressed sequence tags (ESTs) are a fundamental resource for analysis of gene expression and annotation of genome sequences. We generated 116,899 ESTs from 17 normalized and two non-normalized cDNA libraries representing 16 tissues from tilapia, a cichlid fish widely used in aquaculture and biological research.

Results: The ESTs were assembled into 20,190 contigs and 36,028 singletons for a total of 56,218 unique sequences and a total assembled length of 35,168,415 bp. Over the whole project, a unique sequence was discovered for every 2.079 sequence reads. 17,722 (31.5%) of these unique sequences had significant BLAST hits (e-value < 10(-10)) to the UniProt database.

Conclusion: Normalization of the cDNA pools with double-stranded nuclease allowed us to efficiently sequence a large collection of ESTs. These sequences are an important resource for studies of gene expression, comparative mapping and annotation of the forthcoming tilapia genome sequence.

Show MeSH
Related in: MedlinePlus