Limits...
De novo assembly, characterization and functional annotation of Senegalese sole (Solea senegalensis) and common sole (Solea solea) transcriptomes: integration in a database and design of a microarray.

Benzekri H, Armesto P, Cousin X, Rovira M, Crespo D, Merlo MA, Mazurais D, Bautista R, Guerrero-Fernández D, Fernandez-Pozo N, Ponce M, Infante C, Zambonino JL, Nidelet S, Gut M, Rebordinos L, Planas JV, Bégout ML, Claros MG, Manchado M - BMC Genomics (2014)

Bottom Line: Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR.The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies.Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

View Article: PubMed Central - PubMed

Affiliation: IFAPA Centro El Toruño, IFAPA, Consejeria de Agricultura y Pesca, 11500 El Puerto de Santa María, Cádiz, Spain. manuel.manchado@juntadeandalucia.es.

ABSTRACT

Background: Senegalese sole (Solea senegalensis) and common sole (S. solea) are two economically and evolutionary important flatfish species both in fisheries and aquaculture. Although some genomic resources and tools were recently described in these species, further sequencing efforts are required to establish a complete transcriptome, and to identify new molecular markers. Moreover, the comparative analysis of transcriptomes will be useful to understand flatfish evolution.

Results: A comprehensive characterization of the transcriptome for each species was carried out using a large set of Illumina data (more than 1,800 millions reads for each sole species) and 454 reads (more than 5 millions reads only in S. senegalensis), providing coverages ranging from 1,384x to 2,543x. After a de novo assembly, 45,063 and 38,402 different transcripts were obtained, comprising 18,738 and 22,683 full-length cDNAs in S. senegalensis and S. solea, respectively. A reference transcriptome with the longest unique transcripts and putative non-redundant new transcripts was established for each species. A subset of 11,953 reference transcripts was qualified as highly reliable orthologs (>97% identity) between both species. A small subset of putative species-specific, lineage-specific and flatfish-specific transcripts were also identified. Furthermore, transcriptome data permitted the identification of single nucleotide polymorphisms and simple-sequence repeats confirmed by FISH to be used in further genetic and expression studies. Moreover, evidences on the retention of crystallins crybb1, crybb1-like and crybb3 in the two species of soles are also presented. Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR. Finally, transcriptomic data were hosted and structured at SoleaDB.

Conclusions: Transcriptomes and molecular markers identified in this study represent a valuable source for future genomic studies in these economically important species. Orthology analysis provided new clues regarding sole genome evolution indicating a divergent evolution of crystallins in flatfish. The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies. Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

Show MeSH
Representation of transcript abundance with respect of their lengths in theS. senegalensis(dark) andS. solea(grey) transcriptomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4232633&req=5

Fig2: Representation of transcript abundance with respect of their lengths in theS. senegalensis(dark) andS. solea(grey) transcriptomes.

Mentions: The S. senegalensis transcriptome was initially assembled using only the Roche/454 reads, following the workflow depicted in Figure 1A (Additional file 2, S. senegalensis v3). Interestingly, the assembly resulted in a large number of transcripts longer than 500 bp, slightly higher than expected for a teleost [24, 27]. The addition of Illumina libraries to improve transcriptome assembly (S. senegalensis v4 final transcriptome) required the workflow depicted in Figure 1B, where the long read strategy was slightly modified (essentially, all MIRA3 debris were discarded). The S. solea transcriptome (named S. solea v1) was assembled using the short read strategy depicted in Figure 1B, except that k-mers used were 25 and 69 due to longer raw reads (Table 1). S. solea v1 and S. senegalensis v4 transcriptomes were comparable with respect to (i) the frequency distribution of transcript length (Figure 2), (ii) the total number of transcripts (Additional file 2) and (iii) the number of transcripts longer than 500 bp (Additional file 2). However, mean length and N50 were clearly longer in S. solea (Additional file 2), which may be explained by the longer input reads (89 vs 66 nt in S. solea and S. senegalensis, respectively, Table 1) and a low relative contribution of Roche/454 reads in S. senegalensis v4. This low contribution may be explained by the fact that Roche/454 libraries were normalized to reduce highly abundant transcripts which might have led to more fragmented assemblies limiting the contribution of Roche/454 reads to the final transcriptome [28].Figure 1


De novo assembly, characterization and functional annotation of Senegalese sole (Solea senegalensis) and common sole (Solea solea) transcriptomes: integration in a database and design of a microarray.

Benzekri H, Armesto P, Cousin X, Rovira M, Crespo D, Merlo MA, Mazurais D, Bautista R, Guerrero-Fernández D, Fernandez-Pozo N, Ponce M, Infante C, Zambonino JL, Nidelet S, Gut M, Rebordinos L, Planas JV, Bégout ML, Claros MG, Manchado M - BMC Genomics (2014)

Representation of transcript abundance with respect of their lengths in theS. senegalensis(dark) andS. solea(grey) transcriptomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4232633&req=5

Fig2: Representation of transcript abundance with respect of their lengths in theS. senegalensis(dark) andS. solea(grey) transcriptomes.
Mentions: The S. senegalensis transcriptome was initially assembled using only the Roche/454 reads, following the workflow depicted in Figure 1A (Additional file 2, S. senegalensis v3). Interestingly, the assembly resulted in a large number of transcripts longer than 500 bp, slightly higher than expected for a teleost [24, 27]. The addition of Illumina libraries to improve transcriptome assembly (S. senegalensis v4 final transcriptome) required the workflow depicted in Figure 1B, where the long read strategy was slightly modified (essentially, all MIRA3 debris were discarded). The S. solea transcriptome (named S. solea v1) was assembled using the short read strategy depicted in Figure 1B, except that k-mers used were 25 and 69 due to longer raw reads (Table 1). S. solea v1 and S. senegalensis v4 transcriptomes were comparable with respect to (i) the frequency distribution of transcript length (Figure 2), (ii) the total number of transcripts (Additional file 2) and (iii) the number of transcripts longer than 500 bp (Additional file 2). However, mean length and N50 were clearly longer in S. solea (Additional file 2), which may be explained by the longer input reads (89 vs 66 nt in S. solea and S. senegalensis, respectively, Table 1) and a low relative contribution of Roche/454 reads in S. senegalensis v4. This low contribution may be explained by the fact that Roche/454 libraries were normalized to reduce highly abundant transcripts which might have led to more fragmented assemblies limiting the contribution of Roche/454 reads to the final transcriptome [28].Figure 1

Bottom Line: Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR.The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies.Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

View Article: PubMed Central - PubMed

Affiliation: IFAPA Centro El Toruño, IFAPA, Consejeria de Agricultura y Pesca, 11500 El Puerto de Santa María, Cádiz, Spain. manuel.manchado@juntadeandalucia.es.

ABSTRACT

Background: Senegalese sole (Solea senegalensis) and common sole (S. solea) are two economically and evolutionary important flatfish species both in fisheries and aquaculture. Although some genomic resources and tools were recently described in these species, further sequencing efforts are required to establish a complete transcriptome, and to identify new molecular markers. Moreover, the comparative analysis of transcriptomes will be useful to understand flatfish evolution.

Results: A comprehensive characterization of the transcriptome for each species was carried out using a large set of Illumina data (more than 1,800 millions reads for each sole species) and 454 reads (more than 5 millions reads only in S. senegalensis), providing coverages ranging from 1,384x to 2,543x. After a de novo assembly, 45,063 and 38,402 different transcripts were obtained, comprising 18,738 and 22,683 full-length cDNAs in S. senegalensis and S. solea, respectively. A reference transcriptome with the longest unique transcripts and putative non-redundant new transcripts was established for each species. A subset of 11,953 reference transcripts was qualified as highly reliable orthologs (>97% identity) between both species. A small subset of putative species-specific, lineage-specific and flatfish-specific transcripts were also identified. Furthermore, transcriptome data permitted the identification of single nucleotide polymorphisms and simple-sequence repeats confirmed by FISH to be used in further genetic and expression studies. Moreover, evidences on the retention of crystallins crybb1, crybb1-like and crybb3 in the two species of soles are also presented. Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR. Finally, transcriptomic data were hosted and structured at SoleaDB.

Conclusions: Transcriptomes and molecular markers identified in this study represent a valuable source for future genomic studies in these economically important species. Orthology analysis provided new clues regarding sole genome evolution indicating a divergent evolution of crystallins in flatfish. The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies. Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

Show MeSH