Limits...
De novo assembly, characterization and functional annotation of Senegalese sole (Solea senegalensis) and common sole (Solea solea) transcriptomes: integration in a database and design of a microarray.

Benzekri H, Armesto P, Cousin X, Rovira M, Crespo D, Merlo MA, Mazurais D, Bautista R, Guerrero-Fernández D, Fernandez-Pozo N, Ponce M, Infante C, Zambonino JL, Nidelet S, Gut M, Rebordinos L, Planas JV, Bégout ML, Claros MG, Manchado M - BMC Genomics (2014)

Bottom Line: Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR.The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies.Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

View Article: PubMed Central - PubMed

Affiliation: IFAPA Centro El Toruño, IFAPA, Consejeria de Agricultura y Pesca, 11500 El Puerto de Santa María, Cádiz, Spain. manuel.manchado@juntadeandalucia.es.

ABSTRACT

Background: Senegalese sole (Solea senegalensis) and common sole (S. solea) are two economically and evolutionary important flatfish species both in fisheries and aquaculture. Although some genomic resources and tools were recently described in these species, further sequencing efforts are required to establish a complete transcriptome, and to identify new molecular markers. Moreover, the comparative analysis of transcriptomes will be useful to understand flatfish evolution.

Results: A comprehensive characterization of the transcriptome for each species was carried out using a large set of Illumina data (more than 1,800 millions reads for each sole species) and 454 reads (more than 5 millions reads only in S. senegalensis), providing coverages ranging from 1,384x to 2,543x. After a de novo assembly, 45,063 and 38,402 different transcripts were obtained, comprising 18,738 and 22,683 full-length cDNAs in S. senegalensis and S. solea, respectively. A reference transcriptome with the longest unique transcripts and putative non-redundant new transcripts was established for each species. A subset of 11,953 reference transcripts was qualified as highly reliable orthologs (>97% identity) between both species. A small subset of putative species-specific, lineage-specific and flatfish-specific transcripts were also identified. Furthermore, transcriptome data permitted the identification of single nucleotide polymorphisms and simple-sequence repeats confirmed by FISH to be used in further genetic and expression studies. Moreover, evidences on the retention of crystallins crybb1, crybb1-like and crybb3 in the two species of soles are also presented. Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR. Finally, transcriptomic data were hosted and structured at SoleaDB.

Conclusions: Transcriptomes and molecular markers identified in this study represent a valuable source for future genomic studies in these economically important species. Orthology analysis provided new clues regarding sole genome evolution indicating a divergent evolution of crystallins in flatfish. The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies. Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

Show MeSH
Venn’s diagrams reflecting coincidences bySoleaspecies among sole, Blast-based orthologs and transcripts with RefSeq/ENSEMBL ortholog for zebrafish. Diagrams are comparing the 11,743 Blast-based orthologs with the unique zebrafish RefSeq identifiers in SoleaDB for S. senegalensis (39,851) and S. solea (34,949) and with the unique zebrafish ENSEMBL identifiers in SoleaDB for S. senegalensis (39,270) and S. solea (34.389). Within the Venn’s diagrams, the numbers refer to the amount of transcripts in SoleaDB for S. senegalensis (Sse) and S. solea (Sso), the number of transcript in SoleaDB with a zebrafish RefSeq identifier (R) of with a zebrafish ENSEMBL identifier (E).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4232633&req=5

Fig5: Venn’s diagrams reflecting coincidences bySoleaspecies among sole, Blast-based orthologs and transcripts with RefSeq/ENSEMBL ortholog for zebrafish. Diagrams are comparing the 11,743 Blast-based orthologs with the unique zebrafish RefSeq identifiers in SoleaDB for S. senegalensis (39,851) and S. solea (34,949) and with the unique zebrafish ENSEMBL identifiers in SoleaDB for S. senegalensis (39,270) and S. solea (34.389). Within the Venn’s diagrams, the numbers refer to the amount of transcripts in SoleaDB for S. senegalensis (Sse) and S. solea (Sso), the number of transcript in SoleaDB with a zebrafish RefSeq identifier (R) of with a zebrafish ENSEMBL identifier (E).

Mentions: The 11,743 sole Blast-based orthologs with annotation (excluding the 210 unannotated transcripts, see above) were investigated based on their RefSeq and/or ENSEMBL ortholog for zebrafish. As shown in Figure 5, most Blast-based orthologs (93.8%) had a zebrafish ortholog. However, the most interesting finding is the small subset of sole orthologs lacking zebrafish similarity (Figure 5; 701 in S. senegalensis and 492 in S. solea, with 351 transcripts present in both species; Figure 5). Some of these transcripts without zebrafish ortholog were related to the immune system such as hepcidin antimicrobial peptides and some interleukins (e.g. IL11b, IL17A/F-1, IL8, IL22, IL7). Hepcidins appear as a highly diversified family in acanthopterygians (HAMP2-like group) that favoured the radiation of teleosts in marine and brackish environments [39, 40]. Similarly, IL11b duplication appeared later during evolution not occurring in zebrafish [41]. These data suggest that this subset of sole orthologs without zebrafish orthology might represent lineage-specific genes that have appeared, subfunctionalized or neofunctionalized later during teleost evolution. To check the presence of these transcripts in other teleosts, proteins deduced from reference transcripts were compared (Additional file 6, “Annotated transcripts”), observing that most of them were also present in the C. semilaevis genome (287; 81.8%). Only 18 transcripts (5.1%) lacked any orthology in the teleosts analyzed confirming that this collection of transcripts could correspond to genes acquired or fixed during fish evolution (Additional file 5, “lineage-specific genes” tab).Figure 5


De novo assembly, characterization and functional annotation of Senegalese sole (Solea senegalensis) and common sole (Solea solea) transcriptomes: integration in a database and design of a microarray.

Benzekri H, Armesto P, Cousin X, Rovira M, Crespo D, Merlo MA, Mazurais D, Bautista R, Guerrero-Fernández D, Fernandez-Pozo N, Ponce M, Infante C, Zambonino JL, Nidelet S, Gut M, Rebordinos L, Planas JV, Bégout ML, Claros MG, Manchado M - BMC Genomics (2014)

Venn’s diagrams reflecting coincidences bySoleaspecies among sole, Blast-based orthologs and transcripts with RefSeq/ENSEMBL ortholog for zebrafish. Diagrams are comparing the 11,743 Blast-based orthologs with the unique zebrafish RefSeq identifiers in SoleaDB for S. senegalensis (39,851) and S. solea (34,949) and with the unique zebrafish ENSEMBL identifiers in SoleaDB for S. senegalensis (39,270) and S. solea (34.389). Within the Venn’s diagrams, the numbers refer to the amount of transcripts in SoleaDB for S. senegalensis (Sse) and S. solea (Sso), the number of transcript in SoleaDB with a zebrafish RefSeq identifier (R) of with a zebrafish ENSEMBL identifier (E).
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4232633&req=5

Fig5: Venn’s diagrams reflecting coincidences bySoleaspecies among sole, Blast-based orthologs and transcripts with RefSeq/ENSEMBL ortholog for zebrafish. Diagrams are comparing the 11,743 Blast-based orthologs with the unique zebrafish RefSeq identifiers in SoleaDB for S. senegalensis (39,851) and S. solea (34,949) and with the unique zebrafish ENSEMBL identifiers in SoleaDB for S. senegalensis (39,270) and S. solea (34.389). Within the Venn’s diagrams, the numbers refer to the amount of transcripts in SoleaDB for S. senegalensis (Sse) and S. solea (Sso), the number of transcript in SoleaDB with a zebrafish RefSeq identifier (R) of with a zebrafish ENSEMBL identifier (E).
Mentions: The 11,743 sole Blast-based orthologs with annotation (excluding the 210 unannotated transcripts, see above) were investigated based on their RefSeq and/or ENSEMBL ortholog for zebrafish. As shown in Figure 5, most Blast-based orthologs (93.8%) had a zebrafish ortholog. However, the most interesting finding is the small subset of sole orthologs lacking zebrafish similarity (Figure 5; 701 in S. senegalensis and 492 in S. solea, with 351 transcripts present in both species; Figure 5). Some of these transcripts without zebrafish ortholog were related to the immune system such as hepcidin antimicrobial peptides and some interleukins (e.g. IL11b, IL17A/F-1, IL8, IL22, IL7). Hepcidins appear as a highly diversified family in acanthopterygians (HAMP2-like group) that favoured the radiation of teleosts in marine and brackish environments [39, 40]. Similarly, IL11b duplication appeared later during evolution not occurring in zebrafish [41]. These data suggest that this subset of sole orthologs without zebrafish orthology might represent lineage-specific genes that have appeared, subfunctionalized or neofunctionalized later during teleost evolution. To check the presence of these transcripts in other teleosts, proteins deduced from reference transcripts were compared (Additional file 6, “Annotated transcripts”), observing that most of them were also present in the C. semilaevis genome (287; 81.8%). Only 18 transcripts (5.1%) lacked any orthology in the teleosts analyzed confirming that this collection of transcripts could correspond to genes acquired or fixed during fish evolution (Additional file 5, “lineage-specific genes” tab).Figure 5

Bottom Line: Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR.The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies.Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

View Article: PubMed Central - PubMed

Affiliation: IFAPA Centro El Toruño, IFAPA, Consejeria de Agricultura y Pesca, 11500 El Puerto de Santa María, Cádiz, Spain. manuel.manchado@juntadeandalucia.es.

ABSTRACT

Background: Senegalese sole (Solea senegalensis) and common sole (S. solea) are two economically and evolutionary important flatfish species both in fisheries and aquaculture. Although some genomic resources and tools were recently described in these species, further sequencing efforts are required to establish a complete transcriptome, and to identify new molecular markers. Moreover, the comparative analysis of transcriptomes will be useful to understand flatfish evolution.

Results: A comprehensive characterization of the transcriptome for each species was carried out using a large set of Illumina data (more than 1,800 millions reads for each sole species) and 454 reads (more than 5 millions reads only in S. senegalensis), providing coverages ranging from 1,384x to 2,543x. After a de novo assembly, 45,063 and 38,402 different transcripts were obtained, comprising 18,738 and 22,683 full-length cDNAs in S. senegalensis and S. solea, respectively. A reference transcriptome with the longest unique transcripts and putative non-redundant new transcripts was established for each species. A subset of 11,953 reference transcripts was qualified as highly reliable orthologs (>97% identity) between both species. A small subset of putative species-specific, lineage-specific and flatfish-specific transcripts were also identified. Furthermore, transcriptome data permitted the identification of single nucleotide polymorphisms and simple-sequence repeats confirmed by FISH to be used in further genetic and expression studies. Moreover, evidences on the retention of crystallins crybb1, crybb1-like and crybb3 in the two species of soles are also presented. Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR. Finally, transcriptomic data were hosted and structured at SoleaDB.

Conclusions: Transcriptomes and molecular markers identified in this study represent a valuable source for future genomic studies in these economically important species. Orthology analysis provided new clues regarding sole genome evolution indicating a divergent evolution of crystallins in flatfish. The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies. Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.

Show MeSH