Limits...
The Prediction and Validation of Small CDSs Expand the Gene Repertoire of the Smallest Known Eukaryotic Genomes.

Belkorchia A, Gasc C, Polonais V, Parisot N, Gallois N, Ribière C, Lerat E, Gaspin C, Pombert JF, Peyret P, Peyretaillade E - PLoS ONE (2015)

Bottom Line: To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes.Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance.The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

View Article: PubMed Central - PubMed

Affiliation: Clermont Université, Université d'Auvergne, Laboratoire "Microorganismes: Génome et Environnement", BP 10448, F-63000, Clermont-Ferrand, France; CNRS, UMR 6023, LMGE, F-63171, Aubière, France.

ABSTRACT
The proper prediction of the gene catalogue of an organism is essential to obtain a representative snapshot of its overall lifestyle, especially when it is not amenable to culturing. Microsporidia are obligate intracellular, sometimes hard to culture, eukaryotic parasites known to infect members of every animal phylum. To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes. In the present paper, we investigated whether such gene sizes may have induced biases for the methodologies used for genome annotation, with an emphasis on small coding sequence (CDS) gene prediction. Using better delineated intergenic regions from four Encephalitozoon genomes, we predicted de novo new small CDSs with sizes ranging from 78 to 255 bp (median 168) and corroborated these predictions by RACE-PCR experiments in Encephalitozoon cuniculi. Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance. The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

No MeSH data available.


Related in: MedlinePlus

Identification of the 5' and 3' maturation sites of the newly predicted small CDSs.Translation initiation codons and stop codons are highlighted in light-grey for all genes. Putative polyadenylation signals are underlined and highlighted in bold characters. Distances between putative polyadenylation signals and polyadenylation sites are indicated between parentheses. Putative microsporidian promoter specific signals, located upstream the transcription start sites, are highlighted in dark grey. For brevity, the complete CDS sequences were not included and are represented instead by the corresponding gene names. ND; Not Defined.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4589312&req=5

pone.0139075.g003: Identification of the 5' and 3' maturation sites of the newly predicted small CDSs.Translation initiation codons and stop codons are highlighted in light-grey for all genes. Putative polyadenylation signals are underlined and highlighted in bold characters. Distances between putative polyadenylation signals and polyadenylation sites are indicated between parentheses. Putative microsporidian promoter specific signals, located upstream the transcription start sites, are highlighted in dark grey. For brevity, the complete CDS sequences were not included and are represented instead by the corresponding gene names. ND; Not Defined.

Mentions: All of the 32 predicted sCDSs were confirmed to be transcribed in E. cuniculi by 5’ and/or 3’ RACE-PCR experiments followed by Sanger sequencing of the RACE-PCR products thus obtained (Fig 3). The 5’ transcriptional start and polyadenylation sites have been identified for most genes, including those (ECU02_0425, ECU04_1635, ECU05_0115, ECU07_0862, ECU07_1645, ECU07_1775, ECU09_0465 and ECU09_1255) for which no ortholog could be detected in other non-Encephalitozoon microsporidian species (Fig 2 and Table 1). A total of four genes (ECU02_1495, ECU05_0087, ECU07_1775 and ECU11_1725) were also found to harbor upstream of their CCC-like motif, adenine/thymine-rich AAATTT-like or adenine rich sequences that are positively correlated with high gene expression levels in Microsporidia [37]. Thus, integrating all of these results we propose that E. cuniculi, E. intestinalis, E. romaleae and E. hellem contain 2126, 1927, 1904 and 1955 CDSs, respectively.


The Prediction and Validation of Small CDSs Expand the Gene Repertoire of the Smallest Known Eukaryotic Genomes.

Belkorchia A, Gasc C, Polonais V, Parisot N, Gallois N, Ribière C, Lerat E, Gaspin C, Pombert JF, Peyret P, Peyretaillade E - PLoS ONE (2015)

Identification of the 5' and 3' maturation sites of the newly predicted small CDSs.Translation initiation codons and stop codons are highlighted in light-grey for all genes. Putative polyadenylation signals are underlined and highlighted in bold characters. Distances between putative polyadenylation signals and polyadenylation sites are indicated between parentheses. Putative microsporidian promoter specific signals, located upstream the transcription start sites, are highlighted in dark grey. For brevity, the complete CDS sequences were not included and are represented instead by the corresponding gene names. ND; Not Defined.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4589312&req=5

pone.0139075.g003: Identification of the 5' and 3' maturation sites of the newly predicted small CDSs.Translation initiation codons and stop codons are highlighted in light-grey for all genes. Putative polyadenylation signals are underlined and highlighted in bold characters. Distances between putative polyadenylation signals and polyadenylation sites are indicated between parentheses. Putative microsporidian promoter specific signals, located upstream the transcription start sites, are highlighted in dark grey. For brevity, the complete CDS sequences were not included and are represented instead by the corresponding gene names. ND; Not Defined.
Mentions: All of the 32 predicted sCDSs were confirmed to be transcribed in E. cuniculi by 5’ and/or 3’ RACE-PCR experiments followed by Sanger sequencing of the RACE-PCR products thus obtained (Fig 3). The 5’ transcriptional start and polyadenylation sites have been identified for most genes, including those (ECU02_0425, ECU04_1635, ECU05_0115, ECU07_0862, ECU07_1645, ECU07_1775, ECU09_0465 and ECU09_1255) for which no ortholog could be detected in other non-Encephalitozoon microsporidian species (Fig 2 and Table 1). A total of four genes (ECU02_1495, ECU05_0087, ECU07_1775 and ECU11_1725) were also found to harbor upstream of their CCC-like motif, adenine/thymine-rich AAATTT-like or adenine rich sequences that are positively correlated with high gene expression levels in Microsporidia [37]. Thus, integrating all of these results we propose that E. cuniculi, E. intestinalis, E. romaleae and E. hellem contain 2126, 1927, 1904 and 1955 CDSs, respectively.

Bottom Line: To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes.Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance.The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

View Article: PubMed Central - PubMed

Affiliation: Clermont Université, Université d'Auvergne, Laboratoire "Microorganismes: Génome et Environnement", BP 10448, F-63000, Clermont-Ferrand, France; CNRS, UMR 6023, LMGE, F-63171, Aubière, France.

ABSTRACT
The proper prediction of the gene catalogue of an organism is essential to obtain a representative snapshot of its overall lifestyle, especially when it is not amenable to culturing. Microsporidia are obligate intracellular, sometimes hard to culture, eukaryotic parasites known to infect members of every animal phylum. To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes. In the present paper, we investigated whether such gene sizes may have induced biases for the methodologies used for genome annotation, with an emphasis on small coding sequence (CDS) gene prediction. Using better delineated intergenic regions from four Encephalitozoon genomes, we predicted de novo new small CDSs with sizes ranging from 78 to 255 bp (median 168) and corroborated these predictions by RACE-PCR experiments in Encephalitozoon cuniculi. Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance. The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

No MeSH data available.


Related in: MedlinePlus