Limits...
The Prediction and Validation of Small CDSs Expand the Gene Repertoire of the Smallest Known Eukaryotic Genomes.

Belkorchia A, Gasc C, Polonais V, Parisot N, Gallois N, Ribière C, Lerat E, Gaspin C, Pombert JF, Peyret P, Peyretaillade E - PLoS ONE (2015)

Bottom Line: To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes.Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance.The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

View Article: PubMed Central - PubMed

Affiliation: Clermont Université, Université d'Auvergne, Laboratoire "Microorganismes: Génome et Environnement", BP 10448, F-63000, Clermont-Ferrand, France; CNRS, UMR 6023, LMGE, F-63171, Aubière, France.

ABSTRACT
The proper prediction of the gene catalogue of an organism is essential to obtain a representative snapshot of its overall lifestyle, especially when it is not amenable to culturing. Microsporidia are obligate intracellular, sometimes hard to culture, eukaryotic parasites known to infect members of every animal phylum. To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes. In the present paper, we investigated whether such gene sizes may have induced biases for the methodologies used for genome annotation, with an emphasis on small coding sequence (CDS) gene prediction. Using better delineated intergenic regions from four Encephalitozoon genomes, we predicted de novo new small CDSs with sizes ranging from 78 to 255 bp (median 168) and corroborated these predictions by RACE-PCR experiments in Encephalitozoon cuniculi. Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance. The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

No MeSH data available.


Validation example of the newly predicted orthologs using both protein and nucleotide sequence alignments.Protein and nucleotide alignments were performed using MUSCLE and Clustal Omega, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4589312&req=5

pone.0139075.g002: Validation example of the newly predicted orthologs using both protein and nucleotide sequence alignments.Protein and nucleotide alignments were performed using MUSCLE and Clustal Omega, respectively.

Mentions: Thereafter, using the curated annotations described above, we searched for the presence of short protein-coding gene candidates. Specifically, we searched for transcriptional and/or translational signals in intergenic regions that flanked small open reading frames, with the condition that both signals and ORFs were conserved across the Encephalitozoon genomes. Using this approach, a total of 31 small but highly conserved CDSs were identified in the four Encephalitozoon species (Fig 1, Table 1 and S3 Table). Another sCDS was also found to be shared between E. cuniculi (ECU04_1635) and E. romaleae (EROM_041665). However, its presence could not be ascertained in E. hellem and E. intestinalis because its location, based on syntenic information, falls within unsequenced regions. The proteins encoded by the newly-identified small CDS range from 25 to 84 amino acids in E. cuniculi (median 55; Table 1) and generally show a high level of similarity across the four Encephalitozoon species, with an average of 72% (min 46%, max 96%; Fig 2 and S1 Fig).


The Prediction and Validation of Small CDSs Expand the Gene Repertoire of the Smallest Known Eukaryotic Genomes.

Belkorchia A, Gasc C, Polonais V, Parisot N, Gallois N, Ribière C, Lerat E, Gaspin C, Pombert JF, Peyret P, Peyretaillade E - PLoS ONE (2015)

Validation example of the newly predicted orthologs using both protein and nucleotide sequence alignments.Protein and nucleotide alignments were performed using MUSCLE and Clustal Omega, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4589312&req=5

pone.0139075.g002: Validation example of the newly predicted orthologs using both protein and nucleotide sequence alignments.Protein and nucleotide alignments were performed using MUSCLE and Clustal Omega, respectively.
Mentions: Thereafter, using the curated annotations described above, we searched for the presence of short protein-coding gene candidates. Specifically, we searched for transcriptional and/or translational signals in intergenic regions that flanked small open reading frames, with the condition that both signals and ORFs were conserved across the Encephalitozoon genomes. Using this approach, a total of 31 small but highly conserved CDSs were identified in the four Encephalitozoon species (Fig 1, Table 1 and S3 Table). Another sCDS was also found to be shared between E. cuniculi (ECU04_1635) and E. romaleae (EROM_041665). However, its presence could not be ascertained in E. hellem and E. intestinalis because its location, based on syntenic information, falls within unsequenced regions. The proteins encoded by the newly-identified small CDS range from 25 to 84 amino acids in E. cuniculi (median 55; Table 1) and generally show a high level of similarity across the four Encephalitozoon species, with an average of 72% (min 46%, max 96%; Fig 2 and S1 Fig).

Bottom Line: To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes.Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance.The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

View Article: PubMed Central - PubMed

Affiliation: Clermont Université, Université d'Auvergne, Laboratoire "Microorganismes: Génome et Environnement", BP 10448, F-63000, Clermont-Ferrand, France; CNRS, UMR 6023, LMGE, F-63171, Aubière, France.

ABSTRACT
The proper prediction of the gene catalogue of an organism is essential to obtain a representative snapshot of its overall lifestyle, especially when it is not amenable to culturing. Microsporidia are obligate intracellular, sometimes hard to culture, eukaryotic parasites known to infect members of every animal phylum. To date, sequencing and annotation of microsporidian genomes have revealed a poor gene complement with highly reduced gene sizes. In the present paper, we investigated whether such gene sizes may have induced biases for the methodologies used for genome annotation, with an emphasis on small coding sequence (CDS) gene prediction. Using better delineated intergenic regions from four Encephalitozoon genomes, we predicted de novo new small CDSs with sizes ranging from 78 to 255 bp (median 168) and corroborated these predictions by RACE-PCR experiments in Encephalitozoon cuniculi. Most of the newly found genes are present in other distantly related microsporidian species, suggesting their biological relevance. The present study provides a better framework for annotating microsporidian genomes and to train and evaluate new computational methods dedicated at detecting ultra-small genes in various organisms.

No MeSH data available.