Limits...
Non-coding RNA gene families in the genomes of anopheline mosquitoes.

Dritsou V, Deligianni E, Dialynas E, Allen J, Poulakakis N, Louis C, Lawson D, Topalis P - BMC Genomics (2014)

Bottom Line: Our analysis was carried out using, exclusively, computational approaches, and evaluating both the primary NGS reads as well as the respective genome assemblies produced by the consortium and stored in VectorBase; moreover, the results of RNAseq surveys in cases in which these were available and meaningful were also accessed in order to obtain supplementary data, as were "pre-genomic era" sequence data stored in nucleic acid databases.Our study led to the identification of members of these gene families in the majority of twenty different anopheline taxa.A set of tools for the study of the evolution and molecular biology of important disease vectors has, thus, been obtained.

View Article: PubMed Central - PubMed

Affiliation: Institute of Molecular Biology and Biotechnology, FORTH, Heraklion, Greece. topalis@imbb.forth.gr.

ABSTRACT

Background: Only a small fraction of the mosquito species of the genus Anopheles are able to transmit malaria, one of the biggest killer diseases of poverty, which is mostly prevalent in the tropics. This diversity has genetic, yet unknown, causes. In a further attempt to contribute to the elucidation of these variances, the international "Anopheles Genomes Cluster Consortium" project (a.k.a. "16 Anopheles genomes project") was established, aiming at a comprehensive genomic analysis of several anopheline species, most of which are malaria vectors. In the frame of the international consortium carrying out this project our team studied the genes encoding families of non-coding RNAs (ncRNAs), concentrating on four classes: microRNA (miRNA), ribosomal RNA (rRNA), small nuclear RNA (snRNA), and in particular small nucleolar RNA (snoRNA) and, finally, transfer RNA (tRNA).

Results: Our analysis was carried out using, exclusively, computational approaches, and evaluating both the primary NGS reads as well as the respective genome assemblies produced by the consortium and stored in VectorBase; moreover, the results of RNAseq surveys in cases in which these were available and meaningful were also accessed in order to obtain supplementary data, as were "pre-genomic era" sequence data stored in nucleic acid databases. The investigation included the identification and analysis, in most species studied, of ncRNA genes belonging to several families, as well as the analysis of the evolutionary relations of some of those genes in cross-comparisons to other members of the genus Anopheles.

Conclusions: Our study led to the identification of members of these gene families in the majority of twenty different anopheline taxa. A set of tools for the study of the evolution and molecular biology of important disease vectors has, thus, been obtained.

Show MeSH

Related in: MedlinePlus

Alignments of genes encoding the 5.8S ribosomal RNA. The species examined are shown at the left before the sequences. Nucleotides highlighted in green differ from those found at the corresponding position in the consensus sequence. Dashes indicate gaps introduced to improve the alignment, dots to sequence that was not identified. Capitalized letters in the consensus sequence indicate the extent of the 5.8S RNA in D. melanogaster. The three underlined nucleotides highlighted in yellow in the Anopheles consensus sequence point to the terminal nucleotides of the three RNA species identified through the analysis of RNAseq experiments. The base numbering refers to the consensus sequence.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4300560&req=5

Fig1: Alignments of genes encoding the 5.8S ribosomal RNA. The species examined are shown at the left before the sequences. Nucleotides highlighted in green differ from those found at the corresponding position in the consensus sequence. Dashes indicate gaps introduced to improve the alignment, dots to sequence that was not identified. Capitalized letters in the consensus sequence indicate the extent of the 5.8S RNA in D. melanogaster. The three underlined nucleotides highlighted in yellow in the Anopheles consensus sequence point to the terminal nucleotides of the three RNA species identified through the analysis of RNAseq experiments. The base numbering refers to the consensus sequence.

Mentions: BLAST searches of both the assembled genomes and the SRA collections of reads that were generated in this project, allowed us to identify the 5S rRNA gene homologues in all species examined; in 16 out of 19 we were able to assemble a segment coding for the full 5.8S species. Initially we used the D. melanogaster sequence as a query, and then switched to that of A. gambiae once this was unambiguously identified. The sequences shown in Figure 1 show alignments of consensus sequences for each individual species; an overall consensus sequence for all anophelines was assembled from those of the 19 species examined. We stress here that the consensus of each individual taxon is based on the BLAST searches of the primary sequence reads; the output was obviously biased towards sequences that were more similar to the BLAST query; they are therefore not to be considered as “statistical representatives” of all reads present in the SRA database. Not unexpectedly, as seen in Figure 1 a very high degree of sequence conservation is apparent which, overall, ranges from 100% in comparisons between members of the A. gambiae s.l. species complex, to ~89% when the sequence of A. darlingi is compared to the consensus sequence determined from all species examined (average >96%). It should also be noted that most of the polymorphisms seen in A. darlingi are clustered towards the 3′ end of the mature 5.8S molecule, following the pattern detected for the overall comparison: we have determined a total of 42 polymorphic sites, of which 7 (17%) are found in the 5′-most 60 nucleotides, 13 (31%) in the next 60 bases and the remaining ones over the last segment of the gene.Figure 1


Non-coding RNA gene families in the genomes of anopheline mosquitoes.

Dritsou V, Deligianni E, Dialynas E, Allen J, Poulakakis N, Louis C, Lawson D, Topalis P - BMC Genomics (2014)

Alignments of genes encoding the 5.8S ribosomal RNA. The species examined are shown at the left before the sequences. Nucleotides highlighted in green differ from those found at the corresponding position in the consensus sequence. Dashes indicate gaps introduced to improve the alignment, dots to sequence that was not identified. Capitalized letters in the consensus sequence indicate the extent of the 5.8S RNA in D. melanogaster. The three underlined nucleotides highlighted in yellow in the Anopheles consensus sequence point to the terminal nucleotides of the three RNA species identified through the analysis of RNAseq experiments. The base numbering refers to the consensus sequence.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4300560&req=5

Fig1: Alignments of genes encoding the 5.8S ribosomal RNA. The species examined are shown at the left before the sequences. Nucleotides highlighted in green differ from those found at the corresponding position in the consensus sequence. Dashes indicate gaps introduced to improve the alignment, dots to sequence that was not identified. Capitalized letters in the consensus sequence indicate the extent of the 5.8S RNA in D. melanogaster. The three underlined nucleotides highlighted in yellow in the Anopheles consensus sequence point to the terminal nucleotides of the three RNA species identified through the analysis of RNAseq experiments. The base numbering refers to the consensus sequence.
Mentions: BLAST searches of both the assembled genomes and the SRA collections of reads that were generated in this project, allowed us to identify the 5S rRNA gene homologues in all species examined; in 16 out of 19 we were able to assemble a segment coding for the full 5.8S species. Initially we used the D. melanogaster sequence as a query, and then switched to that of A. gambiae once this was unambiguously identified. The sequences shown in Figure 1 show alignments of consensus sequences for each individual species; an overall consensus sequence for all anophelines was assembled from those of the 19 species examined. We stress here that the consensus of each individual taxon is based on the BLAST searches of the primary sequence reads; the output was obviously biased towards sequences that were more similar to the BLAST query; they are therefore not to be considered as “statistical representatives” of all reads present in the SRA database. Not unexpectedly, as seen in Figure 1 a very high degree of sequence conservation is apparent which, overall, ranges from 100% in comparisons between members of the A. gambiae s.l. species complex, to ~89% when the sequence of A. darlingi is compared to the consensus sequence determined from all species examined (average >96%). It should also be noted that most of the polymorphisms seen in A. darlingi are clustered towards the 3′ end of the mature 5.8S molecule, following the pattern detected for the overall comparison: we have determined a total of 42 polymorphic sites, of which 7 (17%) are found in the 5′-most 60 nucleotides, 13 (31%) in the next 60 bases and the remaining ones over the last segment of the gene.Figure 1

Bottom Line: Our analysis was carried out using, exclusively, computational approaches, and evaluating both the primary NGS reads as well as the respective genome assemblies produced by the consortium and stored in VectorBase; moreover, the results of RNAseq surveys in cases in which these were available and meaningful were also accessed in order to obtain supplementary data, as were "pre-genomic era" sequence data stored in nucleic acid databases.Our study led to the identification of members of these gene families in the majority of twenty different anopheline taxa.A set of tools for the study of the evolution and molecular biology of important disease vectors has, thus, been obtained.

View Article: PubMed Central - PubMed

Affiliation: Institute of Molecular Biology and Biotechnology, FORTH, Heraklion, Greece. topalis@imbb.forth.gr.

ABSTRACT

Background: Only a small fraction of the mosquito species of the genus Anopheles are able to transmit malaria, one of the biggest killer diseases of poverty, which is mostly prevalent in the tropics. This diversity has genetic, yet unknown, causes. In a further attempt to contribute to the elucidation of these variances, the international "Anopheles Genomes Cluster Consortium" project (a.k.a. "16 Anopheles genomes project") was established, aiming at a comprehensive genomic analysis of several anopheline species, most of which are malaria vectors. In the frame of the international consortium carrying out this project our team studied the genes encoding families of non-coding RNAs (ncRNAs), concentrating on four classes: microRNA (miRNA), ribosomal RNA (rRNA), small nuclear RNA (snRNA), and in particular small nucleolar RNA (snoRNA) and, finally, transfer RNA (tRNA).

Results: Our analysis was carried out using, exclusively, computational approaches, and evaluating both the primary NGS reads as well as the respective genome assemblies produced by the consortium and stored in VectorBase; moreover, the results of RNAseq surveys in cases in which these were available and meaningful were also accessed in order to obtain supplementary data, as were "pre-genomic era" sequence data stored in nucleic acid databases. The investigation included the identification and analysis, in most species studied, of ncRNA genes belonging to several families, as well as the analysis of the evolutionary relations of some of those genes in cross-comparisons to other members of the genus Anopheles.

Conclusions: Our study led to the identification of members of these gene families in the majority of twenty different anopheline taxa. A set of tools for the study of the evolution and molecular biology of important disease vectors has, thus, been obtained.

Show MeSH
Related in: MedlinePlus