Limits...
Improved orthologous databases to ease protozoan targets inference.

Kotowski N, Jardim R, Dávila AM - Parasit Vectors (2015)

Bottom Line: Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one.We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams.The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input.

View Article: PubMed Central - PubMed

Affiliation: Computational and Systems Biology Laboratory, Oswaldo Cruz Institute, FIOCRUZ, Avenida Brasil, 4365, 21040-360, Rio de Janeiro, RJ, Brazil. nelson.peixoto@fiocruz.br.

ABSTRACT

Background: Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools.

Methods: Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome.

Results: The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams.

Conclusions: The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.

No MeSH data available.


Related in: MedlinePlus

OrthoSearch inferred orthologous groups and coverage, per organism, with the databases created by the methodology itself; A detailed view on how many orthologous groups were inferred with (i) “KO + EggNOG KOG” and (ii) “KO + Eggnog KOG + ProtozoaDB” databases and what do such numbers represent against the organisms total protein numbers
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4587786&req=5

Fig4: OrthoSearch inferred orthologous groups and coverage, per organism, with the databases created by the methodology itself; A detailed view on how many orthologous groups were inferred with (i) “KO + EggNOG KOG” and (ii) “KO + Eggnog KOG + ProtozoaDB” databases and what do such numbers represent against the organisms total protein numbers

Mentions: With these recently created n-ODs, on our second scenario we executed OrthoSearch using as input such n-ODs and the same three protozoan species then compared the obtained results against previous KO analysis. Figure 4 depicts coverage percentage data for each of the OG databases created by the methodology itself, for each organism adopted.Fig. 4


Improved orthologous databases to ease protozoan targets inference.

Kotowski N, Jardim R, Dávila AM - Parasit Vectors (2015)

OrthoSearch inferred orthologous groups and coverage, per organism, with the databases created by the methodology itself; A detailed view on how many orthologous groups were inferred with (i) “KO + EggNOG KOG” and (ii) “KO + Eggnog KOG + ProtozoaDB” databases and what do such numbers represent against the organisms total protein numbers
© Copyright Policy - OpenAccess
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4587786&req=5

Fig4: OrthoSearch inferred orthologous groups and coverage, per organism, with the databases created by the methodology itself; A detailed view on how many orthologous groups were inferred with (i) “KO + EggNOG KOG” and (ii) “KO + Eggnog KOG + ProtozoaDB” databases and what do such numbers represent against the organisms total protein numbers
Mentions: With these recently created n-ODs, on our second scenario we executed OrthoSearch using as input such n-ODs and the same three protozoan species then compared the obtained results against previous KO analysis. Figure 4 depicts coverage percentage data for each of the OG databases created by the methodology itself, for each organism adopted.Fig. 4

Bottom Line: Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one.We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams.The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input.

View Article: PubMed Central - PubMed

Affiliation: Computational and Systems Biology Laboratory, Oswaldo Cruz Institute, FIOCRUZ, Avenida Brasil, 4365, 21040-360, Rio de Janeiro, RJ, Brazil. nelson.peixoto@fiocruz.br.

ABSTRACT

Background: Homology inference helps on identifying similarities, as well as differences among organisms, which provides a better insight on how closely related one might be to another. In addition, comparative genomics pipelines are widely adopted tools designed using different bioinformatics applications and algorithms. In this article, we propose a methodology to build improved orthologous databases with the potential to aid on protozoan target identification, one of the many tasks which benefit from comparative genomics tools.

Methods: Our analyses are based on OrthoSearch, a comparative genomics pipeline originally designed to infer orthologs through protein-profile comparison, supported by an HMM, reciprocal best hits based approach. Our methodology allows OrthoSearch to confront two orthologous databases and to generate an improved new one. Such can be later used to infer potential protozoan targets through a similarity analysis against the human genome.

Results: The protein sequences of Cryptosporidium hominis, Entamoeba histolytica and Leishmania infantum genomes were comparatively analyzed against three orthologous databases: (i) EggNOG KOG, (ii) ProtozoaDB and (iii) Kegg Orthology (KO). That allowed us to create two new orthologous databases, "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB", with 16,938 and 27,701 orthologous groups, respectively. Such new orthologous databases were used for a regular OrthoSearch run. By confronting "KO + EggNOG KOG" and "KO + EggNOG KOG + ProtozoaDB" databases and protozoan species we were able to detect the following total of orthologous groups and coverage (relation between the inferred orthologous groups and the species total number of proteins): Cryptosporidium hominis: 1,821 (11 %) and 3,254 (12 %); Entamoeba histolytica: 2,245 (13 %) and 5,305 (19 %); Leishmania infantum: 2,702 (16 %) and 4,760 (17 %). Using our HMM-based methodology and the largest created orthologous database, it was possible to infer 13 orthologous groups which represent potential protozoan targets; these were found because of our distant homology approach. We also provide the number of species-specific, pair-to-pair and core groups from such analyses, depicted in Venn diagrams.

Conclusions: The orthologous databases generated by our HMM-based methodology provide a broader dataset, with larger amounts of orthologous groups when compared to the original databases used as input. Those may be used for several homology inference analyses, annotation tasks and protozoan targets identification.

No MeSH data available.


Related in: MedlinePlus