Limits...
Evidence-based annotation of the malaria parasite's genome using comparative expression profiling.

Zhou Y, Ramachandran V, Kumar KA, Westenberger S, Refour P, Zhou B, Li F, Young JA, Chen K, Plouffe D, Henson K, Nussenzweig V, Carlton J, Vinetz JM, Duraisingh MT, Winzeler EA - PLoS ONE (2008)

Bottom Line: Here we perform comparative expression analysis on Plasmodium parasite life cycle data derived from P. falciparum blood, sporozoite, zygote and ookinete stages, and P. yoelii mosquito oocyst and salivary gland sporozoites, blood and liver stages and show that type II fatty acid biosynthesis genes are upregulated in liver and insect stages relative to asexual blood stages.We also show that some universally uncharacterized genes with orthologs in Plasmodium species, Saccharomyces cerevisiae and humans show coordinated transcription patterns in large collections of human and yeast expression data and that the function of the uncharacterized genes can sometimes be predicted based on the expression patterns across these diverse organisms.We also use a comprehensive and unbiased literature mining method to predict which uncharacterized parasite-specific genes are likely to have roles in processes such as gliding motility, host-cell interactions, sporozoite stage, or rhoptry function.

View Article: PubMed Central - PubMed

Affiliation: Genomics Institute of the Novartis Research Foundation, San Diego, California, USA.

ABSTRACT
A fundamental problem in systems biology and whole genome sequence analysis is how to infer functions for the many uncharacterized proteins that are identified, whether they are conserved across organisms of different phyla or are phylum-specific. This problem is especially acute in pathogens, such as malaria parasites, where genetic and biochemical investigations are likely to be more difficult. Here we perform comparative expression analysis on Plasmodium parasite life cycle data derived from P. falciparum blood, sporozoite, zygote and ookinete stages, and P. yoelii mosquito oocyst and salivary gland sporozoites, blood and liver stages and show that type II fatty acid biosynthesis genes are upregulated in liver and insect stages relative to asexual blood stages. We also show that some universally uncharacterized genes with orthologs in Plasmodium species, Saccharomyces cerevisiae and humans show coordinated transcription patterns in large collections of human and yeast expression data and that the function of the uncharacterized genes can sometimes be predicted based on the expression patterns across these diverse organisms. We also use a comprehensive and unbiased literature mining method to predict which uncharacterized parasite-specific genes are likely to have roles in processes such as gliding motility, host-cell interactions, sporozoite stage, or rhoptry function. These analyses, together with protein-protein interaction data, provide probabilistic models that predict the function of 926 uncharacterized malaria genes and also suggest that malaria parasites may provide a simple model system for the study of some human processes. These data also provide a foundation for further studies of transcriptional regulation in malaria parasites.

Show MeSH

Related in: MedlinePlus

Temporal expression patterns were constructed from 54 P. falciparum and P. yoelii life cycle samples.A total of 156 statistically enriched gene clusters identified by OPI analysis illustrates the transcription regulation characteristics of all key biological processes in Plasmodium species. Their yeast and human orthologs contents are represented by the white-blue heatmap, indicating parasite-specific processes generally found fewer orthologs in model organisms. The percentage of proteins that form statistically significant within-cluster networks are also white-blue color coded; most networks occur in blood stage processes. Altogether 33 manuscripts were identified with significant overlap to the clusters, nine of which [13], [27], [28], [36], [42], [60], [68], [69], [70] are referenced in the figure. Two clusters were enriched for proteins predicted to have a parasite export signal [68] and were labeled as “Exported proteins (sporozoite)” and “Exported proteins (trophozoite)”—one of which peaks in the trophozoite stage and a second which peaks in sporozoite stages (see GO:PM15591202_Trp and GO:PM15591202_Spo in Table S1-S2). S & T indicate that the P. falciparum parasites were synchronized within the asexual cycle by the thermocycling or sorbitol method [9]. The figure does not comprehensively describe all gene expression patterns contained within the data as there are ∼1,026 genes which are not found in any of the groups depicted here because they do not share expression patterns with a sufficient number of previously characterized genes.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2215772&req=5

pone-0001570-g001: Temporal expression patterns were constructed from 54 P. falciparum and P. yoelii life cycle samples.A total of 156 statistically enriched gene clusters identified by OPI analysis illustrates the transcription regulation characteristics of all key biological processes in Plasmodium species. Their yeast and human orthologs contents are represented by the white-blue heatmap, indicating parasite-specific processes generally found fewer orthologs in model organisms. The percentage of proteins that form statistically significant within-cluster networks are also white-blue color coded; most networks occur in blood stage processes. Altogether 33 manuscripts were identified with significant overlap to the clusters, nine of which [13], [27], [28], [36], [42], [60], [68], [69], [70] are referenced in the figure. Two clusters were enriched for proteins predicted to have a parasite export signal [68] and were labeled as “Exported proteins (sporozoite)” and “Exported proteins (trophozoite)”—one of which peaks in the trophozoite stage and a second which peaks in sporozoite stages (see GO:PM15591202_Trp and GO:PM15591202_Spo in Table S1-S2). S & T indicate that the P. falciparum parasites were synchronized within the asexual cycle by the thermocycling or sorbitol method [9]. The figure does not comprehensively describe all gene expression patterns contained within the data as there are ∼1,026 genes which are not found in any of the groups depicted here because they do not share expression patterns with a sufficient number of previously characterized genes.

Mentions: Applying OPI resulted in 98 non-redundant clusters derived from gene ontologies highly enriched for processes such as glycolysis or protein synthesis, or for cellular components such as the proteosome core complex (Figure 1, Table S1). While we had previously performed a similar analysis on a limited set of sexual development and erythrocytic stage P. falciparum data, the addition of the new data from oocyst and salivary gland sporozoite stages and the creation of combined expression vectors with data from both human and rodent parasites substantially improved the quality of the predictions and allowed the separation of genes which had previously been grouped together. In a previous analysis of sexual development and asexual cycles we identified 246 genes associated with gametocytogenesis, which included the genes involved in type II fatty acid biosynthesis such as PF11_0256, the pyruvate dehydrogenase E1 component. Here we can show that while type II fatty acid biosynthesis genes are upregulated during sexual development they are also upregulated in liver stage development, while others are not. For this study the p-values for functional enrichment calculated with the accumulated hypergeometric distribution ranged from 10−69.0 to 10−8.1. An example of one of the clusters is shown in Table 2 (others can be downloaded as additional data files, http://carrier.gnf.org/publications/Py, the companion website), which shows the P. falciparum genes in a group enriched for the cellular component, nucleolus (GO:0005730). This group contains eight of the fourteen annotated P. falciparum or P. yoelii nucleolus genes in a group of 28. Given that 6,592 genes were considered in this analysis the probability of enrichment by chance is very low (p = 10−15.3). Not only is the p-value low, but it is likely much higher than it should be: Evidence compiled independently indicates that almost every “hypothetical” gene in the cluster has a yeast ortholog, most with likely roles in RNA polymerase I processing and transcription (Table 2). There are numerous other similar examples of the quality of the expression data as evidenced by the functional enrichments. For example, of the twelve genes in GO group GO:0005663 (DNA replication factor C complex), twelve are found in a cluster of 27 (p = 10−28.9), with most of the other genes having a role in DNA replication. Of the 12 components of the chaperonin-containing T complex (GO:0005832), ten are contained in a group of 12 genes (p = 10−27.2). The gluconeogenesis cluster contains 9 of the 13 annotated genes in a group of 14 (p = 10−21.9), with lactate dehydrogenase considered a miss. The patterns of gene regulation for the different functionally-enriched categories can be seen by following the “OPI Web Portal” link on the companion website and the representative profile for each cluster is shown in Figure 1.


Evidence-based annotation of the malaria parasite's genome using comparative expression profiling.

Zhou Y, Ramachandran V, Kumar KA, Westenberger S, Refour P, Zhou B, Li F, Young JA, Chen K, Plouffe D, Henson K, Nussenzweig V, Carlton J, Vinetz JM, Duraisingh MT, Winzeler EA - PLoS ONE (2008)

Temporal expression patterns were constructed from 54 P. falciparum and P. yoelii life cycle samples.A total of 156 statistically enriched gene clusters identified by OPI analysis illustrates the transcription regulation characteristics of all key biological processes in Plasmodium species. Their yeast and human orthologs contents are represented by the white-blue heatmap, indicating parasite-specific processes generally found fewer orthologs in model organisms. The percentage of proteins that form statistically significant within-cluster networks are also white-blue color coded; most networks occur in blood stage processes. Altogether 33 manuscripts were identified with significant overlap to the clusters, nine of which [13], [27], [28], [36], [42], [60], [68], [69], [70] are referenced in the figure. Two clusters were enriched for proteins predicted to have a parasite export signal [68] and were labeled as “Exported proteins (sporozoite)” and “Exported proteins (trophozoite)”—one of which peaks in the trophozoite stage and a second which peaks in sporozoite stages (see GO:PM15591202_Trp and GO:PM15591202_Spo in Table S1-S2). S & T indicate that the P. falciparum parasites were synchronized within the asexual cycle by the thermocycling or sorbitol method [9]. The figure does not comprehensively describe all gene expression patterns contained within the data as there are ∼1,026 genes which are not found in any of the groups depicted here because they do not share expression patterns with a sufficient number of previously characterized genes.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2215772&req=5

pone-0001570-g001: Temporal expression patterns were constructed from 54 P. falciparum and P. yoelii life cycle samples.A total of 156 statistically enriched gene clusters identified by OPI analysis illustrates the transcription regulation characteristics of all key biological processes in Plasmodium species. Their yeast and human orthologs contents are represented by the white-blue heatmap, indicating parasite-specific processes generally found fewer orthologs in model organisms. The percentage of proteins that form statistically significant within-cluster networks are also white-blue color coded; most networks occur in blood stage processes. Altogether 33 manuscripts were identified with significant overlap to the clusters, nine of which [13], [27], [28], [36], [42], [60], [68], [69], [70] are referenced in the figure. Two clusters were enriched for proteins predicted to have a parasite export signal [68] and were labeled as “Exported proteins (sporozoite)” and “Exported proteins (trophozoite)”—one of which peaks in the trophozoite stage and a second which peaks in sporozoite stages (see GO:PM15591202_Trp and GO:PM15591202_Spo in Table S1-S2). S & T indicate that the P. falciparum parasites were synchronized within the asexual cycle by the thermocycling or sorbitol method [9]. The figure does not comprehensively describe all gene expression patterns contained within the data as there are ∼1,026 genes which are not found in any of the groups depicted here because they do not share expression patterns with a sufficient number of previously characterized genes.
Mentions: Applying OPI resulted in 98 non-redundant clusters derived from gene ontologies highly enriched for processes such as glycolysis or protein synthesis, or for cellular components such as the proteosome core complex (Figure 1, Table S1). While we had previously performed a similar analysis on a limited set of sexual development and erythrocytic stage P. falciparum data, the addition of the new data from oocyst and salivary gland sporozoite stages and the creation of combined expression vectors with data from both human and rodent parasites substantially improved the quality of the predictions and allowed the separation of genes which had previously been grouped together. In a previous analysis of sexual development and asexual cycles we identified 246 genes associated with gametocytogenesis, which included the genes involved in type II fatty acid biosynthesis such as PF11_0256, the pyruvate dehydrogenase E1 component. Here we can show that while type II fatty acid biosynthesis genes are upregulated during sexual development they are also upregulated in liver stage development, while others are not. For this study the p-values for functional enrichment calculated with the accumulated hypergeometric distribution ranged from 10−69.0 to 10−8.1. An example of one of the clusters is shown in Table 2 (others can be downloaded as additional data files, http://carrier.gnf.org/publications/Py, the companion website), which shows the P. falciparum genes in a group enriched for the cellular component, nucleolus (GO:0005730). This group contains eight of the fourteen annotated P. falciparum or P. yoelii nucleolus genes in a group of 28. Given that 6,592 genes were considered in this analysis the probability of enrichment by chance is very low (p = 10−15.3). Not only is the p-value low, but it is likely much higher than it should be: Evidence compiled independently indicates that almost every “hypothetical” gene in the cluster has a yeast ortholog, most with likely roles in RNA polymerase I processing and transcription (Table 2). There are numerous other similar examples of the quality of the expression data as evidenced by the functional enrichments. For example, of the twelve genes in GO group GO:0005663 (DNA replication factor C complex), twelve are found in a cluster of 27 (p = 10−28.9), with most of the other genes having a role in DNA replication. Of the 12 components of the chaperonin-containing T complex (GO:0005832), ten are contained in a group of 12 genes (p = 10−27.2). The gluconeogenesis cluster contains 9 of the 13 annotated genes in a group of 14 (p = 10−21.9), with lactate dehydrogenase considered a miss. The patterns of gene regulation for the different functionally-enriched categories can be seen by following the “OPI Web Portal” link on the companion website and the representative profile for each cluster is shown in Figure 1.

Bottom Line: Here we perform comparative expression analysis on Plasmodium parasite life cycle data derived from P. falciparum blood, sporozoite, zygote and ookinete stages, and P. yoelii mosquito oocyst and salivary gland sporozoites, blood and liver stages and show that type II fatty acid biosynthesis genes are upregulated in liver and insect stages relative to asexual blood stages.We also show that some universally uncharacterized genes with orthologs in Plasmodium species, Saccharomyces cerevisiae and humans show coordinated transcription patterns in large collections of human and yeast expression data and that the function of the uncharacterized genes can sometimes be predicted based on the expression patterns across these diverse organisms.We also use a comprehensive and unbiased literature mining method to predict which uncharacterized parasite-specific genes are likely to have roles in processes such as gliding motility, host-cell interactions, sporozoite stage, or rhoptry function.

View Article: PubMed Central - PubMed

Affiliation: Genomics Institute of the Novartis Research Foundation, San Diego, California, USA.

ABSTRACT
A fundamental problem in systems biology and whole genome sequence analysis is how to infer functions for the many uncharacterized proteins that are identified, whether they are conserved across organisms of different phyla or are phylum-specific. This problem is especially acute in pathogens, such as malaria parasites, where genetic and biochemical investigations are likely to be more difficult. Here we perform comparative expression analysis on Plasmodium parasite life cycle data derived from P. falciparum blood, sporozoite, zygote and ookinete stages, and P. yoelii mosquito oocyst and salivary gland sporozoites, blood and liver stages and show that type II fatty acid biosynthesis genes are upregulated in liver and insect stages relative to asexual blood stages. We also show that some universally uncharacterized genes with orthologs in Plasmodium species, Saccharomyces cerevisiae and humans show coordinated transcription patterns in large collections of human and yeast expression data and that the function of the uncharacterized genes can sometimes be predicted based on the expression patterns across these diverse organisms. We also use a comprehensive and unbiased literature mining method to predict which uncharacterized parasite-specific genes are likely to have roles in processes such as gliding motility, host-cell interactions, sporozoite stage, or rhoptry function. These analyses, together with protein-protein interaction data, provide probabilistic models that predict the function of 926 uncharacterized malaria genes and also suggest that malaria parasites may provide a simple model system for the study of some human processes. These data also provide a foundation for further studies of transcriptional regulation in malaria parasites.

Show MeSH
Related in: MedlinePlus