Limits...
targetTB: a target identification pipeline for Mycobacterium tuberculosis through an interactome, reactome and genome-scale structural analysis.

Raman K, Yeturu K, Chandra N - BMC Syst Biol (2008)

Bottom Line: The pipeline incorporates a network analysis of the protein-protein interactome, a flux balance analysis of the reactome, experimentally derived phenotype essentiality data, sequence analyses and a structural assessment of targetability, using novel algorithms recently developed by us.Further analyses include correlation with expression data and non-similarity to gut flora proteins as well as 'anti-targets' in the host, leading to the identification of 451 high-confidence targets.The method has the potential to be used as a general strategy for target identification and validation and hence significantly impact most drug discovery programmes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Supercomputer Education and Research Centre and Bioinformatics Centre, Indian Institute of Science, Bangalore 560 012, India. karthik@rishi.serc.iisc.ernet.in

ABSTRACT

Background: Tuberculosis still remains one of the largest killer infectious diseases, warranting the identification of newer targets and drugs. Identification and validation of appropriate targets for designing drugs are critical steps in drug discovery, which are at present major bottle-necks. A majority of drugs in current clinical use for many diseases have been designed without the knowledge of the targets, perhaps because standard methodologies to identify such targets in a high-throughput fashion do not really exist. With different kinds of 'omics' data that are now available, computational approaches can be powerful means of obtaining short-lists of possible targets for further experimental validation.

Results: We report a comprehensive in silico target identification pipeline, targetTB, for Mycobacterium tuberculosis. The pipeline incorporates a network analysis of the protein-protein interactome, a flux balance analysis of the reactome, experimentally derived phenotype essentiality data, sequence analyses and a structural assessment of targetability, using novel algorithms recently developed by us. Using flux balance analysis and network analysis, proteins critical for survival of M. tuberculosis are first identified, followed by comparative genomics with the host, finally incorporating a novel structural analysis of the binding sites to assess the feasibility of a protein as a target. Further analyses include correlation with expression data and non-similarity to gut flora proteins as well as 'anti-targets' in the host, leading to the identification of 451 high-confidence targets. Through phylogenetic profiling against 228 pathogen genomes, shortlisted targets have been further explored to identify broad-spectrum antibiotic targets, while also identifying those specific to tuberculosis. Targets that address mycobacterial persistence and drug resistance mechanisms are also analysed.

Conclusion: The pipeline developed provides rational schema for drug target identification that are likely to have high rates of success, which is expected to save enormous amounts of money, resources and time in the drug discovery process. A thorough comparison with previously suggested targets in the literature demonstrates the usefulness of the integrated approach used in our study, highlighting the importance of systems-level analyses in particular. The method has the potential to be used as a general strategy for target identification and validation and hence significantly impact most drug discovery programmes.

Show MeSH

Related in: MedlinePlus

The targetTB Target Identification Pipeline. The funnel depicts the order in which the entire proteome of Mtb is considered and analysed at different layers. 'A' refers to the systems level studies, which includes A1, for network analysis of the interactome; A2, for flux balance analyses of the reactome; and A3, for genome-scale essentiality data determined experimentally as reported by Sassetti et al [23]. Those proteins that passed these filters are indicated as 'A', and combined with the results of sequence analysis (B), to derive those that passed both filters (depicted as 'A&B'). These were then taken through Filter C, referring to the structural assessment filter, yielding the list of 622 proteins as the D-List (A&B&C). Further steps of filtering are indicated in the smaller funnel as E (expression under various conditions), F (non-similarity to anti-targets) and G (non-similarity to gut flora proteins). Those proteins that pass all the six levels of filtering (indicated as D&E&F&G) form the H-List comprising 451 targets. Additional filters I, J and K used for analysing the H-List are also indicated. Lists A', C' and E' refer to the set of proteins at A, C and E levels, respectively, that could not be analysed for lack of appropriate data. Lists AX, BX, CX, EX, FX and GX refer to sets of proteins that failed in that particular filter, but may have passed at other levels.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2651862&req=5

Figure 1: The targetTB Target Identification Pipeline. The funnel depicts the order in which the entire proteome of Mtb is considered and analysed at different layers. 'A' refers to the systems level studies, which includes A1, for network analysis of the interactome; A2, for flux balance analyses of the reactome; and A3, for genome-scale essentiality data determined experimentally as reported by Sassetti et al [23]. Those proteins that passed these filters are indicated as 'A', and combined with the results of sequence analysis (B), to derive those that passed both filters (depicted as 'A&B'). These were then taken through Filter C, referring to the structural assessment filter, yielding the list of 622 proteins as the D-List (A&B&C). Further steps of filtering are indicated in the smaller funnel as E (expression under various conditions), F (non-similarity to anti-targets) and G (non-similarity to gut flora proteins). Those proteins that pass all the six levels of filtering (indicated as D&E&F&G) form the H-List comprising 451 targets. Additional filters I, J and K used for analysing the H-List are also indicated. Lists A', C' and E' refer to the set of proteins at A, C and E levels, respectively, that could not be analysed for lack of appropriate data. Lists AX, BX, CX, EX, FX and GX refer to sets of proteins that failed in that particular filter, but may have passed at other levels.

Mentions: A range of analyses spanning multiple levels of abstraction have been carried out, to identify plausible drug targets. The methodology can also be used more generally as a target identification pipeline that would be applicable to many drug discovery programmes. Starting from the entire proteome of Mtb H37Rv comprising 3,989 proteins, we have shortlisted 451 proteins as potential drug targets using a variety of filters, as depicted in Figs. 1 and 2. Fig. 1 illustrates a pictorial view of the targetTB pipeline while Fig. 2 shows a simplified view of the pipeline as a flowchart, illustrating the flow of this study. We first carry out a network analysis, where a full genome-scale interactome encoding several types of protein-protein interactions and protein-protein influences from metabolic pathways is reconstructed. Gene deletions that would significantly disrupt the network are then identified (List-A1). Next, we have studied the reactome through FBA (List-A2), to identify lethal gene deletions. This is further augmented with high-throughput gene essentiality data (List-A3). These system-level analyses together comprise Filter A. This is then integrated with sequence-level (Filter B) and structural analyses (Filter C) as described below (see Fig. 1). The expression of the gene encoding for the target is highly desirable (Filter E) and the list is further pruned by eliminating targets with high similarities to known 'anti-targets' in the human proteome (Filter F) and proteins in gut flora (Filter G). Those targets known to contribute to drug resistance in the pathogen are then prioritised. By analysis of similarity against several pathogenic proteomes, broad-spectrum targets as well as those unique to Mtb have also been identified. Various filters, lists and the numbers of proteins passed and eliminated at the various stages of the pipeline are given in Table 2.


targetTB: a target identification pipeline for Mycobacterium tuberculosis through an interactome, reactome and genome-scale structural analysis.

Raman K, Yeturu K, Chandra N - BMC Syst Biol (2008)

The targetTB Target Identification Pipeline. The funnel depicts the order in which the entire proteome of Mtb is considered and analysed at different layers. 'A' refers to the systems level studies, which includes A1, for network analysis of the interactome; A2, for flux balance analyses of the reactome; and A3, for genome-scale essentiality data determined experimentally as reported by Sassetti et al [23]. Those proteins that passed these filters are indicated as 'A', and combined with the results of sequence analysis (B), to derive those that passed both filters (depicted as 'A&B'). These were then taken through Filter C, referring to the structural assessment filter, yielding the list of 622 proteins as the D-List (A&B&C). Further steps of filtering are indicated in the smaller funnel as E (expression under various conditions), F (non-similarity to anti-targets) and G (non-similarity to gut flora proteins). Those proteins that pass all the six levels of filtering (indicated as D&E&F&G) form the H-List comprising 451 targets. Additional filters I, J and K used for analysing the H-List are also indicated. Lists A', C' and E' refer to the set of proteins at A, C and E levels, respectively, that could not be analysed for lack of appropriate data. Lists AX, BX, CX, EX, FX and GX refer to sets of proteins that failed in that particular filter, but may have passed at other levels.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2651862&req=5

Figure 1: The targetTB Target Identification Pipeline. The funnel depicts the order in which the entire proteome of Mtb is considered and analysed at different layers. 'A' refers to the systems level studies, which includes A1, for network analysis of the interactome; A2, for flux balance analyses of the reactome; and A3, for genome-scale essentiality data determined experimentally as reported by Sassetti et al [23]. Those proteins that passed these filters are indicated as 'A', and combined with the results of sequence analysis (B), to derive those that passed both filters (depicted as 'A&B'). These were then taken through Filter C, referring to the structural assessment filter, yielding the list of 622 proteins as the D-List (A&B&C). Further steps of filtering are indicated in the smaller funnel as E (expression under various conditions), F (non-similarity to anti-targets) and G (non-similarity to gut flora proteins). Those proteins that pass all the six levels of filtering (indicated as D&E&F&G) form the H-List comprising 451 targets. Additional filters I, J and K used for analysing the H-List are also indicated. Lists A', C' and E' refer to the set of proteins at A, C and E levels, respectively, that could not be analysed for lack of appropriate data. Lists AX, BX, CX, EX, FX and GX refer to sets of proteins that failed in that particular filter, but may have passed at other levels.
Mentions: A range of analyses spanning multiple levels of abstraction have been carried out, to identify plausible drug targets. The methodology can also be used more generally as a target identification pipeline that would be applicable to many drug discovery programmes. Starting from the entire proteome of Mtb H37Rv comprising 3,989 proteins, we have shortlisted 451 proteins as potential drug targets using a variety of filters, as depicted in Figs. 1 and 2. Fig. 1 illustrates a pictorial view of the targetTB pipeline while Fig. 2 shows a simplified view of the pipeline as a flowchart, illustrating the flow of this study. We first carry out a network analysis, where a full genome-scale interactome encoding several types of protein-protein interactions and protein-protein influences from metabolic pathways is reconstructed. Gene deletions that would significantly disrupt the network are then identified (List-A1). Next, we have studied the reactome through FBA (List-A2), to identify lethal gene deletions. This is further augmented with high-throughput gene essentiality data (List-A3). These system-level analyses together comprise Filter A. This is then integrated with sequence-level (Filter B) and structural analyses (Filter C) as described below (see Fig. 1). The expression of the gene encoding for the target is highly desirable (Filter E) and the list is further pruned by eliminating targets with high similarities to known 'anti-targets' in the human proteome (Filter F) and proteins in gut flora (Filter G). Those targets known to contribute to drug resistance in the pathogen are then prioritised. By analysis of similarity against several pathogenic proteomes, broad-spectrum targets as well as those unique to Mtb have also been identified. Various filters, lists and the numbers of proteins passed and eliminated at the various stages of the pipeline are given in Table 2.

Bottom Line: The pipeline incorporates a network analysis of the protein-protein interactome, a flux balance analysis of the reactome, experimentally derived phenotype essentiality data, sequence analyses and a structural assessment of targetability, using novel algorithms recently developed by us.Further analyses include correlation with expression data and non-similarity to gut flora proteins as well as 'anti-targets' in the host, leading to the identification of 451 high-confidence targets.The method has the potential to be used as a general strategy for target identification and validation and hence significantly impact most drug discovery programmes.

View Article: PubMed Central - HTML - PubMed

Affiliation: Supercomputer Education and Research Centre and Bioinformatics Centre, Indian Institute of Science, Bangalore 560 012, India. karthik@rishi.serc.iisc.ernet.in

ABSTRACT

Background: Tuberculosis still remains one of the largest killer infectious diseases, warranting the identification of newer targets and drugs. Identification and validation of appropriate targets for designing drugs are critical steps in drug discovery, which are at present major bottle-necks. A majority of drugs in current clinical use for many diseases have been designed without the knowledge of the targets, perhaps because standard methodologies to identify such targets in a high-throughput fashion do not really exist. With different kinds of 'omics' data that are now available, computational approaches can be powerful means of obtaining short-lists of possible targets for further experimental validation.

Results: We report a comprehensive in silico target identification pipeline, targetTB, for Mycobacterium tuberculosis. The pipeline incorporates a network analysis of the protein-protein interactome, a flux balance analysis of the reactome, experimentally derived phenotype essentiality data, sequence analyses and a structural assessment of targetability, using novel algorithms recently developed by us. Using flux balance analysis and network analysis, proteins critical for survival of M. tuberculosis are first identified, followed by comparative genomics with the host, finally incorporating a novel structural analysis of the binding sites to assess the feasibility of a protein as a target. Further analyses include correlation with expression data and non-similarity to gut flora proteins as well as 'anti-targets' in the host, leading to the identification of 451 high-confidence targets. Through phylogenetic profiling against 228 pathogen genomes, shortlisted targets have been further explored to identify broad-spectrum antibiotic targets, while also identifying those specific to tuberculosis. Targets that address mycobacterial persistence and drug resistance mechanisms are also analysed.

Conclusion: The pipeline developed provides rational schema for drug target identification that are likely to have high rates of success, which is expected to save enormous amounts of money, resources and time in the drug discovery process. A thorough comparison with previously suggested targets in the literature demonstrates the usefulness of the integrated approach used in our study, highlighting the importance of systems-level analyses in particular. The method has the potential to be used as a general strategy for target identification and validation and hence significantly impact most drug discovery programmes.

Show MeSH
Related in: MedlinePlus