Limits...
TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery.

Chen YA, Tripathi LP, Mizuguchi K - PLoS ONE (2011)

Bottom Line: An integrated approach that combines results from multiple data types is best suited for optimal target selection.It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data.The results show that the protocol can identify known disease-associated genes with high precision and coverage.

View Article: PubMed Central - PubMed

Affiliation: National Institute of Biomedical Innovation, Saito-Asagi, Ibaraki, Osaka, Japan.

ABSTRACT
Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.

Show MeSH
A schematic representation of the suggested objective protocol for                            candidate gene prioritisation with TargetMine.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3050930&req=5

pone-0017844-g002: A schematic representation of the suggested objective protocol for candidate gene prioritisation with TargetMine.

Mentions: Our general protocol for target prioritisation using TargetMine is shown in Figure 2. First, we upload a list of initial candidate genes or proteins (e.g., a set of differentially expressed genes or a set of proteins that interact with a given protein) to TargetMine to create a TargetMine gene list. Enrichment of specific biological themes (including but not limited to, KEGG pathways, Gene Ontology (GO) terms [49] and OMIM phenotypes) associated with the initial list is estimated by hypergeometric distribution and the inferred p-values are further adjusted for multiple test corrections to control the false discovery rate using the Benajmini and Hochberg procedure [50]. The significantly enriched biological associations (that satisfied, in this instance, a condition of p≤0.05 after a multiple test correction with the Benajmini and Hochberg procedure) can be visualised in the individual enrichment widgets. We gather the genes mapped to the top N significant associations (where N = 1,2,3…, an adjustable value reflecting incrementally relaxed thresholds) retrieved from KEGG (A), GO Biological Process (B) and OMIM (C) databases into separate lists and merge them (for example, by taking the union ABC of the retrieved genes) to infer corresponding sets of prioritised genes, albeit no ranking is provided at the moment. (We assume that an initial candidate list is from a single species and the enrichment calculation is performed using the data for this species only.)


TargetMine, an integrated data warehouse for candidate gene prioritisation and target discovery.

Chen YA, Tripathi LP, Mizuguchi K - PLoS ONE (2011)

A schematic representation of the suggested objective protocol for                            candidate gene prioritisation with TargetMine.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3050930&req=5

pone-0017844-g002: A schematic representation of the suggested objective protocol for candidate gene prioritisation with TargetMine.
Mentions: Our general protocol for target prioritisation using TargetMine is shown in Figure 2. First, we upload a list of initial candidate genes or proteins (e.g., a set of differentially expressed genes or a set of proteins that interact with a given protein) to TargetMine to create a TargetMine gene list. Enrichment of specific biological themes (including but not limited to, KEGG pathways, Gene Ontology (GO) terms [49] and OMIM phenotypes) associated with the initial list is estimated by hypergeometric distribution and the inferred p-values are further adjusted for multiple test corrections to control the false discovery rate using the Benajmini and Hochberg procedure [50]. The significantly enriched biological associations (that satisfied, in this instance, a condition of p≤0.05 after a multiple test correction with the Benajmini and Hochberg procedure) can be visualised in the individual enrichment widgets. We gather the genes mapped to the top N significant associations (where N = 1,2,3…, an adjustable value reflecting incrementally relaxed thresholds) retrieved from KEGG (A), GO Biological Process (B) and OMIM (C) databases into separate lists and merge them (for example, by taking the union ABC of the retrieved genes) to infer corresponding sets of prioritised genes, albeit no ranking is provided at the moment. (We assume that an initial candidate list is from a single species and the enrichment calculation is performed using the data for this species only.)

Bottom Line: An integrated approach that combines results from multiple data types is best suited for optimal target selection.It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data.The results show that the protocol can identify known disease-associated genes with high precision and coverage.

View Article: PubMed Central - PubMed

Affiliation: National Institute of Biomedical Innovation, Saito-Asagi, Ibaraki, Osaka, Japan.

ABSTRACT
Prioritising candidate genes for further experimental characterisation is a non-trivial challenge in drug discovery and biomedical research in general. An integrated approach that combines results from multiple data types is best suited for optimal target selection. We developed TargetMine, a data warehouse for efficient target prioritisation. TargetMine utilises the InterMine framework, with new data models such as protein-DNA interactions integrated in a novel way. It enables complicated searches that are difficult to perform with existing tools and it also offers integration of custom annotations and in-house experimental data. We proposed an objective protocol for target prioritisation using TargetMine and set up a benchmarking procedure to evaluate its performance. The results show that the protocol can identify known disease-associated genes with high precision and coverage. A demonstration version of TargetMine is available at http://targetmine.nibio.go.jp/.

Show MeSH