Limits...
Network based integrated analysis of phenotype-genotype data for prioritization of candidate symptom genes.

Li X, Zhou X, Peng Y, Liu B, Zhang R, Hu J, Yu J, Jia C, Sun C - Biomed Res Int (2014)

Bottom Line: The proposed method gets reliable gene rank list with AUC (area under curve) 0.616 in classification.Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures.Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

View Article: PubMed Central - PubMed

Affiliation: School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China.

ABSTRACT

Background: Symptoms and signs (symptoms in brief) are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM). To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms.

Methods: This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms.

Results: The proposed method gets reliable gene rank list with AUC (area under curve) 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures.

Conclusions: Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

Show MeSH

Related in: MedlinePlus

The approach for predicting the genes with respect to symptom using PRINCE algorithm. For a query symptom S, it has varying degrees of relationship with other diseases, denoted by d1–d5 (where the thickness of lines represents degree of correlation between symptom and diseases). p1–p9 comprise the protein set of a protein-protein interaction network, where interactions are denoted by lines with different thickness (confidence). PRINCE uses an iterative propagation method to assign a score of each protein. The protein with higher score is considered to be the causal gene candidate for symptom S.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4060751&req=5

fig2: The approach for predicting the genes with respect to symptom using PRINCE algorithm. For a query symptom S, it has varying degrees of relationship with other diseases, denoted by d1–d5 (where the thickness of lines represents degree of correlation between symptom and diseases). p1–p9 comprise the protein set of a protein-protein interaction network, where interactions are denoted by lines with different thickness (confidence). PRINCE uses an iterative propagation method to assign a score of each protein. The protein with higher score is considered to be the causal gene candidate for symptom S.

Mentions: The network-based disease gene prediction approach, PRINCE, is used for predicting the genes with respect to symptom. The initialization of the parameters in PRINCE algorithm is the symptom-disease correlations, disease-gene associations, and protein-protein interactions. It uses a propagation-based algorithm [26] to infer a scoring function for estimating the strength of an association. A score is defined for each gene, which reflects the prior information of the genes on the related disease. The score is then used in combination with a PPI network for the identification of proteins involved in the given symptom, as shown in Figure 2.


Network based integrated analysis of phenotype-genotype data for prioritization of candidate symptom genes.

Li X, Zhou X, Peng Y, Liu B, Zhang R, Hu J, Yu J, Jia C, Sun C - Biomed Res Int (2014)

The approach for predicting the genes with respect to symptom using PRINCE algorithm. For a query symptom S, it has varying degrees of relationship with other diseases, denoted by d1–d5 (where the thickness of lines represents degree of correlation between symptom and diseases). p1–p9 comprise the protein set of a protein-protein interaction network, where interactions are denoted by lines with different thickness (confidence). PRINCE uses an iterative propagation method to assign a score of each protein. The protein with higher score is considered to be the causal gene candidate for symptom S.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4060751&req=5

fig2: The approach for predicting the genes with respect to symptom using PRINCE algorithm. For a query symptom S, it has varying degrees of relationship with other diseases, denoted by d1–d5 (where the thickness of lines represents degree of correlation between symptom and diseases). p1–p9 comprise the protein set of a protein-protein interaction network, where interactions are denoted by lines with different thickness (confidence). PRINCE uses an iterative propagation method to assign a score of each protein. The protein with higher score is considered to be the causal gene candidate for symptom S.
Mentions: The network-based disease gene prediction approach, PRINCE, is used for predicting the genes with respect to symptom. The initialization of the parameters in PRINCE algorithm is the symptom-disease correlations, disease-gene associations, and protein-protein interactions. It uses a propagation-based algorithm [26] to infer a scoring function for estimating the strength of an association. A score is defined for each gene, which reflects the prior information of the genes on the related disease. The score is then used in combination with a PPI network for the identification of proteins involved in the given symptom, as shown in Figure 2.

Bottom Line: The proposed method gets reliable gene rank list with AUC (area under curve) 0.616 in classification.Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures.Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

View Article: PubMed Central - PubMed

Affiliation: School of Computer and Information Technology and Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, Beijing 100044, China.

ABSTRACT

Background: Symptoms and signs (symptoms in brief) are the essential clinical manifestations for individualized diagnosis and treatment in traditional Chinese medicine (TCM). To gain insights into the molecular mechanism of symptoms, we develop a computational approach to identify the candidate genes of symptoms.

Methods: This paper presents a network-based approach for the integrated analysis of multiple phenotype-genotype data sources and the prediction of the prioritizing genes for the associated symptoms. The method first calculates the similarities between symptoms and diseases based on the symptom-disease relationships retrieved from the PubMed bibliographic database. Then the disease-gene associations and protein-protein interactions are utilized to construct a phenotype-genotype network. The PRINCE algorithm is finally used to rank the potential genes for the associated symptoms.

Results: The proposed method gets reliable gene rank list with AUC (area under curve) 0.616 in classification. Some novel genes like CALCA, ESR1, and MTHFR were predicted to be associated with headache symptoms, which are not recorded in the benchmark data set, but have been reported in recent published literatures.

Conclusions: Our study demonstrated that by integrating phenotype-genotype relationships into a complex network framework it provides an effective approach to identify candidate genes of symptoms.

Show MeSH
Related in: MedlinePlus