Limits...
Non-parametric change-point method for differential gene expression detection.

Wang Y, Wu C, Ji Z, Wang B, Liang Y - PLoS ONE (2011)

Bottom Line: NPCPS is based on the change point theory to provide effective DGE detecting ability.An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE.Experiment results showed both good accuracy and reliability of NPCPS.

View Article: PubMed Central - PubMed

Affiliation: Key Laboratory for Symbol Computation and Knowledge Engineering of National Education Ministry, College of Computer Science and Technology, Jilin University, Jilin, China.

ABSTRACT

Background: We proposed a non-parametric method, named Non-Parametric Change Point Statistic (NPCPS for short), by using a single equation for detecting differential gene expression (DGE) in microarray data. NPCPS is based on the change point theory to provide effective DGE detecting ability.

Methodology: NPCPS used the data distribution of the normal samples as input, and detects DGE in the cancer samples by locating the change point of gene expression profile. An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE. Monte Carlo simulation and ROC study were applied to examine the detecting accuracy of NPCPS, and the experiment on real microarray data of breast cancer was carried out to compare NPCPS with other methods.

Conclusions: Simulation study indicated that NPCPS was more effective for detecting DGE in cancer subset compared with five parametric methods and one non-parametric method. When there were more than 8 cancer samples containing DGE, the type I error of NPCPS was below 0.01. Experiment results showed both good accuracy and reliability of NPCPS. Out of the 30 top genes ranked by using NPCPS, 16 genes were reported as relevant to cancer. Correlations between the detecting result of NPCPS and the compared methods were less than 0.05, while between the other methods the values were from 0.20 to 0.84. This indicates that NPCPS is working on different features and thus provides DGE identification from a distinct perspective comparing with the other mean or median based methods.

Show MeSH

Related in: MedlinePlus

Selected ROC curves of normal dataset with                            μ = 1.(A)                            n1 = n2 = 25,                                k = 6. (B)                                n1 = n2 = 25,                                k = 9. (C)                                n1 = n2 = 25,                                k = 14. (D)                                n1 = n2 = 50,                                k = 6. (E)                                n1 = n2 = 50,                                k = 9. (F)                                n1 = n2 = 50,                                k = 15. The x-axis is FPR, and                            the y-axis is TPR. The significance level                            α = 0.01 for NPCPS. Larger area under ROC                            curves indicates better sensitivity and specificity. An ROC curve along                            the diagonal line indicates random-guess.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3104986&req=5

pone-0020060-g003: Selected ROC curves of normal dataset with μ = 1.(A) n1 = n2 = 25, k = 6. (B) n1 = n2 = 25, k = 9. (C) n1 = n2 = 25, k = 14. (D) n1 = n2 = 50, k = 6. (E) n1 = n2 = 50, k = 9. (F) n1 = n2 = 50, k = 15. The x-axis is FPR, and the y-axis is TPR. The significance level α = 0.01 for NPCPS. Larger area under ROC curves indicates better sensitivity and specificity. An ROC curve along the diagonal line indicates random-guess.

Mentions: First, we test NPCPS (α = 0.01) and seven other methods, namely COPA, ORT, OS, MOST, T, LRS, and PPST, on normally distributed datasets (mean = 0, sd = 1) with different μ, n and k. When k was getting greater, all methods produced better ROC (Fig. 2 and Fig. 3). For μ = 2, when n = 50 (Fig. 2A–2C), NPCPS was slightly weaker than LRS, and better than the other methods; when n = 100 (Fig. 2D–2E), NPCPS was very similar to LRS, and better than the other methods. For μ = 1, NPCPS gave the best performance for both n = 50 and n = 100 datasets and different values of k (Fig. 3A–3F). This indicated that NPCPS had better sensitivity for less significant DGE compared with the other seven methods. Among the non-parametric method, PPST was not significantly better than the parametric methods, while LRS and NPCPS were continuously better than the other methods. This indicated that methods based on change-point were more effective and robust than methods based on percentile and MAD.


Non-parametric change-point method for differential gene expression detection.

Wang Y, Wu C, Ji Z, Wang B, Liang Y - PLoS ONE (2011)

Selected ROC curves of normal dataset with                            μ = 1.(A)                            n1 = n2 = 25,                                k = 6. (B)                                n1 = n2 = 25,                                k = 9. (C)                                n1 = n2 = 25,                                k = 14. (D)                                n1 = n2 = 50,                                k = 6. (E)                                n1 = n2 = 50,                                k = 9. (F)                                n1 = n2 = 50,                                k = 15. The x-axis is FPR, and                            the y-axis is TPR. The significance level                            α = 0.01 for NPCPS. Larger area under ROC                            curves indicates better sensitivity and specificity. An ROC curve along                            the diagonal line indicates random-guess.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3104986&req=5

pone-0020060-g003: Selected ROC curves of normal dataset with μ = 1.(A) n1 = n2 = 25, k = 6. (B) n1 = n2 = 25, k = 9. (C) n1 = n2 = 25, k = 14. (D) n1 = n2 = 50, k = 6. (E) n1 = n2 = 50, k = 9. (F) n1 = n2 = 50, k = 15. The x-axis is FPR, and the y-axis is TPR. The significance level α = 0.01 for NPCPS. Larger area under ROC curves indicates better sensitivity and specificity. An ROC curve along the diagonal line indicates random-guess.
Mentions: First, we test NPCPS (α = 0.01) and seven other methods, namely COPA, ORT, OS, MOST, T, LRS, and PPST, on normally distributed datasets (mean = 0, sd = 1) with different μ, n and k. When k was getting greater, all methods produced better ROC (Fig. 2 and Fig. 3). For μ = 2, when n = 50 (Fig. 2A–2C), NPCPS was slightly weaker than LRS, and better than the other methods; when n = 100 (Fig. 2D–2E), NPCPS was very similar to LRS, and better than the other methods. For μ = 1, NPCPS gave the best performance for both n = 50 and n = 100 datasets and different values of k (Fig. 3A–3F). This indicated that NPCPS had better sensitivity for less significant DGE compared with the other seven methods. Among the non-parametric method, PPST was not significantly better than the parametric methods, while LRS and NPCPS were continuously better than the other methods. This indicated that methods based on change-point were more effective and robust than methods based on percentile and MAD.

Bottom Line: NPCPS is based on the change point theory to provide effective DGE detecting ability.An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE.Experiment results showed both good accuracy and reliability of NPCPS.

View Article: PubMed Central - PubMed

Affiliation: Key Laboratory for Symbol Computation and Knowledge Engineering of National Education Ministry, College of Computer Science and Technology, Jilin University, Jilin, China.

ABSTRACT

Background: We proposed a non-parametric method, named Non-Parametric Change Point Statistic (NPCPS for short), by using a single equation for detecting differential gene expression (DGE) in microarray data. NPCPS is based on the change point theory to provide effective DGE detecting ability.

Methodology: NPCPS used the data distribution of the normal samples as input, and detects DGE in the cancer samples by locating the change point of gene expression profile. An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE. Monte Carlo simulation and ROC study were applied to examine the detecting accuracy of NPCPS, and the experiment on real microarray data of breast cancer was carried out to compare NPCPS with other methods.

Conclusions: Simulation study indicated that NPCPS was more effective for detecting DGE in cancer subset compared with five parametric methods and one non-parametric method. When there were more than 8 cancer samples containing DGE, the type I error of NPCPS was below 0.01. Experiment results showed both good accuracy and reliability of NPCPS. Out of the 30 top genes ranked by using NPCPS, 16 genes were reported as relevant to cancer. Correlations between the detecting result of NPCPS and the compared methods were less than 0.05, while between the other methods the values were from 0.20 to 0.84. This indicates that NPCPS is working on different features and thus provides DGE identification from a distinct perspective comparing with the other mean or median based methods.

Show MeSH
Related in: MedlinePlus