Limits...
Non-parametric change-point method for differential gene expression detection.

Wang Y, Wu C, Ji Z, Wang B, Liang Y - PLoS ONE (2011)

Bottom Line: NPCPS is based on the change point theory to provide effective DGE detecting ability.An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE.Experiment results showed both good accuracy and reliability of NPCPS.

View Article: PubMed Central - PubMed

Affiliation: Key Laboratory for Symbol Computation and Knowledge Engineering of National Education Ministry, College of Computer Science and Technology, Jilin University, Jilin, China.

ABSTRACT

Background: We proposed a non-parametric method, named Non-Parametric Change Point Statistic (NPCPS for short), by using a single equation for detecting differential gene expression (DGE) in microarray data. NPCPS is based on the change point theory to provide effective DGE detecting ability.

Methodology: NPCPS used the data distribution of the normal samples as input, and detects DGE in the cancer samples by locating the change point of gene expression profile. An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE. Monte Carlo simulation and ROC study were applied to examine the detecting accuracy of NPCPS, and the experiment on real microarray data of breast cancer was carried out to compare NPCPS with other methods.

Conclusions: Simulation study indicated that NPCPS was more effective for detecting DGE in cancer subset compared with five parametric methods and one non-parametric method. When there were more than 8 cancer samples containing DGE, the type I error of NPCPS was below 0.01. Experiment results showed both good accuracy and reliability of NPCPS. Out of the 30 top genes ranked by using NPCPS, 16 genes were reported as relevant to cancer. Correlations between the detecting result of NPCPS and the compared methods were less than 0.05, while between the other methods the values were from 0.20 to 0.84. This indicates that NPCPS is working on different features and thus provides DGE identification from a distinct perspective comparing with the other mean or median based methods.

Show MeSH

Related in: MedlinePlus

Data distributions of genes top-ranked by NPCPS.(A) I1GAP1: rank 19, positive Dn. (B)                            PIP5K1B: rank 20, positive Dn. (C) UBB: rank                            21, negative Dn. (D) RFC1: rank 22, negative                                    Dn. Top-ranked genes by NPCPS had                            significant difference between the data distributions of cancer and                            normal groups. By comparing the empirical distribution of cancer and                            normal samples, (A) and (B) had distributions of cancer group that were                            significantly left to the distribution of normal group, which                            demonstrated under-expression; (C) and (D) had distributions of cancer                            group that were significantly right to the distribution of normal group,                            which demonstrated over expression. The distribution curves were                            consistent with the biological significance of                                Dn value.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3104986&req=5

pone-0020060-g009: Data distributions of genes top-ranked by NPCPS.(A) I1GAP1: rank 19, positive Dn. (B) PIP5K1B: rank 20, positive Dn. (C) UBB: rank 21, negative Dn. (D) RFC1: rank 22, negative Dn. Top-ranked genes by NPCPS had significant difference between the data distributions of cancer and normal groups. By comparing the empirical distribution of cancer and normal samples, (A) and (B) had distributions of cancer group that were significantly left to the distribution of normal group, which demonstrated under-expression; (C) and (D) had distributions of cancer group that were significantly right to the distribution of normal group, which demonstrated over expression. The distribution curves were consistent with the biological significance of Dn value.

Mentions: NPCPS results showed that, among the 7219 genes, 3608 had negative Dn, while the rest 3521 had positive Dn. NPCPS use Dn to evaluate the change in distribution between normal and cancer samples, and directly measure the DGE type as either over expressed or under expressed. This feature is valid based on the expression value in Fig. 6 and 7, where Fig. 6 (positive Dn) shows typical under expression and Fig. 7 (negative Dn) shows typical over expression. Fig. 9 and Fig. 10 can illustrate the relationship between Dn and DGE in a more intuitive manner where cumulative data distributions of several typically ranked genes are given.


Non-parametric change-point method for differential gene expression detection.

Wang Y, Wu C, Ji Z, Wang B, Liang Y - PLoS ONE (2011)

Data distributions of genes top-ranked by NPCPS.(A) I1GAP1: rank 19, positive Dn. (B)                            PIP5K1B: rank 20, positive Dn. (C) UBB: rank                            21, negative Dn. (D) RFC1: rank 22, negative                                    Dn. Top-ranked genes by NPCPS had                            significant difference between the data distributions of cancer and                            normal groups. By comparing the empirical distribution of cancer and                            normal samples, (A) and (B) had distributions of cancer group that were                            significantly left to the distribution of normal group, which                            demonstrated under-expression; (C) and (D) had distributions of cancer                            group that were significantly right to the distribution of normal group,                            which demonstrated over expression. The distribution curves were                            consistent with the biological significance of                                Dn value.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3104986&req=5

pone-0020060-g009: Data distributions of genes top-ranked by NPCPS.(A) I1GAP1: rank 19, positive Dn. (B) PIP5K1B: rank 20, positive Dn. (C) UBB: rank 21, negative Dn. (D) RFC1: rank 22, negative Dn. Top-ranked genes by NPCPS had significant difference between the data distributions of cancer and normal groups. By comparing the empirical distribution of cancer and normal samples, (A) and (B) had distributions of cancer group that were significantly left to the distribution of normal group, which demonstrated under-expression; (C) and (D) had distributions of cancer group that were significantly right to the distribution of normal group, which demonstrated over expression. The distribution curves were consistent with the biological significance of Dn value.
Mentions: NPCPS results showed that, among the 7219 genes, 3608 had negative Dn, while the rest 3521 had positive Dn. NPCPS use Dn to evaluate the change in distribution between normal and cancer samples, and directly measure the DGE type as either over expressed or under expressed. This feature is valid based on the expression value in Fig. 6 and 7, where Fig. 6 (positive Dn) shows typical under expression and Fig. 7 (negative Dn) shows typical over expression. Fig. 9 and Fig. 10 can illustrate the relationship between Dn and DGE in a more intuitive manner where cumulative data distributions of several typically ranked genes are given.

Bottom Line: NPCPS is based on the change point theory to provide effective DGE detecting ability.An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE.Experiment results showed both good accuracy and reliability of NPCPS.

View Article: PubMed Central - PubMed

Affiliation: Key Laboratory for Symbol Computation and Knowledge Engineering of National Education Ministry, College of Computer Science and Technology, Jilin University, Jilin, China.

ABSTRACT

Background: We proposed a non-parametric method, named Non-Parametric Change Point Statistic (NPCPS for short), by using a single equation for detecting differential gene expression (DGE) in microarray data. NPCPS is based on the change point theory to provide effective DGE detecting ability.

Methodology: NPCPS used the data distribution of the normal samples as input, and detects DGE in the cancer samples by locating the change point of gene expression profile. An estimate of the change point position generated by NPCPS enables the identification of the samples containing DGE. Monte Carlo simulation and ROC study were applied to examine the detecting accuracy of NPCPS, and the experiment on real microarray data of breast cancer was carried out to compare NPCPS with other methods.

Conclusions: Simulation study indicated that NPCPS was more effective for detecting DGE in cancer subset compared with five parametric methods and one non-parametric method. When there were more than 8 cancer samples containing DGE, the type I error of NPCPS was below 0.01. Experiment results showed both good accuracy and reliability of NPCPS. Out of the 30 top genes ranked by using NPCPS, 16 genes were reported as relevant to cancer. Correlations between the detecting result of NPCPS and the compared methods were less than 0.05, while between the other methods the values were from 0.20 to 0.84. This indicates that NPCPS is working on different features and thus provides DGE identification from a distinct perspective comparing with the other mean or median based methods.

Show MeSH
Related in: MedlinePlus