Limits...
A unified framework for finding differentially expressed genes from microarray experiments.

Shaik JS, Yeasin M - BMC Bioinformatics (2007)

Bottom Line: The performance of the unified framework is compared with well-known ranking algorithms such as t-statistics, Significance Analysis of Microarrays (SAM), Adaptive Ranking, Combined Adaptive Ranking and Two-way Clustering.The performance curves obtained using 50 simulated microarray datasets each following two different distributions indicate the superiority of the unified framework over the other reported algorithms.Empirical analyses show that the unified framework outperformed other gene selection methods in selecting differentially expressed genes from microarray data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Electrical and Computer Engineering, CVPIA Lab, University of Memphis, Memphis, TN-38152, USA. jshaik@memphis.edu

ABSTRACT

Background: This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The first module uses two gene selection algorithms, namely, a) two-way clustering and b) combined adaptive ranking to rank the genes. The second module converts the gene ranks into p-values using an R-test and fuses the two sets of p-values using the Fisher's omnibus criterion. The DEGs are selected using the FDR analysis. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. In addition, the clustering-based validation of the DEGs is performed by employing an adaptive subspace-based clustering algorithm on the training and the test datasets. Finally, a projection-based visualization is performed to validate the DEGs obtained using the unified framework.

Results: The performance of the unified framework is compared with well-known ranking algorithms such as t-statistics, Significance Analysis of Microarrays (SAM), Adaptive Ranking, Combined Adaptive Ranking and Two-way Clustering. The performance curves obtained using 50 simulated microarray datasets each following two different distributions indicate the superiority of the unified framework over the other reported algorithms. Further analyses on 3 real cancer datasets and 3 Parkinson's datasets show the similar improvement in performance. First, a 3 fold validation process is provided for the two-sample cancer datasets. In addition, the analysis on 3 sets of Parkinson's data is performed to demonstrate the scalability of the proposed method to multi-sample microarray datasets.

Conclusion: This paper presents a unified framework for the robust selection of genes from the two-sample as well as multi-sample microarray experiments. Two different ranking methods used in module 1 bring diversity in the selection of genes. The conversion of ranks to p-values, the fusion of p-values and FDR analysis aid in the identification of significant genes which cannot be judged based on gene ranking alone. The 3 fold validation, namely, robustness in selection of genes using FDR analysis, clustering, and visualization demonstrate the relevance of the DEGs. Empirical analyses on 50 artificial datasets and 6 real microarray datasets illustrate the efficacy of the proposed approach. The analyses on 3 cancer datasets demonstrate the utility of the proposed approach on microarray datasets with two classes of samples. The scalability of the proposed unified approach to multi-sample (more than two sample classes) microarray datasets is addressed using three sets of Parkinson's Data. Empirical analyses show that the unified framework outperformed other gene selection methods in selecting differentially expressed genes from microarray data.

Show MeSH

Related in: MedlinePlus

Unified Framework to find DEGs from Microarray Data.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2099446&req=5

Figure 1: Unified Framework to find DEGs from Microarray Data.

Mentions: The proposed unified framework as shown in Fig. 1 consists of three modules viz. i) Gene ranking, ii) Significance analysis of ranking and iii)Validation. The genes are first scored by employing two-way clustering framework and combined adaptive ranking. The gene with highest score is given rank 1; gene with next highest score is given rank 2 and so on for both the methods. The ranks are converted into p-values (P1 and P2) using the R-test which is discussed later. The p-values P1 and P2 are combined using Fisher's omnibus procedure to obtain the unified p-value (U).


A unified framework for finding differentially expressed genes from microarray experiments.

Shaik JS, Yeasin M - BMC Bioinformatics (2007)

Unified Framework to find DEGs from Microarray Data.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2099446&req=5

Figure 1: Unified Framework to find DEGs from Microarray Data.
Mentions: The proposed unified framework as shown in Fig. 1 consists of three modules viz. i) Gene ranking, ii) Significance analysis of ranking and iii)Validation. The genes are first scored by employing two-way clustering framework and combined adaptive ranking. The gene with highest score is given rank 1; gene with next highest score is given rank 2 and so on for both the methods. The ranks are converted into p-values (P1 and P2) using the R-test which is discussed later. The p-values P1 and P2 are combined using Fisher's omnibus procedure to obtain the unified p-value (U).

Bottom Line: The performance of the unified framework is compared with well-known ranking algorithms such as t-statistics, Significance Analysis of Microarrays (SAM), Adaptive Ranking, Combined Adaptive Ranking and Two-way Clustering.The performance curves obtained using 50 simulated microarray datasets each following two different distributions indicate the superiority of the unified framework over the other reported algorithms.Empirical analyses show that the unified framework outperformed other gene selection methods in selecting differentially expressed genes from microarray data.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Electrical and Computer Engineering, CVPIA Lab, University of Memphis, Memphis, TN-38152, USA. jshaik@memphis.edu

ABSTRACT

Background: This paper presents a unified framework for finding differentially expressed genes (DEGs) from the microarray data. The proposed framework has three interrelated modules: (i) gene ranking, ii) significance analysis of genes and (iii) validation. The first module uses two gene selection algorithms, namely, a) two-way clustering and b) combined adaptive ranking to rank the genes. The second module converts the gene ranks into p-values using an R-test and fuses the two sets of p-values using the Fisher's omnibus criterion. The DEGs are selected using the FDR analysis. The third module performs three fold validations of the obtained DEGs. The robustness of the proposed unified framework in gene selection is first illustrated using false discovery rate analysis. In addition, the clustering-based validation of the DEGs is performed by employing an adaptive subspace-based clustering algorithm on the training and the test datasets. Finally, a projection-based visualization is performed to validate the DEGs obtained using the unified framework.

Results: The performance of the unified framework is compared with well-known ranking algorithms such as t-statistics, Significance Analysis of Microarrays (SAM), Adaptive Ranking, Combined Adaptive Ranking and Two-way Clustering. The performance curves obtained using 50 simulated microarray datasets each following two different distributions indicate the superiority of the unified framework over the other reported algorithms. Further analyses on 3 real cancer datasets and 3 Parkinson's datasets show the similar improvement in performance. First, a 3 fold validation process is provided for the two-sample cancer datasets. In addition, the analysis on 3 sets of Parkinson's data is performed to demonstrate the scalability of the proposed method to multi-sample microarray datasets.

Conclusion: This paper presents a unified framework for the robust selection of genes from the two-sample as well as multi-sample microarray experiments. Two different ranking methods used in module 1 bring diversity in the selection of genes. The conversion of ranks to p-values, the fusion of p-values and FDR analysis aid in the identification of significant genes which cannot be judged based on gene ranking alone. The 3 fold validation, namely, robustness in selection of genes using FDR analysis, clustering, and visualization demonstrate the relevance of the DEGs. Empirical analyses on 50 artificial datasets and 6 real microarray datasets illustrate the efficacy of the proposed approach. The analyses on 3 cancer datasets demonstrate the utility of the proposed approach on microarray datasets with two classes of samples. The scalability of the proposed unified approach to multi-sample (more than two sample classes) microarray datasets is addressed using three sets of Parkinson's Data. Empirical analyses show that the unified framework outperformed other gene selection methods in selecting differentially expressed genes from microarray data.

Show MeSH
Related in: MedlinePlus