Limits...
Robust microarray meta-analysis identifies differentially expressed genes for clinical prediction.

Phan JH, Young AN, Wang MD - ScientificWorldJournal (2012)

Bottom Line: Combining multiple microarray datasets increases sample size and leads to improved reproducibility in identification of informative genes and subsequent clinical prediction.Although microarrays have increased the rate of genomic data collection, sample size is still a major issue when identifying informative genetic biomarkers.Results indicate that rank average meta-analysis consistently performs well compared to five other meta-analysis methods.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, Atlanta, GA 30332, USA.

ABSTRACT
Combining multiple microarray datasets increases sample size and leads to improved reproducibility in identification of informative genes and subsequent clinical prediction. Although microarrays have increased the rate of genomic data collection, sample size is still a major issue when identifying informative genetic biomarkers. Because of this, feature selection methods often suffer from false discoveries, resulting in poorly performing predictive models. We develop a simple meta-analysis-based feature selection method that captures the knowledge in each individual dataset and combines the results using a simple rank average. In a comprehensive study that measures robustness in terms of clinical application (i.e., breast, renal, and pancreatic cancer), microarray platform heterogeneity, and classifier (i.e., logistic regression, diagonal LDA, and linear SVM), we compare the rank average meta-analysis method to five other meta-analysis methods. Results indicate that rank average meta-analysis consistently performs well compared to five other meta-analysis methods.

Show MeSH

Related in: MedlinePlus

Rating meta-analysis methods by prediction performance when combining all available datasets. Each meta-analysis method (rank average, rank products, Wang, mDEDS, Choi, and naive) is rated relative to its peers. We assess performance rating across three factors: (1) clinical application (breast cancer: BC, renal cancer: RC, and pancreatic cancer: PC), (2) data platform heterogeneity (homogeneous: orange, heterogeneous: blue), and (3) classifier (logistic regression: LR, diagonal LDA: DLDA and linear SVM). For each combination of factors, the rating of each meta-analysis method is represented by an additive bar. Methods with higher absolute prediction performance receive higher ratings (and longer bars). When considering absolute prediction performance, rank average, with a mean overall rating of 4.56, performs consistently well compared to its peers.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3539384&req=5

fig3: Rating meta-analysis methods by prediction performance when combining all available datasets. Each meta-analysis method (rank average, rank products, Wang, mDEDS, Choi, and naive) is rated relative to its peers. We assess performance rating across three factors: (1) clinical application (breast cancer: BC, renal cancer: RC, and pancreatic cancer: PC), (2) data platform heterogeneity (homogeneous: orange, heterogeneous: blue), and (3) classifier (logistic regression: LR, diagonal LDA: DLDA and linear SVM). For each combination of factors, the rating of each meta-analysis method is represented by an additive bar. Methods with higher absolute prediction performance receive higher ratings (and longer bars). When considering absolute prediction performance, rank average, with a mean overall rating of 4.56, performs consistently well compared to its peers.

Mentions: We rate each meta-analysis method by absolute prediction performance (Figure 3). Based on this criterion, we find that rank average meta-analysis, with the highest overall mean rating of 4.56, performs consistently well compared to five other meta-analysis methods including the mDEDS, rank products, Choi, Wang, and naive methods. This analysis answers the question: which meta-analysis-based FS method consistently exhibits the largest prediction performance when combining all available datasets? We assign a rating to each meta-analysis method for every combination of three factors that include (1) clinical application or dataset group, (2) data platform heterogeneity (combining similar or different microarray platforms), and (3) classifiers (logistic regression: LR, diagonal linear discriminant: DLDA, and linear SVM). Ratings for each meta-analysis method are relative to its peers, with higher ratings indicating better prediction performance under the same combination of factors. In Figure 3, bars are proportional to performance ratings. Using pancreatic cancer (PC) as an example, the rank average meta-analysis method has a rating of five (corresponding to a predictive performance AUC of 81.5, See Supplemental Table S1 available online at doi:10.1100/2012/989637) when analyzing homogeneous datasets and when using the logistic regression classifier. This means that its absolute prediction performance is higher than that of four other meta-analysis methods compared under the same conditions (i.e., homogeneous data, logistic regression classifier). The results illustrated in Figure 3 and obtained through a comprehensive analysis of three factors suggest that, relative to its peers, rank average meta-analysis is robust when considering absolute prediction performance.


Robust microarray meta-analysis identifies differentially expressed genes for clinical prediction.

Phan JH, Young AN, Wang MD - ScientificWorldJournal (2012)

Rating meta-analysis methods by prediction performance when combining all available datasets. Each meta-analysis method (rank average, rank products, Wang, mDEDS, Choi, and naive) is rated relative to its peers. We assess performance rating across three factors: (1) clinical application (breast cancer: BC, renal cancer: RC, and pancreatic cancer: PC), (2) data platform heterogeneity (homogeneous: orange, heterogeneous: blue), and (3) classifier (logistic regression: LR, diagonal LDA: DLDA and linear SVM). For each combination of factors, the rating of each meta-analysis method is represented by an additive bar. Methods with higher absolute prediction performance receive higher ratings (and longer bars). When considering absolute prediction performance, rank average, with a mean overall rating of 4.56, performs consistently well compared to its peers.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3539384&req=5

fig3: Rating meta-analysis methods by prediction performance when combining all available datasets. Each meta-analysis method (rank average, rank products, Wang, mDEDS, Choi, and naive) is rated relative to its peers. We assess performance rating across three factors: (1) clinical application (breast cancer: BC, renal cancer: RC, and pancreatic cancer: PC), (2) data platform heterogeneity (homogeneous: orange, heterogeneous: blue), and (3) classifier (logistic regression: LR, diagonal LDA: DLDA and linear SVM). For each combination of factors, the rating of each meta-analysis method is represented by an additive bar. Methods with higher absolute prediction performance receive higher ratings (and longer bars). When considering absolute prediction performance, rank average, with a mean overall rating of 4.56, performs consistently well compared to its peers.
Mentions: We rate each meta-analysis method by absolute prediction performance (Figure 3). Based on this criterion, we find that rank average meta-analysis, with the highest overall mean rating of 4.56, performs consistently well compared to five other meta-analysis methods including the mDEDS, rank products, Choi, Wang, and naive methods. This analysis answers the question: which meta-analysis-based FS method consistently exhibits the largest prediction performance when combining all available datasets? We assign a rating to each meta-analysis method for every combination of three factors that include (1) clinical application or dataset group, (2) data platform heterogeneity (combining similar or different microarray platforms), and (3) classifiers (logistic regression: LR, diagonal linear discriminant: DLDA, and linear SVM). Ratings for each meta-analysis method are relative to its peers, with higher ratings indicating better prediction performance under the same combination of factors. In Figure 3, bars are proportional to performance ratings. Using pancreatic cancer (PC) as an example, the rank average meta-analysis method has a rating of five (corresponding to a predictive performance AUC of 81.5, See Supplemental Table S1 available online at doi:10.1100/2012/989637) when analyzing homogeneous datasets and when using the logistic regression classifier. This means that its absolute prediction performance is higher than that of four other meta-analysis methods compared under the same conditions (i.e., homogeneous data, logistic regression classifier). The results illustrated in Figure 3 and obtained through a comprehensive analysis of three factors suggest that, relative to its peers, rank average meta-analysis is robust when considering absolute prediction performance.

Bottom Line: Combining multiple microarray datasets increases sample size and leads to improved reproducibility in identification of informative genes and subsequent clinical prediction.Although microarrays have increased the rate of genomic data collection, sample size is still a major issue when identifying informative genetic biomarkers.Results indicate that rank average meta-analysis consistently performs well compared to five other meta-analysis methods.

View Article: PubMed Central - PubMed

Affiliation: Department of Biomedical Engineering, Georgia Institute of Technology and Emory University, 313 Ferst Drive, Atlanta, GA 30332, USA.

ABSTRACT
Combining multiple microarray datasets increases sample size and leads to improved reproducibility in identification of informative genes and subsequent clinical prediction. Although microarrays have increased the rate of genomic data collection, sample size is still a major issue when identifying informative genetic biomarkers. Because of this, feature selection methods often suffer from false discoveries, resulting in poorly performing predictive models. We develop a simple meta-analysis-based feature selection method that captures the knowledge in each individual dataset and combines the results using a simple rank average. In a comprehensive study that measures robustness in terms of clinical application (i.e., breast, renal, and pancreatic cancer), microarray platform heterogeneity, and classifier (i.e., logistic regression, diagonal LDA, and linear SVM), we compare the rank average meta-analysis method to five other meta-analysis methods. Results indicate that rank average meta-analysis consistently performs well compared to five other meta-analysis methods.

Show MeSH
Related in: MedlinePlus