Limits...
Identification of Subtype-Specific Prognostic Genes for Early-Stage Lung Adenocarcinoma and Squamous Cell Carcinoma Patients Using an Embedded Feature Selection Algorithm.

Tian S - PLoS ONE (2015)

Bottom Line: In this article, we continue our effort on identification of subtype-specific prognostic genes for AC and SCC, and propose a novel embedded feature selection method by extending Threshold Gradient Descent Regularization (TGDR) algorithm and minimizing on a corresponding negative partial likelihood function.Using real-world datasets and simulated ones, we show these two proposed methods have comparable performance whereas the new proposal is superior in terms of model parsimony.Our analysis provides some evidence on the existence of such subtype-specific prognostic genes, more investigation is warranted.

View Article: PubMed Central - PubMed

Affiliation: Division of Clinical Epidemiology, The First Hospital of Jilin University, Changchun, Jilin, People's Republic of China.

ABSTRACT
The existence of fundamental differences between lung adenocarcinoma (AC) and squamous cell carcinoma (SCC) in their underlying mechanisms motivated us to postulate that specific genes might exist relevant to prognosis of each histology subtype. To test on this research hypothesis, we previously proposed a simple Cox-regression model based feature selection algorithm and identified successfully some subtype-specific prognostic genes when applying this method to real-world data. In this article, we continue our effort on identification of subtype-specific prognostic genes for AC and SCC, and propose a novel embedded feature selection method by extending Threshold Gradient Descent Regularization (TGDR) algorithm and minimizing on a corresponding negative partial likelihood function. Using real-world datasets and simulated ones, we show these two proposed methods have comparable performance whereas the new proposal is superior in terms of model parsimony. Our analysis provides some evidence on the existence of such subtype-specific prognostic genes, more investigation is warranted.

No MeSH data available.


Related in: MedlinePlus

Venn diagrams of 33- and 13-gene signatures.A) On the individual gene level. B) On the enriched pathway level. 33-gene and 13-gene signatures were obtained using Cox-TGDR-specific algorithm with one being trained on the microarray data and the other on the RNA-seq data. Here,↓ and ↑ indicate a negative and positive association with hazard of death, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4520527&req=5

pone.0134630.g002: Venn diagrams of 33- and 13-gene signatures.A) On the individual gene level. B) On the enriched pathway level. 33-gene and 13-gene signatures were obtained using Cox-TGDR-specific algorithm with one being trained on the microarray data and the other on the RNA-seq data. Here,↓ and ↑ indicate a negative and positive association with hazard of death, respectively.

Mentions: Then we examined if the 33 (18 AC/18 SCC specific)-gene signature trained on the microarray data and 13 (10 AC/5 SCC specific)-gene signature trained on the RNA-seq data were consistent and robust. First, we focused on the individual gene level. Afterwards, we shifted our attention to the pathway level and evaluated how the enriched pathways by these two signatures overlap (Fig 2). The search of enriched pathways was conducted using a web-based database called STRING [25]. No overlap between these two signatures on both levels and successful identification of keratin 5 (KRT5) as a discriminative gene between AC and SCC samples (although training on data from different platforms) [13] partially justify the claim that prognosis prediction using gene expression profiles is a more difficult task than membership/class prediction [20,26].


Identification of Subtype-Specific Prognostic Genes for Early-Stage Lung Adenocarcinoma and Squamous Cell Carcinoma Patients Using an Embedded Feature Selection Algorithm.

Tian S - PLoS ONE (2015)

Venn diagrams of 33- and 13-gene signatures.A) On the individual gene level. B) On the enriched pathway level. 33-gene and 13-gene signatures were obtained using Cox-TGDR-specific algorithm with one being trained on the microarray data and the other on the RNA-seq data. Here,↓ and ↑ indicate a negative and positive association with hazard of death, respectively.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4520527&req=5

pone.0134630.g002: Venn diagrams of 33- and 13-gene signatures.A) On the individual gene level. B) On the enriched pathway level. 33-gene and 13-gene signatures were obtained using Cox-TGDR-specific algorithm with one being trained on the microarray data and the other on the RNA-seq data. Here,↓ and ↑ indicate a negative and positive association with hazard of death, respectively.
Mentions: Then we examined if the 33 (18 AC/18 SCC specific)-gene signature trained on the microarray data and 13 (10 AC/5 SCC specific)-gene signature trained on the RNA-seq data were consistent and robust. First, we focused on the individual gene level. Afterwards, we shifted our attention to the pathway level and evaluated how the enriched pathways by these two signatures overlap (Fig 2). The search of enriched pathways was conducted using a web-based database called STRING [25]. No overlap between these two signatures on both levels and successful identification of keratin 5 (KRT5) as a discriminative gene between AC and SCC samples (although training on data from different platforms) [13] partially justify the claim that prognosis prediction using gene expression profiles is a more difficult task than membership/class prediction [20,26].

Bottom Line: In this article, we continue our effort on identification of subtype-specific prognostic genes for AC and SCC, and propose a novel embedded feature selection method by extending Threshold Gradient Descent Regularization (TGDR) algorithm and minimizing on a corresponding negative partial likelihood function.Using real-world datasets and simulated ones, we show these two proposed methods have comparable performance whereas the new proposal is superior in terms of model parsimony.Our analysis provides some evidence on the existence of such subtype-specific prognostic genes, more investigation is warranted.

View Article: PubMed Central - PubMed

Affiliation: Division of Clinical Epidemiology, The First Hospital of Jilin University, Changchun, Jilin, People's Republic of China.

ABSTRACT
The existence of fundamental differences between lung adenocarcinoma (AC) and squamous cell carcinoma (SCC) in their underlying mechanisms motivated us to postulate that specific genes might exist relevant to prognosis of each histology subtype. To test on this research hypothesis, we previously proposed a simple Cox-regression model based feature selection algorithm and identified successfully some subtype-specific prognostic genes when applying this method to real-world data. In this article, we continue our effort on identification of subtype-specific prognostic genes for AC and SCC, and propose a novel embedded feature selection method by extending Threshold Gradient Descent Regularization (TGDR) algorithm and minimizing on a corresponding negative partial likelihood function. Using real-world datasets and simulated ones, we show these two proposed methods have comparable performance whereas the new proposal is superior in terms of model parsimony. Our analysis provides some evidence on the existence of such subtype-specific prognostic genes, more investigation is warranted.

No MeSH data available.


Related in: MedlinePlus