Limits...
Large-scale RNA-Seq Transcriptome Analysis of 4043 Cancers and 548 Normal Tissue Controls across 12 TCGA Cancer Types.

Peng L, Bian XW, Li DK, Xu C, Wang GM, Xia QY, Xiong Q - Sci Rep (2015)

Bottom Line: A 14-gene signature extracted from these seven cross-cancer gene signatures precisely differentiated between cancerous and normal samples, the predictive accuracy of leave-one-out cross-validation (LOOCV) were 92.04%, 96.23%, 91.76%, 90.05%, 88.17%, 94.29%, and 99.10% for BLCA, BRCA, COAD, HNSC, LIHC, LUAD, and LUSC, respectively.A lung cancer-specific gene signature, containing SFTPA1 and SFTPA2 genes, accurately distinguished lung cancer from other cancer samples, the predictive accuracy of LOOCV for TCGA and GSE5364 data were 95.68% and 100%, respectively.These gene signatures provide rich insights into the transcriptional programs that trigger tumorigenesis and metastasis, and many genes in the signature gene panels may be of significant value to the diagnosis and treatment of cancer.

View Article: PubMed Central - PubMed

Affiliation: State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China.

ABSTRACT
The Cancer Genome Atlas (TCGA) has accrued RNA-Seq-based transcriptome data for more than 4000 cancer tissue samples across 12 cancer types, translating these data into biological insights remains a major challenge. We analyzed and compared the transcriptomes of 4043 cancer and 548 normal tissue samples from 21 TCGA cancer types, and created a comprehensive catalog of gene expression alterations for each cancer type. By clustering genes into co-regulated gene sets, we identified seven cross-cancer gene signatures altered across a diverse panel of primary human cancer samples. A 14-gene signature extracted from these seven cross-cancer gene signatures precisely differentiated between cancerous and normal samples, the predictive accuracy of leave-one-out cross-validation (LOOCV) were 92.04%, 96.23%, 91.76%, 90.05%, 88.17%, 94.29%, and 99.10% for BLCA, BRCA, COAD, HNSC, LIHC, LUAD, and LUSC, respectively. A lung cancer-specific gene signature, containing SFTPA1 and SFTPA2 genes, accurately distinguished lung cancer from other cancer samples, the predictive accuracy of LOOCV for TCGA and GSE5364 data were 95.68% and 100%, respectively. These gene signatures provide rich insights into the transcriptional programs that trigger tumorigenesis and metastasis, and many genes in the signature gene panels may be of significant value to the diagnosis and treatment of cancer.

No MeSH data available.


Related in: MedlinePlus

The predictive accuracy and error rates of LOOCV for each cancer type using the 14-gene signature.Red indicates the predictive accuracy; Blue represents error rates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4544034&req=5

f5: The predictive accuracy and error rates of LOOCV for each cancer type using the 14-gene signature.Red indicates the predictive accuracy; Blue represents error rates.

Mentions: Seven gene sets, CLUSTER241, CLUSTER514, CLUSTER1011, CLUSTER932, CLUSTER574, CLUSTER3137, and CLUSTER184, were differentially expressed in at least four of the seven cancer types: BLCA, BRCA, COAD, HNSC, LUAD, and LUSC. We extracted the top two most differentially expressed genes from these gene sets and created a 14-gene signature, including kinesin family member 4A (KIF4A), nucleolar and spindle associated protein 1 (NUSAP1), Holliday junction recognition protein (HJURP), NIMA-related kinase 2 (NEK2), Fanconi anemia, complementation group I (FANCI), denticleless E3 ubiquitin protein ligase homolog (Drosophila) (DTL), UHRF1, flap structure-specific endonuclease 1 (FEN1), IQ motif containing GTPase activating protein 3 (IQGAP3), kinesin family member 20A (KIF20A), tripartite motif containing 59 (TRIM59), centromere protein L (CENPL), chromosome 16 open reading frame 59 (C16orf59), and UBE2C. We employed leave-one-out cross-validation (LOOCV) to assess whether or not this 14-gene signature can be used to differentiate between the normal and cancerous tissue samples of those seven cancer types. Machine learning techniques, for example support vector machines, have been playing a vital role in sample classification141142143144. LOOCV was performed using SVM-light145 (http://svmlight.joachims.org/) that is an implementation of support vector machines. The predictive accuracy of LOOCV for each cancer type are shown in Fig. 5. The predictive accuracy is the proportion of the total number of predictions that were correct. We found that most of samples were correctly classified based on the expression levels of these 14 genes, the classification accuracy for BLCA, BRCA, COAD, HNSC, LIHC, LUAD, and LUSC were 92.04%, 96.23%, 91.76%, 90.05%, 88.17%, 94.29%, and 99.10%, respectively.


Large-scale RNA-Seq Transcriptome Analysis of 4043 Cancers and 548 Normal Tissue Controls across 12 TCGA Cancer Types.

Peng L, Bian XW, Li DK, Xu C, Wang GM, Xia QY, Xiong Q - Sci Rep (2015)

The predictive accuracy and error rates of LOOCV for each cancer type using the 14-gene signature.Red indicates the predictive accuracy; Blue represents error rates.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4544034&req=5

f5: The predictive accuracy and error rates of LOOCV for each cancer type using the 14-gene signature.Red indicates the predictive accuracy; Blue represents error rates.
Mentions: Seven gene sets, CLUSTER241, CLUSTER514, CLUSTER1011, CLUSTER932, CLUSTER574, CLUSTER3137, and CLUSTER184, were differentially expressed in at least four of the seven cancer types: BLCA, BRCA, COAD, HNSC, LUAD, and LUSC. We extracted the top two most differentially expressed genes from these gene sets and created a 14-gene signature, including kinesin family member 4A (KIF4A), nucleolar and spindle associated protein 1 (NUSAP1), Holliday junction recognition protein (HJURP), NIMA-related kinase 2 (NEK2), Fanconi anemia, complementation group I (FANCI), denticleless E3 ubiquitin protein ligase homolog (Drosophila) (DTL), UHRF1, flap structure-specific endonuclease 1 (FEN1), IQ motif containing GTPase activating protein 3 (IQGAP3), kinesin family member 20A (KIF20A), tripartite motif containing 59 (TRIM59), centromere protein L (CENPL), chromosome 16 open reading frame 59 (C16orf59), and UBE2C. We employed leave-one-out cross-validation (LOOCV) to assess whether or not this 14-gene signature can be used to differentiate between the normal and cancerous tissue samples of those seven cancer types. Machine learning techniques, for example support vector machines, have been playing a vital role in sample classification141142143144. LOOCV was performed using SVM-light145 (http://svmlight.joachims.org/) that is an implementation of support vector machines. The predictive accuracy of LOOCV for each cancer type are shown in Fig. 5. The predictive accuracy is the proportion of the total number of predictions that were correct. We found that most of samples were correctly classified based on the expression levels of these 14 genes, the classification accuracy for BLCA, BRCA, COAD, HNSC, LIHC, LUAD, and LUSC were 92.04%, 96.23%, 91.76%, 90.05%, 88.17%, 94.29%, and 99.10%, respectively.

Bottom Line: A 14-gene signature extracted from these seven cross-cancer gene signatures precisely differentiated between cancerous and normal samples, the predictive accuracy of leave-one-out cross-validation (LOOCV) were 92.04%, 96.23%, 91.76%, 90.05%, 88.17%, 94.29%, and 99.10% for BLCA, BRCA, COAD, HNSC, LIHC, LUAD, and LUSC, respectively.A lung cancer-specific gene signature, containing SFTPA1 and SFTPA2 genes, accurately distinguished lung cancer from other cancer samples, the predictive accuracy of LOOCV for TCGA and GSE5364 data were 95.68% and 100%, respectively.These gene signatures provide rich insights into the transcriptional programs that trigger tumorigenesis and metastasis, and many genes in the signature gene panels may be of significant value to the diagnosis and treatment of cancer.

View Article: PubMed Central - PubMed

Affiliation: State Key Laboratory of Silkworm Genome Biology, Southwest University, Chongqing 400715, China.

ABSTRACT
The Cancer Genome Atlas (TCGA) has accrued RNA-Seq-based transcriptome data for more than 4000 cancer tissue samples across 12 cancer types, translating these data into biological insights remains a major challenge. We analyzed and compared the transcriptomes of 4043 cancer and 548 normal tissue samples from 21 TCGA cancer types, and created a comprehensive catalog of gene expression alterations for each cancer type. By clustering genes into co-regulated gene sets, we identified seven cross-cancer gene signatures altered across a diverse panel of primary human cancer samples. A 14-gene signature extracted from these seven cross-cancer gene signatures precisely differentiated between cancerous and normal samples, the predictive accuracy of leave-one-out cross-validation (LOOCV) were 92.04%, 96.23%, 91.76%, 90.05%, 88.17%, 94.29%, and 99.10% for BLCA, BRCA, COAD, HNSC, LIHC, LUAD, and LUSC, respectively. A lung cancer-specific gene signature, containing SFTPA1 and SFTPA2 genes, accurately distinguished lung cancer from other cancer samples, the predictive accuracy of LOOCV for TCGA and GSE5364 data were 95.68% and 100%, respectively. These gene signatures provide rich insights into the transcriptional programs that trigger tumorigenesis and metastasis, and many genes in the signature gene panels may be of significant value to the diagnosis and treatment of cancer.

No MeSH data available.


Related in: MedlinePlus