Limits...
Disease Biomarker Query from RNA-Seq Data.

Han H, Jiang X - Cancer Inform (2014)

Bottom Line: Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics.In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data.As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, Fordham University, New York, NY, USA. ; Quantitative Proteomics Center, Columbia University, New York, NY, USA.

ABSTRACT
As a revolutionary way to unveil transcription, RNA-Seq technologies are challenging bioinformatics for its large data volumes and complexities. A large number of computational models have been proposed for differential expression (DE) analysis and normalization from different standing points. However, there were no studies available yet to conduct disease biomarker discovery for this type of high-resolution digital gene expression data, which will actually be essential to explore its potential in clinical bioinformatics. Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics. In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data. As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

No MeSH data available.


The network marker with 203 genes and 730 interactions identified by SEQ-Marker algorithm for Prostate data. The core genes with the largest interactions (degrees) were emphasized in the network topology.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4216051&req=5

f7-cin-suppl.1-2014-081: The network marker with 203 genes and 730 interactions identified by SEQ-Marker algorithm for Prostate data. The core genes with the largest interactions (degrees) were emphasized in the network topology.

Mentions: Similarly, we applied the SEQ-Marker algorithm to the Prostate data and obtained the following network marker with 203 genes and 730 edges as shown in Figure 7. We identified five core genes such as APP, HSP90AA1, NEDD8, HNRNPA1, and NPM1 from the inferred network marker. It was interesting to see that almost all core genes had strong P-value support except HSP90AAI. Although it was actually not a DE gene because of its P-value, 0.2051 statistically, our SEQ-Marker algorithm indicated it as a biomarker for prostate cancer, which was proved as a real prostate cancer marker by the previous studies.33,37 In addition, all the five core genes were high-count genes whose average gene counts were much higher than the median DE gene count: 118 bp. For example, the average gene count of APP and HSP90AAI reached 630 bp and 397 bp, respectively. Interestingly, we found that almost all these genes were associated or closely related to prostate cancer from previous studies,37,38 for instances, APP was identified as a well-known gene marker to promote prostate cancer growth according to Takayama et al’s work,39 and NEDD8 conjugation pathway is essential for understanding prostate cancer or other complex cancer diseases.38,40 We identified the corresponding associative genes for the core genes and included their corresponding correlation values: FKBP5 (99.91%), SPTLC1 (98.57%), NEDD8-MDP1 (99.60%), DARS (99.53%), and MY06 (99.54%). It was interesting to find that FKBP5, DARS, and MY06 were well-known prostate cancer marker according to previous studies.41–43 Similar to the Kidney–Liver data, we achieved 100% accuracy with 100% sensitivity and specificity by using the 10 biomarkers to conduct diagnosis under a linear support vector machine with LOOCV.


Disease Biomarker Query from RNA-Seq Data.

Han H, Jiang X - Cancer Inform (2014)

The network marker with 203 genes and 730 interactions identified by SEQ-Marker algorithm for Prostate data. The core genes with the largest interactions (degrees) were emphasized in the network topology.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4216051&req=5

f7-cin-suppl.1-2014-081: The network marker with 203 genes and 730 interactions identified by SEQ-Marker algorithm for Prostate data. The core genes with the largest interactions (degrees) were emphasized in the network topology.
Mentions: Similarly, we applied the SEQ-Marker algorithm to the Prostate data and obtained the following network marker with 203 genes and 730 edges as shown in Figure 7. We identified five core genes such as APP, HSP90AA1, NEDD8, HNRNPA1, and NPM1 from the inferred network marker. It was interesting to see that almost all core genes had strong P-value support except HSP90AAI. Although it was actually not a DE gene because of its P-value, 0.2051 statistically, our SEQ-Marker algorithm indicated it as a biomarker for prostate cancer, which was proved as a real prostate cancer marker by the previous studies.33,37 In addition, all the five core genes were high-count genes whose average gene counts were much higher than the median DE gene count: 118 bp. For example, the average gene count of APP and HSP90AAI reached 630 bp and 397 bp, respectively. Interestingly, we found that almost all these genes were associated or closely related to prostate cancer from previous studies,37,38 for instances, APP was identified as a well-known gene marker to promote prostate cancer growth according to Takayama et al’s work,39 and NEDD8 conjugation pathway is essential for understanding prostate cancer or other complex cancer diseases.38,40 We identified the corresponding associative genes for the core genes and included their corresponding correlation values: FKBP5 (99.91%), SPTLC1 (98.57%), NEDD8-MDP1 (99.60%), DARS (99.53%), and MY06 (99.54%). It was interesting to find that FKBP5, DARS, and MY06 were well-known prostate cancer marker according to previous studies.41–43 Similar to the Kidney–Liver data, we achieved 100% accuracy with 100% sensitivity and specificity by using the 10 biomarkers to conduct diagnosis under a linear support vector machine with LOOCV.

Bottom Line: Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics.In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data.As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, Fordham University, New York, NY, USA. ; Quantitative Proteomics Center, Columbia University, New York, NY, USA.

ABSTRACT
As a revolutionary way to unveil transcription, RNA-Seq technologies are challenging bioinformatics for its large data volumes and complexities. A large number of computational models have been proposed for differential expression (DE) analysis and normalization from different standing points. However, there were no studies available yet to conduct disease biomarker discovery for this type of high-resolution digital gene expression data, which will actually be essential to explore its potential in clinical bioinformatics. Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics. In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data. As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

No MeSH data available.