Limits...
Disease Biomarker Query from RNA-Seq Data.

Han H, Jiang X - Cancer Inform (2014)

Bottom Line: Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics.In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data.As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, Fordham University, New York, NY, USA. ; Quantitative Proteomics Center, Columbia University, New York, NY, USA.

ABSTRACT
As a revolutionary way to unveil transcription, RNA-Seq technologies are challenging bioinformatics for its large data volumes and complexities. A large number of computational models have been proposed for differential expression (DE) analysis and normalization from different standing points. However, there were no studies available yet to conduct disease biomarker discovery for this type of high-resolution digital gene expression data, which will actually be essential to explore its potential in clinical bioinformatics. Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics. In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data. As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

No MeSH data available.


Comparisons of the gene length medians of the genes selected by NSVA, PCA, and NFS methods and DE genes among the selected genes for the Kidney–Liver data. The DE genes have longer gene length than those selected genes from each feature selection method. The DE genes from NSVA-selected genes seem to be shorter than NFS-selected genes but longer than the PCA-selected genes.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4216051&req=5

f4-cin-suppl.1-2014-081: Comparisons of the gene length medians of the genes selected by NSVA, PCA, and NFS methods and DE genes among the selected genes for the Kidney–Liver data. The DE genes have longer gene length than those selected genes from each feature selection method. The DE genes from NSVA-selected genes seem to be shorter than NFS-selected genes but longer than the PCA-selected genes.

Mentions: Third, we found that the DE genes among NSVA-selected genes tended to be relatively long genes, which was also true for NFS- and PCA-selected genes. Figure 4 compared the gene length medians of NSVA-, NFS-, and PCA-selected genes and DE genes among these selected genes. It was interesting to see that PCA selected the shortest genes among three of them. For example, the gene length medians for its DE genes were quite low in the 3000, 5000, and 8000 gene selection cases. Considering the low DE ratios and low counts for PCA-selected methods, it is reasonable to say that PCA tends to select those genes with low counts or short lengths, most of which are obviously not DE genes.


Disease Biomarker Query from RNA-Seq Data.

Han H, Jiang X - Cancer Inform (2014)

Comparisons of the gene length medians of the genes selected by NSVA, PCA, and NFS methods and DE genes among the selected genes for the Kidney–Liver data. The DE genes have longer gene length than those selected genes from each feature selection method. The DE genes from NSVA-selected genes seem to be shorter than NFS-selected genes but longer than the PCA-selected genes.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4216051&req=5

f4-cin-suppl.1-2014-081: Comparisons of the gene length medians of the genes selected by NSVA, PCA, and NFS methods and DE genes among the selected genes for the Kidney–Liver data. The DE genes have longer gene length than those selected genes from each feature selection method. The DE genes from NSVA-selected genes seem to be shorter than NFS-selected genes but longer than the PCA-selected genes.
Mentions: Third, we found that the DE genes among NSVA-selected genes tended to be relatively long genes, which was also true for NFS- and PCA-selected genes. Figure 4 compared the gene length medians of NSVA-, NFS-, and PCA-selected genes and DE genes among these selected genes. It was interesting to see that PCA selected the shortest genes among three of them. For example, the gene length medians for its DE genes were quite low in the 3000, 5000, and 8000 gene selection cases. Considering the low DE ratios and low counts for PCA-selected methods, it is reasonable to say that PCA tends to select those genes with low counts or short lengths, most of which are obviously not DE genes.

Bottom Line: Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics.In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data.As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer and Information Science, Fordham University, New York, NY, USA. ; Quantitative Proteomics Center, Columbia University, New York, NY, USA.

ABSTRACT
As a revolutionary way to unveil transcription, RNA-Seq technologies are challenging bioinformatics for its large data volumes and complexities. A large number of computational models have been proposed for differential expression (DE) analysis and normalization from different standing points. However, there were no studies available yet to conduct disease biomarker discovery for this type of high-resolution digital gene expression data, which will actually be essential to explore its potential in clinical bioinformatics. Although there were many biomarker discovery algorithms available in traditional omics communities, they cannot be applied to RNA-Seq count data to seek biomarkers directly for its special characteristics. In this work, we have presented a biomarker discovery algorithm, SEQ-Marker for RNA-Seq data, which is built on a novel data-driven feature selection algorithm, nonnegative singular value approximation (NSVA), which contributes to the robustness and sensitivity of the following DE analysis by taking advantages of the built-in characteristics of RNA-Seq count data. As a biomarker discovery algorithm built on network marker topology, the proposed SEQ-Marker not only bridges transcriptomics and systems biology but also contributes to clinical diagnostics.

No MeSH data available.