Limits...
StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data.

Rumpf RW, Wolock SL, Ray WC - Cancer Inform (2014)

Bottom Line: One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models.By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis.In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

View Article: PubMed Central - PubMed

Affiliation: The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA.

ABSTRACT
As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of summary statistics rarely provide an adequate picture of the patterns and details of the dataset to enable researchers to make well-informed decisions about the adequacy of the models they are constructing. We have developed a tool, StickWRLD, which allows the user to visually browse through their data, displaying all possible correlations. By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis. In this study, we applied StickWRLD to a semi-synthetic dataset constructed from two published human datasets. In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

No MeSH data available.


Related in: MedlinePlus

Reducing the residual further and eliminating edges which were not of interest revealed additional gene-SNP relationships of interest (A). Notably, there are several cases where the minor SNP allele is correlated to a change in expression (B), and one (C) where specific alleles of two SNPs differentially effect the expression of multiple genes. These effects were not seen at higher values for the residual due to low penetrance.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4214597&req=5

f4-cin-suppl.3-2014-063: Reducing the residual further and eliminating edges which were not of interest revealed additional gene-SNP relationships of interest (A). Notably, there are several cases where the minor SNP allele is correlated to a change in expression (B), and one (C) where specific alleles of two SNPs differentially effect the expression of multiple genes. These effects were not seen at higher values for the residual due to low penetrance.

Mentions: At a residual of 0.015, many additional gene–SNP correlations are revealed (Fig. 4). To simplify the visualization, all SNP–SNP edges were removed programmatically so that only correlations of interest (gene–SNP) remain. Of significant interest is the discovery of several cases where the minor SNP allele is correlated to a change in expression (Fig. 4, panel B), and a case where two SNPs affect different genes depending on which allele is present (Fig. 4, panel C).


StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data.

Rumpf RW, Wolock SL, Ray WC - Cancer Inform (2014)

Reducing the residual further and eliminating edges which were not of interest revealed additional gene-SNP relationships of interest (A). Notably, there are several cases where the minor SNP allele is correlated to a change in expression (B), and one (C) where specific alleles of two SNPs differentially effect the expression of multiple genes. These effects were not seen at higher values for the residual due to low penetrance.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4214597&req=5

f4-cin-suppl.3-2014-063: Reducing the residual further and eliminating edges which were not of interest revealed additional gene-SNP relationships of interest (A). Notably, there are several cases where the minor SNP allele is correlated to a change in expression (B), and one (C) where specific alleles of two SNPs differentially effect the expression of multiple genes. These effects were not seen at higher values for the residual due to low penetrance.
Mentions: At a residual of 0.015, many additional gene–SNP correlations are revealed (Fig. 4). To simplify the visualization, all SNP–SNP edges were removed programmatically so that only correlations of interest (gene–SNP) remain. Of significant interest is the discovery of several cases where the minor SNP allele is correlated to a change in expression (Fig. 4, panel B), and a case where two SNPs affect different genes depending on which allele is present (Fig. 4, panel C).

Bottom Line: One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models.By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis.In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

View Article: PubMed Central - PubMed

Affiliation: The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA.

ABSTRACT
As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of summary statistics rarely provide an adequate picture of the patterns and details of the dataset to enable researchers to make well-informed decisions about the adequacy of the models they are constructing. We have developed a tool, StickWRLD, which allows the user to visually browse through their data, displaying all possible correlations. By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis. In this study, we applied StickWRLD to a semi-synthetic dataset constructed from two published human datasets. In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

No MeSH data available.


Related in: MedlinePlus