Limits...
StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data.

Rumpf RW, Wolock SL, Ray WC - Cancer Inform (2014)

Bottom Line: One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models.By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis.In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

View Article: PubMed Central - PubMed

Affiliation: The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA.

ABSTRACT
As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of summary statistics rarely provide an adequate picture of the patterns and details of the dataset to enable researchers to make well-informed decisions about the adequacy of the models they are constructing. We have developed a tool, StickWRLD, which allows the user to visually browse through their data, displaying all possible correlations. By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis. In this study, we applied StickWRLD to a semi-synthetic dataset constructed from two published human datasets. In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

No MeSH data available.


Additional relationships are revealed by further reducing the residual; note that the P value remained significant at 0.05 throughout the analysis. Several strong correlations between genes and SNPs can be seen as the bold dark connectors leading from the genes in the foreground to their corresponding SNPs in the background.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4214597&req=5

f3-cin-suppl.3-2014-063: Additional relationships are revealed by further reducing the residual; note that the P value remained significant at 0.05 throughout the analysis. Several strong correlations between genes and SNPs can be seen as the bold dark connectors leading from the genes in the foreground to their corresponding SNPs in the background.

Mentions: Tuning the residual down by increments reveals additional correlations – again all SNP to SNP – until the residual is reduced to 0.05 (Fig. 2). Here, we see our first significant correlation (with P = 0.05) between expression levels of a gene and an SNP – specifically, CDH1 and rs35255374. Dialing the residual down to 0.025 reveals three additional gene to SNP relationships: CHD1 to 16:67369626; PCDH1 to 16:67374748; and CDH22 to rs35255374 (Fig. 3).


StickWRLD as an Interactive Visual Pre-Filter for Canceromics-Centric Expression Quantitative Trait Locus Data.

Rumpf RW, Wolock SL, Ray WC - Cancer Inform (2014)

Additional relationships are revealed by further reducing the residual; note that the P value remained significant at 0.05 throughout the analysis. Several strong correlations between genes and SNPs can be seen as the bold dark connectors leading from the genes in the foreground to their corresponding SNPs in the background.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4214597&req=5

f3-cin-suppl.3-2014-063: Additional relationships are revealed by further reducing the residual; note that the P value remained significant at 0.05 throughout the analysis. Several strong correlations between genes and SNPs can be seen as the bold dark connectors leading from the genes in the foreground to their corresponding SNPs in the background.
Mentions: Tuning the residual down by increments reveals additional correlations – again all SNP to SNP – until the residual is reduced to 0.05 (Fig. 2). Here, we see our first significant correlation (with P = 0.05) between expression levels of a gene and an SNP – specifically, CDH1 and rs35255374. Dialing the residual down to 0.025 reveals three additional gene to SNP relationships: CHD1 to 16:67369626; PCDH1 to 16:67374748; and CDH22 to rs35255374 (Fig. 3).

Bottom Line: One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models.By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis.In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

View Article: PubMed Central - PubMed

Affiliation: The Battelle Center for Mathematical Medicine, The Research Institute at Nationwide Children's Hospital, Columbus, OH, USA.

ABSTRACT
As datasets increase in complexity, the time required for analysis (both computational and human domain-expert) increases. One of the significant impediments introduced by such burgeoning data is the difficulty in knowing what features to include or exclude from statistical models. Simple tables of summary statistics rarely provide an adequate picture of the patterns and details of the dataset to enable researchers to make well-informed decisions about the adequacy of the models they are constructing. We have developed a tool, StickWRLD, which allows the user to visually browse through their data, displaying all possible correlations. By allowing the user to dynamically modify the retention parameters (both P and the residual, r), StickWRLD allows the user to identify significant correlations and disregard potential correlations that do not meet those same criteria - effectively filtering through all possible correlations quickly and identifying possible relationships of interest for further analysis. In this study, we applied StickWRLD to a semi-synthetic dataset constructed from two published human datasets. In addition to detecting high-probability correlations in this dataset, we were able to quickly identify gene-SNP correlations that would have gone undetected using more traditional approaches due to issues of low penetrance.

No MeSH data available.