Limits...
In silico detection of sequence variations modifying transcriptional regulation.

Andersen MC, Engström PG, Lithwick S, Arenillas D, Eriksson P, Lenhard B, Wasserman WW, Odeberg J - PLoS Comput. Biol. (2007)

Bottom Line: Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription.The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs.The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers).

View Article: PubMed Central - PubMed

Affiliation: Department of Gene Technology, School of Biotechnology, AlbaNova University Center, Royal Institute of Technology (KTH), Stockholm, Sweden.

ABSTRACT
Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation.

Show MeSH

Related in: MedlinePlus

Distributions of TFBS Score Delta Values for Background SNPs, Regulatory SNPs, and Regulatory SNPs for Which the Affected TFBS Is KnownIn the three leftmost boxes the average score delta for all matches to any PWMs in the JASPAR database was collected for every SNP. In the rightmost box the score delta for the PWM corresponding to the verified PWM was collected for every SNP.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2211530&req=5

pcbi-0040005-g006: Distributions of TFBS Score Delta Values for Background SNPs, Regulatory SNPs, and Regulatory SNPs for Which the Affected TFBS Is KnownIn the three leftmost boxes the average score delta for all matches to any PWMs in the JASPAR database was collected for every SNP. In the rightmost box the score delta for the PWM corresponding to the verified PWM was collected for every SNP.

Mentions: We also compared the score deltas obtained from analyzing the regulatory polymorphisms with known affected TFBS to the score deltas obtained when analyzing the larger data set used in Figure 2. When analyzing the regulatory polymorphisms with prior knowledge with all PWMs from the JASPAR database, the results were similar to those obtained for the larger set and the background. However, when the regulatory polymorphisms were analyzed with the PWMs for the respective verified TFBSs, the score delta was higher (Figure 6). In the analyses using all PWMs, we selected the mean score delta over all matches between the analyzed PWMs and the SNP, whereas in the analysis with the PWM of the verified transcription factor we used the PWM match giving the largest score delta for that particular PWM. This may have caused the lack of overlap between the interquartile ranges of score deltas in the “all” and “prior knowledge” analyses in Figure 6.


In silico detection of sequence variations modifying transcriptional regulation.

Andersen MC, Engström PG, Lithwick S, Arenillas D, Eriksson P, Lenhard B, Wasserman WW, Odeberg J - PLoS Comput. Biol. (2007)

Distributions of TFBS Score Delta Values for Background SNPs, Regulatory SNPs, and Regulatory SNPs for Which the Affected TFBS Is KnownIn the three leftmost boxes the average score delta for all matches to any PWMs in the JASPAR database was collected for every SNP. In the rightmost box the score delta for the PWM corresponding to the verified PWM was collected for every SNP.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2211530&req=5

pcbi-0040005-g006: Distributions of TFBS Score Delta Values for Background SNPs, Regulatory SNPs, and Regulatory SNPs for Which the Affected TFBS Is KnownIn the three leftmost boxes the average score delta for all matches to any PWMs in the JASPAR database was collected for every SNP. In the rightmost box the score delta for the PWM corresponding to the verified PWM was collected for every SNP.
Mentions: We also compared the score deltas obtained from analyzing the regulatory polymorphisms with known affected TFBS to the score deltas obtained when analyzing the larger data set used in Figure 2. When analyzing the regulatory polymorphisms with prior knowledge with all PWMs from the JASPAR database, the results were similar to those obtained for the larger set and the background. However, when the regulatory polymorphisms were analyzed with the PWMs for the respective verified TFBSs, the score delta was higher (Figure 6). In the analyses using all PWMs, we selected the mean score delta over all matches between the analyzed PWMs and the SNP, whereas in the analysis with the PWM of the verified transcription factor we used the PWM match giving the largest score delta for that particular PWM. This may have caused the lack of overlap between the interquartile ranges of score deltas in the “all” and “prior knowledge” analyses in Figure 6.

Bottom Line: Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription.The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs.The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers).

View Article: PubMed Central - PubMed

Affiliation: Department of Gene Technology, School of Biotechnology, AlbaNova University Center, Royal Institute of Technology (KTH), Stockholm, Sweden.

ABSTRACT
Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation.

Show MeSH
Related in: MedlinePlus