Limits...
A linear model for transcription factor binding affinity prediction in protein binding microarrays.

Annala M, Laurila K, Lähdesmäki H, Nykter M - PLoS ONE (2011)

Bottom Line: Our method was the best performer in the Dialogue for Reverse Engineering Assessments and Methods 5 (DREAM5) transcription factor/DNA motif recognition challenge.For the DREAM5 bonus challenge, we also developed an approach for the identification of transcription factors based on their PBM binding profiles.Our approach for TF identification achieved the best performance in the bonus challenge.

View Article: PubMed Central - PubMed

Affiliation: Department of Signal Processing, Tampere University of Technology, Tampere, Finland. matti.annala@tut.fi

ABSTRACT
Protein binding microarrays (PBM) are a high throughput technology used to characterize protein-DNA binding. The arrays measure a protein's affinity toward thousands of double-stranded DNA sequences at once, producing a comprehensive binding specificity catalog. We present a linear model for predicting the binding affinity of a protein toward DNA sequences based on PBM data. Our model represents the measured intensity of an individual probe as a sum of the binding affinity contributions of the probe's subsequences. These subsequences characterize a DNA binding motif and can be used to predict the intensity of protein binding against arbitrary DNA sequences. Our method was the best performer in the Dialogue for Reverse Engineering Assessments and Methods 5 (DREAM5) transcription factor/DNA motif recognition challenge. For the DREAM5 bonus challenge, we also developed an approach for the identification of transcription factors based on their PBM binding profiles. Our approach for TF identification achieved the best performance in the bonus challenge.

Show MeSH
Quantile normalization recovers high intensity tails in saturated samples.The figure shows how the log-intensity histogram of the Foxo3 PBM sample is changed by quantile normalization. An example of how quantile normalization can recover the high intensity tails in saturated PBM samples. The saturated probe intensities (highlighted in red) are recovered by fitting them to the consensus distribution.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3102690&req=5

pone-0020059-g006: Quantile normalization recovers high intensity tails in saturated samples.The figure shows how the log-intensity histogram of the Foxo3 PBM sample is changed by quantile normalization. An example of how quantile normalization can recover the high intensity tails in saturated PBM samples. The saturated probe intensities (highlighted in red) are recovered by fitting them to the consensus distribution.

Mentions: In the normalization step, the samples used in learning the motif models are quantile normalized. Quantile normalization assumes that the true intensity distributions (uncontaminated by experimental errors) of different transcription factors have roughly similar shapes. The validity of this assumption is subject to debate, but according to our tests, quantile normalization does improve the accuracy of our model's predictions. We suspect that this improvement is largely due to quantile normalization's ability to recover the high intensity tails in saturated PBM samples (Figure 6). Quantile normalization can also recover samples where an experimental error has resulted in a non-linear monotonic transformation of the probe intensities.


A linear model for transcription factor binding affinity prediction in protein binding microarrays.

Annala M, Laurila K, Lähdesmäki H, Nykter M - PLoS ONE (2011)

Quantile normalization recovers high intensity tails in saturated samples.The figure shows how the log-intensity histogram of the Foxo3 PBM sample is changed by quantile normalization. An example of how quantile normalization can recover the high intensity tails in saturated PBM samples. The saturated probe intensities (highlighted in red) are recovered by fitting them to the consensus distribution.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3102690&req=5

pone-0020059-g006: Quantile normalization recovers high intensity tails in saturated samples.The figure shows how the log-intensity histogram of the Foxo3 PBM sample is changed by quantile normalization. An example of how quantile normalization can recover the high intensity tails in saturated PBM samples. The saturated probe intensities (highlighted in red) are recovered by fitting them to the consensus distribution.
Mentions: In the normalization step, the samples used in learning the motif models are quantile normalized. Quantile normalization assumes that the true intensity distributions (uncontaminated by experimental errors) of different transcription factors have roughly similar shapes. The validity of this assumption is subject to debate, but according to our tests, quantile normalization does improve the accuracy of our model's predictions. We suspect that this improvement is largely due to quantile normalization's ability to recover the high intensity tails in saturated PBM samples (Figure 6). Quantile normalization can also recover samples where an experimental error has resulted in a non-linear monotonic transformation of the probe intensities.

Bottom Line: Our method was the best performer in the Dialogue for Reverse Engineering Assessments and Methods 5 (DREAM5) transcription factor/DNA motif recognition challenge.For the DREAM5 bonus challenge, we also developed an approach for the identification of transcription factors based on their PBM binding profiles.Our approach for TF identification achieved the best performance in the bonus challenge.

View Article: PubMed Central - PubMed

Affiliation: Department of Signal Processing, Tampere University of Technology, Tampere, Finland. matti.annala@tut.fi

ABSTRACT
Protein binding microarrays (PBM) are a high throughput technology used to characterize protein-DNA binding. The arrays measure a protein's affinity toward thousands of double-stranded DNA sequences at once, producing a comprehensive binding specificity catalog. We present a linear model for predicting the binding affinity of a protein toward DNA sequences based on PBM data. Our model represents the measured intensity of an individual probe as a sum of the binding affinity contributions of the probe's subsequences. These subsequences characterize a DNA binding motif and can be used to predict the intensity of protein binding against arbitrary DNA sequences. Our method was the best performer in the Dialogue for Reverse Engineering Assessments and Methods 5 (DREAM5) transcription factor/DNA motif recognition challenge. For the DREAM5 bonus challenge, we also developed an approach for the identification of transcription factors based on their PBM binding profiles. Our approach for TF identification achieved the best performance in the bonus challenge.

Show MeSH