Limits...
Nonidentifiability of the source of intrinsic noise in gene expression from single-burst data.

Ingram PJ, Stumpf MP, Stark J - PLoS Comput. Biol. (2008)

Bottom Line: We show that there is a good fit between our theoretical distribution and that obtained from two different published experimental datasets.We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter.Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Imperial College London, London, United Kingdom. piers.ingram@imperial.ac.uk

ABSTRACT
Over the last few years, experimental data on the fluctuations in gene activity between individual cells and within the same cell over time have confirmed that gene expression is a "noisy" process. This variation is in part due to the small number of molecules taking part in some of the key reactions that are involved in gene expression. One of the consequences of this is that protein production often occurs in bursts, each due to a single promoter or transcription factor binding event. Recently, the distribution of the number of proteins produced in such bursts has been experimentally measured, offering a unique opportunity to study the relative importance of different sources of noise in gene expression. Here, we provide a derivation of the theoretical probability distribution of these bursts for a wide variety of different models of gene expression. We show that there is a good fit between our theoretical distribution and that obtained from two different published experimental datasets. We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter. Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution. It is thus impossible to identify different sources of fluctuations purely from protein burst size data or to use such data to estimate all of the model parameters. We explore methods of inferring these values when additional types of experimental data are available.

Show MeSH

Related in: MedlinePlus

Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of thelog-likelihood for the parametersα0,α1,β1, andβ2, withα2 determined by therelationship in Equation 6, and with the protein degradation rateset, β3 set to2.77×10−4, consistent with ahalf-life for β-gal 60 mins. The panels incolumn A show the estimates of the values of the parameters and thepercentage of times the Nelder-Mead algorithm converged to thosevalues. The panels in column B are scattergrams of the values of theparameter estimates against the value of the log-likelihood. Eachsimulation is run 10,000 times to simulate a population of 10,000cells, and each simulation is run for 5000 reaction steps. Thestarting values for the optimisation routine are:α0 = 0.01s−1,α1 = 0.02s−1,β1 = 0.1s−1, andβ2 = 0.007s−1, and are based on previous simulationstudies [16].
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2538572&req=5

pcbi-1000192-g006: Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of thelog-likelihood for the parametersα0,α1,β1, andβ2, withα2 determined by therelationship in Equation 6, and with the protein degradation rateset, β3 set to2.77×10−4, consistent with ahalf-life for β-gal 60 mins. The panels incolumn A show the estimates of the values of the parameters and thepercentage of times the Nelder-Mead algorithm converged to thosevalues. The panels in column B are scattergrams of the values of theparameter estimates against the value of the log-likelihood. Eachsimulation is run 10,000 times to simulate a population of 10,000cells, and each simulation is run for 5000 reaction steps. Thestarting values for the optimisation routine are:α0 = 0.01s−1,α1 = 0.02s−1,β1 = 0.1s−1, andβ2 = 0.007s−1, and are based on previous simulationstudies [16].

Mentions: We therefore fixed β3 to1.92×10−4 s−1,corresponding to a protein half-life of one hour, and then used same methodas described above to estimate the other parametersα0,α1,α2,β1, andβ2. We ran 10,000 simulations, as arelatively low number of runs converged (23.37%), with the othersbecoming trapped in a region with physically unrealistic (negative) reactionrates, and a log-likelihood of L≈−2100. Ofthe runs which converged, 2057 (88%) converged to a local maximumat L≈−9150, while 279 (12%)converged to the presumed global maximum atL≈−1100. The results for the runs whichconverged can be seen in Figure6, whilst summary statistics for the runs which converged to thepresumed global maximum are presented in Table 2.


Nonidentifiability of the source of intrinsic noise in gene expression from single-burst data.

Ingram PJ, Stumpf MP, Stark J - PLoS Comput. Biol. (2008)

Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of thelog-likelihood for the parametersα0,α1,β1, andβ2, withα2 determined by therelationship in Equation 6, and with the protein degradation rateset, β3 set to2.77×10−4, consistent with ahalf-life for β-gal 60 mins. The panels incolumn A show the estimates of the values of the parameters and thepercentage of times the Nelder-Mead algorithm converged to thosevalues. The panels in column B are scattergrams of the values of theparameter estimates against the value of the log-likelihood. Eachsimulation is run 10,000 times to simulate a population of 10,000cells, and each simulation is run for 5000 reaction steps. Thestarting values for the optimisation routine are:α0 = 0.01s−1,α1 = 0.02s−1,β1 = 0.1s−1, andβ2 = 0.007s−1, and are based on previous simulationstudies [16].
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2538572&req=5

pcbi-1000192-g006: Parameter estimation results with fixed protein degradation rate.The results of 10,000 runs of the Nelder-Mead maximisation of thelog-likelihood for the parametersα0,α1,β1, andβ2, withα2 determined by therelationship in Equation 6, and with the protein degradation rateset, β3 set to2.77×10−4, consistent with ahalf-life for β-gal 60 mins. The panels incolumn A show the estimates of the values of the parameters and thepercentage of times the Nelder-Mead algorithm converged to thosevalues. The panels in column B are scattergrams of the values of theparameter estimates against the value of the log-likelihood. Eachsimulation is run 10,000 times to simulate a population of 10,000cells, and each simulation is run for 5000 reaction steps. Thestarting values for the optimisation routine are:α0 = 0.01s−1,α1 = 0.02s−1,β1 = 0.1s−1, andβ2 = 0.007s−1, and are based on previous simulationstudies [16].
Mentions: We therefore fixed β3 to1.92×10−4 s−1,corresponding to a protein half-life of one hour, and then used same methodas described above to estimate the other parametersα0,α1,α2,β1, andβ2. We ran 10,000 simulations, as arelatively low number of runs converged (23.37%), with the othersbecoming trapped in a region with physically unrealistic (negative) reactionrates, and a log-likelihood of L≈−2100. Ofthe runs which converged, 2057 (88%) converged to a local maximumat L≈−9150, while 279 (12%)converged to the presumed global maximum atL≈−1100. The results for the runs whichconverged can be seen in Figure6, whilst summary statistics for the runs which converged to thepresumed global maximum are presented in Table 2.

Bottom Line: We show that there is a good fit between our theoretical distribution and that obtained from two different published experimental datasets.We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter.Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, Imperial College London, London, United Kingdom. piers.ingram@imperial.ac.uk

ABSTRACT
Over the last few years, experimental data on the fluctuations in gene activity between individual cells and within the same cell over time have confirmed that gene expression is a "noisy" process. This variation is in part due to the small number of molecules taking part in some of the key reactions that are involved in gene expression. One of the consequences of this is that protein production often occurs in bursts, each due to a single promoter or transcription factor binding event. Recently, the distribution of the number of proteins produced in such bursts has been experimentally measured, offering a unique opportunity to study the relative importance of different sources of noise in gene expression. Here, we provide a derivation of the theoretical probability distribution of these bursts for a wide variety of different models of gene expression. We show that there is a good fit between our theoretical distribution and that obtained from two different published experimental datasets. We then prove that, irrespective of the details of the model, the burst size distribution is always geometric and hence determined by a single parameter. Many different combinations of the biochemical rates for the constituent reactions of both transcription and translation will therefore lead to the same experimentally observed burst size distribution. It is thus impossible to identify different sources of fluctuations purely from protein burst size data or to use such data to estimate all of the model parameters. We explore methods of inferring these values when additional types of experimental data are available.

Show MeSH
Related in: MedlinePlus