A scale invariant clustering of genes on human chromosome 7.
Bottom Line:
Over evolutionary timescales, tandem duplication, mutation, insertion, deletion and rearrangement could act at these gene sites through a stochastic birth death and immigration process to yield a PG distribution.On the basis of the gene position data alone it was not possible to identify the biological model which best explained the observed clustering.However, the underlying PG statistical model implicated neutral evolutionary mechanisms as the basis for this clustering.
Affiliation: Department of Radiation Oncology, Ottawa Regional Cancer Centre, 503 Smyth, Ottawa, Ontario K1H 1C4, Canada. wayne.kendal@orcc.on.ca
ABSTRACT
Show MeSH
Background: Vertebrate genes often appear to cluster within the background of nontranscribed genomic DNA. Here an analysis of the physical distribution of gene structures on human chromosome 7 was performed to confirm the presence of clustering, and to elucidate possible underlying statistical and biological mechanisms. Results: Clustering of genes was confirmed by virtue of a variance of the number of genes per unit physical length that exceeded the respective mean. Further evidence for clustering came from a power function relationship between the variance and mean that possessed an exponent of 1.51. This power function implied that the spatial distribution of genes on chromosome 7 was scale invariant, and that the underlying statistical distribution had a Poisson-gamma (PG) form. A PG distribution for the spatial scattering of genes was validated by stringent comparisons of both the predicted variance to mean power function and its cumulative distribution function to data derived from chromosome 7. Conclusion: The PG distribution was consistent with at least two different biological models: In the microrearrangement model, the number of genes per unit length of chromosome represented the contribution of a random number of smaller chromosomal segments that had originated by random breakage and reconstruction of more primitive chromosomes. Each of these smaller segments would have necessarily contained (on average) a gamma distributed number of genes. In the gene cluster model, genes would be scattered randomly to begin with. Over evolutionary timescales, tandem duplication, mutation, insertion, deletion and rearrangement could act at these gene sites through a stochastic birth death and immigration process to yield a PG distribution. On the basis of the gene position data alone it was not possible to identify the biological model which best explained the observed clustering. However, the underlying PG statistical model implicated neutral evolutionary mechanisms as the basis for this clustering. Related in: MedlinePlus |
Related In:
Results -
Collection
getmorefigures.php?uid=PMC373443&req=5
Mentions: To determine whether this clustering persisted at other measurement scales, the variance and mean number of gene structures per bin were estimated for a range of bin sizes. Figure 2 provides these data on a log-log plot of variance versus mean. The logarithmically transformed points seemed to describe a linear relationship. Indeed the correlation coefficient squared, estimated between the transformed variance and mean estimates, was r2 = 0.997 thus substantiating a linear relationship. As well, the residuals between the logarithmically transformed variables and a trial linear relationship were essentially negligible and normally distributed about zero (Fig. 2 insert). It should be mentioned that the linear relationship tested here against the data in Fig. 2 was obtained not from the regression fit of the logarithmically transformed data, but from a statistical model that was fitted to the chromosome 7 data and that will be presented later in this article. |
View Article: PubMed Central - HTML - PubMed
Affiliation: Department of Radiation Oncology, Ottawa Regional Cancer Centre, 503 Smyth, Ottawa, Ontario K1H 1C4, Canada. wayne.kendal@orcc.on.ca
Background: Vertebrate genes often appear to cluster within the background of nontranscribed genomic DNA. Here an analysis of the physical distribution of gene structures on human chromosome 7 was performed to confirm the presence of clustering, and to elucidate possible underlying statistical and biological mechanisms.
Results: Clustering of genes was confirmed by virtue of a variance of the number of genes per unit physical length that exceeded the respective mean. Further evidence for clustering came from a power function relationship between the variance and mean that possessed an exponent of 1.51. This power function implied that the spatial distribution of genes on chromosome 7 was scale invariant, and that the underlying statistical distribution had a Poisson-gamma (PG) form. A PG distribution for the spatial scattering of genes was validated by stringent comparisons of both the predicted variance to mean power function and its cumulative distribution function to data derived from chromosome 7.
Conclusion: The PG distribution was consistent with at least two different biological models: In the microrearrangement model, the number of genes per unit length of chromosome represented the contribution of a random number of smaller chromosomal segments that had originated by random breakage and reconstruction of more primitive chromosomes. Each of these smaller segments would have necessarily contained (on average) a gamma distributed number of genes. In the gene cluster model, genes would be scattered randomly to begin with. Over evolutionary timescales, tandem duplication, mutation, insertion, deletion and rearrangement could act at these gene sites through a stochastic birth death and immigration process to yield a PG distribution. On the basis of the gene position data alone it was not possible to identify the biological model which best explained the observed clustering. However, the underlying PG statistical model implicated neutral evolutionary mechanisms as the basis for this clustering.