Limits...
Preferential localization of human origins of DNA replication at the 5'-ends of expressed genes and at evolutionarily conserved DNA sequences.

Valenzuela MS, Chen Y, Davis S, Yang F, Walker RL, Bilke S, Lueders J, Martin MM, Aladjem MI, Massion PP, Meltzer PS - PLoS ONE (2011)

Bottom Line: Our results suggest that the program for origin activation is largely conserved among different cell types.Also, our work supports recent studies connecting transcription initiation with replication, and in addition suggests that evolutionarily conserved intergenic sequences have the potential to participate in origin selection.Overall, our observations suggest that replication origin selection is a stochastic process significantly dependent upon local accessibility to replication factors.

View Article: PubMed Central - PubMed

Affiliation: Genetics Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America. mvalenzuela@mmc.edu

ABSTRACT

Background: Replication of mammalian genomes requires the activation of thousands of origins which are both spatially and temporally regulated by as yet unknown mechanisms. At the most fundamental level, our knowledge about the distribution pattern of origins in each of the chromosomes, among different cell types, and whether the physiological state of the cells alters this distribution is at present very limited.

Methodology/principal findings: We have used standard λ-exonuclease resistant nascent DNA preparations in the size range of 0.7-1.5 kb obtained from the breast cancer cell line MCF-7 hybridized to a custom tiling array containing 50-60 nt probes evenly distributed among genic and non-genic regions covering about 1% of the human genome. A similar DNA preparation was used for high-throughput DNA sequencing. Array experiments were also performed with DNA obtained from BT-474 and H520 cell lines. By determining the sites showing nascent DNA enrichment, we have localized several thousand origins of DNA replication. Our major findings are: (a) both array and DNA sequencing assay methods produced essentially the same origin distribution profile; (b) origin distribution is largely conserved (>70%) in all cell lines tested; (c) origins are enriched at the 5'ends of expressed genes and at evolutionarily conserved intergenic sequences; and (d) ChIP on chip experiments in MCF-7 showed an enrichment of H3K4Me3 and RNA Polymerase II chromatin binding sites at origins of DNA replication.

Conclusions/significance: Our results suggest that the program for origin activation is largely conserved among different cell types. Also, our work supports recent studies connecting transcription initiation with replication, and in addition suggests that evolutionarily conserved intergenic sequences have the potential to participate in origin selection. Overall, our observations suggest that replication origin selection is a stochastic process significantly dependent upon local accessibility to replication factors.

Show MeSH

Related in: MedlinePlus

Association of origin peaks with evolutionarily conserved DNA                            sequences.(A) Composite average conservation score around the highest point of                            origin peaks (peak to trough ratio >1 log-2 scale; total of 8281                            initiation sites; or peak to trough ratio >1,5 ; total 4695 sites).                            Red lines with standard error bar denote the position of an equivalent                            number of sites randomly selected to develop a reference line. The                            statistical significance between the conservation score at the center of                            nascent DNA peaks with peak to through ratio >1 was highly                            significant (t-test,                                p = 1.1×10−14) (B)                            Composite average conservation score of non-genic conservation sequences                            around the highest point of origin peaks (peak to through ratio >1                            log-2 scale; total of 4763). Red lines with standard error bar denote                            the position of an equivalent number of sites randomly selected to                            develop a reference line. The statistical significance between the                            conservation score of non-genic conserved sequences at the center of                            origin peaks with peak to trough ratio >1 was found to be                            statistically significant (t-test:                                p = 2.9×10−4).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3094316&req=5

pone-0017308-g006: Association of origin peaks with evolutionarily conserved DNA sequences.(A) Composite average conservation score around the highest point of origin peaks (peak to trough ratio >1 log-2 scale; total of 8281 initiation sites; or peak to trough ratio >1,5 ; total 4695 sites). Red lines with standard error bar denote the position of an equivalent number of sites randomly selected to develop a reference line. The statistical significance between the conservation score at the center of nascent DNA peaks with peak to through ratio >1 was highly significant (t-test, p = 1.1×10−14) (B) Composite average conservation score of non-genic conservation sequences around the highest point of origin peaks (peak to through ratio >1 log-2 scale; total of 4763). Red lines with standard error bar denote the position of an equivalent number of sites randomly selected to develop a reference line. The statistical significance between the conservation score of non-genic conserved sequences at the center of origin peaks with peak to trough ratio >1 was found to be statistically significant (t-test: p = 2.9×10−4).

Mentions: Because origin peaks were not confined to genes or their 5′ends, we sought to determine if other features of the genome were significantly related to their localization in intergenic regions. DNA sequence comparison of the human genome with other vertebrates has uncovered significant conservation of non-coding DNA sequences suggesting a functional role for these sequences [22]. Visual inspection of the conserved sequences among the human, chimpanzee, mouse, rat, and chicken genomes (UCSC genome browser hg16 build, table mxPt1 Mm3RnGg_pHMM) along the regions covered by our array suggested a correlation of origin peaks with the position of conserved elements. We therefore developed a composite average conservation score around the highest point of the origin peaks (peak heights with at least 1 or 1.5 log2-fold changes). Fig. 6A (green lines) demonstrates an association between the average conservation score with the highest peak enrichment point (solid and dashed green lines for peak/trough ratios of >1.0, and >1.5 log-2 fold, respectively). At peak height log-fold >1.0, the Pearson correlation coefficient was found to be 0.9524, p = 1.19×10−30. To further assess the significance of this finding, we selected a similar number of locations at random and calculated the average conservation scores along these locations (Fig. 6A, red line). No significant correlation was found. In contrast, a t-test performed to compare the average conservation score at origin peaks versus random locations was found to be highly significant (p = 1.1×10−14). To ascertain if the correlation between origin peaks and conserved sequences also held for non-genic regions, we selected for analysis intergenic regions that were separated by at least 1000 bp from the nearest genes on either end of the gene free segment (an illustration of such region is shown in Supporting Figure S12). Randomly selected sites were subjected to the same criteria. The results shown in Fig. 6B indicate that a highly significant correlation still remains (Pearson correlation coefficient = 0.915, p = 9×10−24) at these conserved non-genic regions. When compared to the randomly selected sites the t-test p-value (p = 2.95×10−4) was also found significant. Similar results were found in the other cell lines used in this study (Supporting Figure S7). An example of the association of origins with evolutionarily conserved regions is illustrated for a 50 kb intergenic segment on chr17 containing several highly conserved sequences (Supporting Figure S12). These results are consistent with the possibility that evolutionarily conserved elements define functionally active chromatin available as preferred sites of replication initiation.


Preferential localization of human origins of DNA replication at the 5'-ends of expressed genes and at evolutionarily conserved DNA sequences.

Valenzuela MS, Chen Y, Davis S, Yang F, Walker RL, Bilke S, Lueders J, Martin MM, Aladjem MI, Massion PP, Meltzer PS - PLoS ONE (2011)

Association of origin peaks with evolutionarily conserved DNA                            sequences.(A) Composite average conservation score around the highest point of                            origin peaks (peak to trough ratio >1 log-2 scale; total of 8281                            initiation sites; or peak to trough ratio >1,5 ; total 4695 sites).                            Red lines with standard error bar denote the position of an equivalent                            number of sites randomly selected to develop a reference line. The                            statistical significance between the conservation score at the center of                            nascent DNA peaks with peak to through ratio >1 was highly                            significant (t-test,                                p = 1.1×10−14) (B)                            Composite average conservation score of non-genic conservation sequences                            around the highest point of origin peaks (peak to through ratio >1                            log-2 scale; total of 4763). Red lines with standard error bar denote                            the position of an equivalent number of sites randomly selected to                            develop a reference line. The statistical significance between the                            conservation score of non-genic conserved sequences at the center of                            origin peaks with peak to trough ratio >1 was found to be                            statistically significant (t-test:                                p = 2.9×10−4).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3094316&req=5

pone-0017308-g006: Association of origin peaks with evolutionarily conserved DNA sequences.(A) Composite average conservation score around the highest point of origin peaks (peak to trough ratio >1 log-2 scale; total of 8281 initiation sites; or peak to trough ratio >1,5 ; total 4695 sites). Red lines with standard error bar denote the position of an equivalent number of sites randomly selected to develop a reference line. The statistical significance between the conservation score at the center of nascent DNA peaks with peak to through ratio >1 was highly significant (t-test, p = 1.1×10−14) (B) Composite average conservation score of non-genic conservation sequences around the highest point of origin peaks (peak to through ratio >1 log-2 scale; total of 4763). Red lines with standard error bar denote the position of an equivalent number of sites randomly selected to develop a reference line. The statistical significance between the conservation score of non-genic conserved sequences at the center of origin peaks with peak to trough ratio >1 was found to be statistically significant (t-test: p = 2.9×10−4).
Mentions: Because origin peaks were not confined to genes or their 5′ends, we sought to determine if other features of the genome were significantly related to their localization in intergenic regions. DNA sequence comparison of the human genome with other vertebrates has uncovered significant conservation of non-coding DNA sequences suggesting a functional role for these sequences [22]. Visual inspection of the conserved sequences among the human, chimpanzee, mouse, rat, and chicken genomes (UCSC genome browser hg16 build, table mxPt1 Mm3RnGg_pHMM) along the regions covered by our array suggested a correlation of origin peaks with the position of conserved elements. We therefore developed a composite average conservation score around the highest point of the origin peaks (peak heights with at least 1 or 1.5 log2-fold changes). Fig. 6A (green lines) demonstrates an association between the average conservation score with the highest peak enrichment point (solid and dashed green lines for peak/trough ratios of >1.0, and >1.5 log-2 fold, respectively). At peak height log-fold >1.0, the Pearson correlation coefficient was found to be 0.9524, p = 1.19×10−30. To further assess the significance of this finding, we selected a similar number of locations at random and calculated the average conservation scores along these locations (Fig. 6A, red line). No significant correlation was found. In contrast, a t-test performed to compare the average conservation score at origin peaks versus random locations was found to be highly significant (p = 1.1×10−14). To ascertain if the correlation between origin peaks and conserved sequences also held for non-genic regions, we selected for analysis intergenic regions that were separated by at least 1000 bp from the nearest genes on either end of the gene free segment (an illustration of such region is shown in Supporting Figure S12). Randomly selected sites were subjected to the same criteria. The results shown in Fig. 6B indicate that a highly significant correlation still remains (Pearson correlation coefficient = 0.915, p = 9×10−24) at these conserved non-genic regions. When compared to the randomly selected sites the t-test p-value (p = 2.95×10−4) was also found significant. Similar results were found in the other cell lines used in this study (Supporting Figure S7). An example of the association of origins with evolutionarily conserved regions is illustrated for a 50 kb intergenic segment on chr17 containing several highly conserved sequences (Supporting Figure S12). These results are consistent with the possibility that evolutionarily conserved elements define functionally active chromatin available as preferred sites of replication initiation.

Bottom Line: Our results suggest that the program for origin activation is largely conserved among different cell types.Also, our work supports recent studies connecting transcription initiation with replication, and in addition suggests that evolutionarily conserved intergenic sequences have the potential to participate in origin selection.Overall, our observations suggest that replication origin selection is a stochastic process significantly dependent upon local accessibility to replication factors.

View Article: PubMed Central - PubMed

Affiliation: Genetics Branch, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, Maryland, United States of America. mvalenzuela@mmc.edu

ABSTRACT

Background: Replication of mammalian genomes requires the activation of thousands of origins which are both spatially and temporally regulated by as yet unknown mechanisms. At the most fundamental level, our knowledge about the distribution pattern of origins in each of the chromosomes, among different cell types, and whether the physiological state of the cells alters this distribution is at present very limited.

Methodology/principal findings: We have used standard λ-exonuclease resistant nascent DNA preparations in the size range of 0.7-1.5 kb obtained from the breast cancer cell line MCF-7 hybridized to a custom tiling array containing 50-60 nt probes evenly distributed among genic and non-genic regions covering about 1% of the human genome. A similar DNA preparation was used for high-throughput DNA sequencing. Array experiments were also performed with DNA obtained from BT-474 and H520 cell lines. By determining the sites showing nascent DNA enrichment, we have localized several thousand origins of DNA replication. Our major findings are: (a) both array and DNA sequencing assay methods produced essentially the same origin distribution profile; (b) origin distribution is largely conserved (>70%) in all cell lines tested; (c) origins are enriched at the 5'ends of expressed genes and at evolutionarily conserved intergenic sequences; and (d) ChIP on chip experiments in MCF-7 showed an enrichment of H3K4Me3 and RNA Polymerase II chromatin binding sites at origins of DNA replication.

Conclusions/significance: Our results suggest that the program for origin activation is largely conserved among different cell types. Also, our work supports recent studies connecting transcription initiation with replication, and in addition suggests that evolutionarily conserved intergenic sequences have the potential to participate in origin selection. Overall, our observations suggest that replication origin selection is a stochastic process significantly dependent upon local accessibility to replication factors.

Show MeSH
Related in: MedlinePlus