Limits...
Sequence features of yeast and human core promoters that are predictive of maximal promoter activity.

Lubliner S, Keren L, Segal E - Nucleic Acids Res. (2013)

Bottom Line: These features are mainly located in the region 75 bp upstream and 50 bp downstream of the main transcription start site, and their associations hold for both constitutively active promoters and promoters that are induced or repressed in specific conditions.Our results unravel several architectural features of yeast core promoters and suggest that the yeast core promoter sequence downstream of the TATA box (or of similar sequences involved in recruitment of the pre-initiation complex) is a major determinant of maximal promoter activity.We further show that human core promoters also contain features that are indicative of maximal promoter activity; thus, our results emphasize the important role of the core promoter sequence in transcriptional regulation.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.

ABSTRACT
The core promoter is the region in which RNA polymerase II is recruited to the DNA and acts to initiate transcription, but the extent to which the core promoter sequence determines promoter activity levels is largely unknown. Here, we identified several base content and k-mer sequence features of the yeast core promoter sequence that are highly predictive of maximal promoter activity. These features are mainly located in the region 75 bp upstream and 50 bp downstream of the main transcription start site, and their associations hold for both constitutively active promoters and promoters that are induced or repressed in specific conditions. Our results unravel several architectural features of yeast core promoters and suggest that the yeast core promoter sequence downstream of the TATA box (or of similar sequences involved in recruitment of the pre-initiation complex) is a major determinant of maximal promoter activity. We further show that human core promoters also contain features that are indicative of maximal promoter activity; thus, our results emphasize the important role of the core promoter sequence in transcriptional regulation.

Show MeSH
Human core promoter sequence signals differ between constitutive TSSs with different maximal expression. (A) Mean nucleotide, k-mers and TATA box content, computed using a sliding window (20 bp long, 10 bp step) over the [−200, 100] region around TSSs that are constitutively expressesed in 10 different human cell lines (37). Plots are arranged in three columns: constitutive TSSs with high maximal expression in the left, constitutive TSSs with low maximal expression in the middle and all constitutive TSSs in the right. The vertical dashed lines represent the location of the TSS. The horizontal dotted lines are to assist with the comparison of plots between columns. SP1 - the GGGCGG\CCGCCC 6-mers of the SP1 TF binding motif consensus. (B) Fifty of the above constitutive TSSs were of RPs, and they were divided into two subsets of 25 higher expressed and 25 lower expressed (based on their maximal expression). Mean nucleotides content around the TSSs is shown for the two subsets.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3675475&req=5

gkt256-F7: Human core promoter sequence signals differ between constitutive TSSs with different maximal expression. (A) Mean nucleotide, k-mers and TATA box content, computed using a sliding window (20 bp long, 10 bp step) over the [−200, 100] region around TSSs that are constitutively expressesed in 10 different human cell lines (37). Plots are arranged in three columns: constitutive TSSs with high maximal expression in the left, constitutive TSSs with low maximal expression in the middle and all constitutive TSSs in the right. The vertical dashed lines represent the location of the TSS. The horizontal dotted lines are to assist with the comparison of plots between columns. SP1 - the GGGCGG\CCGCCC 6-mers of the SP1 TF binding motif consensus. (B) Fifty of the above constitutive TSSs were of RPs, and they were divided into two subsets of 25 higher expressed and 25 lower expressed (based on their maximal expression). Mean nucleotides content around the TSSs is shown for the two subsets.

Mentions: Similar to our analysis in yeast (see above), we analyzed various sequence signals within the [−200, 100] region around the TSSs, including base content (mononucleotides and G + C), CpG and GpC content, as well as the percent of TSSs with TATA box hits, or with hits of 6-mers of the SP1 transcription factor motif consensus (GGGCGG or its reverse complement CCGCCC). For all of these sequence signals (Figure 7A) there were significant differences (see rank-sum P-values in Supplementary Figure S4) between the set of high maximal expression TSSs (Figure 7A, left column) and the set of low maximal expression TSSs (Figure 7A, middle column).Figure 7.


Sequence features of yeast and human core promoters that are predictive of maximal promoter activity.

Lubliner S, Keren L, Segal E - Nucleic Acids Res. (2013)

Human core promoter sequence signals differ between constitutive TSSs with different maximal expression. (A) Mean nucleotide, k-mers and TATA box content, computed using a sliding window (20 bp long, 10 bp step) over the [−200, 100] region around TSSs that are constitutively expressesed in 10 different human cell lines (37). Plots are arranged in three columns: constitutive TSSs with high maximal expression in the left, constitutive TSSs with low maximal expression in the middle and all constitutive TSSs in the right. The vertical dashed lines represent the location of the TSS. The horizontal dotted lines are to assist with the comparison of plots between columns. SP1 - the GGGCGG\CCGCCC 6-mers of the SP1 TF binding motif consensus. (B) Fifty of the above constitutive TSSs were of RPs, and they were divided into two subsets of 25 higher expressed and 25 lower expressed (based on their maximal expression). Mean nucleotides content around the TSSs is shown for the two subsets.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3675475&req=5

gkt256-F7: Human core promoter sequence signals differ between constitutive TSSs with different maximal expression. (A) Mean nucleotide, k-mers and TATA box content, computed using a sliding window (20 bp long, 10 bp step) over the [−200, 100] region around TSSs that are constitutively expressesed in 10 different human cell lines (37). Plots are arranged in three columns: constitutive TSSs with high maximal expression in the left, constitutive TSSs with low maximal expression in the middle and all constitutive TSSs in the right. The vertical dashed lines represent the location of the TSS. The horizontal dotted lines are to assist with the comparison of plots between columns. SP1 - the GGGCGG\CCGCCC 6-mers of the SP1 TF binding motif consensus. (B) Fifty of the above constitutive TSSs were of RPs, and they were divided into two subsets of 25 higher expressed and 25 lower expressed (based on their maximal expression). Mean nucleotides content around the TSSs is shown for the two subsets.
Mentions: Similar to our analysis in yeast (see above), we analyzed various sequence signals within the [−200, 100] region around the TSSs, including base content (mononucleotides and G + C), CpG and GpC content, as well as the percent of TSSs with TATA box hits, or with hits of 6-mers of the SP1 transcription factor motif consensus (GGGCGG or its reverse complement CCGCCC). For all of these sequence signals (Figure 7A) there were significant differences (see rank-sum P-values in Supplementary Figure S4) between the set of high maximal expression TSSs (Figure 7A, left column) and the set of low maximal expression TSSs (Figure 7A, middle column).Figure 7.

Bottom Line: These features are mainly located in the region 75 bp upstream and 50 bp downstream of the main transcription start site, and their associations hold for both constitutively active promoters and promoters that are induced or repressed in specific conditions.Our results unravel several architectural features of yeast core promoters and suggest that the yeast core promoter sequence downstream of the TATA box (or of similar sequences involved in recruitment of the pre-initiation complex) is a major determinant of maximal promoter activity.We further show that human core promoters also contain features that are indicative of maximal promoter activity; thus, our results emphasize the important role of the core promoter sequence in transcriptional regulation.

View Article: PubMed Central - PubMed

Affiliation: Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel.

ABSTRACT
The core promoter is the region in which RNA polymerase II is recruited to the DNA and acts to initiate transcription, but the extent to which the core promoter sequence determines promoter activity levels is largely unknown. Here, we identified several base content and k-mer sequence features of the yeast core promoter sequence that are highly predictive of maximal promoter activity. These features are mainly located in the region 75 bp upstream and 50 bp downstream of the main transcription start site, and their associations hold for both constitutively active promoters and promoters that are induced or repressed in specific conditions. Our results unravel several architectural features of yeast core promoters and suggest that the yeast core promoter sequence downstream of the TATA box (or of similar sequences involved in recruitment of the pre-initiation complex) is a major determinant of maximal promoter activity. We further show that human core promoters also contain features that are indicative of maximal promoter activity; thus, our results emphasize the important role of the core promoter sequence in transcriptional regulation.

Show MeSH