Limits...
The distribution of inverted repeat sequences in the Saccharomyces cerevisiae genome.

Strawbridge EM, Benson G, Gelfand Y, Benham CJ - Curr. Genet. (2010)

Bottom Line: We find that the S. cerevisiae genome is significantly enriched in IRs relative to random.However, the S. cerevisiae genome is not enriched in those IRs that would extrude cruciforms, suggesting that this is not a common event.Various explanations for these results are considered.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, University of Chicago, IL 60637, USA. emstrawb@math.uchicago.edu

ABSTRACT
Although a variety of possible functions have been proposed for inverted repeat sequences (IRs), it is not known which of them might occur in vivo. We investigate this question by assessing the distributions and properties of IRs in the Saccharomyces cerevisiae (SC) genome. Using the IRFinder algorithm we detect 100,514 IRs having copy length greater than 6 bp and spacer length less than 77 bp. To assess statistical significance we also determine the IR distributions in two types of randomization of the S. cerevisiae genome. We find that the S. cerevisiae genome is significantly enriched in IRs relative to random. The S. cerevisiae IRs are significantly longer and contain fewer imperfections than those from the randomized genomes, suggesting that processes to lengthen and/or correct errors in IRs may be operative in vivo. The S. cerevisiae IRs are highly clustered in intergenic regions, while their occurrence in coding sequences is consistent with random. Clustering is stronger in the 3' flanks of genes than in their 5' flanks. However, the S. cerevisiae genome is not enriched in those IRs that would extrude cruciforms, suggesting that this is not a common event. Various explanations for these results are considered.

Show MeSH
The mean overlap number Nbp is shown for 3,832 genes aligned at their start (a), and stop positions (b). The results for the S. cerevisiae genome are shown in blue, while those for the RA- and R-genomes are in red and green, respectively. We compared the distributions at each position using the Kolmogorov–Smirnov test. The p values assessing statistical significance found this way at each position are plotted logarithmically in parts (c) and (d). The threshold for significance is shown as a horizontal line in each case. We note that significance can occur through either enrichment or paucity relative to random. This shows a significant enrichment of IRs in the 5′ and 3′ flanks of genes, with greater enrichment in the downstream, 3′ flanks. Within coding regions IRs occur at rates consistent with random, given their base composition. Within the first 80 bp after the gene start there are significantly fewer IRs than expected at random
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2908449&req=5

Fig10: The mean overlap number Nbp is shown for 3,832 genes aligned at their start (a), and stop positions (b). The results for the S. cerevisiae genome are shown in blue, while those for the RA- and R-genomes are in red and green, respectively. We compared the distributions at each position using the Kolmogorov–Smirnov test. The p values assessing statistical significance found this way at each position are plotted logarithmically in parts (c) and (d). The threshold for significance is shown as a horizontal line in each case. We note that significance can occur through either enrichment or paucity relative to random. This shows a significant enrichment of IRs in the 5′ and 3′ flanks of genes, with greater enrichment in the downstream, 3′ flanks. Within coding regions IRs occur at rates consistent with random, given their base composition. Within the first 80 bp after the gene start there are significantly fewer IRs than expected at random

Mentions: The results of this analysis are shown in Fig. 10a for 5′ flanks, and in Fig. 10b for 3′ flanks. We see that in the S. cerevisiae genome the IR density substantially increases both just before gene starts and just after gene stops, the latter being the larger. Comparison with the RA randomizations shows that approximately half of the upstream enrichment and a quarter of the downstream enrichment can be attributed to the difference in base composition between intergenic and genic regions. This is consistent with the findings of others (Lillo et al. 2002; Lisnic et al. 2005; Lu et al. 2007).Fig. 10


The distribution of inverted repeat sequences in the Saccharomyces cerevisiae genome.

Strawbridge EM, Benson G, Gelfand Y, Benham CJ - Curr. Genet. (2010)

The mean overlap number Nbp is shown for 3,832 genes aligned at their start (a), and stop positions (b). The results for the S. cerevisiae genome are shown in blue, while those for the RA- and R-genomes are in red and green, respectively. We compared the distributions at each position using the Kolmogorov–Smirnov test. The p values assessing statistical significance found this way at each position are plotted logarithmically in parts (c) and (d). The threshold for significance is shown as a horizontal line in each case. We note that significance can occur through either enrichment or paucity relative to random. This shows a significant enrichment of IRs in the 5′ and 3′ flanks of genes, with greater enrichment in the downstream, 3′ flanks. Within coding regions IRs occur at rates consistent with random, given their base composition. Within the first 80 bp after the gene start there are significantly fewer IRs than expected at random
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2908449&req=5

Fig10: The mean overlap number Nbp is shown for 3,832 genes aligned at their start (a), and stop positions (b). The results for the S. cerevisiae genome are shown in blue, while those for the RA- and R-genomes are in red and green, respectively. We compared the distributions at each position using the Kolmogorov–Smirnov test. The p values assessing statistical significance found this way at each position are plotted logarithmically in parts (c) and (d). The threshold for significance is shown as a horizontal line in each case. We note that significance can occur through either enrichment or paucity relative to random. This shows a significant enrichment of IRs in the 5′ and 3′ flanks of genes, with greater enrichment in the downstream, 3′ flanks. Within coding regions IRs occur at rates consistent with random, given their base composition. Within the first 80 bp after the gene start there are significantly fewer IRs than expected at random
Mentions: The results of this analysis are shown in Fig. 10a for 5′ flanks, and in Fig. 10b for 3′ flanks. We see that in the S. cerevisiae genome the IR density substantially increases both just before gene starts and just after gene stops, the latter being the larger. Comparison with the RA randomizations shows that approximately half of the upstream enrichment and a quarter of the downstream enrichment can be attributed to the difference in base composition between intergenic and genic regions. This is consistent with the findings of others (Lillo et al. 2002; Lisnic et al. 2005; Lu et al. 2007).Fig. 10

Bottom Line: We find that the S. cerevisiae genome is significantly enriched in IRs relative to random.However, the S. cerevisiae genome is not enriched in those IRs that would extrude cruciforms, suggesting that this is not a common event.Various explanations for these results are considered.

View Article: PubMed Central - PubMed

Affiliation: Department of Mathematics, University of Chicago, IL 60637, USA. emstrawb@math.uchicago.edu

ABSTRACT
Although a variety of possible functions have been proposed for inverted repeat sequences (IRs), it is not known which of them might occur in vivo. We investigate this question by assessing the distributions and properties of IRs in the Saccharomyces cerevisiae (SC) genome. Using the IRFinder algorithm we detect 100,514 IRs having copy length greater than 6 bp and spacer length less than 77 bp. To assess statistical significance we also determine the IR distributions in two types of randomization of the S. cerevisiae genome. We find that the S. cerevisiae genome is significantly enriched in IRs relative to random. The S. cerevisiae IRs are significantly longer and contain fewer imperfections than those from the randomized genomes, suggesting that processes to lengthen and/or correct errors in IRs may be operative in vivo. The S. cerevisiae IRs are highly clustered in intergenic regions, while their occurrence in coding sequences is consistent with random. Clustering is stronger in the 3' flanks of genes than in their 5' flanks. However, the S. cerevisiae genome is not enriched in those IRs that would extrude cruciforms, suggesting that this is not a common event. Various explanations for these results are considered.

Show MeSH