Limits...
Genomic regions with distinct genomic distance conservation in vertebrate genomes.

Sun H, Skogerbø G, Zheng X, Liu W, Li Y - BMC Genomics (2009)

Bottom Line: Among HCE pairs, we found that some consistently preserve highly conserved interval distance among genomes while others have relatively low distance conservation.Both groups of IHRs are significantly enriched for CpG islands.The data suggest that subsets of HCE pairs may undergo different evolutionary paths in light of their genomic distance conservation, and that sets of genomic regions pertain to HCEs, as well as the region in which HCEs reside, should be treated as integrated domains.

View Article: PubMed Central - HTML - PubMed

Affiliation: Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China. sunhong@scbit.org

ABSTRACT

Background: A number of vertebrate highly conserved elements (HCEs) have been detected and their genomic interval distances have been reported to be more conserved than protein coding genes among mammalian genomes. A characteristic of the human - non-mammalian comparisons is a bimodal distribution of relative distance difference of conserved consecutive HCE pairs; and it is difficult to attribute such profile to a random assortment. We therefore undertook an analysis of the human genomic regions confined by consecutive HCE pairs common to eight genomes (human, mouse, rat, chicken, frog, zebrafish, tetradon and fugu).

Results: Among HCE pairs, we found that some consistently preserve highly conserved interval distance among genomes while others have relatively low distance conservation. Using a partition method, we detected two groups of inter-HCE regions (IHRs) with distinct distance conservation pattern in vertebrate genomes: IHR1s that are bordered by HCE pairs with relative small distance variation, and IHR2s with larger distance difference values. Compared to random background, annotated repeat sequences are significantly less frequent in IHR1s than IHR2s, which reflects a correlation between repeat sequences and the length expansion of IHRs. Both groups of IHRs are unexpectedly enriched in human indel (i.e. insertion and deletion) polymorphism-variations than random background. The correlation between the percentage of conserved sequence and human IHR length was stronger for IHR1 than IHR2. Both groups of IHRs are significantly enriched for CpG islands.

Conclusion: The data suggest that subsets of HCE pairs may undergo different evolutionary paths in light of their genomic distance conservation, and that sets of genomic regions pertain to HCEs, as well as the region in which HCEs reside, should be treated as integrated domains.

Show MeSH
Median /RDD/ for HCE pairs of IHRs. Median / RDD/ of IHR2s were much higher than that of IHR1s for the comparison of any two pair-wise genomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2667192&req=5

Figure 2: Median /RDD/ for HCE pairs of IHRs. Median / RDD/ of IHR2s were much higher than that of IHR1s for the comparison of any two pair-wise genomes.

Mentions: In addition to the persistent nature of the two-peak distribution profiles, a remaining question is whether there exist any other characteristics pertaining to the regions confined by the HCE pairs common to the human and five non-mammalian genomes. To test this, we divided the 403 HCE pairs into two groups by using a partitioning clustering method based on the matrix of absolute RDD (/RDD/) values for the human – non-mammalian comparisons (Methods). RDD values of group one HCE pairs are centered around zero (Figure 1B), whereas those of group two are more widely scattered around a more negative value (Figure 1C). The distances between group two pairs (mean 46 Kb) are significantly longer than the distances between group one pairs (mean 2.8 Kb) [see Additional file 9; Wilcoxon test p value = 2.2e-16]. The /RDD/ value of two consecutive HCEs has been reported to be positively correlated with the distance between the pair [5], we see here a reflection of the same correlation. We call the inter-HCE regions IHRs and subsequently classify the IHRs into two types based on the (above mentioned) partitioning result [see Additional file 10]. We obtained 188 IHRs (termed as IHR1s which are bordered by HCE pairs with relative small /RDD/ values), and 215 IHRs (termed as IHR2s which are bordered by two consecutive HCEs with larger /RDD/ values). All these 403 HCE pairs are also detected in the rodents. An intriguing observation is that for any pair-wise comparisons among the eight genomes, the median /RDD/ values for HCE pairs of IHR2s are constantly much higher than those values of IHR1s [Figure 2, see Additional file 11]. Given the persistent nature of distinct distance conservation of the two groups of IHRs, it is difficult to assume that such profile was the result of a random assortment. Rather, it seems more likely that subsets of HCE pairs may undergo different evolutionary paths in the sense of genomic distance conservation.


Genomic regions with distinct genomic distance conservation in vertebrate genomes.

Sun H, Skogerbø G, Zheng X, Liu W, Li Y - BMC Genomics (2009)

Median /RDD/ for HCE pairs of IHRs. Median / RDD/ of IHR2s were much higher than that of IHR1s for the comparison of any two pair-wise genomes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2667192&req=5

Figure 2: Median /RDD/ for HCE pairs of IHRs. Median / RDD/ of IHR2s were much higher than that of IHR1s for the comparison of any two pair-wise genomes.
Mentions: In addition to the persistent nature of the two-peak distribution profiles, a remaining question is whether there exist any other characteristics pertaining to the regions confined by the HCE pairs common to the human and five non-mammalian genomes. To test this, we divided the 403 HCE pairs into two groups by using a partitioning clustering method based on the matrix of absolute RDD (/RDD/) values for the human – non-mammalian comparisons (Methods). RDD values of group one HCE pairs are centered around zero (Figure 1B), whereas those of group two are more widely scattered around a more negative value (Figure 1C). The distances between group two pairs (mean 46 Kb) are significantly longer than the distances between group one pairs (mean 2.8 Kb) [see Additional file 9; Wilcoxon test p value = 2.2e-16]. The /RDD/ value of two consecutive HCEs has been reported to be positively correlated with the distance between the pair [5], we see here a reflection of the same correlation. We call the inter-HCE regions IHRs and subsequently classify the IHRs into two types based on the (above mentioned) partitioning result [see Additional file 10]. We obtained 188 IHRs (termed as IHR1s which are bordered by HCE pairs with relative small /RDD/ values), and 215 IHRs (termed as IHR2s which are bordered by two consecutive HCEs with larger /RDD/ values). All these 403 HCE pairs are also detected in the rodents. An intriguing observation is that for any pair-wise comparisons among the eight genomes, the median /RDD/ values for HCE pairs of IHR2s are constantly much higher than those values of IHR1s [Figure 2, see Additional file 11]. Given the persistent nature of distinct distance conservation of the two groups of IHRs, it is difficult to assume that such profile was the result of a random assortment. Rather, it seems more likely that subsets of HCE pairs may undergo different evolutionary paths in the sense of genomic distance conservation.

Bottom Line: Among HCE pairs, we found that some consistently preserve highly conserved interval distance among genomes while others have relatively low distance conservation.Both groups of IHRs are significantly enriched for CpG islands.The data suggest that subsets of HCE pairs may undergo different evolutionary paths in light of their genomic distance conservation, and that sets of genomic regions pertain to HCEs, as well as the region in which HCEs reside, should be treated as integrated domains.

View Article: PubMed Central - HTML - PubMed

Affiliation: Key Laboratory of Systems Biology, Shanghai Institutes for Biological Sciences, Chinese Academy of Sciences, Shanghai, PR China. sunhong@scbit.org

ABSTRACT

Background: A number of vertebrate highly conserved elements (HCEs) have been detected and their genomic interval distances have been reported to be more conserved than protein coding genes among mammalian genomes. A characteristic of the human - non-mammalian comparisons is a bimodal distribution of relative distance difference of conserved consecutive HCE pairs; and it is difficult to attribute such profile to a random assortment. We therefore undertook an analysis of the human genomic regions confined by consecutive HCE pairs common to eight genomes (human, mouse, rat, chicken, frog, zebrafish, tetradon and fugu).

Results: Among HCE pairs, we found that some consistently preserve highly conserved interval distance among genomes while others have relatively low distance conservation. Using a partition method, we detected two groups of inter-HCE regions (IHRs) with distinct distance conservation pattern in vertebrate genomes: IHR1s that are bordered by HCE pairs with relative small distance variation, and IHR2s with larger distance difference values. Compared to random background, annotated repeat sequences are significantly less frequent in IHR1s than IHR2s, which reflects a correlation between repeat sequences and the length expansion of IHRs. Both groups of IHRs are unexpectedly enriched in human indel (i.e. insertion and deletion) polymorphism-variations than random background. The correlation between the percentage of conserved sequence and human IHR length was stronger for IHR1 than IHR2. Both groups of IHRs are significantly enriched for CpG islands.

Conclusion: The data suggest that subsets of HCE pairs may undergo different evolutionary paths in light of their genomic distance conservation, and that sets of genomic regions pertain to HCEs, as well as the region in which HCEs reside, should be treated as integrated domains.

Show MeSH