Limits...
Detecting evolutionary strata on the human x chromosome in the absence of gametologous y-linked sequences.

Pandey RS, Wilson Sayres MA, Azad RK - Genome Biol Evol (2013)

Bottom Line: Mammalian sex chromosomes arose from a pair of homologous autosomes that differentiated into the X and Y chromosomes following a series of recombination suppression events between the X and Y.We have developed an integrative method that combines a top-down, recursive segmentation algorithm with a bottom-up, agglomerative clustering algorithm to decipher compositionally distinct regions on the X, which reflect regions of unique X-Y divergence.The older strata, from the first up to the third stratum, have remained poorly resolved due to paucity of X-Y gametologs.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, University of North Texas.

ABSTRACT
Mammalian sex chromosomes arose from a pair of homologous autosomes that differentiated into the X and Y chromosomes following a series of recombination suppression events between the X and Y. The stepwise recombination suppressions from the distal long arm to the distal short arm of the chromosomes are reflected as regions with distinct X-Y divergence, referred to as evolutionary strata on the X. All current methods for stratum detection depend on X-Y comparisons but are severely limited by the paucity of X-Y gametologs. We have developed an integrative method that combines a top-down, recursive segmentation algorithm with a bottom-up, agglomerative clustering algorithm to decipher compositionally distinct regions on the X, which reflect regions of unique X-Y divergence. In application to human X chromosome, our method correctly classified a concatenated set of 35 previously assayed X-linked gene sequences by evolutionary strata. We then extended our analysis, applying this method to the entire sequence of the human X chromosome, in an effort to define stratum boundaries. The boundaries of more recently formed strata on X-added region, namely the fourth and fifth strata, have been defined by previous studies and are recapitulated with our method. The older strata, from the first up to the third stratum, have remained poorly resolved due to paucity of X-Y gametologs. By analyzing the entire X sequence, our method identified seven evolutionary strata in these ancient regions, where only three could previously be assayed, thus demonstrating the robustness of our method in detecting the evolutionary strata.

Show MeSH

Related in: MedlinePlus

Strata identified using previously assayed X-linked genes. Here we apply the segmentation and clustering algorithm to a concatenated string of the X-linked genes that have been previously assayed using inversion, phylogenetic, and substitution rate analyses. Previous strata are colored: 5, Red; 4, Yellow; 3, Geen; 2, Blue; 1, Violet. Genes in each cluster are labeled above the cluster, similarly color-coded. Genes that span cluster boundaries are marked with a star. We used Markov model of order 2 to perform segmentation and clustering at significance thresholds of 0.3 and 0.04, respectively.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3814197&req=5

evt139-F1: Strata identified using previously assayed X-linked genes. Here we apply the segmentation and clustering algorithm to a concatenated string of the X-linked genes that have been previously assayed using inversion, phylogenetic, and substitution rate analyses. Previous strata are colored: 5, Red; 4, Yellow; 3, Geen; 2, Blue; 1, Violet. Genes in each cluster are labeled above the cluster, similarly color-coded. Genes that span cluster boundaries are marked with a star. We used Markov model of order 2 to perform segmentation and clustering at significance thresholds of 0.3 and 0.04, respectively.

Mentions: We first applied our segmentation and clustering algorithm to the concatenated sequence of the 35 X-linked genes that have been previously assayed using inversion (Ross et al. 2005; Lemaitre et al. 2009a), phylogenetic (Wilson and Makova 2009; Luo and Wilson Sayres, unpublished data), and substitution rate (Lahn and Page 1999; Skaletsky et al. 2003) analyses. Protein-coding gene sequences are constrained in their sequence evolution to maintain functional gene products and so will accumulate nucleotide differences (substitutions, deletions, and insertions) slower than noncoding DNAs. As such, an analysis of the coding regions should be a proxy for the divergence rate between gametologous X-Y sequences but is likely a conservative estimate of the sequence differences that have accumulated between the larger X and Y regions due to suppression of recombination between them. Similar to the expectations, when analyzing just the coding regions, the segmentation and clustering algorithm produces a conservative strata structure (fig. 1 and supplementaryfig. S2, Supplementary Material online). Our method correctly classifies genes by previously defined stratum boundaries, and consistent with recent suggestions (Lemaitre et al. 2009a; Wilson and Makova 2009), we find evidence of two strata within the previously described stratum 3 (fig. 1 and table 1). This analysis confirms that our method is able to recapitulate previous stratum definitions but also highlights the challenges of relying only on X-linked coding sequences, which are necessarily more conserved than noncoding regions, with identifiable Y-linked gametologs.Fig. 1.—


Detecting evolutionary strata on the human x chromosome in the absence of gametologous y-linked sequences.

Pandey RS, Wilson Sayres MA, Azad RK - Genome Biol Evol (2013)

Strata identified using previously assayed X-linked genes. Here we apply the segmentation and clustering algorithm to a concatenated string of the X-linked genes that have been previously assayed using inversion, phylogenetic, and substitution rate analyses. Previous strata are colored: 5, Red; 4, Yellow; 3, Geen; 2, Blue; 1, Violet. Genes in each cluster are labeled above the cluster, similarly color-coded. Genes that span cluster boundaries are marked with a star. We used Markov model of order 2 to perform segmentation and clustering at significance thresholds of 0.3 and 0.04, respectively.
© Copyright Policy - creative-commons
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3814197&req=5

evt139-F1: Strata identified using previously assayed X-linked genes. Here we apply the segmentation and clustering algorithm to a concatenated string of the X-linked genes that have been previously assayed using inversion, phylogenetic, and substitution rate analyses. Previous strata are colored: 5, Red; 4, Yellow; 3, Geen; 2, Blue; 1, Violet. Genes in each cluster are labeled above the cluster, similarly color-coded. Genes that span cluster boundaries are marked with a star. We used Markov model of order 2 to perform segmentation and clustering at significance thresholds of 0.3 and 0.04, respectively.
Mentions: We first applied our segmentation and clustering algorithm to the concatenated sequence of the 35 X-linked genes that have been previously assayed using inversion (Ross et al. 2005; Lemaitre et al. 2009a), phylogenetic (Wilson and Makova 2009; Luo and Wilson Sayres, unpublished data), and substitution rate (Lahn and Page 1999; Skaletsky et al. 2003) analyses. Protein-coding gene sequences are constrained in their sequence evolution to maintain functional gene products and so will accumulate nucleotide differences (substitutions, deletions, and insertions) slower than noncoding DNAs. As such, an analysis of the coding regions should be a proxy for the divergence rate between gametologous X-Y sequences but is likely a conservative estimate of the sequence differences that have accumulated between the larger X and Y regions due to suppression of recombination between them. Similar to the expectations, when analyzing just the coding regions, the segmentation and clustering algorithm produces a conservative strata structure (fig. 1 and supplementaryfig. S2, Supplementary Material online). Our method correctly classifies genes by previously defined stratum boundaries, and consistent with recent suggestions (Lemaitre et al. 2009a; Wilson and Makova 2009), we find evidence of two strata within the previously described stratum 3 (fig. 1 and table 1). This analysis confirms that our method is able to recapitulate previous stratum definitions but also highlights the challenges of relying only on X-linked coding sequences, which are necessarily more conserved than noncoding regions, with identifiable Y-linked gametologs.Fig. 1.—

Bottom Line: Mammalian sex chromosomes arose from a pair of homologous autosomes that differentiated into the X and Y chromosomes following a series of recombination suppression events between the X and Y.We have developed an integrative method that combines a top-down, recursive segmentation algorithm with a bottom-up, agglomerative clustering algorithm to decipher compositionally distinct regions on the X, which reflect regions of unique X-Y divergence.The older strata, from the first up to the third stratum, have remained poorly resolved due to paucity of X-Y gametologs.

View Article: PubMed Central - PubMed

Affiliation: Department of Biological Sciences, University of North Texas.

ABSTRACT
Mammalian sex chromosomes arose from a pair of homologous autosomes that differentiated into the X and Y chromosomes following a series of recombination suppression events between the X and Y. The stepwise recombination suppressions from the distal long arm to the distal short arm of the chromosomes are reflected as regions with distinct X-Y divergence, referred to as evolutionary strata on the X. All current methods for stratum detection depend on X-Y comparisons but are severely limited by the paucity of X-Y gametologs. We have developed an integrative method that combines a top-down, recursive segmentation algorithm with a bottom-up, agglomerative clustering algorithm to decipher compositionally distinct regions on the X, which reflect regions of unique X-Y divergence. In application to human X chromosome, our method correctly classified a concatenated set of 35 previously assayed X-linked gene sequences by evolutionary strata. We then extended our analysis, applying this method to the entire sequence of the human X chromosome, in an effort to define stratum boundaries. The boundaries of more recently formed strata on X-added region, namely the fourth and fifth strata, have been defined by previous studies and are recapitulated with our method. The older strata, from the first up to the third stratum, have remained poorly resolved due to paucity of X-Y gametologs. By analyzing the entire X sequence, our method identified seven evolutionary strata in these ancient regions, where only three could previously be assayed, thus demonstrating the robustness of our method in detecting the evolutionary strata.

Show MeSH
Related in: MedlinePlus