Limits...
Characterization of the past and current duplication activities in the human 22q11.2 region.

Guo X, Freyer L, Morrow B, Zheng D - BMC Genomics (2011)

Bottom Line: Segmental duplications (SDs) on 22q11.2 (LCR22), serve as substrates for meiotic non-allelic homologous recombination (NAHR) events resulting in several clinically significant genomic disorders.Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events.Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Neurology, Albert Einstein College of Medicine, Bronx, NY 10461, USA.

ABSTRACT

Background: Segmental duplications (SDs) on 22q11.2 (LCR22), serve as substrates for meiotic non-allelic homologous recombination (NAHR) events resulting in several clinically significant genomic disorders.

Results: To understand the duplication activity leading to the complicated SD structure of this region, we have applied the A-Bruijn graph algorithm to decompose the 22q11.2 SDs to 523 fundamental duplication sequences, termed subunits. Cross-species syntenic analysis of primate genomes demonstrates that many of these LCR22 subunits emerged very recently, especially those implicated in human genomic disorders. Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events. Many copy number variations (CNVs) exist on 22q11.2, some flanked by SDs. Interestingly, two chromosome breakpoints for 13 CNVs (mean length 65 kb) are located in paralogous subunits, providing direct evidence that SD subunits could contribute to CNV formation. Sequence analysis of PACs or BACs identified extra CNVs, specifically, 10 insertions and 18 deletions within 22q11.2; four were more than 10 kb in size and most contained young AluYs at their breakpoints.

Conclusions: Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.

Show MeSH

Related in: MedlinePlus

Synteny of SDs on 22q11.2. (A) The syntenic relationship of the subunits with chimpanzee, orangutan and macaque is shown as present (matching color boxes) or absent (white). This map was derived from our analysis of the multi-genome alignment data in the Ensembl database (see Methods). The boxed region in LCR22-5' was subsequently confirmed by PCR to be absent in the macaque genome (see Additional file 3, Figure S2). (B) Comparison of primate segmental duplications. The data were retrieved from a previous study using WSSD analysis for SD detection [29]. The depth of sequence read coverage (number of shot-gun sequencing reads in 5-kb windows) is depicted for human (HAS), chimpanzee (PTR), orangutan (PPY) and macaque (MMU) based on alignment of reads against the human genome. Putative duplicated regions with excess read depth (more than three standard deviation of the mean) are shown in red with unique regions in green. Human and chimp SDs derived from depth analysis are also shown below the human SDs derived from WGAC analysis (top). The data here suggest that most of the sequences in LCR22-2', -3a' and -4' are shared between human and chimpanzee and their duplications likely occurred after the split of the African great apes from Asian great apes. Interestingly, the human-specific SDs in LCR22-3a' and -4' show higher sequence identity (represented by light to dark orange color) than the rest of the SDs (light to dark grey). (C) Past duplication events that may have generated the homology between LCR22-3a' and LCR22-4'. Arrow lines represent putative duplication directions. The large cyan subunit in LCR22-3a' may have arisen from either the proximal or distal paralogous sequences in LCR22-4'.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC3040729&req=5

Figure 3: Synteny of SDs on 22q11.2. (A) The syntenic relationship of the subunits with chimpanzee, orangutan and macaque is shown as present (matching color boxes) or absent (white). This map was derived from our analysis of the multi-genome alignment data in the Ensembl database (see Methods). The boxed region in LCR22-5' was subsequently confirmed by PCR to be absent in the macaque genome (see Additional file 3, Figure S2). (B) Comparison of primate segmental duplications. The data were retrieved from a previous study using WSSD analysis for SD detection [29]. The depth of sequence read coverage (number of shot-gun sequencing reads in 5-kb windows) is depicted for human (HAS), chimpanzee (PTR), orangutan (PPY) and macaque (MMU) based on alignment of reads against the human genome. Putative duplicated regions with excess read depth (more than three standard deviation of the mean) are shown in red with unique regions in green. Human and chimp SDs derived from depth analysis are also shown below the human SDs derived from WGAC analysis (top). The data here suggest that most of the sequences in LCR22-2', -3a' and -4' are shared between human and chimpanzee and their duplications likely occurred after the split of the African great apes from Asian great apes. Interestingly, the human-specific SDs in LCR22-3a' and -4' show higher sequence identity (represented by light to dark orange color) than the rest of the SDs (light to dark grey). (C) Past duplication events that may have generated the homology between LCR22-3a' and LCR22-4'. Arrow lines represent putative duplication directions. The large cyan subunit in LCR22-3a' may have arisen from either the proximal or distal paralogous sequences in LCR22-4'.

Mentions: Using multiple genome alignment data from the Ensembl database [28] and with extra filtering to improve syntenic map for duplicated sequences (see Methods for details), we found that 70%, 61%, and 26% of the 22q11.2 SD subunits had unambiguous syntenic sequences in chimpanzee, orangutan, and macaque (Table 1), respectively, and all together 81% of subunits had syntenic sequences detected in at least one of the three current assembles of non-human primate genomes (Figure 3A). More specifically, 39%, 21%, and 26% of duplicated subunits in LCR22-3a', LCR22-2', and LCR22-4', respectively, appeared specific to the human genome (Table 1; Figure 3A). By comparison, only 3% to 19% of the subunits in LCR22-3b', LCR22-5', 6', 7', and 8' exhibited human specificity. We have also carried out a PCR assay to support our syntenic analysis (Additional file 3). Together with the results from our analysis of subunit distribution in 22q11.2 (above), our cross-species syntenic analysis demonstrates that most sequences in LCR22-2', LCR22-3a', and LCR22-4' were generated from more recent duplication events.


Characterization of the past and current duplication activities in the human 22q11.2 region.

Guo X, Freyer L, Morrow B, Zheng D - BMC Genomics (2011)

Synteny of SDs on 22q11.2. (A) The syntenic relationship of the subunits with chimpanzee, orangutan and macaque is shown as present (matching color boxes) or absent (white). This map was derived from our analysis of the multi-genome alignment data in the Ensembl database (see Methods). The boxed region in LCR22-5' was subsequently confirmed by PCR to be absent in the macaque genome (see Additional file 3, Figure S2). (B) Comparison of primate segmental duplications. The data were retrieved from a previous study using WSSD analysis for SD detection [29]. The depth of sequence read coverage (number of shot-gun sequencing reads in 5-kb windows) is depicted for human (HAS), chimpanzee (PTR), orangutan (PPY) and macaque (MMU) based on alignment of reads against the human genome. Putative duplicated regions with excess read depth (more than three standard deviation of the mean) are shown in red with unique regions in green. Human and chimp SDs derived from depth analysis are also shown below the human SDs derived from WGAC analysis (top). The data here suggest that most of the sequences in LCR22-2', -3a' and -4' are shared between human and chimpanzee and their duplications likely occurred after the split of the African great apes from Asian great apes. Interestingly, the human-specific SDs in LCR22-3a' and -4' show higher sequence identity (represented by light to dark orange color) than the rest of the SDs (light to dark grey). (C) Past duplication events that may have generated the homology between LCR22-3a' and LCR22-4'. Arrow lines represent putative duplication directions. The large cyan subunit in LCR22-3a' may have arisen from either the proximal or distal paralogous sequences in LCR22-4'.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC3040729&req=5

Figure 3: Synteny of SDs on 22q11.2. (A) The syntenic relationship of the subunits with chimpanzee, orangutan and macaque is shown as present (matching color boxes) or absent (white). This map was derived from our analysis of the multi-genome alignment data in the Ensembl database (see Methods). The boxed region in LCR22-5' was subsequently confirmed by PCR to be absent in the macaque genome (see Additional file 3, Figure S2). (B) Comparison of primate segmental duplications. The data were retrieved from a previous study using WSSD analysis for SD detection [29]. The depth of sequence read coverage (number of shot-gun sequencing reads in 5-kb windows) is depicted for human (HAS), chimpanzee (PTR), orangutan (PPY) and macaque (MMU) based on alignment of reads against the human genome. Putative duplicated regions with excess read depth (more than three standard deviation of the mean) are shown in red with unique regions in green. Human and chimp SDs derived from depth analysis are also shown below the human SDs derived from WGAC analysis (top). The data here suggest that most of the sequences in LCR22-2', -3a' and -4' are shared between human and chimpanzee and their duplications likely occurred after the split of the African great apes from Asian great apes. Interestingly, the human-specific SDs in LCR22-3a' and -4' show higher sequence identity (represented by light to dark orange color) than the rest of the SDs (light to dark grey). (C) Past duplication events that may have generated the homology between LCR22-3a' and LCR22-4'. Arrow lines represent putative duplication directions. The large cyan subunit in LCR22-3a' may have arisen from either the proximal or distal paralogous sequences in LCR22-4'.
Mentions: Using multiple genome alignment data from the Ensembl database [28] and with extra filtering to improve syntenic map for duplicated sequences (see Methods for details), we found that 70%, 61%, and 26% of the 22q11.2 SD subunits had unambiguous syntenic sequences in chimpanzee, orangutan, and macaque (Table 1), respectively, and all together 81% of subunits had syntenic sequences detected in at least one of the three current assembles of non-human primate genomes (Figure 3A). More specifically, 39%, 21%, and 26% of duplicated subunits in LCR22-3a', LCR22-2', and LCR22-4', respectively, appeared specific to the human genome (Table 1; Figure 3A). By comparison, only 3% to 19% of the subunits in LCR22-3b', LCR22-5', 6', 7', and 8' exhibited human specificity. We have also carried out a PCR assay to support our syntenic analysis (Additional file 3). Together with the results from our analysis of subunit distribution in 22q11.2 (above), our cross-species syntenic analysis demonstrates that most sequences in LCR22-2', LCR22-3a', and LCR22-4' were generated from more recent duplication events.

Bottom Line: Segmental duplications (SDs) on 22q11.2 (LCR22), serve as substrates for meiotic non-allelic homologous recombination (NAHR) events resulting in several clinically significant genomic disorders.Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events.Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Neurology, Albert Einstein College of Medicine, Bronx, NY 10461, USA.

ABSTRACT

Background: Segmental duplications (SDs) on 22q11.2 (LCR22), serve as substrates for meiotic non-allelic homologous recombination (NAHR) events resulting in several clinically significant genomic disorders.

Results: To understand the duplication activity leading to the complicated SD structure of this region, we have applied the A-Bruijn graph algorithm to decompose the 22q11.2 SDs to 523 fundamental duplication sequences, termed subunits. Cross-species syntenic analysis of primate genomes demonstrates that many of these LCR22 subunits emerged very recently, especially those implicated in human genomic disorders. Some subunits have expanded more actively than others, and young Alu SINEs, are associated much more frequently with duplicated sequences that have undergone active expansion, confirming their role in mediating recombination events. Many copy number variations (CNVs) exist on 22q11.2, some flanked by SDs. Interestingly, two chromosome breakpoints for 13 CNVs (mean length 65 kb) are located in paralogous subunits, providing direct evidence that SD subunits could contribute to CNV formation. Sequence analysis of PACs or BACs identified extra CNVs, specifically, 10 insertions and 18 deletions within 22q11.2; four were more than 10 kb in size and most contained young AluYs at their breakpoints.

Conclusions: Our study indicates that AluYs are implicated in the past and current duplication events, and moreover suggests that DNA rearrangements in 22q11.2 genomic disorders perhaps do not occur randomly but involve both actively expanded duplication subunits and Alu elements.

Show MeSH
Related in: MedlinePlus