Limits...
Personalized copy number and segmental duplication maps using next-generation sequencing.

Alkan C, Kidd JM, Marques-Bonet T, Aksay G, Antonacci F, Hormozdiari F, Kitzman JO, Baker C, Malig M, Mutlu O, Sahinalp SC, Gibbs RA, Eichler EE - Nat. Genet. (2009)

Bottom Line: We examine three human genomes and experimentally validate genome-wide copy number differences.We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 x 10(-16)).Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.

View Article: PubMed Central - PubMed

Affiliation: Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA.

ABSTRACT
Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 x 10(-16)). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.

Show MeSH
Validation of individual-specific segmental duplicationsThe number of duplicated base pairs predicted and validated in NA18507, JDW, and YH (autosomes only) are shown. The height of the bars represents the sum of computationally predicted interval lengths, and the blue color bars correspond to the experimentally validated portion. Only duplicated intervals >20 kbp were considered for validation.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2875196&req=5

Figure 3: Validation of individual-specific segmental duplicationsThe number of duplicated base pairs predicted and validated in NA18507, JDW, and YH (autosomes only) are shown. The height of the bars represents the sum of computationally predicted interval lengths, and the blue color bars correspond to the experimentally validated portion. Only duplicated intervals >20 kbp were considered for validation.

Mentions: We constructed duplication maps for each of the three genomes and estimated the absolute copy number of each duplication interval larger than 20 kbp in length. We considered a given segment to be duplicated within an individual if the median of estimated copy number for that individual was greater than 2.5 (diploid copy number; see Supplementary Note). We compared the extent of overlap among duplicated sequences (Figure 2, Methods) and reclassified duplicated sequences as shared or individual-specific based on the predicted copy numbers in the analysis of these three genomes (Supplementary Note). We defined a total of 725 non-overlapping duplication intervals across the three individuals that total 84.76 Mbp. Only 25 duplication intervals were not predicted in all three individuals suggesting that the vast majority (97% of the intervals and 98% by base pair) of large segmental duplications are shared (Figure 3 and Supplementary Figure 3).


Personalized copy number and segmental duplication maps using next-generation sequencing.

Alkan C, Kidd JM, Marques-Bonet T, Aksay G, Antonacci F, Hormozdiari F, Kitzman JO, Baker C, Malig M, Mutlu O, Sahinalp SC, Gibbs RA, Eichler EE - Nat. Genet. (2009)

Validation of individual-specific segmental duplicationsThe number of duplicated base pairs predicted and validated in NA18507, JDW, and YH (autosomes only) are shown. The height of the bars represents the sum of computationally predicted interval lengths, and the blue color bars correspond to the experimentally validated portion. Only duplicated intervals >20 kbp were considered for validation.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2875196&req=5

Figure 3: Validation of individual-specific segmental duplicationsThe number of duplicated base pairs predicted and validated in NA18507, JDW, and YH (autosomes only) are shown. The height of the bars represents the sum of computationally predicted interval lengths, and the blue color bars correspond to the experimentally validated portion. Only duplicated intervals >20 kbp were considered for validation.
Mentions: We constructed duplication maps for each of the three genomes and estimated the absolute copy number of each duplication interval larger than 20 kbp in length. We considered a given segment to be duplicated within an individual if the median of estimated copy number for that individual was greater than 2.5 (diploid copy number; see Supplementary Note). We compared the extent of overlap among duplicated sequences (Figure 2, Methods) and reclassified duplicated sequences as shared or individual-specific based on the predicted copy numbers in the analysis of these three genomes (Supplementary Note). We defined a total of 725 non-overlapping duplication intervals across the three individuals that total 84.76 Mbp. Only 25 duplication intervals were not predicted in all three individuals suggesting that the vast majority (97% of the intervals and 98% by base pair) of large segmental duplications are shared (Figure 3 and Supplementary Figure 3).

Bottom Line: We examine three human genomes and experimentally validate genome-wide copy number differences.We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 x 10(-16)).Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.

View Article: PubMed Central - PubMed

Affiliation: Department of Genome Sciences, University of Washington School of Medicine, Seattle, Washington, USA.

ABSTRACT
Despite their importance in gene innovation and phenotypic variation, duplicated regions have remained largely intractable owing to difficulties in accurately resolving their structure, copy number and sequence content. We present an algorithm (mrFAST) to comprehensively map next-generation sequence reads, which allows for the prediction of absolute copy-number variation of duplicated segments and genes. We examine three human genomes and experimentally validate genome-wide copy number differences. We estimate that, on average, 73-87 genes vary in copy number between any two individuals and find that these genic differences overwhelmingly correspond to segmental duplications (odds ratio = 135; P < 2.2 x 10(-16)). Our method can distinguish between different copies of highly identical genes, providing a more accurate assessment of gene content and insight into functional constraint without the limitations of array-based technology.

Show MeSH