Limits...
Whole genome sequencing of an ethnic Pathan (Pakhtun) from the north-west of Pakistan.

Ilyas M, Kim JS, Cooper J, Shin YA, Kim HM, Cho YS, Hwang S, Kim H, Moon J, Chung O, Jun J, Rastogi A, Song S, Ko J, Manica A, Rahman Z, Husnain T, Bhak J - BMC Genomics (2015)

Bottom Line: Among the SNVs, 129,441 were novel, and 10,315 nonsynonymous SNVs were found in 5,344 genes.Finally, we reconstruct the demographic history by PSMC, which highlights a recent increase in effective population size compatible with admixture between European and Asian lineages expected in this geographic region.It is a useful resource to understand genetic variation and human migration across the whole Asian continent.

View Article: PubMed Central - PubMed

Affiliation: National Centre of Excellence in Molecular Biology, University of the Punjab, Lahore, Pakistan. milyaskh@hotmail.com.

ABSTRACT

Background: Pakistan covers a key geographic area in human history, being both part of the Indus River region that acted as one of the cradles of civilization and as a link between Western Eurasia and Eastern Asia. This region is inhabited by a number of distinct ethnic groups, the largest being the Punjabi, Pathan (Pakhtuns), Sindhi, and Baloch.

Results: We analyzed the first ethnic male Pathan genome by sequencing it to 29.7-fold coverage using the Illumina HiSeq2000 platform. A total of 3.8 million single nucleotide variations (SNVs) and 0.5 million small indels were identified by comparing with the human reference genome. Among the SNVs, 129,441 were novel, and 10,315 nonsynonymous SNVs were found in 5,344 genes. SNVs were annotated for health consequences and high risk diseases, as well as possible influences on drug efficacy. We confirmed that the Pathan genome presented here is representative of this ethnic group by comparing it to a panel of Central Asians from the HGDP-CEPH panels typed for ~650 k SNPs. The mtDNA (H2) and Y haplogroup (L1) of this individual were also typical of his geographic region of origin. Finally, we reconstruct the demographic history by PSMC, which highlights a recent increase in effective population size compatible with admixture between European and Asian lineages expected in this geographic region.

Conclusions: We present a whole-genome sequence and analyses of an ethnic Pathan from the north-west province of Pakistan. It is a useful resource to understand genetic variation and human migration across the whole Asian continent.

Show MeSH
Copy number variation regions in Pathan genome. Copy number variations counts distributed in each chromosome.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4362645&req=5

Fig1: Copy number variation regions in Pathan genome. Copy number variations counts distributed in each chromosome.

Mentions: A total of 504,276 short indels (up to ±20 bases) were observed, of which 306,128 were found in intergenic regions, 237 in CDS regions, and 193,308 in intron regions. Additionally, 1,503 CNVRs were found, 713 of which were classed as duplicated and 790 as deleted, affecting 2,364 overlapped genes (Additional file 3: Table S2). A total of 65 CNVRs had not previously been described in the database of genomic variants (DGV; http://projects.tcag.ca/variation/). Figure 1 shows the number of gained and lost CNVRs in each chromosome. ANNOVAR was used for detailed annotation analysis of CNVRs to identify genes associated with these regions (Additional file 4: Table S3).Figure 1


Whole genome sequencing of an ethnic Pathan (Pakhtun) from the north-west of Pakistan.

Ilyas M, Kim JS, Cooper J, Shin YA, Kim HM, Cho YS, Hwang S, Kim H, Moon J, Chung O, Jun J, Rastogi A, Song S, Ko J, Manica A, Rahman Z, Husnain T, Bhak J - BMC Genomics (2015)

Copy number variation regions in Pathan genome. Copy number variations counts distributed in each chromosome.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4362645&req=5

Fig1: Copy number variation regions in Pathan genome. Copy number variations counts distributed in each chromosome.
Mentions: A total of 504,276 short indels (up to ±20 bases) were observed, of which 306,128 were found in intergenic regions, 237 in CDS regions, and 193,308 in intron regions. Additionally, 1,503 CNVRs were found, 713 of which were classed as duplicated and 790 as deleted, affecting 2,364 overlapped genes (Additional file 3: Table S2). A total of 65 CNVRs had not previously been described in the database of genomic variants (DGV; http://projects.tcag.ca/variation/). Figure 1 shows the number of gained and lost CNVRs in each chromosome. ANNOVAR was used for detailed annotation analysis of CNVRs to identify genes associated with these regions (Additional file 4: Table S3).Figure 1

Bottom Line: Among the SNVs, 129,441 were novel, and 10,315 nonsynonymous SNVs were found in 5,344 genes.Finally, we reconstruct the demographic history by PSMC, which highlights a recent increase in effective population size compatible with admixture between European and Asian lineages expected in this geographic region.It is a useful resource to understand genetic variation and human migration across the whole Asian continent.

View Article: PubMed Central - PubMed

Affiliation: National Centre of Excellence in Molecular Biology, University of the Punjab, Lahore, Pakistan. milyaskh@hotmail.com.

ABSTRACT

Background: Pakistan covers a key geographic area in human history, being both part of the Indus River region that acted as one of the cradles of civilization and as a link between Western Eurasia and Eastern Asia. This region is inhabited by a number of distinct ethnic groups, the largest being the Punjabi, Pathan (Pakhtuns), Sindhi, and Baloch.

Results: We analyzed the first ethnic male Pathan genome by sequencing it to 29.7-fold coverage using the Illumina HiSeq2000 platform. A total of 3.8 million single nucleotide variations (SNVs) and 0.5 million small indels were identified by comparing with the human reference genome. Among the SNVs, 129,441 were novel, and 10,315 nonsynonymous SNVs were found in 5,344 genes. SNVs were annotated for health consequences and high risk diseases, as well as possible influences on drug efficacy. We confirmed that the Pathan genome presented here is representative of this ethnic group by comparing it to a panel of Central Asians from the HGDP-CEPH panels typed for ~650 k SNPs. The mtDNA (H2) and Y haplogroup (L1) of this individual were also typical of his geographic region of origin. Finally, we reconstruct the demographic history by PSMC, which highlights a recent increase in effective population size compatible with admixture between European and Asian lineages expected in this geographic region.

Conclusions: We present a whole-genome sequence and analyses of an ethnic Pathan from the north-west province of Pakistan. It is a useful resource to understand genetic variation and human migration across the whole Asian continent.

Show MeSH