Limits...
Identification of novel single nucleotide polymorphisms associated with acute respiratory distress syndrome by exome-seq.

Shortt K, Chaudhary S, Grigoryev D, Heruth DP, Venkitachalam L, Zhang LQ, Ye SQ - PLoS ONE (2014)

Bottom Line: Acute respiratory distress syndrome (ARDS) is a lung condition characterized by impaired gas exchange with systemic release of inflammatory mediators, causing pulmonary inflammation, vascular leak and hypoxemia.Existing biomarkers have limited effectiveness as diagnostic and therapeutic targets.Additional validation in larger patient populations and further exploration of underlying molecular mechanisms are warranted.

View Article: PubMed Central - PubMed

Affiliation: Department of Pediatrics, Division of Experimental and Translational Genetics, Children's Mercy Hospital, University of Missouri - Kansas City School of Medicine, Kansas City, Missouri, United States of America; Department of Biomedical and Health Informatics, University of Missouri - Kansas City School of Medicine, Kansas City, Missouri, United States of America.

ABSTRACT
Acute respiratory distress syndrome (ARDS) is a lung condition characterized by impaired gas exchange with systemic release of inflammatory mediators, causing pulmonary inflammation, vascular leak and hypoxemia. Existing biomarkers have limited effectiveness as diagnostic and therapeutic targets. To identify disease-associating variants in ARDS patients, whole-exome sequencing was performed on 96 ARDS patients, detecting 1,382,399 SNPs. By comparing these exome data to those of the 1000 Genomes Project, we identified a number of single nucleotide polymorphisms (SNP) which are potentially associated with ARDS. 50,190SNPs were found in all case subgroups and controls, of which89 SNPs were associated with susceptibility. We validated three SNPs (rs78142040, rs9605146 and rs3848719) in additional ARDS patients to substantiate their associations with susceptibility, severity and outcome of ARDS. rs78142040 (C>T) occurs within a histone mark (intron 6) of the Arylsulfatase D gene. rs9605146 (G>A) causes a deleterious coding change (proline to leucine) in the XK, Kell blood group complex subunit-related family, member 3 gene. rs3848719 (G>A) is a synonymous SNP in the Zinc-Finger/Leucine-Zipper Co-Transducer NIF1 gene. rs78142040, rs9605146, and rs3848719 are associated significantly with susceptibility to ARDS. rs3848719 is associated with APACHE II score quartile. rs78142040 is associated with 60-day mortality in the overall ARDS patient population. Exome-seq is a powerful tool to identify potential new biomarkers for ARDS. We selectively validated three SNPs which have not been previously associated with ARDS and represent potential new genetic biomarkers for ARDS. Additional validation in larger patient populations and further exploration of underlying molecular mechanisms are warranted.

Show MeSH

Related in: MedlinePlus

Quantile-quantile plots of Caucasian ARDS and EUR 1000 genomes.In the example of our Caucasian cases and EUR controls, we observe that correction for principal components improves the fit of our data with the expected distribution. (A) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data are filtered on HWE, LD, and SNP call rate but not PCA corrected. (B) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs. (C) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs and undergone sample outlier removal.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4221189&req=5

pone-0111953-g003: Quantile-quantile plots of Caucasian ARDS and EUR 1000 genomes.In the example of our Caucasian cases and EUR controls, we observe that correction for principal components improves the fit of our data with the expected distribution. (A) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data are filtered on HWE, LD, and SNP call rate but not PCA corrected. (B) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs. (C) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs and undergone sample outlier removal.

Mentions: Although we applied the Bonferroni correction (p<2.95×10−7) and several SNP filtering steps during our data analysis as well as validations of three selected candidate SNPs to ARDS, our data come with potential limitations. First, we only performed exome-seq of 96 ARDS samples. Although we would argue that this is a very reasonable sample size considering the restriction of high exome-seq cost per sample, even though the exome-seq cost per sample is cheaper than whole genome-seq per sample, the sample size is not large. Our 76 SNPs which are associated strongly with susceptibility are all present in an age and race matched 48 sample control set, which will be used to validate our findings in further studies (117,35 out of the 169,376 SNPs which are in the ARDS cases and 1000 Genomes Project are found in this control set). Confirmation of our findings in larger patient populations is warranted. Second, during analysis of SNP associations with ARDS susceptibility, we used the healthy control subjects from the 1000 Genome Project. Both ARDS patients and healthy control subjects do not derive from the same population. Since population admixture is assumed in the African American cases, we have elected to compare these cases with the ASW subset of the 1000 Genomes Project African Ancestry panel. We feel this is the best fitting control group due to the observed reduction in genomic inflation factor (inflation factor = 1.18 when compared with ASW, after filtering for informative markers based on HWE, call rate, number of alleles, and LD) compared to the total African Ancestry controls (inflation factor = 1.70), YRI alone (inflation factor = 1.59), or LWK alone (inflation factor = 1.98) [41]. An ancestry-informative SNP panel with good coverage of our dataset was not available. Although we have applied HWE, PCA analysis and Q–Q plot determination as well as race specific comparison to filter the identified SNPs, it may not totally correct the population admixtures (Figure 3, Figure S1, Figure S2, Table S8). Two of the SNPs (rs9605146, control MAF 4.0% and rs78142040, control MAF 0% respectively) have extremely minor allele frequencies which causes inflation of the type-1 error of the Goodness-of-fit test for HWE [26]. In this study, we explicitly searched for SNPs in which the MAF differed between cases and controls, so we expect that we might see some deviation where the minor alleles are rare in healthy controls. The observed associations with other disease phenotypes within our case cohort support our conclusion that variations at these loci contribute to disease. Replication of our findings in larger and different populations may strengthen and develop the candidate SNPs identified here as true genetic biomarkers of ARDS.


Identification of novel single nucleotide polymorphisms associated with acute respiratory distress syndrome by exome-seq.

Shortt K, Chaudhary S, Grigoryev D, Heruth DP, Venkitachalam L, Zhang LQ, Ye SQ - PLoS ONE (2014)

Quantile-quantile plots of Caucasian ARDS and EUR 1000 genomes.In the example of our Caucasian cases and EUR controls, we observe that correction for principal components improves the fit of our data with the expected distribution. (A) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data are filtered on HWE, LD, and SNP call rate but not PCA corrected. (B) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs. (C) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs and undergone sample outlier removal.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4221189&req=5

pone-0111953-g003: Quantile-quantile plots of Caucasian ARDS and EUR 1000 genomes.In the example of our Caucasian cases and EUR controls, we observe that correction for principal components improves the fit of our data with the expected distribution. (A) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data are filtered on HWE, LD, and SNP call rate but not PCA corrected. (B) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs. (C) QQ plot of expected χ2values versus the actual χ2values for the genotypic trend test of case-control status. The data have been filtered and corrected for 6 PCs and undergone sample outlier removal.
Mentions: Although we applied the Bonferroni correction (p<2.95×10−7) and several SNP filtering steps during our data analysis as well as validations of three selected candidate SNPs to ARDS, our data come with potential limitations. First, we only performed exome-seq of 96 ARDS samples. Although we would argue that this is a very reasonable sample size considering the restriction of high exome-seq cost per sample, even though the exome-seq cost per sample is cheaper than whole genome-seq per sample, the sample size is not large. Our 76 SNPs which are associated strongly with susceptibility are all present in an age and race matched 48 sample control set, which will be used to validate our findings in further studies (117,35 out of the 169,376 SNPs which are in the ARDS cases and 1000 Genomes Project are found in this control set). Confirmation of our findings in larger patient populations is warranted. Second, during analysis of SNP associations with ARDS susceptibility, we used the healthy control subjects from the 1000 Genome Project. Both ARDS patients and healthy control subjects do not derive from the same population. Since population admixture is assumed in the African American cases, we have elected to compare these cases with the ASW subset of the 1000 Genomes Project African Ancestry panel. We feel this is the best fitting control group due to the observed reduction in genomic inflation factor (inflation factor = 1.18 when compared with ASW, after filtering for informative markers based on HWE, call rate, number of alleles, and LD) compared to the total African Ancestry controls (inflation factor = 1.70), YRI alone (inflation factor = 1.59), or LWK alone (inflation factor = 1.98) [41]. An ancestry-informative SNP panel with good coverage of our dataset was not available. Although we have applied HWE, PCA analysis and Q–Q plot determination as well as race specific comparison to filter the identified SNPs, it may not totally correct the population admixtures (Figure 3, Figure S1, Figure S2, Table S8). Two of the SNPs (rs9605146, control MAF 4.0% and rs78142040, control MAF 0% respectively) have extremely minor allele frequencies which causes inflation of the type-1 error of the Goodness-of-fit test for HWE [26]. In this study, we explicitly searched for SNPs in which the MAF differed between cases and controls, so we expect that we might see some deviation where the minor alleles are rare in healthy controls. The observed associations with other disease phenotypes within our case cohort support our conclusion that variations at these loci contribute to disease. Replication of our findings in larger and different populations may strengthen and develop the candidate SNPs identified here as true genetic biomarkers of ARDS.

Bottom Line: Acute respiratory distress syndrome (ARDS) is a lung condition characterized by impaired gas exchange with systemic release of inflammatory mediators, causing pulmonary inflammation, vascular leak and hypoxemia.Existing biomarkers have limited effectiveness as diagnostic and therapeutic targets.Additional validation in larger patient populations and further exploration of underlying molecular mechanisms are warranted.

View Article: PubMed Central - PubMed

Affiliation: Department of Pediatrics, Division of Experimental and Translational Genetics, Children's Mercy Hospital, University of Missouri - Kansas City School of Medicine, Kansas City, Missouri, United States of America; Department of Biomedical and Health Informatics, University of Missouri - Kansas City School of Medicine, Kansas City, Missouri, United States of America.

ABSTRACT
Acute respiratory distress syndrome (ARDS) is a lung condition characterized by impaired gas exchange with systemic release of inflammatory mediators, causing pulmonary inflammation, vascular leak and hypoxemia. Existing biomarkers have limited effectiveness as diagnostic and therapeutic targets. To identify disease-associating variants in ARDS patients, whole-exome sequencing was performed on 96 ARDS patients, detecting 1,382,399 SNPs. By comparing these exome data to those of the 1000 Genomes Project, we identified a number of single nucleotide polymorphisms (SNP) which are potentially associated with ARDS. 50,190SNPs were found in all case subgroups and controls, of which89 SNPs were associated with susceptibility. We validated three SNPs (rs78142040, rs9605146 and rs3848719) in additional ARDS patients to substantiate their associations with susceptibility, severity and outcome of ARDS. rs78142040 (C>T) occurs within a histone mark (intron 6) of the Arylsulfatase D gene. rs9605146 (G>A) causes a deleterious coding change (proline to leucine) in the XK, Kell blood group complex subunit-related family, member 3 gene. rs3848719 (G>A) is a synonymous SNP in the Zinc-Finger/Leucine-Zipper Co-Transducer NIF1 gene. rs78142040, rs9605146, and rs3848719 are associated significantly with susceptibility to ARDS. rs3848719 is associated with APACHE II score quartile. rs78142040 is associated with 60-day mortality in the overall ARDS patient population. Exome-seq is a powerful tool to identify potential new biomarkers for ARDS. We selectively validated three SNPs which have not been previously associated with ARDS and represent potential new genetic biomarkers for ARDS. Additional validation in larger patient populations and further exploration of underlying molecular mechanisms are warranted.

Show MeSH
Related in: MedlinePlus