Limits...
Statistical estimation of correlated genome associations to a quantitative trait network.

Kim S, Xing EP - PLoS Genet. (2009)

Bottom Line: Using simulated datasets based on the HapMap consortium and an asthma dataset, we compared the performance of our method with other methods based on single-marker analysis and regression-based methods that do not use any of the relational information in the traits.We found that our method showed an increased power in detecting causal variants affecting correlated traits.Our results showed that, when correlation patterns among traits in a QTN are considered explicitly and directly during a structured multivariate genome association analysis using our proposed methods, the power of detecting true causal SNPs with possibly pleiotropic effects increased significantly without compromising performance on non-pleiotropic SNPs.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.

ABSTRACT
Many complex disease syndromes, such as asthma, consist of a large number of highly related, rather than independent, clinical or molecular phenotypes. This raises a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to directly and effectively incorporate the correlation structure of multiple quantitative traits such as clinical metrics and gene expressions in association analysis. Our approach represents correlation information explicitly among the quantitative traits as a quantitative trait network (QTN) and then leverages this network to encode structured regularization functions in a multivariate regression model over the genotypes and traits. The result is that the genetic markers that jointly influence subgroups of highly correlated traits can be detected jointly with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently and combined the results afterwards, our approach analyzes all of the traits jointly in a single statistical framework. This allows our method to borrow information across correlated phenotypes to discover the genetic markers that perturb a subset of the correlated traits synergistically. Using simulated datasets based on the HapMap consortium and an asthma dataset, we compared the performance of our method with other methods based on single-marker analysis and regression-based methods that do not use any of the relational information in the traits. We found that our method showed an increased power in detecting causal variants affecting correlated traits. Our results showed that, when correlation patterns among traits in a QTN are considered explicitly and directly during a structured multivariate genome association analysis using our proposed methods, the power of detecting true causal SNPs with possibly pleiotropic effects increased significantly without compromising performance on non-pleiotropic SNPs.

Show MeSH

Related in: MedlinePlus

An illustration of association analysis using the QTN for asthma dataset.Nodes in the QTN represent clinical traits related to asthma. Each pair of nodes is connected with an edge if the corresponding two traits are highly correlated. The thicknesses of edges indicate the strength of correlation. We are interested in identifying SNPs that are associated with a subnetwork of clinical traits.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2719086&req=5

pgen-1000587-g001: An illustration of association analysis using the QTN for asthma dataset.Nodes in the QTN represent clinical traits related to asthma. Each pair of nodes is connected with an edge if the corresponding two traits are highly correlated. The thicknesses of edges indicate the strength of correlation. We are interested in identifying SNPs that are associated with a subnetwork of clinical traits.

Mentions: In several recent attempts on expression quantitative trait locus (eQTL) mapping, a significant focus has been placed on identifying modules of co-expressed genes and the genotype markers that perturb the whole module rather than a single gene. For example, a genotype variation in a putative transcription factor is likely to affect the expression levels of all of the genes regulated by this common transcription factor. Under this scenario, once a group of genes are mapped to a common locus in the genome, it is possible to examine whether the locus harbors a transcription factor that targets the group of genes jointly in order to understand the functional relationship between the genotype marker and the gene module (e.g., [11]). Another example, which will be explored in this paper, involves the study of complex heterogeneous diseases such as asthma that cannot be characterized by a single phenotype, but are influenced by multiple factors. In Figure 1, the correlation structure of 53 clinical traits in an asthma dataset collected as a part of the Severe Asthma Research Program (SARP) [14] is represented as a quantitative trait network (QTN). From a visual inspection of this network, it is apparent that it contains several groups of inter-correlated traits that are connected with weighted edges among them. Further investigation reveals that each subnetwork in this QTN corresponds to different clinical aspects of asthma, such as quality of life (the nodes for QLEnvironment, QLSymptom, QLEmotion, and QLActivity), asthma symptoms (the nodes for Wheezy, Sputum, ChestTight), and lung physiology (the nodes for BaseFEV1, PreFEFPred, PostbroPred, PredrugFEV1P, MaxFEV1P, etc.). It is natural for one to suspect that such highly correlated traits in a subnetwork may share some common genetic causes, and that analyzing a group of traits in each subnetwork jointly rather than each trait independently may help to better uncover such causes.


Statistical estimation of correlated genome associations to a quantitative trait network.

Kim S, Xing EP - PLoS Genet. (2009)

An illustration of association analysis using the QTN for asthma dataset.Nodes in the QTN represent clinical traits related to asthma. Each pair of nodes is connected with an edge if the corresponding two traits are highly correlated. The thicknesses of edges indicate the strength of correlation. We are interested in identifying SNPs that are associated with a subnetwork of clinical traits.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2719086&req=5

pgen-1000587-g001: An illustration of association analysis using the QTN for asthma dataset.Nodes in the QTN represent clinical traits related to asthma. Each pair of nodes is connected with an edge if the corresponding two traits are highly correlated. The thicknesses of edges indicate the strength of correlation. We are interested in identifying SNPs that are associated with a subnetwork of clinical traits.
Mentions: In several recent attempts on expression quantitative trait locus (eQTL) mapping, a significant focus has been placed on identifying modules of co-expressed genes and the genotype markers that perturb the whole module rather than a single gene. For example, a genotype variation in a putative transcription factor is likely to affect the expression levels of all of the genes regulated by this common transcription factor. Under this scenario, once a group of genes are mapped to a common locus in the genome, it is possible to examine whether the locus harbors a transcription factor that targets the group of genes jointly in order to understand the functional relationship between the genotype marker and the gene module (e.g., [11]). Another example, which will be explored in this paper, involves the study of complex heterogeneous diseases such as asthma that cannot be characterized by a single phenotype, but are influenced by multiple factors. In Figure 1, the correlation structure of 53 clinical traits in an asthma dataset collected as a part of the Severe Asthma Research Program (SARP) [14] is represented as a quantitative trait network (QTN). From a visual inspection of this network, it is apparent that it contains several groups of inter-correlated traits that are connected with weighted edges among them. Further investigation reveals that each subnetwork in this QTN corresponds to different clinical aspects of asthma, such as quality of life (the nodes for QLEnvironment, QLSymptom, QLEmotion, and QLActivity), asthma symptoms (the nodes for Wheezy, Sputum, ChestTight), and lung physiology (the nodes for BaseFEV1, PreFEFPred, PostbroPred, PredrugFEV1P, MaxFEV1P, etc.). It is natural for one to suspect that such highly correlated traits in a subnetwork may share some common genetic causes, and that analyzing a group of traits in each subnetwork jointly rather than each trait independently may help to better uncover such causes.

Bottom Line: Using simulated datasets based on the HapMap consortium and an asthma dataset, we compared the performance of our method with other methods based on single-marker analysis and regression-based methods that do not use any of the relational information in the traits.We found that our method showed an increased power in detecting causal variants affecting correlated traits.Our results showed that, when correlation patterns among traits in a QTN are considered explicitly and directly during a structured multivariate genome association analysis using our proposed methods, the power of detecting true causal SNPs with possibly pleiotropic effects increased significantly without compromising performance on non-pleiotropic SNPs.

View Article: PubMed Central - PubMed

Affiliation: School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.

ABSTRACT
Many complex disease syndromes, such as asthma, consist of a large number of highly related, rather than independent, clinical or molecular phenotypes. This raises a new technical challenge in identifying genetic variations associated simultaneously with correlated traits. In this study, we propose a new statistical framework called graph-guided fused lasso (GFlasso) to directly and effectively incorporate the correlation structure of multiple quantitative traits such as clinical metrics and gene expressions in association analysis. Our approach represents correlation information explicitly among the quantitative traits as a quantitative trait network (QTN) and then leverages this network to encode structured regularization functions in a multivariate regression model over the genotypes and traits. The result is that the genetic markers that jointly influence subgroups of highly correlated traits can be detected jointly with high sensitivity and specificity. While most of the traditional methods examined each phenotype independently and combined the results afterwards, our approach analyzes all of the traits jointly in a single statistical framework. This allows our method to borrow information across correlated phenotypes to discover the genetic markers that perturb a subset of the correlated traits synergistically. Using simulated datasets based on the HapMap consortium and an asthma dataset, we compared the performance of our method with other methods based on single-marker analysis and regression-based methods that do not use any of the relational information in the traits. We found that our method showed an increased power in detecting causal variants affecting correlated traits. Our results showed that, when correlation patterns among traits in a QTN are considered explicitly and directly during a structured multivariate genome association analysis using our proposed methods, the power of detecting true causal SNPs with possibly pleiotropic effects increased significantly without compromising performance on non-pleiotropic SNPs.

Show MeSH
Related in: MedlinePlus