Limits...
Family-based association analysis: a fast and efficient method of multivariate association analysis with multiple variants.

Won S, Kim W, Lee S, Lee Y, Sung J, Park T - BMC Bioinformatics (2015)

Bottom Line: The proposed test can be applied for both quantitative and dichotomous phenotypes, and it is robust under the presence of population substructure, as long as large-scale genomic data is available.Using simulated data, we showed that our method is statistically more efficient than the existing methods, and the practical relevance is illustrated by application of the approach to obesity-related phenotypes.The proposed method may be more statistically efficient than the existing methods.

View Article: PubMed Central - PubMed

Affiliation: Department of Public Health Science, Seoul National University, Seoul, Korea. won1@snu.ac.kr.

ABSTRACT

Background: Many disease phenotypes are outcomes of the complicated interplay between multiple genes, and multiple phenotypes are affected by a single or multiple genotypes. Therefore, joint analysis of multiple phenotypes and multiple markers has been considered as an efficient strategy for genome-wide association analysis, and in this work we propose an omnibus family-based association test for the joint analysis of multiple genotypes and multiple phenotypes.

Results: The proposed test can be applied for both quantitative and dichotomous phenotypes, and it is robust under the presence of population substructure, as long as large-scale genomic data is available. Using simulated data, we showed that our method is statistically more efficient than the existing methods, and the practical relevance is illustrated by application of the approach to obesity-related phenotypes.

Conclusions: The proposed method may be more statistically efficient than the existing methods. The application was developed in C++ and is available at the following URL: http://healthstat.snu.ac.kr/software/mfqls/ .

Show MeSH

Related in: MedlinePlus

QQ-plots for dichotomous phenotypes in the presence of population substructure. QQ-plots were generated from results of 10,000 replicates for quantitative phenotypes. We assumed that the number of markers were 2, and that their minor allele frequencies were generated as U(0.1, 0.5). ρ was assumed to be 0.2, and Wright’s FST was assumed to be 0.01. (a)Q=2 and D'=0, (b)Q=2 and D'=0.5, (c)Q=5 and D'=0, and (d)Q=5 and D'=0.5 were assumed respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4339744&req=5

Fig5: QQ-plots for dichotomous phenotypes in the presence of population substructure. QQ-plots were generated from results of 10,000 replicates for quantitative phenotypes. We assumed that the number of markers were 2, and that their minor allele frequencies were generated as U(0.1, 0.5). ρ was assumed to be 0.2, and Wright’s FST was assumed to be 0.01. (a)Q=2 and D'=0, (b)Q=2 and D'=0.5, (c)Q=5 and D'=0, and (d)Q=5 and D'=0.5 were assumed respectively.

Mentions: The proposed methods for both dichotomous and quantitative phenotypes were evaluated in the presence of population substructure. Wright’s FST indicates the level of population substructure and we assumed that FST = 0.01 and 0.05. Robustness of the proposed method to population substructure is provided if the genetic relationship matrix is estimated with large-scale genetic information and replace the kinship coefficient matrix [27]. In our simulation studies, we generated 100,000 common variants of which minor allele frequencies were larger than 0.1, and which are not related to the phenotypes. With these large-scale genotypes, we empirically estimated the genetic relationship matrix [27], which was then used as Φ in the proposed methods. The empirical type-1 error rates were calculated from 10,000 replicates at the 0.005, 0.01, and 0.05 significance levels. Table 3 shows that the empirical type-1 error rates for MFQLS are approximately equal to the nominal significance levels in the presence of the population substructure. Figures 4 and 5 respectively show QQ plots from results for quantitative and dichotomous phenotypes when FST was assumed to be 0.01 and ρ was 0.2. The QQ plots showed that the statistical validities for both dichotomous and quantitative phenotypes were preserved at various significance levels.Table 3


Family-based association analysis: a fast and efficient method of multivariate association analysis with multiple variants.

Won S, Kim W, Lee S, Lee Y, Sung J, Park T - BMC Bioinformatics (2015)

QQ-plots for dichotomous phenotypes in the presence of population substructure. QQ-plots were generated from results of 10,000 replicates for quantitative phenotypes. We assumed that the number of markers were 2, and that their minor allele frequencies were generated as U(0.1, 0.5). ρ was assumed to be 0.2, and Wright’s FST was assumed to be 0.01. (a)Q=2 and D'=0, (b)Q=2 and D'=0.5, (c)Q=5 and D'=0, and (d)Q=5 and D'=0.5 were assumed respectively.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4339744&req=5

Fig5: QQ-plots for dichotomous phenotypes in the presence of population substructure. QQ-plots were generated from results of 10,000 replicates for quantitative phenotypes. We assumed that the number of markers were 2, and that their minor allele frequencies were generated as U(0.1, 0.5). ρ was assumed to be 0.2, and Wright’s FST was assumed to be 0.01. (a)Q=2 and D'=0, (b)Q=2 and D'=0.5, (c)Q=5 and D'=0, and (d)Q=5 and D'=0.5 were assumed respectively.
Mentions: The proposed methods for both dichotomous and quantitative phenotypes were evaluated in the presence of population substructure. Wright’s FST indicates the level of population substructure and we assumed that FST = 0.01 and 0.05. Robustness of the proposed method to population substructure is provided if the genetic relationship matrix is estimated with large-scale genetic information and replace the kinship coefficient matrix [27]. In our simulation studies, we generated 100,000 common variants of which minor allele frequencies were larger than 0.1, and which are not related to the phenotypes. With these large-scale genotypes, we empirically estimated the genetic relationship matrix [27], which was then used as Φ in the proposed methods. The empirical type-1 error rates were calculated from 10,000 replicates at the 0.005, 0.01, and 0.05 significance levels. Table 3 shows that the empirical type-1 error rates for MFQLS are approximately equal to the nominal significance levels in the presence of the population substructure. Figures 4 and 5 respectively show QQ plots from results for quantitative and dichotomous phenotypes when FST was assumed to be 0.01 and ρ was 0.2. The QQ plots showed that the statistical validities for both dichotomous and quantitative phenotypes were preserved at various significance levels.Table 3

Bottom Line: The proposed test can be applied for both quantitative and dichotomous phenotypes, and it is robust under the presence of population substructure, as long as large-scale genomic data is available.Using simulated data, we showed that our method is statistically more efficient than the existing methods, and the practical relevance is illustrated by application of the approach to obesity-related phenotypes.The proposed method may be more statistically efficient than the existing methods.

View Article: PubMed Central - PubMed

Affiliation: Department of Public Health Science, Seoul National University, Seoul, Korea. won1@snu.ac.kr.

ABSTRACT

Background: Many disease phenotypes are outcomes of the complicated interplay between multiple genes, and multiple phenotypes are affected by a single or multiple genotypes. Therefore, joint analysis of multiple phenotypes and multiple markers has been considered as an efficient strategy for genome-wide association analysis, and in this work we propose an omnibus family-based association test for the joint analysis of multiple genotypes and multiple phenotypes.

Results: The proposed test can be applied for both quantitative and dichotomous phenotypes, and it is robust under the presence of population substructure, as long as large-scale genomic data is available. Using simulated data, we showed that our method is statistically more efficient than the existing methods, and the practical relevance is illustrated by application of the approach to obesity-related phenotypes.

Conclusions: The proposed method may be more statistically efficient than the existing methods. The application was developed in C++ and is available at the following URL: http://healthstat.snu.ac.kr/software/mfqls/ .

Show MeSH
Related in: MedlinePlus