Limits...
MetaQTL: a package of new computational methods for the meta-analysis of QTL mapping experiments.

Veyrieras JB, Goffinet B, Charcosset A - BMC Bioinformatics (2007)

Bottom Line: However, studying the congruency between these results still remains a complex task.As expected, simulations also show that this new clustering algorithm leads to a reduction in the length of the confidence interval of QTL location provided that across studies there are enough observed QTL for each underlying true QTL location.The usefulness of our approach is illustrated on published QTL detection results of flowering time in maize.

View Article: PubMed Central - HTML - PubMed

Affiliation: UMR, INRA UPS-XI INAPG CNRS Génétique Végétale, Ferme du Moulon, 91190 Gif-sur-Yvette, France. veyrieras@moulon.inra.fr

ABSTRACT

Background: Integration of multiple results from Quantitative Trait Loci (QTL) studies is a key point to understand the genetic determinism of complex traits. Up to now many efforts have been made by public database developers to facilitate the storage, compilation and visualization of multiple QTL mapping experiment results. However, studying the congruency between these results still remains a complex task. Presently, the few computational and statistical frameworks to do so are mainly based on empirical methods (e.g. consensus genetic maps are generally built by iterative projection).

Results: In this article, we present a new computational and statistical package, called MetaQTL, for carrying out whole-genome meta-analysis of QTL mapping experiments. Contrary to existing methods, MetaQTL offers a complete statistical process to establish a consensus model for both the marker and the QTL positions on the whole genome. First, MetaQTL implements a new statistical approach to merge multiple distinct genetic maps into a single consensus map which is optimal in terms of weighted least squares and can be used to investigate recombination rate heterogeneity between studies. Secondly, assuming that QTL can be projected on the consensus map, MetaQTL offers a new clustering approach based on a Gaussian mixture model to decide how many QTL underly the distribution of the observed QTL.

Conclusion: We demonstrate using simulations that the usual model choice criteria from mixture model literature perform relatively well in this context. As expected, simulations also show that this new clustering algorithm leads to a reduction in the length of the confidence interval of QTL location provided that across studies there are enough observed QTL for each underlying true QTL location. The usefulness of our approach is illustrated on published QTL detection results of flowering time in maize. Finally, MetaQTL is freely available at http://bioinformatics.org/mqtl.

Show MeSH

Related in: MedlinePlus

Performance of AIC: a simulation study. Behavior of the AIC criterion for the 4 distance constraints, δmin = 1, 2, 3, 4, and for different values of the true number of QTL, K, and the number of observed QTL, q. The vertical bars indicate the probability than the AIC criterion has selected the true model. The open circles, respectively the dotted lines, represent the mean, respectively the 0.1% and 0.9% quantiles, of the ratios MSEP(2)/MSEP(1).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC1808479&req=5

Figure 2: Performance of AIC: a simulation study. Behavior of the AIC criterion for the 4 distance constraints, δmin = 1, 2, 3, 4, and for different values of the true number of QTL, K, and the number of observed QTL, q. The vertical bars indicate the probability than the AIC criterion has selected the true model. The open circles, respectively the dotted lines, represent the mean, respectively the 0.1% and 0.9% quantiles, of the ratios MSEP(2)/MSEP(1).

Mentions: In Figure 1 we summarized the result of simulations for several values of K and q by averaging over the distance constraint configurations (δmin = 1, 2, 3 and 4). At first sight the 5 criteria seem to have the same behavior whatever the configuration, except for AIC3 which crucially underperforms for small values of q (this can be explained by the higher penality of AIC3 comparing to the other criteria for small values of q). For reasonable sample size relatively to the true number of components the meta-analysis appears to be more efficient than strategy 1. Since the AIC criterion has relatively good performance in these simulations we assume that there is no need for a specific theory to deal with this kind of mixture models and that this criterion can be used to carry out model selection in this context. So, in Figure 2 we focus on the AIC criterion for the different distance configurations δmin = 1, 2, 3 and 4. This clearly shows that, for configurations with reasonable separation between the true positions of the QTL, the meta-analysis performs relatively well. It is worth noting that the better the probability to choose the true model, the better the quality of the QTL position estimates. In order to evaluate the ability of the meta-analysis to improve the precision on the "true" QTL locations we computed the quantities /xi - i (s)/ and calculated the quantiles at 95 and 90% of its empirical distribution over all the QTL for the two strategies. The smaller this confidence interval, the better the estimated position i(s). We reported in Supplementary Table 1 (see Additional File 4) the average ratios of these quantities between the two strategies. Hence, if there are actually one, two, three or four different QTL locations with a reasonable separation (δmin ≥ 2), we can see that the meta-analysis gives better estimates of the QTL locations and makes it possible to reduce the length of the 95% CI (in most of the situtations this length is halved). According to [29] to halve a CI in a QTL experiment, one needs to use at least two times the initial number of individuals.


MetaQTL: a package of new computational methods for the meta-analysis of QTL mapping experiments.

Veyrieras JB, Goffinet B, Charcosset A - BMC Bioinformatics (2007)

Performance of AIC: a simulation study. Behavior of the AIC criterion for the 4 distance constraints, δmin = 1, 2, 3, 4, and for different values of the true number of QTL, K, and the number of observed QTL, q. The vertical bars indicate the probability than the AIC criterion has selected the true model. The open circles, respectively the dotted lines, represent the mean, respectively the 0.1% and 0.9% quantiles, of the ratios MSEP(2)/MSEP(1).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC1808479&req=5

Figure 2: Performance of AIC: a simulation study. Behavior of the AIC criterion for the 4 distance constraints, δmin = 1, 2, 3, 4, and for different values of the true number of QTL, K, and the number of observed QTL, q. The vertical bars indicate the probability than the AIC criterion has selected the true model. The open circles, respectively the dotted lines, represent the mean, respectively the 0.1% and 0.9% quantiles, of the ratios MSEP(2)/MSEP(1).
Mentions: In Figure 1 we summarized the result of simulations for several values of K and q by averaging over the distance constraint configurations (δmin = 1, 2, 3 and 4). At first sight the 5 criteria seem to have the same behavior whatever the configuration, except for AIC3 which crucially underperforms for small values of q (this can be explained by the higher penality of AIC3 comparing to the other criteria for small values of q). For reasonable sample size relatively to the true number of components the meta-analysis appears to be more efficient than strategy 1. Since the AIC criterion has relatively good performance in these simulations we assume that there is no need for a specific theory to deal with this kind of mixture models and that this criterion can be used to carry out model selection in this context. So, in Figure 2 we focus on the AIC criterion for the different distance configurations δmin = 1, 2, 3 and 4. This clearly shows that, for configurations with reasonable separation between the true positions of the QTL, the meta-analysis performs relatively well. It is worth noting that the better the probability to choose the true model, the better the quality of the QTL position estimates. In order to evaluate the ability of the meta-analysis to improve the precision on the "true" QTL locations we computed the quantities /xi - i (s)/ and calculated the quantiles at 95 and 90% of its empirical distribution over all the QTL for the two strategies. The smaller this confidence interval, the better the estimated position i(s). We reported in Supplementary Table 1 (see Additional File 4) the average ratios of these quantities between the two strategies. Hence, if there are actually one, two, three or four different QTL locations with a reasonable separation (δmin ≥ 2), we can see that the meta-analysis gives better estimates of the QTL locations and makes it possible to reduce the length of the 95% CI (in most of the situtations this length is halved). According to [29] to halve a CI in a QTL experiment, one needs to use at least two times the initial number of individuals.

Bottom Line: However, studying the congruency between these results still remains a complex task.As expected, simulations also show that this new clustering algorithm leads to a reduction in the length of the confidence interval of QTL location provided that across studies there are enough observed QTL for each underlying true QTL location.The usefulness of our approach is illustrated on published QTL detection results of flowering time in maize.

View Article: PubMed Central - HTML - PubMed

Affiliation: UMR, INRA UPS-XI INAPG CNRS Génétique Végétale, Ferme du Moulon, 91190 Gif-sur-Yvette, France. veyrieras@moulon.inra.fr

ABSTRACT

Background: Integration of multiple results from Quantitative Trait Loci (QTL) studies is a key point to understand the genetic determinism of complex traits. Up to now many efforts have been made by public database developers to facilitate the storage, compilation and visualization of multiple QTL mapping experiment results. However, studying the congruency between these results still remains a complex task. Presently, the few computational and statistical frameworks to do so are mainly based on empirical methods (e.g. consensus genetic maps are generally built by iterative projection).

Results: In this article, we present a new computational and statistical package, called MetaQTL, for carrying out whole-genome meta-analysis of QTL mapping experiments. Contrary to existing methods, MetaQTL offers a complete statistical process to establish a consensus model for both the marker and the QTL positions on the whole genome. First, MetaQTL implements a new statistical approach to merge multiple distinct genetic maps into a single consensus map which is optimal in terms of weighted least squares and can be used to investigate recombination rate heterogeneity between studies. Secondly, assuming that QTL can be projected on the consensus map, MetaQTL offers a new clustering approach based on a Gaussian mixture model to decide how many QTL underly the distribution of the observed QTL.

Conclusion: We demonstrate using simulations that the usual model choice criteria from mixture model literature perform relatively well in this context. As expected, simulations also show that this new clustering algorithm leads to a reduction in the length of the confidence interval of QTL location provided that across studies there are enough observed QTL for each underlying true QTL location. The usefulness of our approach is illustrated on published QTL detection results of flowering time in maize. Finally, MetaQTL is freely available at http://bioinformatics.org/mqtl.

Show MeSH
Related in: MedlinePlus