Limits...
Comparative methods for association studies: a case study on metabolite variation in a Brassica rapa core collection.

Pino Del Carpio D, Basnet RK, De Vos RC, Maliepaard C, Paulo MJ, Bonnema G - PLoS ONE (2011)

Bottom Line: It is essential to separate the true effect of genetic variation from other confounding factors, such as adaptation to different uses and geographical locations.This set of markers associated to the metabolites can potentially be applied for the selection of genotypes with elevated levels of these metabolites.The incorporation of the kinship correction into the association model did not reduce the number of significantly associated markers.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands.

ABSTRACT

Background: Association mapping is a statistical approach combining phenotypic traits and genetic diversity in natural populations with the goal of correlating the variation present at phenotypic and allelic levels. It is essential to separate the true effect of genetic variation from other confounding factors, such as adaptation to different uses and geographical locations. The rapid availability of large datasets makes it necessary to explore statistical methods that can be computationally less intensive and more flexible for data exploration.

Methodology/principal findings: A core collection of 168 Brassica rapa accessions of different morphotypes and origins was explored to find genetic association between markers and metabolites: tocopherols, carotenoids, chlorophylls and folate. A widely used linear model with modifications to account for population structure and kinship was followed for association mapping. In addition, a machine learning algorithm called Random Forest (RF) was used as a comparison. Comparison of results across methods resulted in the selection of a set of significant markers as promising candidates for further work. This set of markers associated to the metabolites can potentially be applied for the selection of genotypes with elevated levels of these metabolites.

Conclusions/significance: The incorporation of the kinship correction into the association model did not reduce the number of significantly associated markers. However incorporation of the STRUCTURE correction (Q matrix) in the linear regression model greatly reduced the number of significantly associated markers. Additionally, our results demonstrate that RF is an interesting complementary method with added value in association studies in plants, which is illustrated by the overlap in markers identified using RF and a linear mixed model with correction for kinship and population structure. Several markers that were selected in RF and in the models with correction for kinship, but not for population structure, were also identified as QTLs in two bi-parental DH populations.

Show MeSH

Related in: MedlinePlus

Principal coordinate analysis (A) and STRUCTURE (B) results.Colors define subpopulations: red (oil: Population 1), green (PC+T:                            population 2), blue(CC: population 3) and yellow (VT+FT: population                            4).
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC3094343&req=5

pone-0019624-g001: Principal coordinate analysis (A) and STRUCTURE (B) results.Colors define subpopulations: red (oil: Population 1), green (PC+T: population 2), blue(CC: population 3) and yellow (VT+FT: population 4).

Mentions: The Bayesian clustering method as implemented in STRUCTURE revealed 4 subpopulations. Subpopulation 1 included oil types of Indian origin, spring oil (SO), yellow sarson (YS) and rapid cycling (RC) (SO, YS and RC); subpopulation 2 included several types from Asian origin: pak choi (PC), winter oil, mizuna, mibuna, komasuna, turnip green, oil rape and Asian turnip (PC+T); subpopulation 3 included mainly accessions of Chinese cabbage (CC) and subpopulation 4 included mostly vegetable turnip (VT), fodder turnip (FT) and broccoletto accessions from European origin (VT+FT) (Figure 1B).


Comparative methods for association studies: a case study on metabolite variation in a Brassica rapa core collection.

Pino Del Carpio D, Basnet RK, De Vos RC, Maliepaard C, Paulo MJ, Bonnema G - PLoS ONE (2011)

Principal coordinate analysis (A) and STRUCTURE (B) results.Colors define subpopulations: red (oil: Population 1), green (PC+T:                            population 2), blue(CC: population 3) and yellow (VT+FT: population                            4).
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC3094343&req=5

pone-0019624-g001: Principal coordinate analysis (A) and STRUCTURE (B) results.Colors define subpopulations: red (oil: Population 1), green (PC+T: population 2), blue(CC: population 3) and yellow (VT+FT: population 4).
Mentions: The Bayesian clustering method as implemented in STRUCTURE revealed 4 subpopulations. Subpopulation 1 included oil types of Indian origin, spring oil (SO), yellow sarson (YS) and rapid cycling (RC) (SO, YS and RC); subpopulation 2 included several types from Asian origin: pak choi (PC), winter oil, mizuna, mibuna, komasuna, turnip green, oil rape and Asian turnip (PC+T); subpopulation 3 included mainly accessions of Chinese cabbage (CC) and subpopulation 4 included mostly vegetable turnip (VT), fodder turnip (FT) and broccoletto accessions from European origin (VT+FT) (Figure 1B).

Bottom Line: It is essential to separate the true effect of genetic variation from other confounding factors, such as adaptation to different uses and geographical locations.This set of markers associated to the metabolites can potentially be applied for the selection of genotypes with elevated levels of these metabolites.The incorporation of the kinship correction into the association model did not reduce the number of significantly associated markers.

View Article: PubMed Central - PubMed

Affiliation: Laboratory of Plant Breeding, Wageningen University, Wageningen, The Netherlands.

ABSTRACT

Background: Association mapping is a statistical approach combining phenotypic traits and genetic diversity in natural populations with the goal of correlating the variation present at phenotypic and allelic levels. It is essential to separate the true effect of genetic variation from other confounding factors, such as adaptation to different uses and geographical locations. The rapid availability of large datasets makes it necessary to explore statistical methods that can be computationally less intensive and more flexible for data exploration.

Methodology/principal findings: A core collection of 168 Brassica rapa accessions of different morphotypes and origins was explored to find genetic association between markers and metabolites: tocopherols, carotenoids, chlorophylls and folate. A widely used linear model with modifications to account for population structure and kinship was followed for association mapping. In addition, a machine learning algorithm called Random Forest (RF) was used as a comparison. Comparison of results across methods resulted in the selection of a set of significant markers as promising candidates for further work. This set of markers associated to the metabolites can potentially be applied for the selection of genotypes with elevated levels of these metabolites.

Conclusions/significance: The incorporation of the kinship correction into the association model did not reduce the number of significantly associated markers. However incorporation of the STRUCTURE correction (Q matrix) in the linear regression model greatly reduced the number of significantly associated markers. Additionally, our results demonstrate that RF is an interesting complementary method with added value in association studies in plants, which is illustrated by the overlap in markers identified using RF and a linear mixed model with correction for kinship and population structure. Several markers that were selected in RF and in the models with correction for kinship, but not for population structure, were also identified as QTLs in two bi-parental DH populations.

Show MeSH
Related in: MedlinePlus