Limits...
Towards systems genetic analyses in barley: Integration of phenotypic, expression and genotype data into GeneNetwork.

Druka A, Druka I, Centeno AG, Li H, Sun Z, Thomas WT, Bonar N, Steffenson BJ, Ullrich SE, Kleinhofs A, Wise RP, Close TJ, Potokina E, Luo Z, Wagner C, Schweizer GF, Marshall DF, Kearsey MJ, Williams RW, Waugh R - BMC Genet. (2008)

Bottom Line: By integrating barley genotypic, phenotypic and mRNA abundance data sets directly within GeneNetwork's analytical environment we provide simple web access to the data for the research community.In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning.By integrating datasets from an unsequenced crop plant (barley) in a database that has been designed for an animal model species (mouse) with a well established genome sequence, we prove the importance of the concept and practice of modular development and interoperability of software engineering for biological data sets.

View Article: PubMed Central - HTML - PubMed

Affiliation: Scottish Crop Research Institute, Invergowrie, Dundee, UK. Arnis.Druka@scri.ac.uk

ABSTRACT

Background: A typical genetical genomics experiment results in four separate data sets; genotype, gene expression, higher-order phenotypic data and metadata that describe the protocols, processing and the array platform. Used in concert, these data sets provide the opportunity to perform genetic analysis at a systems level. Their predictive power is largely determined by the gene expression dataset where tens of millions of data points can be generated using currently available mRNA profiling technologies. Such large, multidimensional data sets often have value beyond that extracted during their initial analysis and interpretation, particularly if conducted on widely distributed reference genetic materials. Besides quality and scale, access to the data is of primary importance as accessibility potentially allows the extraction of considerable added value from the same primary dataset by the wider research community. Although the number of genetical genomics experiments in different plant species is rapidly increasing, none to date has been presented in a form that allows quick and efficient on-line testing for possible associations between genes, loci and traits of interest by an entire research community.

Description: Using a reference population of 150 recombinant doubled haploid barley lines we generated novel phenotypic, mRNA abundance and SNP-based genotyping data sets, added them to a considerable volume of legacy trait data and entered them into the GeneNetwork http://www.genenetwork.org. GeneNetwork is a unified on-line analytical environment that enables the user to test genetic hypotheses about how component traits, such as mRNA abundance, may interact to condition more complex biological phenotypes (higher-order traits). Here we describe these barley data sets and demonstrate some of the functionalities GeneNetwork provides as an easily accessible and integrated analytical environment for exploring them.

Conclusion: By integrating barley genotypic, phenotypic and mRNA abundance data sets directly within GeneNetwork's analytical environment we provide simple web access to the data for the research community. In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning. By integrating datasets from an unsequenced crop plant (barley) in a database that has been designed for an animal model species (mouse) with a well established genome sequence, we prove the importance of the concept and practice of modular development and interoperability of software engineering for biological data sets.

Show MeSH

Related in: MedlinePlus

Results of principal component analysis (A) and association network (B) show the relationships between the major barley phenotypic traits integrated into the GeneNetwork. The network was built using scores of the first four principal components (c1–c4) calculated by combining data from a single trait measured in different locations and years, or related (component) traits underlying a higher order trait (e.g. malt quality data). Concerning the latter, principal component scores for malting quality traits were calculated from combined alpha amylase, diastatic power, grain protein and malt extract trait values. Principal component node colouring; c1-black background, c2-grey, c3 and c4 – white). Double-lined links – positive correlations; Bold, thick links – negative. For clarity, the network was re-drawn using GeneNetwork's output.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC2630324&req=5

Figure 3: Results of principal component analysis (A) and association network (B) show the relationships between the major barley phenotypic traits integrated into the GeneNetwork. The network was built using scores of the first four principal components (c1–c4) calculated by combining data from a single trait measured in different locations and years, or related (component) traits underlying a higher order trait (e.g. malt quality data). Concerning the latter, principal component scores for malting quality traits were calculated from combined alpha amylase, diastatic power, grain protein and malt extract trait values. Principal component node colouring; c1-black background, c2-grey, c3 and c4 – white). Double-lined links – positive correlations; Bold, thick links – negative. For clarity, the network was re-drawn using GeneNetwork's output.

Mentions: An association network using principal component scores calculated using a selected set of malting quality and yield-related trait data as variables provides an overview of the key barley traits that segregate in the St/Mx population (Figure 3, Additional File 3). The cumulative variation explained by the first four principle components ranged from around 90% for heading date to 40% for grain size (Figure 3A), suggesting a strong genetic component for the former, and a more complex situation for the latter. The derived association network (Figure 3B) revealed some known and obvious relationships. For example, the main yield component 'yield-c1' (c1 = principle component 1) is negatively correlated with 'plant height-c1' and 'lodging-c1' and 'lodging-c2'. In contrast, there is a positive correlation between 'lodging-c1' and -c2 with 'height-c1'. This is entirely consistent with taller plants lodging more which results in grain loss during harvest. The St/Mx population was originally designed to dissect two contrasting barley traits, yield and malting quality [21]. The trait association network in Figure 3B shows links only between the minor components of these traits (malting-c1 to yield-c3 and malting-c2 to yield-c2) suggesting complex underlying genetics.


Towards systems genetic analyses in barley: Integration of phenotypic, expression and genotype data into GeneNetwork.

Druka A, Druka I, Centeno AG, Li H, Sun Z, Thomas WT, Bonar N, Steffenson BJ, Ullrich SE, Kleinhofs A, Wise RP, Close TJ, Potokina E, Luo Z, Wagner C, Schweizer GF, Marshall DF, Kearsey MJ, Williams RW, Waugh R - BMC Genet. (2008)

Results of principal component analysis (A) and association network (B) show the relationships between the major barley phenotypic traits integrated into the GeneNetwork. The network was built using scores of the first four principal components (c1–c4) calculated by combining data from a single trait measured in different locations and years, or related (component) traits underlying a higher order trait (e.g. malt quality data). Concerning the latter, principal component scores for malting quality traits were calculated from combined alpha amylase, diastatic power, grain protein and malt extract trait values. Principal component node colouring; c1-black background, c2-grey, c3 and c4 – white). Double-lined links – positive correlations; Bold, thick links – negative. For clarity, the network was re-drawn using GeneNetwork's output.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC2630324&req=5

Figure 3: Results of principal component analysis (A) and association network (B) show the relationships between the major barley phenotypic traits integrated into the GeneNetwork. The network was built using scores of the first four principal components (c1–c4) calculated by combining data from a single trait measured in different locations and years, or related (component) traits underlying a higher order trait (e.g. malt quality data). Concerning the latter, principal component scores for malting quality traits were calculated from combined alpha amylase, diastatic power, grain protein and malt extract trait values. Principal component node colouring; c1-black background, c2-grey, c3 and c4 – white). Double-lined links – positive correlations; Bold, thick links – negative. For clarity, the network was re-drawn using GeneNetwork's output.
Mentions: An association network using principal component scores calculated using a selected set of malting quality and yield-related trait data as variables provides an overview of the key barley traits that segregate in the St/Mx population (Figure 3, Additional File 3). The cumulative variation explained by the first four principle components ranged from around 90% for heading date to 40% for grain size (Figure 3A), suggesting a strong genetic component for the former, and a more complex situation for the latter. The derived association network (Figure 3B) revealed some known and obvious relationships. For example, the main yield component 'yield-c1' (c1 = principle component 1) is negatively correlated with 'plant height-c1' and 'lodging-c1' and 'lodging-c2'. In contrast, there is a positive correlation between 'lodging-c1' and -c2 with 'height-c1'. This is entirely consistent with taller plants lodging more which results in grain loss during harvest. The St/Mx population was originally designed to dissect two contrasting barley traits, yield and malting quality [21]. The trait association network in Figure 3B shows links only between the minor components of these traits (malting-c1 to yield-c3 and malting-c2 to yield-c2) suggesting complex underlying genetics.

Bottom Line: By integrating barley genotypic, phenotypic and mRNA abundance data sets directly within GeneNetwork's analytical environment we provide simple web access to the data for the research community.In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning.By integrating datasets from an unsequenced crop plant (barley) in a database that has been designed for an animal model species (mouse) with a well established genome sequence, we prove the importance of the concept and practice of modular development and interoperability of software engineering for biological data sets.

View Article: PubMed Central - HTML - PubMed

Affiliation: Scottish Crop Research Institute, Invergowrie, Dundee, UK. Arnis.Druka@scri.ac.uk

ABSTRACT

Background: A typical genetical genomics experiment results in four separate data sets; genotype, gene expression, higher-order phenotypic data and metadata that describe the protocols, processing and the array platform. Used in concert, these data sets provide the opportunity to perform genetic analysis at a systems level. Their predictive power is largely determined by the gene expression dataset where tens of millions of data points can be generated using currently available mRNA profiling technologies. Such large, multidimensional data sets often have value beyond that extracted during their initial analysis and interpretation, particularly if conducted on widely distributed reference genetic materials. Besides quality and scale, access to the data is of primary importance as accessibility potentially allows the extraction of considerable added value from the same primary dataset by the wider research community. Although the number of genetical genomics experiments in different plant species is rapidly increasing, none to date has been presented in a form that allows quick and efficient on-line testing for possible associations between genes, loci and traits of interest by an entire research community.

Description: Using a reference population of 150 recombinant doubled haploid barley lines we generated novel phenotypic, mRNA abundance and SNP-based genotyping data sets, added them to a considerable volume of legacy trait data and entered them into the GeneNetwork http://www.genenetwork.org. GeneNetwork is a unified on-line analytical environment that enables the user to test genetic hypotheses about how component traits, such as mRNA abundance, may interact to condition more complex biological phenotypes (higher-order traits). Here we describe these barley data sets and demonstrate some of the functionalities GeneNetwork provides as an easily accessible and integrated analytical environment for exploring them.

Conclusion: By integrating barley genotypic, phenotypic and mRNA abundance data sets directly within GeneNetwork's analytical environment we provide simple web access to the data for the research community. In this environment, a combination of correlation analysis and linkage mapping provides the potential to identify and substantiate gene targets for saturation mapping and positional cloning. By integrating datasets from an unsequenced crop plant (barley) in a database that has been designed for an animal model species (mouse) with a well established genome sequence, we prove the importance of the concept and practice of modular development and interoperability of software engineering for biological data sets.

Show MeSH
Related in: MedlinePlus