Limits...
PhenoMeter: A Metabolome Database Search Tool Using Statistical Similarity Matching of Metabolic Phenotypes for High-Confidence Detection of Functional Links.

Carroll AJ, Zhang P, Whitehead L, Kaines S, Tcherkez G, Badger MR - Front Bioeng Biotechnol (2015)

Bottom Line: To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations.Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed.Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins.

View Article: PubMed Central - PubMed

Affiliation: College of Medicine, Biology and Environment, Research School of Biology, The Australian National University , Canberra, ACT , Australia.

ABSTRACT
This article describes PhenoMeter (PM), a new type of metabolomics database search that accepts metabolite response patterns as queries and searches the MetaPhen database of reference patterns for responses that are statistically significantly similar or inverse for the purposes of detecting functional links. To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations. The best performing statistic, the PM score, was a function of both Pearson correlation and Fisher's Exact Test of directional overlap. This statistic outperformed Pearson correlation, biweight midcorrelation and Fisher's Exact Test used alone. To demonstrate general applicability, we show that the PM reliably retrieved the most closely functionally linked response in the database when queried with responses to a wide variety of environmental and genetic perturbations. Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed. Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins. PM is freely available at MetabolomeExpress (https://www.metabolome-express.org/phenometer.php).

No MeSH data available.


Related in: MedlinePlus

Calculation of PhenoMeter (PM) score. The procedure for calculating the PhenoMeter similarity score consists of several stages. First, metabolites that are not represented or do not increase or decrease by at least the minimum threshold (1.5-fold by default) in both bait and prey are discarded. Then, signal intensity ratios (SIRs) associated with each metabolite are transformed to ResponseValues (RV = SIR–1 where SIR > 1 and RV = (−1/SIR) + 1 where SIR <1). The correlation between the RVs of the bait and prey phenotypes is then calculated. The PhenoMeter then counts the number of metabolites that are (1) increased above threshold in both phenotypes; (2) decreased below threshold in both phenotypes; (3) decreased below threshold in the bait but increased above threshold in the reference; and (4) increased above threshold in bait but decreased below threshold in reference; and then uses these values as input into a two-tailed Fisher’s Exact Test to calculate the statistical significance of the qualitative overlap of the two phenotypes (FET2p). The PM score is then calculated using the formula PM score = sgn(R)*R2*(–log10(FET2p)).
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4518198&req=5

Figure 2: Calculation of PhenoMeter (PM) score. The procedure for calculating the PhenoMeter similarity score consists of several stages. First, metabolites that are not represented or do not increase or decrease by at least the minimum threshold (1.5-fold by default) in both bait and prey are discarded. Then, signal intensity ratios (SIRs) associated with each metabolite are transformed to ResponseValues (RV = SIR–1 where SIR > 1 and RV = (−1/SIR) + 1 where SIR <1). The correlation between the RVs of the bait and prey phenotypes is then calculated. The PhenoMeter then counts the number of metabolites that are (1) increased above threshold in both phenotypes; (2) decreased below threshold in both phenotypes; (3) decreased below threshold in the bait but increased above threshold in the reference; and (4) increased above threshold in bait but decreased below threshold in reference; and then uses these values as input into a two-tailed Fisher’s Exact Test to calculate the statistical significance of the qualitative overlap of the two phenotypes (FET2p). The PM score is then calculated using the formula PM score = sgn(R)*R2*(–log10(FET2p)).

Mentions: While simply replacing R2 with R would have achieved the same sign-changing effect, we observed lower matching performance when R2 was replaced with R (data not shown due to limited space). The fact that the magnitude of the PM score is derived from two readily understood statistics (R2 and FET2p) makes its “strength” readily interpretable in familiar statistical terms. For example, the fact that a match with a marginally significant FET2p of 0.05 and an R2 of 0.8 (an above average R2 for genuine matches between functionally equivalent phenotypes; see Table 2) would have a PM score of ~1 (1.040824) makes this score a useful benchmark since scores <1 must either have low R2 or insignificant FET2p while scores >>1 must at least have a highly significant FET2p if not a high R2 as well. Thus, as a “rule of thumb,” scores <1 may be considered ‘weak’ while scores >>1 may be considered “strong.” An example PM score calculation is illustrated in Figure 2.


PhenoMeter: A Metabolome Database Search Tool Using Statistical Similarity Matching of Metabolic Phenotypes for High-Confidence Detection of Functional Links.

Carroll AJ, Zhang P, Whitehead L, Kaines S, Tcherkez G, Badger MR - Front Bioeng Biotechnol (2015)

Calculation of PhenoMeter (PM) score. The procedure for calculating the PhenoMeter similarity score consists of several stages. First, metabolites that are not represented or do not increase or decrease by at least the minimum threshold (1.5-fold by default) in both bait and prey are discarded. Then, signal intensity ratios (SIRs) associated with each metabolite are transformed to ResponseValues (RV = SIR–1 where SIR > 1 and RV = (−1/SIR) + 1 where SIR <1). The correlation between the RVs of the bait and prey phenotypes is then calculated. The PhenoMeter then counts the number of metabolites that are (1) increased above threshold in both phenotypes; (2) decreased below threshold in both phenotypes; (3) decreased below threshold in the bait but increased above threshold in the reference; and (4) increased above threshold in bait but decreased below threshold in reference; and then uses these values as input into a two-tailed Fisher’s Exact Test to calculate the statistical significance of the qualitative overlap of the two phenotypes (FET2p). The PM score is then calculated using the formula PM score = sgn(R)*R2*(–log10(FET2p)).
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4518198&req=5

Figure 2: Calculation of PhenoMeter (PM) score. The procedure for calculating the PhenoMeter similarity score consists of several stages. First, metabolites that are not represented or do not increase or decrease by at least the minimum threshold (1.5-fold by default) in both bait and prey are discarded. Then, signal intensity ratios (SIRs) associated with each metabolite are transformed to ResponseValues (RV = SIR–1 where SIR > 1 and RV = (−1/SIR) + 1 where SIR <1). The correlation between the RVs of the bait and prey phenotypes is then calculated. The PhenoMeter then counts the number of metabolites that are (1) increased above threshold in both phenotypes; (2) decreased below threshold in both phenotypes; (3) decreased below threshold in the bait but increased above threshold in the reference; and (4) increased above threshold in bait but decreased below threshold in reference; and then uses these values as input into a two-tailed Fisher’s Exact Test to calculate the statistical significance of the qualitative overlap of the two phenotypes (FET2p). The PM score is then calculated using the formula PM score = sgn(R)*R2*(–log10(FET2p)).
Mentions: While simply replacing R2 with R would have achieved the same sign-changing effect, we observed lower matching performance when R2 was replaced with R (data not shown due to limited space). The fact that the magnitude of the PM score is derived from two readily understood statistics (R2 and FET2p) makes its “strength” readily interpretable in familiar statistical terms. For example, the fact that a match with a marginally significant FET2p of 0.05 and an R2 of 0.8 (an above average R2 for genuine matches between functionally equivalent phenotypes; see Table 2) would have a PM score of ~1 (1.040824) makes this score a useful benchmark since scores <1 must either have low R2 or insignificant FET2p while scores >>1 must at least have a highly significant FET2p if not a high R2 as well. Thus, as a “rule of thumb,” scores <1 may be considered ‘weak’ while scores >>1 may be considered “strong.” An example PM score calculation is illustrated in Figure 2.

Bottom Line: To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations.Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed.Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins.

View Article: PubMed Central - PubMed

Affiliation: College of Medicine, Biology and Environment, Research School of Biology, The Australian National University , Canberra, ACT , Australia.

ABSTRACT
This article describes PhenoMeter (PM), a new type of metabolomics database search that accepts metabolite response patterns as queries and searches the MetaPhen database of reference patterns for responses that are statistically significantly similar or inverse for the purposes of detecting functional links. To identify a similarity measure that would detect functional links as reliably as possible, we compared the performance of four statistics in correctly top-matching metabolic phenotypes of Arabidopsis thaliana metabolism mutants affected in different steps of the photorespiration metabolic pathway to reference phenotypes of mutants affected in the same enzymes by independent mutations. The best performing statistic, the PM score, was a function of both Pearson correlation and Fisher's Exact Test of directional overlap. This statistic outperformed Pearson correlation, biweight midcorrelation and Fisher's Exact Test used alone. To demonstrate general applicability, we show that the PM reliably retrieved the most closely functionally linked response in the database when queried with responses to a wide variety of environmental and genetic perturbations. Attempts to match metabolic phenotypes between independent studies were met with varying success and possible reasons for this are discussed. Overall, our results suggest that integration of pattern-based search tools into metabolomics databases will aid functional annotation of newly recorded metabolic phenotypes analogously to the way sequence similarity search algorithms have aided the functional annotation of genes and proteins. PM is freely available at MetabolomeExpress (https://www.metabolome-express.org/phenometer.php).

No MeSH data available.


Related in: MedlinePlus