Limits...
Joint Analysis of Dependent Features within Compound Spectra Can Improve Detection of Differential Features.

Trutschel D, Schmidt S, Grosse I, Neumann S - Front Bioeng Biotechnol (2015)

Bottom Line: After the initial feature detection and alignment steps, the raw data processing results in a high-dimensional data matrix of mass spectral features, which is then subjected to further statistical analysis.For a quantitative evaluation data sets with a simulated known effect between two sample classes were analyzed.The spectra-wise analysis showed better detection results for all simulated effects.

View Article: PubMed Central - PubMed

Affiliation: Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry , Halle , Germany ; Institute of Computer Science, Martin Luther University Halle-Wittenberg , Halle , Germany.

ABSTRACT
Mass spectrometry is an important analytical technology in metabolomics. After the initial feature detection and alignment steps, the raw data processing results in a high-dimensional data matrix of mass spectral features, which is then subjected to further statistical analysis. Univariate tests like Student's t-test and Analysis of Variances (ANOVA) are hypothesis tests, which aim to detect differences between two or more sample classes, e.g., wildtype-mutant or between different doses of treatments. In both cases, one of the underlying assumptions is the independence between metabolic features. However, in mass spectrometry, a single metabolite usually gives rise to several mass spectral features, which are observed together and show a common behavior. This paper suggests to group the related features of metabolites with CAMERA into compound spectra, and then to use a multivariate statistical method to test whether a compound spectrum (and thus the actual metabolite) is differential between two sample classes. The multivariate method is first demonstrated with an analysis between wild-type and an over-expression line of the model plant Arabidopsis thaliana. For a quantitative evaluation data sets with a simulated known effect between two sample classes were analyzed. The spectra-wise analysis showed better detection results for all simulated effects.

No MeSH data available.


Venn diagram of differential features and compound spectra in the simulation experiment for the simulated effect 0.5 and significance level of α = 0.05. Left: number of features detected by univariate and multivariate method. Right: number of compound spectra detected by the multivariate method, compared to the number of compound spectra where at least one feature was detected univariately.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4585098&req=5

Figure 3: Venn diagram of differential features and compound spectra in the simulation experiment for the simulated effect 0.5 and significance level of α = 0.05. Left: number of features detected by univariate and multivariate method. Right: number of compound spectra detected by the multivariate method, compared to the number of compound spectra where at least one feature was detected univariately.

Mentions: The Venn diagram in Figure 3 (left) shows the 242 features are detected as differential by all three tests, 243 by both the univariate and the T2 and 258 by both the univariate and the diagonal T2. The Comparison of the univariate and the original T2 shows that 16 features are found only by the univariate and 328 features only by the multivariate method. The same for the diagonal T2 shows that only 1 feature is found only by the univariate and 253 features only by the multivariate method. Furthermore, 200 features are found by both multivariate methods. It is shown that the feature detection has more overlap between the two multivariate methods than between one of these with the univariate approach. Now, we are especially interested in cases where the multivariate methods identify compound spectra as differential, while the univariate method detects none of the features in the spectra, or cases where the univariate method detects features whose associated compound spectra are missed by the multivariate methods (Figure 3 right). Here, only 7 compound spectra are detected by both multivariate methods, 29 by the original multivariate T2 and 25 by the diagonal multivariate method, where any feature of this spectra is detected by univariate method. In contrast, 5 compound spectra have at least one feature, which is detected by the univariate test, but the compound spectra itself are not identified by the multivariate T2 method and 1 compound spectrum in comparison with the diagonal multivariate T2. 83 groups are detected by all three tests, 84 by univariate and T2, 98 by univariate and diagonal T2 (Figure 3 right).


Joint Analysis of Dependent Features within Compound Spectra Can Improve Detection of Differential Features.

Trutschel D, Schmidt S, Grosse I, Neumann S - Front Bioeng Biotechnol (2015)

Venn diagram of differential features and compound spectra in the simulation experiment for the simulated effect 0.5 and significance level of α = 0.05. Left: number of features detected by univariate and multivariate method. Right: number of compound spectra detected by the multivariate method, compared to the number of compound spectra where at least one feature was detected univariately.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4585098&req=5

Figure 3: Venn diagram of differential features and compound spectra in the simulation experiment for the simulated effect 0.5 and significance level of α = 0.05. Left: number of features detected by univariate and multivariate method. Right: number of compound spectra detected by the multivariate method, compared to the number of compound spectra where at least one feature was detected univariately.
Mentions: The Venn diagram in Figure 3 (left) shows the 242 features are detected as differential by all three tests, 243 by both the univariate and the T2 and 258 by both the univariate and the diagonal T2. The Comparison of the univariate and the original T2 shows that 16 features are found only by the univariate and 328 features only by the multivariate method. The same for the diagonal T2 shows that only 1 feature is found only by the univariate and 253 features only by the multivariate method. Furthermore, 200 features are found by both multivariate methods. It is shown that the feature detection has more overlap between the two multivariate methods than between one of these with the univariate approach. Now, we are especially interested in cases where the multivariate methods identify compound spectra as differential, while the univariate method detects none of the features in the spectra, or cases where the univariate method detects features whose associated compound spectra are missed by the multivariate methods (Figure 3 right). Here, only 7 compound spectra are detected by both multivariate methods, 29 by the original multivariate T2 and 25 by the diagonal multivariate method, where any feature of this spectra is detected by univariate method. In contrast, 5 compound spectra have at least one feature, which is detected by the univariate test, but the compound spectra itself are not identified by the multivariate T2 method and 1 compound spectrum in comparison with the diagonal multivariate T2. 83 groups are detected by all three tests, 84 by univariate and T2, 98 by univariate and diagonal T2 (Figure 3 right).

Bottom Line: After the initial feature detection and alignment steps, the raw data processing results in a high-dimensional data matrix of mass spectral features, which is then subjected to further statistical analysis.For a quantitative evaluation data sets with a simulated known effect between two sample classes were analyzed.The spectra-wise analysis showed better detection results for all simulated effects.

View Article: PubMed Central - PubMed

Affiliation: Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry , Halle , Germany ; Institute of Computer Science, Martin Luther University Halle-Wittenberg , Halle , Germany.

ABSTRACT
Mass spectrometry is an important analytical technology in metabolomics. After the initial feature detection and alignment steps, the raw data processing results in a high-dimensional data matrix of mass spectral features, which is then subjected to further statistical analysis. Univariate tests like Student's t-test and Analysis of Variances (ANOVA) are hypothesis tests, which aim to detect differences between two or more sample classes, e.g., wildtype-mutant or between different doses of treatments. In both cases, one of the underlying assumptions is the independence between metabolic features. However, in mass spectrometry, a single metabolite usually gives rise to several mass spectral features, which are observed together and show a common behavior. This paper suggests to group the related features of metabolites with CAMERA into compound spectra, and then to use a multivariate statistical method to test whether a compound spectrum (and thus the actual metabolite) is differential between two sample classes. The multivariate method is first demonstrated with an analysis between wild-type and an over-expression line of the model plant Arabidopsis thaliana. For a quantitative evaluation data sets with a simulated known effect between two sample classes were analyzed. The spectra-wise analysis showed better detection results for all simulated effects.

No MeSH data available.