Limits...
Identification of metabolites from 2D (1)H-(13)C HSQC NMR using peak correlation plots.

Öman T, Tessem MB, Bathen TF, Bertilsson H, Angelsen A, Hedenström M, Andreassen T - BMC Bioinformatics (2014)

Bottom Line: For the identification of individual metabolites in metabolomics, correlation or covariance between peaks in (1)H NMR spectra has previously been successfully employed.The identities of these metabolites were confirmed by comparing the correlation plots with reported NMR data, mostly from the Human Metabolome Database.The correlation plots highlight cross-peaks belonging to each individual compound, not limited by long-range magnetization transfer as conventional NMR experiments.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry, Umeå University, Umeå, Sweden. tommy.oman@ltu.se.

ABSTRACT

Background: Identification of individual components in complex mixtures is an important and sometimes daunting task in several research areas like metabolomics and natural product studies. NMR spectroscopy is an excellent technique for analysis of mixtures of organic compounds and gives a detailed chemical fingerprint of most individual components above the detection limit. For the identification of individual metabolites in metabolomics, correlation or covariance between peaks in (1)H NMR spectra has previously been successfully employed. Similar correlation of 2D (1)H-(13)C Heteronuclear Single Quantum Correlation spectra was recently applied to investigate the structure of heparine. In this paper, we demonstrate how a similar approach can be used to identify metabolites in human biofluids (post-prostatic palpation urine).

Results: From 50 (1)H-(13)C Heteronuclear Single Quantum Correlation spectra, 23 correlation plots resembling pure metabolites were constructed. The identities of these metabolites were confirmed by comparing the correlation plots with reported NMR data, mostly from the Human Metabolome Database.

Conclusions: Correlation plots prepared by statistically correlating (1)H-(13)C Heteronuclear Single Quantum Correlation spectra from human biofluids provide unambiguous identification of metabolites. The correlation plots highlight cross-peaks belonging to each individual compound, not limited by long-range magnetization transfer as conventional NMR experiments.

Show MeSH

Related in: MedlinePlus

Procedure for generating correlation plots. Each spectrum is transformed to a row vector where the chemical shifts for both 1H and 13C are encoded, forming a matrix with dimensions n x K (step 1). By plotting one of these vectors, real signals are easily discerned from noise and an appropriate noise threshold may be selected. Data points are removed from the matrix only when all values in the column (from all HSQC spectra) are lower than the selected threshold. This noise exclusion step results in a final matrix X of a more manageable size that still contains all relevant information (step 2). Any of the rows in X can be transformed to a matrix of the original format and plotted as a noise-free HSQC-spectrum. From this plot, a cross-peak (coordinate) of interest may be selected, corresponding to the column vector vpeak (step 3) in X. At this point, X (and vpeak) is auto-scaled and a correlation vector cpeak is calculated according to equation 2. This vector will contain values between −1 and 1, i.e. correlation coefficients, and can be visualized as a 2D spectrum after re-introducing zeros to the data points omitted in the noise exclusion step, followed by transformation to a matrix with the same dimensions as the original data (step 4). A cutoff for the correlation is then chosen for the visualization, for example 0.9, to only show peaks highly correlated (>0.9) with the chosen peak.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4274720&req=5

Fig1: Procedure for generating correlation plots. Each spectrum is transformed to a row vector where the chemical shifts for both 1H and 13C are encoded, forming a matrix with dimensions n x K (step 1). By plotting one of these vectors, real signals are easily discerned from noise and an appropriate noise threshold may be selected. Data points are removed from the matrix only when all values in the column (from all HSQC spectra) are lower than the selected threshold. This noise exclusion step results in a final matrix X of a more manageable size that still contains all relevant information (step 2). Any of the rows in X can be transformed to a matrix of the original format and plotted as a noise-free HSQC-spectrum. From this plot, a cross-peak (coordinate) of interest may be selected, corresponding to the column vector vpeak (step 3) in X. At this point, X (and vpeak) is auto-scaled and a correlation vector cpeak is calculated according to equation 2. This vector will contain values between −1 and 1, i.e. correlation coefficients, and can be visualized as a 2D spectrum after re-introducing zeros to the data points omitted in the noise exclusion step, followed by transformation to a matrix with the same dimensions as the original data (step 4). A cutoff for the correlation is then chosen for the visualization, for example 0.9, to only show peaks highly correlated (>0.9) with the chosen peak.

Mentions: This approach is similar to the one used by Rudd et al. [17]. The peaks of interest were selected in a point-and-click fashion from a plot of a representative HSQC spectrum. Each HSQC cross-peak encompasses a number of data points, and to remedy small changes in chemical shift, the most central data point within each cross-peak was selected. This usually coincided with the local maxima. The correlation coefficients calculated range from −1 to 1, with 1 meaning perfect positive correlation. By only plotting the most highly correlated data points, i.e. setting a high cutoff for the correlation coefficient, HSQC spectra of seemingly pure compounds could be produced. A pictorial overview of the procedure is presented in Figure 1, starting from aligned and normalized (optional) 1H-13C HSQC spectra. All steps, including alignment and normalization, have been implemented in Matlab (Mathworks, Natick, MA) scripts together with a graphical user interface developed in-house. The scripts import 1H-13C HSQC spectra in Bruker format (2rr files) and can also export the resulting correlation plots in Bruker format for visualization in Topspin. All functions are activated from an intuitive graphical interface, making them easily accessible for unexperienced Matlab users. Matlab scripts are available upon request.Figure 1


Identification of metabolites from 2D (1)H-(13)C HSQC NMR using peak correlation plots.

Öman T, Tessem MB, Bathen TF, Bertilsson H, Angelsen A, Hedenström M, Andreassen T - BMC Bioinformatics (2014)

Procedure for generating correlation plots. Each spectrum is transformed to a row vector where the chemical shifts for both 1H and 13C are encoded, forming a matrix with dimensions n x K (step 1). By plotting one of these vectors, real signals are easily discerned from noise and an appropriate noise threshold may be selected. Data points are removed from the matrix only when all values in the column (from all HSQC spectra) are lower than the selected threshold. This noise exclusion step results in a final matrix X of a more manageable size that still contains all relevant information (step 2). Any of the rows in X can be transformed to a matrix of the original format and plotted as a noise-free HSQC-spectrum. From this plot, a cross-peak (coordinate) of interest may be selected, corresponding to the column vector vpeak (step 3) in X. At this point, X (and vpeak) is auto-scaled and a correlation vector cpeak is calculated according to equation 2. This vector will contain values between −1 and 1, i.e. correlation coefficients, and can be visualized as a 2D spectrum after re-introducing zeros to the data points omitted in the noise exclusion step, followed by transformation to a matrix with the same dimensions as the original data (step 4). A cutoff for the correlation is then chosen for the visualization, for example 0.9, to only show peaks highly correlated (>0.9) with the chosen peak.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4274720&req=5

Fig1: Procedure for generating correlation plots. Each spectrum is transformed to a row vector where the chemical shifts for both 1H and 13C are encoded, forming a matrix with dimensions n x K (step 1). By plotting one of these vectors, real signals are easily discerned from noise and an appropriate noise threshold may be selected. Data points are removed from the matrix only when all values in the column (from all HSQC spectra) are lower than the selected threshold. This noise exclusion step results in a final matrix X of a more manageable size that still contains all relevant information (step 2). Any of the rows in X can be transformed to a matrix of the original format and plotted as a noise-free HSQC-spectrum. From this plot, a cross-peak (coordinate) of interest may be selected, corresponding to the column vector vpeak (step 3) in X. At this point, X (and vpeak) is auto-scaled and a correlation vector cpeak is calculated according to equation 2. This vector will contain values between −1 and 1, i.e. correlation coefficients, and can be visualized as a 2D spectrum after re-introducing zeros to the data points omitted in the noise exclusion step, followed by transformation to a matrix with the same dimensions as the original data (step 4). A cutoff for the correlation is then chosen for the visualization, for example 0.9, to only show peaks highly correlated (>0.9) with the chosen peak.
Mentions: This approach is similar to the one used by Rudd et al. [17]. The peaks of interest were selected in a point-and-click fashion from a plot of a representative HSQC spectrum. Each HSQC cross-peak encompasses a number of data points, and to remedy small changes in chemical shift, the most central data point within each cross-peak was selected. This usually coincided with the local maxima. The correlation coefficients calculated range from −1 to 1, with 1 meaning perfect positive correlation. By only plotting the most highly correlated data points, i.e. setting a high cutoff for the correlation coefficient, HSQC spectra of seemingly pure compounds could be produced. A pictorial overview of the procedure is presented in Figure 1, starting from aligned and normalized (optional) 1H-13C HSQC spectra. All steps, including alignment and normalization, have been implemented in Matlab (Mathworks, Natick, MA) scripts together with a graphical user interface developed in-house. The scripts import 1H-13C HSQC spectra in Bruker format (2rr files) and can also export the resulting correlation plots in Bruker format for visualization in Topspin. All functions are activated from an intuitive graphical interface, making them easily accessible for unexperienced Matlab users. Matlab scripts are available upon request.Figure 1

Bottom Line: For the identification of individual metabolites in metabolomics, correlation or covariance between peaks in (1)H NMR spectra has previously been successfully employed.The identities of these metabolites were confirmed by comparing the correlation plots with reported NMR data, mostly from the Human Metabolome Database.The correlation plots highlight cross-peaks belonging to each individual compound, not limited by long-range magnetization transfer as conventional NMR experiments.

View Article: PubMed Central - PubMed

Affiliation: Department of Chemistry, Umeå University, Umeå, Sweden. tommy.oman@ltu.se.

ABSTRACT

Background: Identification of individual components in complex mixtures is an important and sometimes daunting task in several research areas like metabolomics and natural product studies. NMR spectroscopy is an excellent technique for analysis of mixtures of organic compounds and gives a detailed chemical fingerprint of most individual components above the detection limit. For the identification of individual metabolites in metabolomics, correlation or covariance between peaks in (1)H NMR spectra has previously been successfully employed. Similar correlation of 2D (1)H-(13)C Heteronuclear Single Quantum Correlation spectra was recently applied to investigate the structure of heparine. In this paper, we demonstrate how a similar approach can be used to identify metabolites in human biofluids (post-prostatic palpation urine).

Results: From 50 (1)H-(13)C Heteronuclear Single Quantum Correlation spectra, 23 correlation plots resembling pure metabolites were constructed. The identities of these metabolites were confirmed by comparing the correlation plots with reported NMR data, mostly from the Human Metabolome Database.

Conclusions: Correlation plots prepared by statistically correlating (1)H-(13)C Heteronuclear Single Quantum Correlation spectra from human biofluids provide unambiguous identification of metabolites. The correlation plots highlight cross-peaks belonging to each individual compound, not limited by long-range magnetization transfer as conventional NMR experiments.

Show MeSH
Related in: MedlinePlus