Limits...
Self-organising maps and correlation analysis as a tool to explore patterns in excitation-emission matrix data sets and to discriminate dissolved organic matter fluorescence components.

Ejarque-Gonzalez E, Butturini A - PLoS ONE (2014)

Bottom Line: SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure.According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics.We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.

View Article: PubMed Central - PubMed

Affiliation: Departament d'Ecologia, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalunya, Spain.

ABSTRACT
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.

Show MeSH

Related in: MedlinePlus

Localisation of the fluorescence components.Representation of the four groups of wavelength coordinates determined by correlation analysis on the excitation-emission space.
© Copyright Policy
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4048288&req=5

pone-0099618-g007: Localisation of the fluorescence components.Representation of the four groups of wavelength coordinates determined by correlation analysis on the excitation-emission space.

Mentions: The four groups of wavelength coordinates (hereafter referred to as C1 to C4) are represented on the excitation-emission space in Figure 7. It can be seen that they appear spatially grouped in the optical plane and, moreover, that they overlap regions previously related to specific DOM fluorophores in the literature (Table 2). C4 corresponds to the V region of Chen et al. [51] and broadly to peak C of Coble [54], which were associated with humic-like substances. This component has been detected in a wide range of aquatic environments but mainly in waters draining forested catchments [2], and hence, represents an indicator of terrestrially derived DOM [54]. In the same emission range, but at the lowest excitation wavelengths, component C3 is apparent. Similarly to C4, it has also been associated with humic-like components of terrestrial origin but with a higher molecular weight and more freshly released character [2], [55]. In the region of the EEM with the lowest emissions are two spots centred at λex/λem = 230/330 nm and 270/310 nm (C1), similarly to the coordinates of maximal fluorescence of tyrosine [56]. Hence, components appearing at these wavelengths have been attributed to peptide material resembling or containing tyrosine, indicating the presence of autochthonous microbially derived DOM [57]. Finally, C2 covers an area surrounding the previous protein-like spots, overlapping the region occupied by tryptophan [56]. This component has also been reported to reflect microbial activity, and has been used as an indicator of anthropogenic DOM inputs [58]–[60].


Self-organising maps and correlation analysis as a tool to explore patterns in excitation-emission matrix data sets and to discriminate dissolved organic matter fluorescence components.

Ejarque-Gonzalez E, Butturini A - PLoS ONE (2014)

Localisation of the fluorescence components.Representation of the four groups of wavelength coordinates determined by correlation analysis on the excitation-emission space.
© Copyright Policy
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4048288&req=5

pone-0099618-g007: Localisation of the fluorescence components.Representation of the four groups of wavelength coordinates determined by correlation analysis on the excitation-emission space.
Mentions: The four groups of wavelength coordinates (hereafter referred to as C1 to C4) are represented on the excitation-emission space in Figure 7. It can be seen that they appear spatially grouped in the optical plane and, moreover, that they overlap regions previously related to specific DOM fluorophores in the literature (Table 2). C4 corresponds to the V region of Chen et al. [51] and broadly to peak C of Coble [54], which were associated with humic-like substances. This component has been detected in a wide range of aquatic environments but mainly in waters draining forested catchments [2], and hence, represents an indicator of terrestrially derived DOM [54]. In the same emission range, but at the lowest excitation wavelengths, component C3 is apparent. Similarly to C4, it has also been associated with humic-like components of terrestrial origin but with a higher molecular weight and more freshly released character [2], [55]. In the region of the EEM with the lowest emissions are two spots centred at λex/λem = 230/330 nm and 270/310 nm (C1), similarly to the coordinates of maximal fluorescence of tyrosine [56]. Hence, components appearing at these wavelengths have been attributed to peptide material resembling or containing tyrosine, indicating the presence of autochthonous microbially derived DOM [57]. Finally, C2 covers an area surrounding the previous protein-like spots, overlapping the region occupied by tryptophan [56]. This component has also been reported to reflect microbial activity, and has been used as an indicator of anthropogenic DOM inputs [58]–[60].

Bottom Line: SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure.According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics.We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.

View Article: PubMed Central - PubMed

Affiliation: Departament d'Ecologia, Facultat de Biologia, Universitat de Barcelona, Barcelona, Catalunya, Spain.

ABSTRACT
Dissolved organic matter (DOM) is a complex mixture of organic compounds, ubiquitous in marine and freshwater systems. Fluorescence spectroscopy, by means of Excitation-Emission Matrices (EEM), has become an indispensable tool to study DOM sources, transport and fate in aquatic ecosystems. However the statistical treatment of large and heterogeneous EEM data sets still represents an important challenge for biogeochemists. Recently, Self-Organising Maps (SOM) has been proposed as a tool to explore patterns in large EEM data sets. SOM is a pattern recognition method which clusterizes and reduces the dimensionality of input EEMs without relying on any assumption about the data structure. In this paper, we show how SOM, coupled with a correlation analysis of the component planes, can be used both to explore patterns among samples, as well as to identify individual fluorescence components. We analysed a large and heterogeneous EEM data set, including samples from a river catchment collected under a range of hydrological conditions, along a 60-km downstream gradient, and under the influence of different degrees of anthropogenic impact. According to our results, chemical industry effluents appeared to have unique and distinctive spectral characteristics. On the other hand, river samples collected under flash flood conditions showed homogeneous EEM shapes. The correlation analysis of the component planes suggested the presence of four fluorescence components, consistent with DOM components previously described in the literature. A remarkable strength of this methodology was that outlier samples appeared naturally integrated in the analysis. We conclude that SOM coupled with a correlation analysis procedure is a promising tool for studying large and heterogeneous EEM data sets.

Show MeSH
Related in: MedlinePlus