Limits...
Evaluating the utility of mid-infrared spectral subspaces for predicting soil properties.

Sila AM, Shepherd KD, Pokhariyal GP - Chemometr Intell Lab Syst (2016)

Bottom Line: The root mean square error of prediction was computed using a one-third-holdout validation set.In summary, the results show that global models outperformed the subspace models.We, therefore, conclude that global models are more accurate than the local models except in few cases.

View Article: PubMed Central - PubMed

Affiliation: World Agroforestry Centre (ICRAF), P.O. Box 30677-00100 GPO, Nairobi, Kenya; School of Mathematics, University of Nairobi, P.O Box 30196-00100 GPO, Nairobi, Kenya.

ABSTRACT

We propose four methods for finding local subspaces in large spectral libraries. The proposed four methods include (a) cosine angle spectral matching; (b) hit quality index spectral matching; (c) self-organizing maps and (d) archetypal analysis methods. Then evaluate prediction accuracies for global and subspaces calibration models. These methods were tested on a mid-infrared spectral library containing 1907 soil samples collected from 19 different countries under the Africa Soil Information Service project. Calibration models for pH, Mehlich-3 Ca, Mehlich-3 Al, total carbon and clay soil properties were developed for the whole library and for the subspace. Root mean square error of prediction was used to evaluate predictive performance of subspace and global models. The root mean square error of prediction was computed using a one-third-holdout validation set. Effect of pretreating spectra with different methods was tested for 1st and 2nd derivative Savitzky-Golay algorithm, multiplicative scatter correction, standard normal variate and standard normal variate followed by detrending methods. In summary, the results show that global models outperformed the subspace models. We, therefore, conclude that global models are more accurate than the local models except in few cases. For instance, sand and clay root mean square error values from local models from archetypal analysis method were 50% poorer than the global models except for subspace models obtained using multiplicative scatter corrected spectra with which were 12% better. However, the subspace approach provides novel methods for discovering data pattern that may exist in large spectral libraries.

No MeSH data available.


Related in: MedlinePlus

1st derivative preprocessed MIR spectra PCA scores' sample points labeled in each sample space.
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4834557&req=5

f0015: 1st derivative preprocessed MIR spectra PCA scores' sample points labeled in each sample space.

Mentions: Distribution of the samples within their local spaces is shown in Fig. 3 using score plot for the first two principle components (PCs) for all the 1906 samples used in this study. The first two PCs explain up to 74.4% of the original mid-infrared spectral variation, which comprises both physical and chemical soil information. Using different colors and labeling sample points according to their local subspaces, we showed how well some of the subspace methods discovered hidden structure in the global spectral library. For instance, the SOMSS gave well-separated clusters, labeled as SOM1, SOM2, SOM3 and SOM4. When the points were projected into a PC score plot and read side by side with the subspaces from HQISS it was easy to relate SOM1 samples with soil samples identified as close to the sample with pure quartz. Samples associated with SOM2 can be said to belong to sample class associated with pure Montmorillonite mineral. SOM3 gave mixed samples associated with Halloysite, Montmorillonite and Illite pure minerals as identified in the HQISS. SOM4 was also a mixed bag when related to samples identified in both the ArchetypeSS and HQISS. In the ArchetypeSS it is seen to be dominated by archetype1 interspersed with the few samples assigned to archetype-3 and a mixture of samples associated with Montmorillonite and Illite. Using Tukey's test we found that mean total carbon between subspaces obtained using SOMSS and ArchetypeSS differed significantly in each subspace.


Evaluating the utility of mid-infrared spectral subspaces for predicting soil properties.

Sila AM, Shepherd KD, Pokhariyal GP - Chemometr Intell Lab Syst (2016)

1st derivative preprocessed MIR spectra PCA scores' sample points labeled in each sample space.
© Copyright Policy - CC BY
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4834557&req=5

f0015: 1st derivative preprocessed MIR spectra PCA scores' sample points labeled in each sample space.
Mentions: Distribution of the samples within their local spaces is shown in Fig. 3 using score plot for the first two principle components (PCs) for all the 1906 samples used in this study. The first two PCs explain up to 74.4% of the original mid-infrared spectral variation, which comprises both physical and chemical soil information. Using different colors and labeling sample points according to their local subspaces, we showed how well some of the subspace methods discovered hidden structure in the global spectral library. For instance, the SOMSS gave well-separated clusters, labeled as SOM1, SOM2, SOM3 and SOM4. When the points were projected into a PC score plot and read side by side with the subspaces from HQISS it was easy to relate SOM1 samples with soil samples identified as close to the sample with pure quartz. Samples associated with SOM2 can be said to belong to sample class associated with pure Montmorillonite mineral. SOM3 gave mixed samples associated with Halloysite, Montmorillonite and Illite pure minerals as identified in the HQISS. SOM4 was also a mixed bag when related to samples identified in both the ArchetypeSS and HQISS. In the ArchetypeSS it is seen to be dominated by archetype1 interspersed with the few samples assigned to archetype-3 and a mixture of samples associated with Montmorillonite and Illite. Using Tukey's test we found that mean total carbon between subspaces obtained using SOMSS and ArchetypeSS differed significantly in each subspace.

Bottom Line: The root mean square error of prediction was computed using a one-third-holdout validation set.In summary, the results show that global models outperformed the subspace models.We, therefore, conclude that global models are more accurate than the local models except in few cases.

View Article: PubMed Central - PubMed

Affiliation: World Agroforestry Centre (ICRAF), P.O. Box 30677-00100 GPO, Nairobi, Kenya; School of Mathematics, University of Nairobi, P.O Box 30196-00100 GPO, Nairobi, Kenya.

ABSTRACT

We propose four methods for finding local subspaces in large spectral libraries. The proposed four methods include (a) cosine angle spectral matching; (b) hit quality index spectral matching; (c) self-organizing maps and (d) archetypal analysis methods. Then evaluate prediction accuracies for global and subspaces calibration models. These methods were tested on a mid-infrared spectral library containing 1907 soil samples collected from 19 different countries under the Africa Soil Information Service project. Calibration models for pH, Mehlich-3 Ca, Mehlich-3 Al, total carbon and clay soil properties were developed for the whole library and for the subspace. Root mean square error of prediction was used to evaluate predictive performance of subspace and global models. The root mean square error of prediction was computed using a one-third-holdout validation set. Effect of pretreating spectra with different methods was tested for 1st and 2nd derivative Savitzky-Golay algorithm, multiplicative scatter correction, standard normal variate and standard normal variate followed by detrending methods. In summary, the results show that global models outperformed the subspace models. We, therefore, conclude that global models are more accurate than the local models except in few cases. For instance, sand and clay root mean square error values from local models from archetypal analysis method were 50% poorer than the global models except for subspace models obtained using multiplicative scatter corrected spectra with which were 12% better. However, the subspace approach provides novel methods for discovering data pattern that may exist in large spectral libraries.

No MeSH data available.


Related in: MedlinePlus