Limits...
Informatics derived materials databases for multifunctional properties

View Article: PubMed Central - PubMed

ABSTRACT

In this review, we provide an overview of the development of quantitative structure–property relationships incorporating the impact of data uncertainty from small, limited knowledge data sets from which we rapidly develop new and larger databases. Unlike traditional database development, this informatics based approach is concurrent with the identification and discovery of the key metrics controlling structure–property relationships; and even more importantly we are now in a position to build materials databases based on design ‘intent’ and not just design parameters. This permits for example to establish materials databases that can be used for targeted multifunctional properties and not just one characteristic at a time as is presently done. This review provides a summary of the computational logic of building such virtual databases and gives some examples in the field of complex inorganic solids for scintillator applications.

No MeSH data available.


Related in: MedlinePlus

The description of data used in this analysis and the logic for developing a QSPR. The regression coefficients are defined as W∗CT, which is the product of weights for converting predictor and predicted variables into latent variable space, respectively. PLS operates by performing separate PCA-like analyses on the predictor matrix and the predicted matrix, outputting the weights needed to convert the matrix to latent variable space and the values of the materials in latent variable space (scores). The PLS mathematics perform matrix transformations so that the final predictive model calculates the properties as a function of the reduced descriptor set, the weights (importance) of the individual descriptors, and the weights of the predicted variables for the training data. This results in a computationally efficient model for rapid prediction of properties.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC5036495&req=5

Figure 5: The description of data used in this analysis and the logic for developing a QSPR. The regression coefficients are defined as W∗CT, which is the product of weights for converting predictor and predicted variables into latent variable space, respectively. PLS operates by performing separate PCA-like analyses on the predictor matrix and the predicted matrix, outputting the weights needed to convert the matrix to latent variable space and the values of the materials in latent variable space (scores). The PLS mathematics perform matrix transformations so that the final predictive model calculates the properties as a function of the reduced descriptor set, the weights (importance) of the individual descriptors, and the weights of the predicted variables for the training data. This results in a computationally efficient model for rapid prediction of properties.

Mentions: In PLS the training data is converted to a data matrix with orthogonalized axes, which are based on capturing the maximum amount of information in fewer dimensions, and thus building on the mathematics of PCA [34–38]. The relationships discovered in the training data can be applied to a test dataset based on a projection of the data onto a high-dimensional hyperplane within the orthogonalized axis-system. With PLS, the properties of the training data are modeled as a function of the controllable parameters such as chemistry and processing. Typical linear regression models do not properly account for the co-linearity between the descriptors, and as a result the isolated impact of each descriptor on the property cannot be accurately known. However, by projecting the data onto a high-dimensional space defined by axes which are comprised of a linear combination of the composite descriptors and also orthogonalized, the impact of the descriptor on the property can be identified independent of all other descriptors. Therefore, PLS is used here to identify structure–property relationships, with the logic of PLS is provided in figure 5.


Informatics derived materials databases for multifunctional properties
The description of data used in this analysis and the logic for developing a QSPR. The regression coefficients are defined as W∗CT, which is the product of weights for converting predictor and predicted variables into latent variable space, respectively. PLS operates by performing separate PCA-like analyses on the predictor matrix and the predicted matrix, outputting the weights needed to convert the matrix to latent variable space and the values of the materials in latent variable space (scores). The PLS mathematics perform matrix transformations so that the final predictive model calculates the properties as a function of the reduced descriptor set, the weights (importance) of the individual descriptors, and the weights of the predicted variables for the training data. This results in a computationally efficient model for rapid prediction of properties.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC5036495&req=5

Figure 5: The description of data used in this analysis and the logic for developing a QSPR. The regression coefficients are defined as W∗CT, which is the product of weights for converting predictor and predicted variables into latent variable space, respectively. PLS operates by performing separate PCA-like analyses on the predictor matrix and the predicted matrix, outputting the weights needed to convert the matrix to latent variable space and the values of the materials in latent variable space (scores). The PLS mathematics perform matrix transformations so that the final predictive model calculates the properties as a function of the reduced descriptor set, the weights (importance) of the individual descriptors, and the weights of the predicted variables for the training data. This results in a computationally efficient model for rapid prediction of properties.
Mentions: In PLS the training data is converted to a data matrix with orthogonalized axes, which are based on capturing the maximum amount of information in fewer dimensions, and thus building on the mathematics of PCA [34–38]. The relationships discovered in the training data can be applied to a test dataset based on a projection of the data onto a high-dimensional hyperplane within the orthogonalized axis-system. With PLS, the properties of the training data are modeled as a function of the controllable parameters such as chemistry and processing. Typical linear regression models do not properly account for the co-linearity between the descriptors, and as a result the isolated impact of each descriptor on the property cannot be accurately known. However, by projecting the data onto a high-dimensional space defined by axes which are comprised of a linear combination of the composite descriptors and also orthogonalized, the impact of the descriptor on the property can be identified independent of all other descriptors. Therefore, PLS is used here to identify structure–property relationships, with the logic of PLS is provided in figure 5.

View Article: PubMed Central - PubMed

ABSTRACT

In this review, we provide an overview of the development of quantitative structure–property relationships incorporating the impact of data uncertainty from small, limited knowledge data sets from which we rapidly develop new and larger databases. Unlike traditional database development, this informatics based approach is concurrent with the identification and discovery of the key metrics controlling structure–property relationships; and even more importantly we are now in a position to build materials databases based on design ‘intent’ and not just design parameters. This permits for example to establish materials databases that can be used for targeted multifunctional properties and not just one characteristic at a time as is presently done. This review provides a summary of the computational logic of building such virtual databases and gives some examples in the field of complex inorganic solids for scintillator applications.

No MeSH data available.


Related in: MedlinePlus