Limits...
The identification of complex interactions in epidemiology and toxicology: a simulation study of boosted regression trees.

Lampa E, Lind L, Lind PM, Bornefalk-Hermansson A - Environ Health (2014)

Bottom Line: There is a need to evaluate complex interaction effects on human health, such as those induced by mixtures of environmental contaminants.The simulated outcome contains one four-way interaction, one non-linear effect and one interaction between a continuous variable and a binary variable.Some spurious interactions were also found, however.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Medical Sciences, Occupational and Environmental Medicine, Uppsala University, 75185 Uppsala Sweden. erik.lampa@medsci.uu.se.

ABSTRACT

Background: There is a need to evaluate complex interaction effects on human health, such as those induced by mixtures of environmental contaminants. The usual approach is to formulate an additive statistical model and check for departures using product terms between the variables of interest. In this paper, we present an approach to search for interaction effects among several variables using boosted regression trees.

Methods: We simulate a continuous outcome from real data on 27 environmental contaminants, some of which are correlated, and test the method's ability to uncover the simulated interactions. The simulated outcome contains one four-way interaction, one non-linear effect and one interaction between a continuous variable and a binary variable. Four scenarios reflecting different strengths of association are simulated. We illustrate the method using real data.

Results: The method succeeded in identifying the true interactions in all scenarios except where the association was weakest. Some spurious interactions were also found, however. The method was also capable to identify interactions in the real data set.

Conclusions: We conclude that boosted regression trees can be used to uncover complex interaction effects in epidemiological studies.

Show MeSH

Related in: MedlinePlus

Visualization of the four-way interaction. The x- and y-axes of each panel represent p-p’-DDE and PCB 170 respectively. Levels of Cd increase with panels going left to right, and levels of MMP increase with panels going bottom to top. The plotted ranges are from the 10th to the 90th percentiles of each variable’s distribution to ease interpretation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4120739&req=5

Figure 9: Visualization of the four-way interaction. The x- and y-axes of each panel represent p-p’-DDE and PCB 170 respectively. Levels of Cd increase with panels going left to right, and levels of MMP increase with panels going bottom to top. The plotted ranges are from the 10th to the 90th percentiles of each variable’s distribution to ease interpretation.

Mentions: The four-way interaction between p-p-’DDE, PCB 170, Cd and MMP for SNR = 0.5 is seen in Figure 9. The x- and y-axes of each panel represent p-p’-DDE and PCB 170 levels respectively. Cd and MMP are represented as shingles [38] which are overlapping intervals used to represent continuous variables in a high-dimensional setting. Panels going left to right represent increasing levels of Cd while panels going bottom to top represent increasing levels of MMP. The bar to the right of the figure provides the color codes for the predicted outcome.The bottom left panel of Figure 9 shows the joint effect of p-p’-DDE and PCB 170 while CD and MMP are both at low levels. The synergistic effect is hardly discernable. Following the panels right or up from the bottom left panel shows the joint effect when Cd or MMP increases. The synergistic effect becomes clearer, although it is still small. Following the diagonal from the bottom left panel shows the joint effect of p-p’-DDE and PCB 170 as Cd and MMP both increase, and the synergistic effect is obvious in the top right panel.


The identification of complex interactions in epidemiology and toxicology: a simulation study of boosted regression trees.

Lampa E, Lind L, Lind PM, Bornefalk-Hermansson A - Environ Health (2014)

Visualization of the four-way interaction. The x- and y-axes of each panel represent p-p’-DDE and PCB 170 respectively. Levels of Cd increase with panels going left to right, and levels of MMP increase with panels going bottom to top. The plotted ranges are from the 10th to the 90th percentiles of each variable’s distribution to ease interpretation.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4120739&req=5

Figure 9: Visualization of the four-way interaction. The x- and y-axes of each panel represent p-p’-DDE and PCB 170 respectively. Levels of Cd increase with panels going left to right, and levels of MMP increase with panels going bottom to top. The plotted ranges are from the 10th to the 90th percentiles of each variable’s distribution to ease interpretation.
Mentions: The four-way interaction between p-p-’DDE, PCB 170, Cd and MMP for SNR = 0.5 is seen in Figure 9. The x- and y-axes of each panel represent p-p’-DDE and PCB 170 levels respectively. Cd and MMP are represented as shingles [38] which are overlapping intervals used to represent continuous variables in a high-dimensional setting. Panels going left to right represent increasing levels of Cd while panels going bottom to top represent increasing levels of MMP. The bar to the right of the figure provides the color codes for the predicted outcome.The bottom left panel of Figure 9 shows the joint effect of p-p’-DDE and PCB 170 while CD and MMP are both at low levels. The synergistic effect is hardly discernable. Following the panels right or up from the bottom left panel shows the joint effect when Cd or MMP increases. The synergistic effect becomes clearer, although it is still small. Following the diagonal from the bottom left panel shows the joint effect of p-p’-DDE and PCB 170 as Cd and MMP both increase, and the synergistic effect is obvious in the top right panel.

Bottom Line: There is a need to evaluate complex interaction effects on human health, such as those induced by mixtures of environmental contaminants.The simulated outcome contains one four-way interaction, one non-linear effect and one interaction between a continuous variable and a binary variable.Some spurious interactions were also found, however.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Medical Sciences, Occupational and Environmental Medicine, Uppsala University, 75185 Uppsala Sweden. erik.lampa@medsci.uu.se.

ABSTRACT

Background: There is a need to evaluate complex interaction effects on human health, such as those induced by mixtures of environmental contaminants. The usual approach is to formulate an additive statistical model and check for departures using product terms between the variables of interest. In this paper, we present an approach to search for interaction effects among several variables using boosted regression trees.

Methods: We simulate a continuous outcome from real data on 27 environmental contaminants, some of which are correlated, and test the method's ability to uncover the simulated interactions. The simulated outcome contains one four-way interaction, one non-linear effect and one interaction between a continuous variable and a binary variable. Four scenarios reflecting different strengths of association are simulated. We illustrate the method using real data.

Results: The method succeeded in identifying the true interactions in all scenarios except where the association was weakest. Some spurious interactions were also found, however. The method was also capable to identify interactions in the real data set.

Conclusions: We conclude that boosted regression trees can be used to uncover complex interaction effects in epidemiological studies.

Show MeSH
Related in: MedlinePlus