Limits...
Range bagging: a new method for ecological niche modelling from presence-only data.

Drake JM - J R Soc Interface (2015)

Bottom Line: This paper extends the concept of environmental range to multiple dimensions and shows that range bagging is computationally feasible even when the number of environmental dimensions is large.The target of the range bagging base learner is an environmental tolerance of the species in a projection of its niche and is therefore an ecologically interpretable property of a species' biological requirements.The computational complexity of range bagging is linear in the number of examples, which compares favourably with the main alternative, Qhull.

View Article: PubMed Central - PubMed

Affiliation: Odum School of Ecology, University of Georgia, 140 E Green Street, Athens, GA 30602-2202, USA john@drakeresearchlab.com.

ABSTRACT
The ecological niche is the set of environments in which a population of a species can persist without introduction of individuals from other locations. A good mathematical or computational representation of the niche is a prerequisite to addressing many questions in ecology, biogeography, evolutionary biology and conservation. A particularly challenging question for ecological niche modelling is the problem of presence-only modelling. That is, can an ecological niche be identified from records drawn only from the set of niche environments without records from non-niche environments for comparison? Here, I introduce a new method for ecological niche modelling from presence-only data called range bagging. Range bagging draws on the concept of a species' environmental range, but was inspired by the empirical performance of ensemble learning algorithms in other areas of ecological research. This paper extends the concept of environmental range to multiple dimensions and shows that range bagging is computationally feasible even when the number of environmental dimensions is large. The target of the range bagging base learner is an environmental tolerance of the species in a projection of its niche and is therefore an ecologically interpretable property of a species' biological requirements. The computational complexity of range bagging is linear in the number of examples, which compares favourably with the main alternative, Qhull. In conclusion, range bagging appears to be a reasonable choice for niche modelling in applications in which a presence-only method is desired and may provide a solution to problems in other disciplines where one-class classification is required, such as outlier detection and concept learning.

No MeSH data available.


If the density of environments p(z) is far from uniform, the distribution of occupied environments in nature, f(z) may bear little resemblance to the habitat selection function q(z). This plot shows the two-dimensional habitat selection function, q(z), and joint density of occupied environments, f(z), ‘marginalized’ over variable z2 (a). Importantly, the maxima of these functions are displaced from each other by approximately half the habitable range. Nonetheless, hF(z), the boundary of the support of f(z) may be a very good approximation to hN(z), the zero net growth isocline (b). Note, particularly, that even though the maxima of p(z) and q(z) belong to different modes the supports of these functions are nearly identical.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4590497&req=5

RSIF20150086F2: If the density of environments p(z) is far from uniform, the distribution of occupied environments in nature, f(z) may bear little resemblance to the habitat selection function q(z). This plot shows the two-dimensional habitat selection function, q(z), and joint density of occupied environments, f(z), ‘marginalized’ over variable z2 (a). Importantly, the maxima of these functions are displaced from each other by approximately half the habitable range. Nonetheless, hF(z), the boundary of the support of f(z) may be a very good approximation to hN(z), the zero net growth isocline (b). Note, particularly, that even though the maxima of p(z) and q(z) belong to different modes the supports of these functions are nearly identical.

Mentions: I suggest that we think of niche identification as the estimation of hN(z). Obviously, the distribution of occupied environments in nature, f(z), depends on both the density of environments from which species can select and the habitat selection function (figure 1c). We designate this set F and denote its boundary by hF(z). The key insight is that if the set P is ‘large’ compared with N, then hN(z) ≈ hF(z) and a model of hF(z) may be substituted for hN(z) in practice. Figure 2 presents this idea graphically. What it means for P to be large is somewhat ambiguous. The intuition is that information is required mainly near the boundary of N, the zero net growth isocline and is relatively unimportant elsewhere. Possibly, this criterion could be made more precise by stating additional conditions ensuring that the species had the opportunity to explore its environmental space, for instance that for all points in hN(z) there must exist within a local neighbourhood points in P. Importantly, the approximation of hN(z) by hF(z) may be good even where f(z) and q(z) have very different shapes (figures 1 and 2). This is useful because one typically has data drawn from f(z) but not q(z). For this reason, we may wish to speak of ‘estimating the support of f’, by which we mean estimating the parameters of a model , or a trained algorithm. The estimation of hF may be construed as a classification problem, but does not have to be. Further, this picture makes no explicit assumptions concerning the prevalence of a species in nature (i.e. whether q(z) is large or small in places where it is positive).Figure 2.


Range bagging: a new method for ecological niche modelling from presence-only data.

Drake JM - J R Soc Interface (2015)

If the density of environments p(z) is far from uniform, the distribution of occupied environments in nature, f(z) may bear little resemblance to the habitat selection function q(z). This plot shows the two-dimensional habitat selection function, q(z), and joint density of occupied environments, f(z), ‘marginalized’ over variable z2 (a). Importantly, the maxima of these functions are displaced from each other by approximately half the habitable range. Nonetheless, hF(z), the boundary of the support of f(z) may be a very good approximation to hN(z), the zero net growth isocline (b). Note, particularly, that even though the maxima of p(z) and q(z) belong to different modes the supports of these functions are nearly identical.
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4590497&req=5

RSIF20150086F2: If the density of environments p(z) is far from uniform, the distribution of occupied environments in nature, f(z) may bear little resemblance to the habitat selection function q(z). This plot shows the two-dimensional habitat selection function, q(z), and joint density of occupied environments, f(z), ‘marginalized’ over variable z2 (a). Importantly, the maxima of these functions are displaced from each other by approximately half the habitable range. Nonetheless, hF(z), the boundary of the support of f(z) may be a very good approximation to hN(z), the zero net growth isocline (b). Note, particularly, that even though the maxima of p(z) and q(z) belong to different modes the supports of these functions are nearly identical.
Mentions: I suggest that we think of niche identification as the estimation of hN(z). Obviously, the distribution of occupied environments in nature, f(z), depends on both the density of environments from which species can select and the habitat selection function (figure 1c). We designate this set F and denote its boundary by hF(z). The key insight is that if the set P is ‘large’ compared with N, then hN(z) ≈ hF(z) and a model of hF(z) may be substituted for hN(z) in practice. Figure 2 presents this idea graphically. What it means for P to be large is somewhat ambiguous. The intuition is that information is required mainly near the boundary of N, the zero net growth isocline and is relatively unimportant elsewhere. Possibly, this criterion could be made more precise by stating additional conditions ensuring that the species had the opportunity to explore its environmental space, for instance that for all points in hN(z) there must exist within a local neighbourhood points in P. Importantly, the approximation of hN(z) by hF(z) may be good even where f(z) and q(z) have very different shapes (figures 1 and 2). This is useful because one typically has data drawn from f(z) but not q(z). For this reason, we may wish to speak of ‘estimating the support of f’, by which we mean estimating the parameters of a model , or a trained algorithm. The estimation of hF may be construed as a classification problem, but does not have to be. Further, this picture makes no explicit assumptions concerning the prevalence of a species in nature (i.e. whether q(z) is large or small in places where it is positive).Figure 2.

Bottom Line: This paper extends the concept of environmental range to multiple dimensions and shows that range bagging is computationally feasible even when the number of environmental dimensions is large.The target of the range bagging base learner is an environmental tolerance of the species in a projection of its niche and is therefore an ecologically interpretable property of a species' biological requirements.The computational complexity of range bagging is linear in the number of examples, which compares favourably with the main alternative, Qhull.

View Article: PubMed Central - PubMed

Affiliation: Odum School of Ecology, University of Georgia, 140 E Green Street, Athens, GA 30602-2202, USA john@drakeresearchlab.com.

ABSTRACT
The ecological niche is the set of environments in which a population of a species can persist without introduction of individuals from other locations. A good mathematical or computational representation of the niche is a prerequisite to addressing many questions in ecology, biogeography, evolutionary biology and conservation. A particularly challenging question for ecological niche modelling is the problem of presence-only modelling. That is, can an ecological niche be identified from records drawn only from the set of niche environments without records from non-niche environments for comparison? Here, I introduce a new method for ecological niche modelling from presence-only data called range bagging. Range bagging draws on the concept of a species' environmental range, but was inspired by the empirical performance of ensemble learning algorithms in other areas of ecological research. This paper extends the concept of environmental range to multiple dimensions and shows that range bagging is computationally feasible even when the number of environmental dimensions is large. The target of the range bagging base learner is an environmental tolerance of the species in a projection of its niche and is therefore an ecologically interpretable property of a species' biological requirements. The computational complexity of range bagging is linear in the number of examples, which compares favourably with the main alternative, Qhull. In conclusion, range bagging appears to be a reasonable choice for niche modelling in applications in which a presence-only method is desired and may provide a solution to problems in other disciplines where one-class classification is required, such as outlier detection and concept learning.

No MeSH data available.