Limits...
Multi-step polynomial regression method to model and forecast malaria incidence.

Chatterjee C, Sarkar RR - PLoS ONE (2009)

Bottom Line: We performed variable selection by simple correlation study, identification of the initial relationship between variables through non-linear curve fitting and used multi-step methods for induction of variables in the non-linear regression analysis along with applied Gauss-Markov models, and ANOVA for testing the prediction, validity and constructing the confidence intervals.The results execute the applicability of our method for different types of data, the autoregressive nature of forecasting, and show high prediction power for both SPR and P. vivax deaths, where the one-lag SPR values plays an influential role and proves useful for better prediction.Different climatic factors are identified as playing crucial role on shaping the disease curve.

View Article: PubMed Central - PubMed

Affiliation: Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India.

ABSTRACT
Malaria is one of the most severe problems faced by the world even today. Understanding the causative factors such as age, sex, social factors, environmental variability etc. as well as underlying transmission dynamics of the disease is important for epidemiological research on malaria and its eradication. Thus, development of suitable modeling approach and methodology, based on the available data on the incidence of the disease and other related factors is of utmost importance. In this study, we developed a simple non-linear regression methodology in modeling and forecasting malaria incidence in Chennai city, India, and predicted future disease incidence with high confidence level. We considered three types of data to develop the regression methodology: a longer time series data of Slide Positivity Rates (SPR) of malaria; a smaller time series data (deaths due to Plasmodium vivax) of one year; and spatial data (zonal distribution of P. vivax deaths) for the city along with the climatic factors, population and previous incidence of the disease. We performed variable selection by simple correlation study, identification of the initial relationship between variables through non-linear curve fitting and used multi-step methods for induction of variables in the non-linear regression analysis along with applied Gauss-Markov models, and ANOVA for testing the prediction, validity and constructing the confidence intervals. The results execute the applicability of our method for different types of data, the autoregressive nature of forecasting, and show high prediction power for both SPR and P. vivax deaths, where the one-lag SPR values plays an influential role and proves useful for better prediction. Different climatic factors are identified as playing crucial role on shaping the disease curve. Further, disease incidence at zonal level and the effect of causative factors on different zonal clusters indicate the pattern of malaria prevalence in the city. The study also demonstrates that with excellent models of climatic forecasts readily available, using this method one can predict the disease incidence at long forecasting horizons, with high degree of efficiency and based on such technique a useful early warning system can be developed region wise or nation wise for disease prevention and control activities.

Show MeSH

Related in: MedlinePlus

Flowchart for multi step regression model with step-wise induction of variables.Xi is the starting variable exhibiting highest coefficient of determination (R2) with the dependent variable and the initial pair wise relation is the rth order polynomial, while Xj is the subsequent variable (with pth order relation) with second highest R2.
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC2648889&req=5

pone-0004726-g001: Flowchart for multi step regression model with step-wise induction of variables.Xi is the starting variable exhibiting highest coefficient of determination (R2) with the dependent variable and the initial pair wise relation is the rth order polynomial, while Xj is the subsequent variable (with pth order relation) with second highest R2.

Mentions: Once the pair wise relationships are identified between selected variables and the dependent variable, we refine the model according to a ‘step wise induction of functional forms and subsequent improvement of residual sum of squares’. The flowchart (Fig. 1) of the process simply explains the procedure and Table S1 (in Supporting Information, Section E of Text S1) denotes the process in detail for the study of SPR values. We start with the variable exhibiting the best pair wise relation with the dependent variable and keep inducing from the lower order functional form of that variable (in case of polynomial function) moving upwards to the higher order depending on the coefficient of determination of the model at each step. When all forms of the first variable are inducted, we then start the procedure again with the functional forms of the second variable in the same way from lower to higher order and repeat the process with other variables until all the selected variables are exhausted.


Multi-step polynomial regression method to model and forecast malaria incidence.

Chatterjee C, Sarkar RR - PLoS ONE (2009)

Flowchart for multi step regression model with step-wise induction of variables.Xi is the starting variable exhibiting highest coefficient of determination (R2) with the dependent variable and the initial pair wise relation is the rth order polynomial, while Xj is the subsequent variable (with pth order relation) with second highest R2.
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC2648889&req=5

pone-0004726-g001: Flowchart for multi step regression model with step-wise induction of variables.Xi is the starting variable exhibiting highest coefficient of determination (R2) with the dependent variable and the initial pair wise relation is the rth order polynomial, while Xj is the subsequent variable (with pth order relation) with second highest R2.
Mentions: Once the pair wise relationships are identified between selected variables and the dependent variable, we refine the model according to a ‘step wise induction of functional forms and subsequent improvement of residual sum of squares’. The flowchart (Fig. 1) of the process simply explains the procedure and Table S1 (in Supporting Information, Section E of Text S1) denotes the process in detail for the study of SPR values. We start with the variable exhibiting the best pair wise relation with the dependent variable and keep inducing from the lower order functional form of that variable (in case of polynomial function) moving upwards to the higher order depending on the coefficient of determination of the model at each step. When all forms of the first variable are inducted, we then start the procedure again with the functional forms of the second variable in the same way from lower to higher order and repeat the process with other variables until all the selected variables are exhausted.

Bottom Line: We performed variable selection by simple correlation study, identification of the initial relationship between variables through non-linear curve fitting and used multi-step methods for induction of variables in the non-linear regression analysis along with applied Gauss-Markov models, and ANOVA for testing the prediction, validity and constructing the confidence intervals.The results execute the applicability of our method for different types of data, the autoregressive nature of forecasting, and show high prediction power for both SPR and P. vivax deaths, where the one-lag SPR values plays an influential role and proves useful for better prediction.Different climatic factors are identified as playing crucial role on shaping the disease curve.

View Article: PubMed Central - PubMed

Affiliation: Centre for Cellular and Molecular Biology, CSIR, Hyderabad, India.

ABSTRACT
Malaria is one of the most severe problems faced by the world even today. Understanding the causative factors such as age, sex, social factors, environmental variability etc. as well as underlying transmission dynamics of the disease is important for epidemiological research on malaria and its eradication. Thus, development of suitable modeling approach and methodology, based on the available data on the incidence of the disease and other related factors is of utmost importance. In this study, we developed a simple non-linear regression methodology in modeling and forecasting malaria incidence in Chennai city, India, and predicted future disease incidence with high confidence level. We considered three types of data to develop the regression methodology: a longer time series data of Slide Positivity Rates (SPR) of malaria; a smaller time series data (deaths due to Plasmodium vivax) of one year; and spatial data (zonal distribution of P. vivax deaths) for the city along with the climatic factors, population and previous incidence of the disease. We performed variable selection by simple correlation study, identification of the initial relationship between variables through non-linear curve fitting and used multi-step methods for induction of variables in the non-linear regression analysis along with applied Gauss-Markov models, and ANOVA for testing the prediction, validity and constructing the confidence intervals. The results execute the applicability of our method for different types of data, the autoregressive nature of forecasting, and show high prediction power for both SPR and P. vivax deaths, where the one-lag SPR values plays an influential role and proves useful for better prediction. Different climatic factors are identified as playing crucial role on shaping the disease curve. Further, disease incidence at zonal level and the effect of causative factors on different zonal clusters indicate the pattern of malaria prevalence in the city. The study also demonstrates that with excellent models of climatic forecasts readily available, using this method one can predict the disease incidence at long forecasting horizons, with high degree of efficiency and based on such technique a useful early warning system can be developed region wise or nation wise for disease prevention and control activities.

Show MeSH
Related in: MedlinePlus