Limits...
Risk Prediction of One-Year Mortality in Patients with Cardiac Arrhythmias Using Random Survival Forest.

Miao F, Cai YP, Zhang YX, Li Y, Zhang YT - Comput Math Methods Med (2015)

Bottom Line: The simplified risk model also achieved a good accuracy of 0.799.Both results outperformed traditional CPH (which achieved a c-statistic of 0.733 for the comprehensive model and 0.718 for the simplified model).As a result, RSF based model which took nonlinearity into account significantly outperformed traditional Cox proportional hazard model and has great potential to be a more effective approach for survival analysis.

View Article: PubMed Central - PubMed

Affiliation: Key Laboratory for Health Informatics of the Chinese Academy of Sciences (HICAS), Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China ; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing 100049, China.

ABSTRACT
Existing models for predicting mortality based on traditional Cox proportional hazard approach (CPH) often have low prediction accuracy. This paper aims to develop a clinical risk model with good accuracy for predicting 1-year mortality in cardiac arrhythmias patients using random survival forest (RSF), a robust approach for survival analysis. 10,488 cardiac arrhythmias patients available in the public MIMIC II clinical database were investigated, with 3,452 deaths occurring within 1-year followups. Forty risk factors including demographics and clinical and laboratory information and antiarrhythmic agents were analyzed as potential predictors of all-cause mortality. RSF was adopted to build a comprehensive survival model and a simplified risk model composed of 14 top risk factors. The built comprehensive model achieved a prediction accuracy of 0.81 measured by c-statistic with 10-fold cross validation. The simplified risk model also achieved a good accuracy of 0.799. Both results outperformed traditional CPH (which achieved a c-statistic of 0.733 for the comprehensive model and 0.718 for the simplified model). Moreover, various factors are observed to have nonlinear impact on cardiac arrhythmias prognosis. As a result, RSF based model which took nonlinearity into account significantly outperformed traditional Cox proportional hazard model and has great potential to be a more effective approach for survival analysis.

No MeSH data available.


Related in: MedlinePlus

Minimal depth from RSF analysis. Horizontal line is threshold for separating predictive variables that are below the line. The diameter of each circle is in proportion to the forest-averaged number of maximal subtrees for that variable: 1: cardiac arrest, 2: log of BUN, 3: log of BMI, 4: log of AST, 5: log of age, 6: log of SCR, 7: log of BR, 8: log of K, 9: log of WBC, 10: log of ALT, 11: log of NA, 12: log of CKPK, 13: class II agents, 14: log of glucose, 15: log of INR, 16: CHF, 17: renal failure, 18: log of RBC, 19: log of PTT, 20: class V agents, 21: log of PT, 22: stroke, 23: sex, 24: AF, 25: class IV agents, 26: myocardial infarction, 27: hypertension, 28: uncomplicated diabetes, 29: valvular heart disease, 30: slow arrhythmias, 31: VT, 32: VF, 33: hypothyroidism, 34: complicated diabetes, 35: class III agents, 36: liver disease, 37: chronic pulmonary heart disease, 38: acute pulmonary heart disease, 39: class I agents, 40: bundle branch block.
© Copyright Policy - open-access
Related In: Results  -  Collection


getmorefigures.php?uid=PMC4562335&req=5

fig1: Minimal depth from RSF analysis. Horizontal line is threshold for separating predictive variables that are below the line. The diameter of each circle is in proportion to the forest-averaged number of maximal subtrees for that variable: 1: cardiac arrest, 2: log of BUN, 3: log of BMI, 4: log of AST, 5: log of age, 6: log of SCR, 7: log of BR, 8: log of K, 9: log of WBC, 10: log of ALT, 11: log of NA, 12: log of CKPK, 13: class II agents, 14: log of glucose, 15: log of INR, 16: CHF, 17: renal failure, 18: log of RBC, 19: log of PTT, 20: class V agents, 21: log of PT, 22: stroke, 23: sex, 24: AF, 25: class IV agents, 26: myocardial infarction, 27: hypertension, 28: uncomplicated diabetes, 29: valvular heart disease, 30: slow arrhythmias, 31: VT, 32: VF, 33: hypothyroidism, 34: complicated diabetes, 35: class III agents, 36: liver disease, 37: chronic pulmonary heart disease, 38: acute pulmonary heart disease, 39: class I agents, 40: bundle branch block.

Mentions: From the comprehensive RSF analysis with all 40 variables, 14 variables were selected to be predictive for 1-year mortality, including cardiac arrest, BUN, BMI, AST, age, SCR, BR, K, WBC, ALT, NA, CKPK, class II agents, and glucose (the detailed minimal depths of all variables can be seen from Figure 1, in which 14 predictive variables were separated from the remaining nonpredictive variables by the horizontal line). The 6 variables on the extreme left including cardiac arrest, BUN, BMI, AST, age, and SCR are easily seen to be the most predictive variables.


Risk Prediction of One-Year Mortality in Patients with Cardiac Arrhythmias Using Random Survival Forest.

Miao F, Cai YP, Zhang YX, Li Y, Zhang YT - Comput Math Methods Med (2015)

Minimal depth from RSF analysis. Horizontal line is threshold for separating predictive variables that are below the line. The diameter of each circle is in proportion to the forest-averaged number of maximal subtrees for that variable: 1: cardiac arrest, 2: log of BUN, 3: log of BMI, 4: log of AST, 5: log of age, 6: log of SCR, 7: log of BR, 8: log of K, 9: log of WBC, 10: log of ALT, 11: log of NA, 12: log of CKPK, 13: class II agents, 14: log of glucose, 15: log of INR, 16: CHF, 17: renal failure, 18: log of RBC, 19: log of PTT, 20: class V agents, 21: log of PT, 22: stroke, 23: sex, 24: AF, 25: class IV agents, 26: myocardial infarction, 27: hypertension, 28: uncomplicated diabetes, 29: valvular heart disease, 30: slow arrhythmias, 31: VT, 32: VF, 33: hypothyroidism, 34: complicated diabetes, 35: class III agents, 36: liver disease, 37: chronic pulmonary heart disease, 38: acute pulmonary heart disease, 39: class I agents, 40: bundle branch block.
© Copyright Policy - open-access
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC4562335&req=5

fig1: Minimal depth from RSF analysis. Horizontal line is threshold for separating predictive variables that are below the line. The diameter of each circle is in proportion to the forest-averaged number of maximal subtrees for that variable: 1: cardiac arrest, 2: log of BUN, 3: log of BMI, 4: log of AST, 5: log of age, 6: log of SCR, 7: log of BR, 8: log of K, 9: log of WBC, 10: log of ALT, 11: log of NA, 12: log of CKPK, 13: class II agents, 14: log of glucose, 15: log of INR, 16: CHF, 17: renal failure, 18: log of RBC, 19: log of PTT, 20: class V agents, 21: log of PT, 22: stroke, 23: sex, 24: AF, 25: class IV agents, 26: myocardial infarction, 27: hypertension, 28: uncomplicated diabetes, 29: valvular heart disease, 30: slow arrhythmias, 31: VT, 32: VF, 33: hypothyroidism, 34: complicated diabetes, 35: class III agents, 36: liver disease, 37: chronic pulmonary heart disease, 38: acute pulmonary heart disease, 39: class I agents, 40: bundle branch block.
Mentions: From the comprehensive RSF analysis with all 40 variables, 14 variables were selected to be predictive for 1-year mortality, including cardiac arrest, BUN, BMI, AST, age, SCR, BR, K, WBC, ALT, NA, CKPK, class II agents, and glucose (the detailed minimal depths of all variables can be seen from Figure 1, in which 14 predictive variables were separated from the remaining nonpredictive variables by the horizontal line). The 6 variables on the extreme left including cardiac arrest, BUN, BMI, AST, age, and SCR are easily seen to be the most predictive variables.

Bottom Line: The simplified risk model also achieved a good accuracy of 0.799.Both results outperformed traditional CPH (which achieved a c-statistic of 0.733 for the comprehensive model and 0.718 for the simplified model).As a result, RSF based model which took nonlinearity into account significantly outperformed traditional Cox proportional hazard model and has great potential to be a more effective approach for survival analysis.

View Article: PubMed Central - PubMed

Affiliation: Key Laboratory for Health Informatics of the Chinese Academy of Sciences (HICAS), Shenzhen Institutes of Advanced Technology, Shenzhen 518055, China ; Shenzhen College of Advanced Technology, University of Chinese Academy of Sciences, Beijing 100049, China.

ABSTRACT
Existing models for predicting mortality based on traditional Cox proportional hazard approach (CPH) often have low prediction accuracy. This paper aims to develop a clinical risk model with good accuracy for predicting 1-year mortality in cardiac arrhythmias patients using random survival forest (RSF), a robust approach for survival analysis. 10,488 cardiac arrhythmias patients available in the public MIMIC II clinical database were investigated, with 3,452 deaths occurring within 1-year followups. Forty risk factors including demographics and clinical and laboratory information and antiarrhythmic agents were analyzed as potential predictors of all-cause mortality. RSF was adopted to build a comprehensive survival model and a simplified risk model composed of 14 top risk factors. The built comprehensive model achieved a prediction accuracy of 0.81 measured by c-statistic with 10-fold cross validation. The simplified risk model also achieved a good accuracy of 0.799. Both results outperformed traditional CPH (which achieved a c-statistic of 0.733 for the comprehensive model and 0.718 for the simplified model). Moreover, various factors are observed to have nonlinear impact on cardiac arrhythmias prognosis. As a result, RSF based model which took nonlinearity into account significantly outperformed traditional Cox proportional hazard model and has great potential to be a more effective approach for survival analysis.

No MeSH data available.


Related in: MedlinePlus