Limits...
From data to optimal decision making: a data-driven, probabilistic machine learning approach to decision support for patients with sepsis.

Tsoukalas A, Albertson T, Tagkopoulos I - JMIR Med Inform (2015)

Bottom Line: Furthermore, the percentage of transitions within a trajectory that lead to a better or better/same state are significantly higher by following the policy than for non-policy cases (605 vs 344 patients, P=8.6e-25).Mortality was predicted with an AUC of 0.7 and 0.82 accuracy in the general case and similar performance was obtained for the inference of the length-of-stay (AUC of 0.69 to 0.73 with accuracies from 0.69 to 0.82).A data-driven model was able to suggest favorable actions, predict mortality and length of stay with high accuracy.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Genome Center, University of California, Davis, Davis, CA, United States.

ABSTRACT

Background: A tantalizing question in medical informatics is how to construct knowledge from heterogeneous datasets, and as an extension, inform clinical decisions. The emergence of large-scale data integration in electronic health records (EHR) presents tremendous opportunities. However, our ability to efficiently extract informed decision support is limited due to the complexity of the clinical states and decision process, missing data and lack of analytical tools to advice based on statistical relationships.

Objective: Development and assessment of a data-driven method that infers the probability distribution of the current state of patients with sepsis, likely trajectories, optimal actions related to antibiotic administration, prediction of mortality and length-of-stay.

Methods: We present a data-driven, probabilistic framework for clinical decision support in sepsis-related cases. We first define states, actions, observations and rewards based on clinical practice, expert knowledge and data representations in an EHR dataset of 1492 patients. We then use Partially Observable Markov Decision Process (POMDP) model to derive the optimal policy based on individual patient trajectories and we evaluate the performance of the model-derived policies in a separate test set. Policy decisions were focused on the type of antibiotic combinations to administer. Multi-class and discriminative classifiers were used to predict mortality and length of stay.

Results: Data-derived antibiotic administration policies led to a favorable patient outcome in 49% of the cases, versus 37% when the alternative policies were followed (P=1.3e-13). Sensitivity analysis on the model parameters and missing data argue for a highly robust decision support tool that withstands parameter variation and data uncertainty. When the optimal policy was followed, 387 patients (25.9%) have 90% of their transitions to better states and 503 patients (33.7%) patients had 90% of their transitions to worse states (P=4.0e-06), while in the non-policy cases, these numbers are 192 (12.9%) and 764 (51.2%) patients (P=4.6e-117), respectively. Furthermore, the percentage of transitions within a trajectory that lead to a better or better/same state are significantly higher by following the policy than for non-policy cases (605 vs 344 patients, P=8.6e-25). Mortality was predicted with an AUC of 0.7 and 0.82 accuracy in the general case and similar performance was obtained for the inference of the length-of-stay (AUC of 0.69 to 0.73 with accuracies from 0.69 to 0.82).

Conclusions: A data-driven model was able to suggest favorable actions, predict mortality and length of stay with high accuracy. This work provides a solid basis for a scalable probabilistic clinical decision support framework for sepsis treatment that can be expanded to other clinically relevant states and actions, as well as a data-driven model that can be adopted in other clinical areas with sufficient training data.

No MeSH data available.


Related in: MedlinePlus

Predicting a patient’s Length of Stay (LOS). Histogram of the LOS for the 745 patients in the database that have a complete record (median length of stay is 10.4 days). ROC curves for binary classifiers when 4, 8 or 12 days was used as the boundary between the two classes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4376114&req=5

figure7: Predicting a patient’s Length of Stay (LOS). Histogram of the LOS for the 745 patients in the database that have a complete record (median length of stay is 10.4 days). ROC curves for binary classifiers when 4, 8 or 12 days was used as the boundary between the two classes.

Mentions: Next, we evaluated the robustness of the POMDP framework to decreasing sets of training data. To perform this analysis, we iteratively reduced the training set and we evaluated in the same testing data set (see Methods). Results demonstrate the method is robust to decreasing amount of data in the set, Figure 6 (c) as well as Figures 3 and 7, mainly due to the significant overlap of the various antibiotics in each combination proposed by the optimal policy. To gain more insight on how the policy changes by decreasing the training set we constructed a comprehensive map of the optimal policy for each state, Figure 6 (d). The resulting map provides the CDSS-derived drug combination that led to more favorable outcomes in each state, Table 3. It is important to note, that these policies correspond to a definite knowledge that the patient is that specific state (belief/probability of 1), which is almost never the case as his/her prior history (previous states, clinical information, etc) shapes the belief distribution across all states at any given time. Additionally, the depicted drug combinations are associated with a better outcome overall, and not that are necessarily the optimal combination under any condition when a patient is in that specific state, since potent drug combinations are used to more severe cases, which have a higher probability to transition to a worse state. The optimal decision for any state will ultimately be a function of all observations (vitals, blood results, etc). Their associations will depend on the structure of the data that were used for the CDSS training.


From data to optimal decision making: a data-driven, probabilistic machine learning approach to decision support for patients with sepsis.

Tsoukalas A, Albertson T, Tagkopoulos I - JMIR Med Inform (2015)

Predicting a patient’s Length of Stay (LOS). Histogram of the LOS for the 745 patients in the database that have a complete record (median length of stay is 10.4 days). ROC curves for binary classifiers when 4, 8 or 12 days was used as the boundary between the two classes.
© Copyright Policy - open-access
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4376114&req=5

figure7: Predicting a patient’s Length of Stay (LOS). Histogram of the LOS for the 745 patients in the database that have a complete record (median length of stay is 10.4 days). ROC curves for binary classifiers when 4, 8 or 12 days was used as the boundary between the two classes.
Mentions: Next, we evaluated the robustness of the POMDP framework to decreasing sets of training data. To perform this analysis, we iteratively reduced the training set and we evaluated in the same testing data set (see Methods). Results demonstrate the method is robust to decreasing amount of data in the set, Figure 6 (c) as well as Figures 3 and 7, mainly due to the significant overlap of the various antibiotics in each combination proposed by the optimal policy. To gain more insight on how the policy changes by decreasing the training set we constructed a comprehensive map of the optimal policy for each state, Figure 6 (d). The resulting map provides the CDSS-derived drug combination that led to more favorable outcomes in each state, Table 3. It is important to note, that these policies correspond to a definite knowledge that the patient is that specific state (belief/probability of 1), which is almost never the case as his/her prior history (previous states, clinical information, etc) shapes the belief distribution across all states at any given time. Additionally, the depicted drug combinations are associated with a better outcome overall, and not that are necessarily the optimal combination under any condition when a patient is in that specific state, since potent drug combinations are used to more severe cases, which have a higher probability to transition to a worse state. The optimal decision for any state will ultimately be a function of all observations (vitals, blood results, etc). Their associations will depend on the structure of the data that were used for the CDSS training.

Bottom Line: Furthermore, the percentage of transitions within a trajectory that lead to a better or better/same state are significantly higher by following the policy than for non-policy cases (605 vs 344 patients, P=8.6e-25).Mortality was predicted with an AUC of 0.7 and 0.82 accuracy in the general case and similar performance was obtained for the inference of the length-of-stay (AUC of 0.69 to 0.73 with accuracies from 0.69 to 0.82).A data-driven model was able to suggest favorable actions, predict mortality and length of stay with high accuracy.

View Article: PubMed Central - HTML - PubMed

Affiliation: Department of Computer Science and Genome Center, University of California, Davis, Davis, CA, United States.

ABSTRACT

Background: A tantalizing question in medical informatics is how to construct knowledge from heterogeneous datasets, and as an extension, inform clinical decisions. The emergence of large-scale data integration in electronic health records (EHR) presents tremendous opportunities. However, our ability to efficiently extract informed decision support is limited due to the complexity of the clinical states and decision process, missing data and lack of analytical tools to advice based on statistical relationships.

Objective: Development and assessment of a data-driven method that infers the probability distribution of the current state of patients with sepsis, likely trajectories, optimal actions related to antibiotic administration, prediction of mortality and length-of-stay.

Methods: We present a data-driven, probabilistic framework for clinical decision support in sepsis-related cases. We first define states, actions, observations and rewards based on clinical practice, expert knowledge and data representations in an EHR dataset of 1492 patients. We then use Partially Observable Markov Decision Process (POMDP) model to derive the optimal policy based on individual patient trajectories and we evaluate the performance of the model-derived policies in a separate test set. Policy decisions were focused on the type of antibiotic combinations to administer. Multi-class and discriminative classifiers were used to predict mortality and length of stay.

Results: Data-derived antibiotic administration policies led to a favorable patient outcome in 49% of the cases, versus 37% when the alternative policies were followed (P=1.3e-13). Sensitivity analysis on the model parameters and missing data argue for a highly robust decision support tool that withstands parameter variation and data uncertainty. When the optimal policy was followed, 387 patients (25.9%) have 90% of their transitions to better states and 503 patients (33.7%) patients had 90% of their transitions to worse states (P=4.0e-06), while in the non-policy cases, these numbers are 192 (12.9%) and 764 (51.2%) patients (P=4.6e-117), respectively. Furthermore, the percentage of transitions within a trajectory that lead to a better or better/same state are significantly higher by following the policy than for non-policy cases (605 vs 344 patients, P=8.6e-25). Mortality was predicted with an AUC of 0.7 and 0.82 accuracy in the general case and similar performance was obtained for the inference of the length-of-stay (AUC of 0.69 to 0.73 with accuracies from 0.69 to 0.82).

Conclusions: A data-driven model was able to suggest favorable actions, predict mortality and length of stay with high accuracy. This work provides a solid basis for a scalable probabilistic clinical decision support framework for sepsis treatment that can be expanded to other clinically relevant states and actions, as well as a data-driven model that can be adopted in other clinical areas with sufficient training data.

No MeSH data available.


Related in: MedlinePlus