Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care.

Abstract

Medical treatment decisions inherently involve a series of sequential choices, each informed by the outcomes of preceding decisions. This process closely aligns with the principles of reinforcement learning (RL), which also focuses on sequential decisions aimed at maximizing cumulative rewards. Consequently, RL holds significant promise for developing data-driven treatment plans. However, a major challenge in applying RL within medical contexts lies in the sparse nature of the rewards, which are primarily based on mortality outcomes. This sparsity can reduce the stability of offline estimates, posing a significant hurdle in fully utilizing RL for medical decision-making. In this work, we introduce a deep Q-learning approach able to obtain more reliable critical care policies. This method integrates relevant but noisy intermediate biomarker signals into the reward specification without compromising the optimization of the main outcome of interest (e.g., patient survival). We achieve this by first pruning the action space based on all available rewards, and then training a final model based on the (sparse) main reward, while only choosing actions available within the pruned action space. By disentangling sparse rewards and frequently measured reward proxies through action pruning, potential distortions of the main objective are minimized, all while enabling the extraction of valuable information from intermediate signals that can guide the learning process. We evaluate our method in both off-policy and offline settings using simulated environments and real health records of patients in intensive care units. Our empirical results indicate that our method outperforms common offline RL methods such as conservative Q-learning and batch-constrained deep Q-learning. Our work is a step towards developing reliable policies by effectively harnessing the wealth of available information in data-intensive critical care environments.

Show Full Text

Pruning the Way to Reliable Policies: A Multi-Objective Deep Q-Learning Approach to Critical Care.

Researchers

Journal

Modalities

Models

Abstract

A deep learning approach for facility patient attendance prediction based on medical booking data.

Maximizing throughput in NOMA-enable industrial IoT network using digital twin and reinforcement learning.

A comprehensive benchmark for COVID-19 predictive modeling using electronic health records in intensive care.

Predicting patient decompensation from continuous physiologic monitoring in the emergency department.

COVID Mortality Prediction with Machine Learning Methods: A Systematic Review and Critical Appraisal.

Leave a Reply Cancel reply

Researchers

Journal

Modalities

Models

Abstract

Similar Posts

Leave a Reply Cancel reply