Offline Reinforcement Learning for Optimization of Therapy Towards a Clinical Endpoint

University essay from KTH/Medicinteknik och hälsosystem

Author: Simon Jenner; [2022]

Keywords: Offline; Reinforcement learning; Double Deep Q-Network; Cognitive behavior therapy; Digital therapeutics; Optimization; Förstärkningsinlärning; Dubbelt djupt Q-nätverk; Kognitiv beteendeterapi; Digital terapeutika; Optimering;

Abstract: The improvement of data acquisition and computer heavy methods in recentyears has paved the way for completely digital healthcare solutions. Digitaltherapeutics (DTx) are such solutions and are often provided as mobileapplications that must undergo clinical trials. A common method for suchapplications is to utilize cognitive behavioral-therapy (CBT), in order toprovide their patients with tools for self-improvement. The Swedish-basedcompany Alex Therapeutics is such a provider. They develop state-of-theartapplications that utilize CBT to help patients. Among their applications,they have one that aims to help users quit smoking. From this app, they havecollected user data with the goal of continuously improving their servicesthrough machine learning (ML). In their current application, they utilizemultiple ML methods to personalize the care, but have opened up possibilitiesfor the usage of reinforcement learning (RL). Often the wanted behavior isknown, such as to quitting smoking, but the optimal path, within the app, forhow to reach such a goal is not. By formalizing the problem as a Markovdecision process, where the transition probabilities have to be inferred fromuser data, such an optimal policy can be found. Standard methods of RL arereliant on direct access of an environment for sampling of data, whereas theuser data sampled from the application are to be treated as such. This thesisthus explores the possibilities of using RL on a static dataset in order to inferan optimal policy. A double deep Q-network (DDQN) was chosen as the reinforcement learningagent. The agent was trained on two different datasets and showed goodconvergence for both, using a custom metric for the task. Using SHAPvaluesthe strategy of the agent is visualized and discussed, together with themethodological challenges. Lastly, future work for the proposed methods arediscussed.

AT THIS PAGE YOU CAN DOWNLOAD THE WHOLE ESSAY. (follow the link to the next page)

Offline Reinforcement Learning for Optimization of Therapy Towards a Clinical Endpoint

Searchphrases right now

Popular searches

popular essays yesterday (2024-04-26)