Deep Reinforcement Learning for Controlled Piecewise Deterministic Markov Process in Cancer Treatment Follow-up
Install
Run following lines to download the git repository
git clone git@forgemia.inra.fr:orlane.le-quellennec/controlled_pdmp_po.git
cd controlled_pdmp_po
pip install -r requirements.txt
pip install -e .
Description
Environment fold
The fold env
contains all environment used in the paper.
The full_pdmp.py
corresponds to a piecewise deterministic Markov process (PDMP) simulator.
It simulates patient's trajectories.
Those trajectories are fully observable.
To create a PDMP trajectory instance :
import gymnasium
from gymnasium.envs.registration import register
# Import your environment
from env.full_pdmp import Patient
# Register your environment
register(
id="env/Patient",
entry_point="env.full_pdmp:Patient",
)
# Load an instance of Patient PDMP model
env = gymnasium.make('env/Patient', render_mode="human")
The partially_observable.py
corresponds to the PDMP transformed into Partially Observable Markov Decision Process (POMDP) model detailed in the paper.
To create a partially observable patient trajectory instance :
import gymnasium
from gymnasium.envs.registration import register
# Import your environment
from env.full_pdmp import Patient
from env.wrappers.partially_observable import POWrapper
# Register your environment
register(
id="env/Patient",
entry_point="env.full_pdmp:Patient",
)
# Load an instance of partially observable patient (POMDP model)
env = gymnasium.make('env/Patient', render_mode="human")
env_po = POWrapper(env)
Simulations fold
This folder contains script to simulate trajectories according to a specific model and a chosen policy. All trajectories cost are stored in data folder.
This is an example of command line to run to execute the code.
cd simulations
python generate_data.py --env pdmp --policy alea --num-samples 100000
python generate_data.py --env pomdp --policy dqn --num-samples 100000
To compare all policy costs run compare_cost.py
script.
cd simulations
python compare_cost.py --logdir ./data/pdmp_alea.csv ./data/pomdp_thresh.csv ./data/pdmp_inactive.csv ./data/pomdp_dqn.csv
Tests fold
The folder tests
contains some functions test for each environment and wrappers.
Training
The folder training
contains all necessary script to train / run / exploit all neural networks.
This folder contains :
- [Hyperparameter tunning] This script performs hyperparameter tunning with Rllib Tuner to find good hyperparameters for the algorithm. The script outputs a yaml file with the combination of hyperparameters tested as well as the best hyperparameters found.
python ./training/tune.py --config-file ./env/experiment/pomdp_v2_dqn.py --stop-timesteps 100000 --num-samples 1000 --stop-iters 1000 --output-file ./env/experiment/tuned_hyperparams_dqn_v2.yaml
- [Neural network creation] This script performs multiple training and evaluation cycles using the tunned hyperparameters.
python ./training/evaluate.py --config-file ./env/experiment/tuned_hyperparams_dqn_v2.yaml --stop-timesteps 100000 --evaluation-interval 5 --stop-iters 1000 --num-samples 3 --output-folder ./env/results/pomdp_xp2_DQN
Training with action masking
- [Neural network creation with action masking] This script performs multiple training and evaluation cycles using the tunned hyperparameters.
python ./training/evaluate.py --masking --config-file ./env/experiment/tuned_hyperparams_dqn_v3_with_action_mask.yaml --stop-timesteps 100000 --evaluation-interval 5 --stop-iters 1000 --num-samples 3 --output-folder ./env/results/pomdp_xp_DQN_with_action_masking
python ./training/evaluate.py --masking --config-file ./env/experiment/tuned_hyperparams_r2d2_v3_with_action_mask.yaml --stop-timesteps 100000 --evaluation-interval 5 --stop-iters 1000 --num-samples 3 --output-folder ./env/results/pomdp_xp_R2D2_with_action_masking
Roadmap
Still in progress on this repository:
- Action-masking to deal with constraints (?)
- R2D2 simulations
- Graph comparison / script
Authors and acknowledgment
Alice Cleynen, Benoite de Saporta, Orlane Rossini, Régis Sabbadin and Meritxell Vinyals