Callbacks
- class rl_zoo3.callbacks.ParallelTrainCallback(gradient_steps=100, verbose=0, sleep_time=0.0)[source]
Callback to explore (collect experience) and train (do gradient steps) at the same time using two separate threads. Normally used with off-policy algorithms and train_freq=(1, “episode”).
TODO: - blocking mode: wait for the model to finish updating the policy before collecting new experience at the end of a rollout - force sync mode: stop training to update to the latest policy for collecting new experience
- Parameters:
gradient_steps (int) – Number of gradient steps to do before sending the new policy
verbose (int) – Verbosity level
sleep_time (float) – Limit the fps in the thread collecting experience.
- class rl_zoo3.callbacks.RawStatisticsCallback(verbose=0)[source]
Callback used for logging raw episode data (return and episode length).
- class rl_zoo3.callbacks.SaveVecNormalizeCallback(save_freq, save_path, name_prefix=None, verbose=0)[source]
Callback for saving a VecNormalize wrapper every
save_freq
steps- Parameters:
save_freq (int) – (int)
save_path (str) – (str) Path to the folder where
VecNormalize
will be saved, asvecnormalize.pkl
name_prefix (str | None) – (str) Common prefix to the saved
VecNormalize
, if None (default) only one file will be kept.verbose (int)
- class rl_zoo3.callbacks.TrialEvalCallback(eval_env, trial, n_eval_episodes=5, eval_freq=10000, deterministic=True, verbose=0, best_model_save_path=None, log_path=None)[source]
Callback used for evaluating and reporting a trial.
- Parameters:
eval_env (VecEnv)
trial (Trial)
n_eval_episodes (int)
eval_freq (int)
deterministic (bool)
verbose (int)
best_model_save_path (str | None)
log_path (str | None)