Callbacks

class rl_zoo3.callbacks.ParallelTrainCallback(gradient_steps=100, verbose=0, sleep_time=0.0)[source]

Callback to explore (collect experience) and train (do gradient steps) at the same time using two separate threads. Normally used with off-policy algorithms and train_freq=(1, “episode”).

TODO: - blocking mode: wait for the model to finish updating the policy before collecting new experience at the end of a rollout - force sync mode: stop training to update to the latest policy for collecting new experience

Parameters:

gradient_steps (int) – Number of gradient steps to do before sending the new policy
verbose (int) – Verbosity level
sleep_time (float) – Limit the fps in the thread collecting experience.

class rl_zoo3.callbacks.RawStatisticsCallback(verbose=0)[source]: Callback used for logging raw episode data (return and episode length).

class rl_zoo3.callbacks.SaveVecNormalizeCallback(save_freq, save_path, name_prefix=None, verbose=0)[source]

Callback for saving a VecNormalize wrapper every save_freq steps

Parameters:

save_freq (int) – (int)
save_path (str) – (str) Path to the folder where VecNormalize will be saved, as vecnormalize.pkl
name_prefix (str | None) – (str) Common prefix to the saved VecNormalize, if None (default) only one file will be kept.
verbose (int) –

class rl_zoo3.callbacks.TrialEvalCallback(eval_env, trial, n_eval_episodes=5, eval_freq=10000, deterministic=True, verbose=0, best_model_save_path=None, log_path=None)[source]

Callback used for evaluating and reporting a trial.

Parameters:

eval_env (VecEnv) –
trial (Trial) –
n_eval_episodes (int) –
eval_freq (int) –
deterministic (bool) –
verbose (int) –
best_model_save_path (str | None) –
log_path (str | None) –