Experiment Manager

Parameters

class rl_zoo3.exp_manager.ExperimentManager(args, algo, env_id, log_folder, tensorboard_log='', n_timesteps=0, eval_freq=10000, n_eval_episodes=5, save_freq=-1, hyperparams=None, env_kwargs=None, eval_env_kwargs=None, trained_agent='', optimize_hyperparameters=False, storage=None, study_name=None, n_trials=1, max_total_trials=None, n_jobs=1, sampler='tpe', pruner='median', optimization_log_path=None, n_startup_trials=0, n_evaluations=1, truncate_last_trajectory=False, uuid_str='', seed=0, log_interval=0, save_replay_buffer=False, verbose=1, vec_env_type='dummy', n_eval_envs=1, no_optim_plots=False, device='auto', config=None, show_progress=False, trial_id=None)[source]

Experiment manager: read the hyperparameters, preprocess them, create the environment and the RL model.

Please take a look at train.py to have the details for each argument.

Parameters:
  • args (Namespace)

  • algo (str)

  • env_id (str)

  • log_folder (str)

  • tensorboard_log (str)

  • n_timesteps (int)

  • eval_freq (int)

  • n_eval_episodes (int)

  • save_freq (int)

  • hyperparams (dict[str, Any] | None)

  • env_kwargs (dict[str, Any] | None)

  • eval_env_kwargs (dict[str, Any] | None)

  • trained_agent (str)

  • optimize_hyperparameters (bool)

  • storage (str | None)

  • study_name (str | None)

  • n_trials (int)

  • max_total_trials (int | None)

  • n_jobs (int)

  • sampler (str)

  • pruner (str)

  • optimization_log_path (str | None)

  • n_startup_trials (int)

  • n_evaluations (int)

  • truncate_last_trajectory (bool)

  • uuid_str (str)

  • seed (int)

  • log_interval (int)

  • save_replay_buffer (bool)

  • verbose (int)

  • vec_env_type (str)

  • n_eval_envs (int)

  • no_optim_plots (bool)

  • device (device | str)

  • config (str | None)

  • show_progress (bool)

  • trial_id (int | None)

create_envs(n_envs, eval_env=False, no_log=False)[source]

Create the environment and wrap it if necessary.

Parameters:
  • n_envs (int)

  • eval_env (bool) – Whether is it an environment used for evaluation or not

  • no_log (bool) – Do not log training when doing hyperparameter optim (issue with writing the same file)

Returns:

the vectorized environment, with appropriate wrappers

Return type:

VecEnv

learn(model)[source]
Parameters:

model (BaseAlgorithm) – an initialized RL model

Return type:

None

save_trained_model(model)[source]

Save trained model optionally with its replay buffer and VecNormalize statistics

Parameters:

model (BaseAlgorithm)

Return type:

None

setup_experiment()[source]

Read hyperparameters, pre-process them (create schedules, wrappers, callbacks, action noise objects) create the environment and possibly the model.

Returns:

the initialized RL model

Return type:

tuple[BaseAlgorithm, dict[str, Any]] | None