Experiment Manager

Parameters

class rl_zoo3.exp_manager.ExperimentManager(args, algo, env_id, log_folder, tensorboard_log='', n_timesteps=0, eval_freq=10000, n_eval_episodes=5, save_freq=-1, hyperparams=None, env_kwargs=None, eval_env_kwargs=None, trained_agent='', optimize_hyperparameters=False, storage=None, study_name=None, n_trials=1, max_total_trials=None, n_jobs=1, sampler='tpe', pruner='median', optimization_log_path=None, n_startup_trials=0, n_evaluations=1, truncate_last_trajectory=False, uuid_str='', seed=0, log_interval=0, save_replay_buffer=False, verbose=1, vec_env_type='dummy', n_eval_envs=1, no_optim_plots=False, device='auto', config=None, show_progress=False)[source]

Experiment manager: read the hyperparameters, preprocess them, create the environment and the RL model.

Please take a look at train.py to have the details for each argument.

Parameters:
  • args (Namespace) –

  • algo (str) –

  • env_id (str) –

  • log_folder (str) –

  • tensorboard_log (str) –

  • n_timesteps (int) –

  • eval_freq (int) –

  • n_eval_episodes (int) –

  • save_freq (int) –

  • hyperparams (Dict[str, Any] | None) –

  • env_kwargs (Dict[str, Any] | None) –

  • eval_env_kwargs (Dict[str, Any] | None) –

  • trained_agent (str) –

  • optimize_hyperparameters (bool) –

  • storage (str | None) –

  • study_name (str | None) –

  • n_trials (int) –

  • max_total_trials (int | None) –

  • n_jobs (int) –

  • sampler (str) –

  • pruner (str) –

  • optimization_log_path (str | None) –

  • n_startup_trials (int) –

  • n_evaluations (int) –

  • truncate_last_trajectory (bool) –

  • uuid_str (str) –

  • seed (int) –

  • log_interval (int) –

  • save_replay_buffer (bool) –

  • verbose (int) –

  • vec_env_type (str) –

  • n_eval_envs (int) –

  • no_optim_plots (bool) –

  • device (device | str) –

  • config (str | None) –

  • show_progress (bool) –

create_envs(n_envs, eval_env=False, no_log=False)[source]

Create the environment and wrap it if necessary.

Parameters:
  • n_envs (int) –

  • eval_env (bool) – Whether is it an environment used for evaluation or not

  • no_log (bool) – Do not log training when doing hyperparameter optim (issue with writing the same file)

Returns:

the vectorized environment, with appropriate wrappers

Return type:

VecEnv

learn(model)[source]
Parameters:

model (BaseAlgorithm) – an initialized RL model

Return type:

None

save_trained_model(model)[source]

Save trained model optionally with its replay buffer and VecNormalize statistics

Parameters:

model (BaseAlgorithm) –

Return type:

None

setup_experiment()[source]

Read hyperparameters, pre-process them (create schedules, wrappers, callbacks, action noise objects) create the environment and possibly the model.

Returns:

the initialized RL model

Return type:

Tuple[BaseAlgorithm, Dict[str, Any]] | None