Skip to content

Commit

Permalink
more doc updates
Browse files Browse the repository at this point in the history
  • Loading branch information
TheEimer committed May 30, 2024
1 parent f580646 commit 3cc13a9
Show file tree
Hide file tree
Showing 9 changed files with 65 additions and 7 deletions.
4 changes: 3 additions & 1 deletion docs/advanced_usage/algorithm_states.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
Using the ARLBench States
==========================
==========================

In addition to providing different objectives, ARLBench also provides insights into the target algorithms' internal states.
4 changes: 3 additions & 1 deletion docs/advanced_usage/autorl_paradigms.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
ARLBench and Different AutoRL Paradigms
=======================================
=======================================

TODO: relationship to other AutoRL paradigms
4 changes: 3 additions & 1 deletion docs/advanced_usage/dynamic_configuration.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
Dynamic Configuration in ARLBench
==================================
==================================

How to dynamic?
12 changes: 11 additions & 1 deletion docs/basic_usage/env_subsets.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,12 @@
The ARLBench Subsets
====================
====================

We analyzed the hyperparameter landscapes of PPO, DQN and SAC on 20 environments to select a subset which allows for efficient benchmarking of AutoRL algorithms. These are the resulting subsets:

.. image:: path/subsets.png
:width: 800
:alt: Alternative text

We strongly recommend you focus your benchmarking on these exact environments to ensure you cover the space total landscape of RL behaviors well.
The data generated for selecting these environments is available on `HuggingFace <https://huggingface.co/datasets/autorl-org/arlbench>`_ for you to use in your experiments.
For more information how the subset selection was done, please refer to our paper.
1 change: 1 addition & 0 deletions docs/basic_usage/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Benchmarking AutoRL Methods
seeding



ARLBench provides an basis for benchmarking different AutoRL methods. This section of the documentation focuses on the prominent aspect of black-box hyperparameter optimization, since it's the simplest usecase of ARLBench.
We discuss the structure of ARLBenchmark, the currently supported objectives, the environment subsets and search spaces we provide and the seeding of the experiments in their own subpages.
The most important question, however, is how to actually use ARLBench in your experiments. This is the workflow we propose:
Expand Down
15 changes: 14 additions & 1 deletion docs/basic_usage/objectives.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,15 @@
Objectives in ARLBench
======================
======================

ARLBench allows to configure the objectives you'd like to use for your AutoRL methods.
These are selected as a list of keywords in the configuration of the AutoRL Environment, e.g. like this:

.. code-block:: bash
python arlbench.py autorl.objectives=["reward_mean"]
The following objectives are available at the moment:
- reward_mean: the mean evaluation reward across a number of evaluation episodes
- reward_std: the standard deviation of the evaluation rewards across a number of evaluation episodes
- runtime: the runtime of the training process
- emissions: the CO2 emissions of the training process
28 changes: 27 additions & 1 deletion docs/basic_usage/options.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,28 @@
ARLBench Options
================
================

A given training run in ARLBench can be configured on two levels: the lower one is the configuration that happens via the AutoRL tool we benchmark while the upper level decides the setting we test the AutoRL tool in.
The high level configuration takes place via the 'autorl' keys in the configuration file. These are the available options:

- **seed**: The seed for the random number generator
- **env_framework**: Environment framework to use. Currently supported: gymnax, envpool, brax, xland
- **env_name**: The name of the environment to use
- **env_kwargs**: Additional keyword arguments for the environment
- **eval_env_kwargs**: Additional keyword arguments for the evaluation environment
- **n_envs**: Number of environments to use in parallel
- **algorithm**: The algorithm to use. Currently supported: dqn, ppo, sac
- **cnn_policy**: Whether to use a CNN policy
- **deterministic_eval**: Whether to use deterministic evaluation. This diables exploration behaviors in evaluation.
- **nas_config**: Configuration for architecture
- **checkpoint**: A list of elements the checkpoint should contain
- **checkpoint_name**: The name of the checkpoint
- **checkpoint_dir**: The directory to save the checkpoint in
- **objectives**: The objectives to optimize for. Currently supported: reward_mean, reward_std, runtime, emissions
- **optimize_objectives**: Whether to maximize or minimize the objectives
- **state_features**: The features of the RL algorithm's state to return
- **n_steps**: The number of steps in the configuration schedule. Using 1 will result in a static configuration
- **n_total_timesteps**: The total number of timesteps to train in each schedule interval
- **n_eval_steps**: The number of steps to evaluate the agent for
- **n_eval_episodes**: The number of episodes to evaluate the agent for

The low level configuration options can be found in the 'hp_config' key set, containing the configurable hyperparameters and architecture of each algorithm. Please refer to the search space overview for more information.
4 changes: 3 additions & 1 deletion docs/basic_usage/seeding.rst
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
Considerations for Seeding
============================
============================

Seeding is important both on the level of RL algorithms as well as the AutoRL level.
Binary file added docs/images/subsets.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3cc13a9

Please sign in to comment.