[docs] Callbacks edits (#49599)

I mistakenly approved #49513 before making copy edits. This PR is a mulligan. ## Checks - [x] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [x] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: angelinalg <[email protected]> Signed-off-by: Sven Mika <[email protected]> Signed-off-by: angelinalg <[email protected]> Co-authored-by: Sven Mika <[email protected]>
ray-project · Jan 9, 2025 · a3b1265 · a3b1265
1 parent 2ac04cc
commit a3b1265
Show file tree

Hide file tree

Showing 7 changed files with 182 additions and 193 deletions.
diff --git a/.vale/styles/config/vocabularies/RLlib/accept.txt b/.vale/styles/config/vocabularies/RLlib/accept.txt
@@ -27,7 +27,7 @@ pragmas?
 (RL|rl)lib
 RNN
 RLModules?
-rollout
+[Rr]ollout
 (SAC|sac)
 SGD
 [Ss]ubcomponents?

diff --git a/doc/source/rllib/package_ref/callback.rst b/doc/source/rllib/package_ref/callback.rst
@@ -1,20 +1,18 @@
-.. include:: /_includes/rllib/we_are_hiring.rst
-
 .. include:: /_includes/rllib/new_api_stack.rst
 
 .. _rllib-callback-reference-docs:
 
 Callback APIs
 =============
 
-RLlib's callback APIs enable you to inject code into your experiment, your Algorithm,
-and its subcomponents.
+Callback APIs enable you to inject code into an experiment, an Algorithm,
+and the subcomponents of an Algorithm.
 
 You can either subclass :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback` and implement
-one or more of its methods, for example :py:meth:`~ray.rllib.callbacks.callbacks.RLlibCallback.on_algorithm_init`,
-or you can pass respective arguments to the :py:meth:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig.callbacks`
-method of your algorithm's config, for example
-``config.callbacks(on_algorithm_init=lambda algorithm, **kw: print('algo initialized!'))``:
+one or more of its methods, like :py:meth:`~ray.rllib.callbacks.callbacks.RLlibCallback.on_algorithm_init`,
+or pass respective arguments to the :py:meth:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig.callbacks`
+method of an Algorithm's config, like
+``config.callbacks(on_algorithm_init=lambda algorithm, **kw: print('algo initialized!'))``.
 
 .. tab-set::
 
@@ -60,15 +58,15 @@ method of your algorithm's config, for example
             config.validate()
 
 
-See here for :ref:`more details on how to write and configure your own custom callbacks <rllib-callback-docs>`.
+See :ref:`Callbacks <rllib-callback-docs>` for more details on how to write and configure callbacks.
 
 
-Methods you should implement for custom behavior
-------------------------------------------------
+Methods to implement for custom behavior
+----------------------------------------
 
 .. note::
 
-    Currently, RLlib only invokes callbacks in :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`
+    RLlib only invokes callbacks in :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`
     and :py:class:`~ray.rllib.env.env_runner.EnvRunner` actors.
     The Ray team is considering expanding callbacks onto :py:class:`~ray.rllib.core.learner.learner.Learner`
     actors and possibly :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` instances as well.
@@ -91,7 +89,7 @@ RLlibCallback
 Callbacks invoked in Algorithm
 ------------------------------
 
-The following callback methods are always executed on the main Algorithm process:
+The main Algorithm process always executes the following callback methods:
 
 .. autosummary::
     :nosignatures:
@@ -111,7 +109,7 @@ The following callback methods are always executed on the main Algorithm process
 Callbacks invoked in EnvRunner
 ------------------------------
 
-The following callback methods are always executed on EnvRunner actors:
+The EnvRunner actors always execute the following callback methods:
 
 .. autosummary::
     :nosignatures:

diff --git a/doc/source/rllib/rllib-advanced-api.rst b/doc/source/rllib/rllib-advanced-api.rst
diff --git a/doc/source/rllib/rllib-algorithms.rst b/doc/source/rllib/rllib-algorithms.rst
@@ -7,7 +7,7 @@
 Algorithms
 ==========
 
-The following table is an overview of all available algorithms in RLlib. Note that all of them support
+The following table is an overview of all available algorithms in RLlib. Note that all algorithms support
 multi-GPU training on a single (GPU) node in `Ray (open-source) <https://docs.ray.io/en/latest/index.html>`__ (|multi_gpu|)
 as well as multi-GPU training on multi-node (GPU) clusters when using the `Anyscale platform <https://www.anyscale.com/platform>`__
 (|multi_node_multi_gpu|).

diff --git a/doc/source/rllib/rllib-callback.rst b/doc/source/rllib/rllib-callback.rst
@@ -1,19 +1,15 @@
-.. include:: /_includes/rllib/we_are_hiring.rst
-
 .. include:: /_includes/rllib/new_api_stack.rst
 
 .. _rllib-callback-docs:
 
-RLlib's callback APIs
-=====================
-
-RLlib's callbacks are the easiest way for you to inject code into your experiments.
+Callbacks
+=========
 
-In a nutshell, you define the code you want to execute at certain events and pass it to your
+Callbacks are the most straightforward way to inject code into experiments. You can define the code to execute at certain events and pass it to your
 :py:class:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig`.
 
-Here is an example of defining a simple lambda that prints out an episode's return after
-the episode has terminated:
+The following is an example of defining a simple lambda that prints out an episode's return after
+the episode terminates:
 
 .. testcode::
 
@@ -37,30 +33,30 @@ the episode has terminated:
     ppo.stop()
 
 
-Callback lambdas vs stateful RLlibCallback
-------------------------------------------
-There are two ways to define custom code and have it executed during the various callback events.
+Callback lambdas versus stateful RLlibCallback
+----------------------------------------------
+There are two ways to define custom code for various callback events to execute.
 
 Callback lambdas
 ~~~~~~~~~~~~~~~~
-If your injected code is rather simple and doesn't need to store temporary information
+If the injected code is rather simple and doesn't need to store temporary information
 for reuse in succeeding event calls, you can use a lambda
 and pass it to the :py:meth:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig.callbacks`
 method as previously shown.
 
-See here for a :ref:`complete list of supported callback events <rllib-callback-event-overview>`.
+See ref:`Callback events <rllib-callback-event-overview>` for a complete list.
 The names of the events always match the argument names for the
 :py:meth:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig.callbacks` method.
 
 
 Stateful RLlibCallback
 ~~~~~~~~~~~~~~~~~~~~~~
-If you need your code to be stateful and be able to temporarily store results for reuse
+If the injected code is stateful and temporarily stores results for reuse
 in succeeding calls triggered by the same or a different event, you
 need to subclass the :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback` API and then implement
 one or more methods, for example :py:meth:`~ray.rllib.callbacks.callbacks.RLlibCallback.on_algorithm_init`:
 
-Here is the same example, printing out a terminated episode's return, but using
+The following is the same example that prints out a terminated episode's return, but uses
 a subclass of :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback`.
 
 .. testcode::
@@ -94,24 +90,11 @@ a subclass of :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback`.
 
 .. _rllib-callback-event-overview:
 
-Overview of all callback events
--------------------------------
-
-During a training iteration, the Algorithm normally walks through the following event tree. Note
-that some of the events in the tree happen simultaneously, on different processes through Ray actors.
-For example an EnvRunner actor may trigger its ``on_episode_start`` event while at the same time another
-EnvRunner actor may trigger its ``on_sample_end`` event and the main Algorithm process triggers
-``on_train_result``.
-
-.. note::
-
-    Currently, RLlib only invokes callbacks in :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`
-    and :py:class:`~ray.rllib.env.env_runner.EnvRunner` actors.
-    The Ray team is considering expanding callbacks onto :py:class:`~ray.rllib.core.learner.learner.Learner`
-    actors and possibly :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` instances as well.
-
+Callback events
+---------------
 
-Here is a high-level overview of all supported events in RLlib's callbacks system.
+During a training iteration, the Algorithm normally walks through the following event tree, 
+a high-level overview of all supported events in RLlib's callbacks system:
 
 .. code-block:: text
 
@@ -136,10 +119,21 @@ Here is a high-level overview of all supported events in RLlib's callbacks syste
             `on_episode_end` - After an episode object has terminated (or truncated).
             `on_sample_end` - At the end of the ``EnvRunner.sample()`` call.
 
+Note that some of the events in the tree happen simultaneously, on different processes through Ray actors.
+For example an EnvRunner actor may trigger its ``on_episode_start`` event while at the same time another
+EnvRunner actor may trigger its ``on_sample_end`` event and the main Algorithm process triggers
+``on_train_result``.
+
+.. note::
+
+    RLlib only invokes callbacks in :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`
+    and :py:class:`~ray.rllib.env.env_runner.EnvRunner` actors.
+    The Ray team is considering expanding callbacks onto :py:class:`~ray.rllib.core.learner.learner.Learner`
+    actors and possibly :py:class:`~ray.rllib.core.rl_module.rl_module.RLModule` instances as well.
 
 .. currentmodule:: ray.rllib.callbacks.callbacks
 
-.. dropdown:: Click here to see all Algorithm-bound methods of ``RLlibCallback``
+.. dropdown:: Algorithm-bound methods of ``RLlibCallback``
 
     .. autosummary::
 
@@ -149,7 +143,7 @@ Here is a high-level overview of all supported events in RLlib's callbacks syste
         RLlibCallback.on_env_runners_recreated
         RLlibCallback.on_checkpoint_loaded
 
-.. dropdown:: Click here to see all EnvRunner-bound methods of ``RLlibCallback``
+.. dropdown:: EnvRunner-bound methods of ``RLlibCallback``
 
     .. autosummary::
 
@@ -169,9 +163,9 @@ You can define more than one :py:class:`~ray.rllib.callbacks.callbacks.RLlibCall
 You can also send lists of callables, instead of a single callable, to the different
 arguments of that method.
 
-For example, assume you already have a subclass of :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback`
-written and would like to reuse it in different experiments. However, one of your experiments
-requires some debug callback code you would like to inject only temporarily for a couple of runs.
+For example, if you already wrote a subclass of :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback`
+and want to reuse it in different experiments. Because one of your experiments
+requires some debug callback code, you want to inject it only temporarily for a couple of runs.
 
 Resolution order of chained callbacks
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -180,16 +174,16 @@ RLlib resolves all available callback methods and callables for a given event
 as follows:
 
 Subclasses of :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback` take precedence
-over individual or lists of callables provided through the various arguments of
+over individual or lists of callables that you provide through the various arguments of
 the :py:meth:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig.callbacks` method.
 
 For example, assume the callback event is ``on_train_result``, which fires at the end of
-a training iteration and inside the algorithm's process.
+a training iteration and inside the algorithm's process:
 
 - RLlib loops through the list of all given :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback`
   subclasses and calls their ``on_train_result`` method. Thereby, it keeps the exact order the user
   provided in the list.
-- RLlib then loops through the list of all defined ``on_train_result`` callables. The user configured these
+- RLlib then loops through the list of all defined ``on_train_result`` callables. You configured these
   by calling the :py:meth:`~ray.rllib.algorithms.algorithm_config.AlgorithmConfig.callbacks` method
   and defining the ``on_train_result`` argument in this call.
 
@@ -224,7 +218,7 @@ a training iteration and inside the algorithm's process.
 Examples
 --------
 
-Here are two examples showing you how to setup custom callbacks on the
+The following are two examples showing you how to setup custom callbacks on the
 :ref:`Algorithm <rllib-key-concepts-algorithms>` process as well as on the
 :ref:`EnvRunner <rllib-key-concepts-env-runners>` processes.
 
@@ -236,7 +230,7 @@ The following example demonstrates how to implement a simple custom function wri
 contents to disk from time to time.
 
 You normally don't want to write the contents of buffers along with your
-:ref:`Algorithm checkpoints <rllib-checkpoints-docs>`, so doing this less often, in a more
+:ref:`Algorithm checkpoints <rllib-checkpoints-docs>`, so writing less often, in a more
 controlled fashion through a custom callback could be a good compromise.
 
 .. testcode::
@@ -269,41 +263,39 @@ controlled fashion through a custom callback could be a good compromise.
     )
     dqn = config.build()
 
-    # Train n times. Expect buffer to be written every ith iteration.
+    # Train n times. Expect RLlib to write buffer every ith iteration.
     for _ in range(4):
         print(dqn.train())
 
-.. tip::
-   See :ref:`here for the exact call signatures and expected argument types <rllib-callback-reference-algorithm-bound>`
-   of all available callbacks.
 
+See :ref:`Callbacks invoked in Algorithm <rllib-callback-reference-algorithm-bound>` for the exact call signatures of all available callbacks and the argument types that they expect.
 
 Example 2: `on_episode_step` and `on_episode_end`
 -------------------------------------------------
 
 The following example demonstrates how to implement a custom :py:class:`~ray.rllib.callbacks.callbacks.RLlibCallback` class
 computing the average "first-joint angle" of the
-`Acrobot-v1 RL environment <https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/classic_control/acrobot.py>`__.
+`Acrobot-v1 RL environment <https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/classic_control/acrobot.py>`__:
 
 .. figure:: images/acrobot-v1.png
     :width: 150
     :align: left
 
-    **The Acrobot-v1 environment**: The env's code described the angle you are about to
+    **The Acrobot-v1 environment**: The env code describes the angle you are about to
     compute and log through your custom callback as:
 
     .. code-block:: text
 
         `theta1` is the angle of the first joint, where an angle of 0.0 indicates that the first
         link is pointing directly downwards.
 
-The example utilizes RLlib's :py:class:`~ray.rllib.utils.metrics.metrics_logger.MetricsLogger`
-API to log the custom computations happening in the injected code your Algorithm's main results system.
+This example utilizes RLlib's :py:class:`~ray.rllib.utils.metrics.metrics_logger.MetricsLogger`
+API to log the custom computations of the injected code of the Algorithm's main results system.
 
 .. todo: uncomment this once metrics-logger.rst page is online.
    Read :ref:`more about the MetricsLogger API here <rllib-metric-logger-doc>`__ or also ...
 
-Also take a look at this more complex example on `how to generate and log a PacMan heatmap (image) to WandB <https://github.com/ray-project/ray/blob/master/rllib/examples/metrics/custom_metrics_in_env_runners.py>`__.
+See this more complex example that `generates and logs a PacMan heatmap (image) to WandB <https://github.com/ray-project/ray/blob/master/rllib/examples/metrics/custom_metrics_in_env_runners.py>`__.
 
 .. testcode::
 
@@ -316,7 +308,7 @@ Also take a look at this more complex example on `how to generate and log a PacM
         def on_episode_step(self, *, episode, env, **kwargs):
             # First get the angle from the env (note that `env` is a VectorEnv).
             # See https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/classic_control/acrobot.py
-            # for the env's source code.
+            # for the env source code.
             cos_theta1, sin_theta1 = env.envs[0].unwrapped.state[0], env.envs[0].unwrapped.state[1]
             # Convert cos/sin/tan into degree.
             deg_theta1 = math.degrees(math.atan2(sin_theta1, cos_theta1))
@@ -342,7 +334,7 @@ Also take a look at this more complex example on `how to generate and log a PacM
     )
     ppo = config.build()
 
-    # Train n times. Expect `theta1_mean` to be found in the results under:
+    # Train n times. Expect to find `theta1_mean` in the results under:
     # `env_runners/theta1_mean`
     for i in range(10):
         results = ppo.train()
@@ -360,6 +352,6 @@ Also take a look at this more complex example on `how to generate and log a PacM
     evaluation ``EnvRunner`` actors and False on ``EnvRunner`` actors used to collect
     training data.
 
-.. tip::
-    See :ref:`here for the exact call signatures and expected argument types <rllib-callback-reference-algorithm-bound>`
-    of all available callbacks.
+See :ref:`Callbacks invoked in Algorithm <rllib-callback-reference-algorithm-bound>`
+for the exact call signatures of all available callbacks and the argument types they expect.
+