Polish configurations and documents.

opendilab · Jun 18, 2024 · b8d1233 · b8d1233
1 parent 31eaa6a
commit b8d1233
Show file tree

Hide file tree

Showing 18 changed files with 9 additions and 2,502 deletions.
diff --git a/docs/source/tutorials/installation/index.rst b/docs/source/tutorials/installation/index.rst
@@ -9,12 +9,6 @@ GenerativeRL can be installed using pip:
 
 You can also install the latest development version from GitHub:
 
-.. code-block:: console
-
-   $ pip install git+https://github.com/OpenDILab/GenerativeRL.git
-
-If you want to try a preview of the latest features, you can install the latest development version from GitHub:
-
 .. code-block:: console
 
    $ pip install git+https://github.com/opendilab/GenerativeRL.git
diff --git a/docs/source/tutorials/quick_start/index.rst b/docs/source/tutorials/quick_start/index.rst
@@ -5,7 +5,7 @@ Generative model in GenerativeRL
 ---------
 
 GenerativeRL support easy-to-use APIs for training and deploying generative model.
-We provide a simple example of how to train a diffusion model on the swiss roll dataset in [Colab](https://colab.research.google.com/drive/18yHUAmcMh_7xq2U6TBCtcLKX2y4YvNyk?usp=drive_link).
+We provide a simple example of how to train a diffusion model on the swiss roll dataset in `Colab <https://colab.research.google.com/drive/18yHUAmcMh_7xq2U6TBCtcLKX2y4YvNyk?usp=drive_link>`_.
 
 More usage examples can be found in the folder `grl_pipelines/tutorials/`.
 
@@ -41,16 +41,16 @@ Explanation
 
 1. First, we import the necessary components from the GenerativeRL library, including the configuration for the HalfCheetah environment and the QGPO algorithm, as well as the logging utility and the OpenAI Gym environment.
 
-2. The `qgpo_pipeline` function encapsulates the training and deployment process:
+2. The ``qgpo_pipeline`` function encapsulates the training and deployment process:
 
-   - An instance of the `QGPOAlgorithm` is created with the provided configuration.
-   - The `qgpo.train()` method is called to train the QGPO agent on the HalfCheetah environment.
-   - After training, the `qgpo.deploy()` method is called to obtain the trained agent for deployment.
-   - A new instance of the HalfCheetah environment is created using `gym.make`.
-   - The environment is reset to its initial state with `env.reset()`.
-   - A loop is executed for the specified number of steps (`config.deploy.num_deploy_steps`), rendering the environment and stepping through it using the agent's `act` method.
+   - An instance of the ``QGPOAlgorithm`` is created with the provided configuration.
+   - The ``qgpo.train()`` method is called to train the QGPO agent on the HalfCheetah environment.
+   - After training, the ``qgpo.deploy()`` method is called to obtain the trained agent for deployment.
+   - A new instance of the HalfCheetah environment is created using ``gym.make``.
+   - The environment is reset to its initial state with ``env.reset()``.
+   - A loop is executed for the specified number of steps (``config.deploy.num_deploy_steps``), rendering the environment and stepping through it using the agent's ``act`` method.
 
-3. In the `if __name__ == '__main__'` block, the configuration is printed to the console using the logging utility, and the `qgpo_pipeline` function is called with the provided configuration.
+3. In the ``if __name__ == '__main__'`` block, the configuration is printed to the console using the logging utility, and the ``qgpo_pipeline`` function is called with the provided configuration.
 
 This example demonstrates how to utilize the GenerativeRL library to train a QGPO agent on the HalfCheetah environment and then deploy the trained agent for evaluation within the environment. You can modify the configuration and algorithm as needed to suit your specific use case.
 

diff --git a/grl_pipelines/diffusion_model/configurations/adroit_penhuman_v2_qgpo.py b/grl_pipelines/diffusion_model/configurations/adroit_penhuman_v2_qgpo.py
diff --git a/grl_pipelines/diffusion_model/configurations/antmaze_large_diverse_v0_qgpo.py b/grl_pipelines/diffusion_model/configurations/antmaze_large_diverse_v0_qgpo.py