New Features:

Better video rendering
- [Feature] A PixelRenderTransform by @vmoens in #2099
- [Feature] Video recording in SOTA examples by @vmoens in #2070
- [Feature] VideoRecorder for datasets and replay buffers by @vmoens in #2069
Replay buffer: sampling trajectories is now much easier, cleaner and faster
- [Benchmark] Benchmark slice sampler by @vmoens in #1992
- [Feature] Add PrioritizedSliceSampler by @Cadene in #1875
- [Feature] Span slice indices on the left and on the right by @vmoens in #2107
- [Feature] batched trajectories - SliceSampler compatibility by @vmoens in #1775
- [Performance] Faster slice sampler by @vmoens in #2031
Datasets: allow preprocessing datasets after download
- [Feature] Preproc for datasets by @vmoens in #1989
Losses: reduction parameters and non-functional execution
- [Feature] Add reduction parameter to On-Policy losses. by @albertbou92 in #1890
- [Feature] Adds value clipping in ClipPPOLoss loss by @albertbou92 in #2005
- [Feature] Offline objectives reduction parameter by @albertbou92 in #1984
Environment API: support "fork" start method in ParallelEnv, better handling of auto-resetting envs.
- [Feature] Use non-default mp start method in ParallelEnv by @vmoens in #1966
- [Feature] Auto-resetting envs by @vmoens in #2073
Transforms
- [Feature] Allow any callable to be used as transform by @vmoens in #2027
- [Feature] invert transforms appended to a RB by @vmoens in #2111
- [Feature] Extend TensorDictPrimer default_value options by @albertbou92 in #2071
- [Feature] Fine grained DeviceCastTransform by @vmoens in #2041
- [Feature] BatchSizeTransform by @vmoens in #2030
- [Feature] Allow non-sorted keys in CatFrames by @vmoens in #1913
- [Feature] env.append_transform by @vmoens in #2040
New environment and improvements:
- [Environment] Meltingpot by @matteobettini in #2054
- [Feature] Return depth from RoboHiveEnv by @sriramsk1999 in #2058
- [Feature] PettingZoo possibility to choose reset strategy by @matteobettini in #2048

Other features

[Feature] Add time_dim arg in value modules by @vmoens in #1946
[Feature] Batched actions wrapper by @vmoens in #2018
[Feature] Better repr of RBs by @vmoens in #1991
[Feature] Execute rollouts with regular nn.Module instances by @vmoens in #1947
[Feature] Logger by @vmoens in #1858
[Feature] Passing lists of keyword arguments in reset for batched envs by @vmoens in #2076
[Feature] RB MultiStep transform by @vmoens in #2008
[Feature] Replace RewardClipping with SignTransform in Atari examples by @albertbou92 in #1870
[Feature] reset_parameters for multiagent nets by @matteobettini in #1970
[Feature] optionally set truncated = True at the end of rollouts by @vmoens in #2042

Miscellaneous

Fix onw typo by @kit1980 in #1917
Rename SOTA-IMPLEMENTATIONS.md to README.md by @matteobettini in #2093
Revert "[BugFix] Fix Isaac" by @vmoens in #2118
Update getting-started-5.py by @vmoens in #1894
[BugFix, Performance] Fewer imports at root by @vmoens in #1930
[BugFix,CI] Fix Windows CI by @vmoens in #1983
[BugFix,CI] Fix sporadically failing tests in CI by @vmoens in #2098
[BugFix,Refactor] Dreamer refactor by @BY571 in #1918
[BugFix] Adaptable non-blocking for mps and non cuda device in batched-envs by @vmoens in #1900
[BugFix] Call contiguous on rollout results in TestMultiStepTransform by @vmoens in #2025
[BugFix] Dedicated tests for on policy losses reduction parameter by @albertbou92 in #1974
[BugFix] Extend with a list of tensordicts by @vmoens in #2032
[BugFix] Fix Atari DQN ensembling by @vmoens in #1981
[BugFix] Fix CQL/IQL pbar update by @vmoens in #2020
[BugFix] Fix Exclude / Double2Float transforms by @vmoens in #2101
[BugFix] Fix Isaac by @vmoens in #2072
[BugFix] Fix KLPENPPOLoss KL computation by @vmoens in #1922
[BugFix] Fix MPS sync in device transform by @vmoens in #2061
[BugFix] Fix OOB TruncatedNormal LP by @vmoens in #1924
[BugFix] Fix R2Go once more by @vmoens in #2089
[BugFix] Fix Ray collector example error by @albertbou92 in #1908
[BugFix] Fix Ray collector on Python > 3.8 by @albertbou92 in #2015
[BugFix] Fix RoboHiveEnv tests by @sriramsk1999 in #2062
[BugFix] Fix _reset data passing in parallel env by @vmoens in #1880
[BugFix] Fix a bug in SliceSampler, indexes outside sampler lengths were produced by @vladisai in #1874
[BugFix] Fix args/kwargs passing in advantages by @vmoens in #2001
[BugFix] Fix batch-size expansion in functionalization by @vmoens in #1959
[BugFix] Fix broken gym tests by @vmoens in #1980
[BugFix] Fix clip_fraction in PO losses by @vmoens in #2021
[BugFix] Fix colab in tutos by @vmoens in #2113
[BugFix] Fix env.shape regex matches by @vmoens in #1940
[BugFix] Fix examples by @vmoens in #1945
[BugFix] Fix exploration in losses by @vmoens in #1898
[BugFix] Fix flaky rb tests by @vmoens in #1901
[BugFix] Fix habitat by @vmoens in #1941
[BugFix] Fix jumanji by @vmoens in #2064
[BugFix] Fix load_state_dict and is_empty td bugfix impact by @vmoens in #1869
[BugFix] Fix mp_start_method for ParallelEnv with single_for_serial by @vmoens in #2007
[BugFix] Fix multiple context syntax in multiagent examples by @matteobettini in #1943
[BugFix] Fix offline CatFrames by @vmoens in #1953
[BugFix] Fix offline CatFrames for pixels by @vmoens in #1964
[BugFix] Fix prints of size error when no file is associated with memmap by @vmoens in #2090
[BugFix] Fix replay buffer extension with lists by @vmoens in #1937
[BugFix] Fix reward2go for nd tensors by @vmoens in #2087
[BugFix] Fix robohive by @vmoens in #2080
[BugFix] Fix sampling without replacement with ndim storages by @vmoens in #1999
[BugFix] Fix slice sampler compatibility with split_trajs and MultiStep by @vmoens in #1961
[BugFix] Fix slicesampler terminated/truncated signaling by @vmoens in #2044
[BugFix] Fix strict-length for spanning trajectories by @vmoens in #1982
[BugFix] Fix strict_length=True in SliceSampler by @vmoens in #2037
[BugFix] Fix unwanted lazy stacks by @vmoens in #2102
[BugFix] Fix update in serial / parallel env by @vmoens in #1866
[BugFix] Fix vmas stacks by @vmoens in #2105
[BugFix] Fixed import for importlib by @DanilBaibak in #1914
[BugFix] Make KL-controllers independent of the model by @vmoens in #1903
[BugFix] Make sure ParallelEnv does not overflow mem when policy requires grad by @vmoens in #1909
[BugFix] More robust _StepMDP and multi-purpose envs by @vmoens in #2038
[BugFix] No grad on collector reset by @matteobettini in #1927
[BugFix] Non exclusive terminated and truncated by @vmoens in #1911
[BugFix] Refactor reductions by @vmoens in #1968
[BugFix] Remove split_trajectories's reference to ("next", "done"). by @initmaks in #2094
[BugFix] Remove reset on last step of a rollout by @matteobettini in #1936
[BugFix] Robust sync for non_blocking=True by @vmoens in #2034
[BugFix] Set default value for normalize_advantage to False. by @DobromirM in #2050
[BugFix] Set strict=False in tensordict.select() calls for objective classes by @albertbou92 in #2004
[BugFix] SliceSampler device and index mesh by @vmoens in #1996
[BugFix] Solve recursion issue in losses hook by @vmoens in #1897
[BugFix] Update cql docstring example by @BY571 in #1951
[BugFix] Update iql docstring example by @BY571 in #1950
[BugFix] Use same signature for append_transform in all cases by @vmoens in #2091
[BugFix] Use setdefault in _cache_values by @vmoens in #1910
[BugFix] Use traj_terminated in SliceSampler by @Cadene in #1884
[BugFix] Vmap randomness for value estimator by @BY571 in #1942
[BugFix] better device consistency in EGreedy by @vmoens in #1867
[BugFix] check_env_specs seeding logic by @vmoens in #1872
[BugFix] fix formatting for VideoRecorder docstring by @sriramsk1999 in #1985
[BugFix] fix trunc normal device by @vmoens in #1931
[BugFix] missing annotations import by @vmoens in #2074
[BugFix] state typo in RNG control module by @vmoens in #1878
[BugFix] to_observation_norm now works with keys which are not strings by @maxweissenbacher in #2045
[BugFix] union -> intersection in _StepMDP check by @vmoens in #2039
[CI,Doc] Sanitize version by @vmoens in #2120
[CI] Doc on release tag by @vmoens in #2116
[CI] Fix CI issues by @vmoens in #2084
[CI] Fix Doc CI by @matteobettini in #2106
[CI] Fixes sympy error by fixing mpmath version by @vmoens in #1988
[CI] Install ffmpeg in Robohive tests by @vmoens in #2063
[CI] Install stable torch and tensordict for release tests by @vmoens in #1978
[CI] Remove all macos x86 jobs by @vmoens in #2117
[CI] Remove x86 OSX jobs by @vmoens in #2112
[CI] Schedule workflows for releases by @vmoens in #2114
[CI] Temporarily remove snapshot from CI by @vmoens in #2000
[CI] Unpin mpmath by @vmoens in #1997
[CI] Upgrade 3.8 to 3.10 GPU jobs by @vmoens in #2013
[Deprecation] Deprecate in prep for release by @vmoens in #1820
[Doc,Feature] Better doc for modules and list of kwargs when possible by @vmoens in #1990
[Doc] Fix tutos by @vmoens in #1863
[Doc] Getting started tutos by @vmoens in #1886
[Doc] Improve PrioritizedSampler doc and get rid of np dependency as much as possible by @vmoens in #1881
[Doc] Installation instructions in API ref by @vmoens in #1871
[Doc] Per-release doc by @vmoens in #2108
[Documentation] Correct MaskedEnv Example in ActionMask Transform Documentation by @Jonathanace in #2060
[Examples] Move examples to sota-implementations by @vmoens in #2016
[Minor] Add env.shape attribute by @vmoens in #1938
[Minor] Lint by @vmoens in #2096
[Minor] Move distributed examples to examples by @vmoens in #2097
[Minor] Remove duplicate if statement in storages by @vmoens in #2066
[Minor] Remove warnings in test_cost by @vmoens in #1902
[Minor] Support init lazy storages with add by @vmoens in #2028
[Minor] Use the main branch for the M1 build wheels by @DanilBaibak in #1965
[Performance] Faster DMC by @vmoens in #2002
[Quality] Capture errors in specs transforms by @vmoens in #2092
[Quality] Make sure deprec warnings are displayed by @vmoens in #2088
[Refactor,Feature] Refactor collector shapes and stack_result in sync collector by @vmoens in #1994
[Refactor] Clearer separation between single_task and share_individual_td by @vmoens in #2026
[Refactor] Faster and more generic multi-agent nets by @vmoens in #1921
[Refactor] Refactor split_trajectories by @vmoens in #1955
[Refactor] Remove remnant legacy functional calls by @vmoens in #1973
[Refactor] Use filter_empty=False in apply for params by @vmoens in #1882
[Refactor] Use filter_empty=True in apply by @vmoens in #1879
[Tutorial] PettingZoo Parallel competitive tutorial by @matteobettini in #2047
[Versioning] Deprecations for 0.4 by @vmoens in #2109
[Versioning] New torch version by @vmoens in #2110
[Versioning] v0.4.0 by @vmoens in #1860

New Contributors

@vladisai made their first contribution in #1874
@Cadene made their first contribution in #1884
@sriramsk1999 made their first contribution in #1985
@DobromirM made their first contribution in #2050
@Jonathanace made their first contribution in #2060
@maxweissenbacher made their first contribution in #2045
@initmaks made their first contribution in #2094

A big thanks to our dear contributors as well as the entire user base for helping with this lib!

Full Changelog: v0.3.0...v0.4.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.4.0

New Features:

Other features

Miscellaneous

New Contributors

Contributors