v0.4.0
New Features:
- Better video rendering
- Replay buffer: sampling trajectories is now much easier, cleaner and faster
- [Benchmark] Benchmark slice sampler by @vmoens in #1992
- [Feature] Add PrioritizedSliceSampler by @Cadene in #1875
- [Feature] Span slice indices on the left and on the right by @vmoens in #2107
- [Feature] batched trajectories - SliceSampler compatibility by @vmoens in #1775
- [Performance] Faster slice sampler by @vmoens in #2031
- Datasets: allow preprocessing datasets after download
- Losses: reduction parameters and non-functional execution
- [Feature] Add reduction parameter to On-Policy losses. by @albertbou92 in #1890
- [Feature] Adds value clipping in ClipPPOLoss loss by @albertbou92 in #2005
- [Feature] Offline objectives reduction parameter by @albertbou92 in #1984
- Environment API: support "fork" start method in ParallelEnv, better handling of auto-resetting envs.
- Transforms
- [Feature] Allow any callable to be used as transform by @vmoens in #2027
- [Feature] invert transforms appended to a RB by @vmoens in #2111
- [Feature] Extend TensorDictPrimer default_value options by @albertbou92 in #2071
- [Feature] Fine grained DeviceCastTransform by @vmoens in #2041
- [Feature] BatchSizeTransform by @vmoens in #2030
- [Feature] Allow non-sorted keys in CatFrames by @vmoens in #1913
- [Feature] env.append_transform by @vmoens in #2040
- New environment and improvements:
- [Environment] Meltingpot by @matteobettini in #2054
- [Feature] Return depth from RoboHiveEnv by @sriramsk1999 in #2058
- [Feature] PettingZoo possibility to choose reset strategy by @matteobettini in #2048
Other features
- [Feature] Add time_dim arg in value modules by @vmoens in #1946
- [Feature] Batched actions wrapper by @vmoens in #2018
- [Feature] Better repr of RBs by @vmoens in #1991
- [Feature] Execute rollouts with regular nn.Module instances by @vmoens in #1947
- [Feature] Logger by @vmoens in #1858
- [Feature] Passing lists of keyword arguments in
reset
for batched envs by @vmoens in #2076 - [Feature] RB MultiStep transform by @vmoens in #2008
- [Feature] Replace RewardClipping with SignTransform in Atari examples by @albertbou92 in #1870
- [Feature]
reset_parameters
for multiagent nets by @matteobettini in #1970 - [Feature] optionally set truncated = True at the end of rollouts by @vmoens in #2042
Miscellaneous
- Fix onw typo by @kit1980 in #1917
- Rename SOTA-IMPLEMENTATIONS.md to README.md by @matteobettini in #2093
- Revert "[BugFix] Fix Isaac" by @vmoens in #2118
- Update getting-started-5.py by @vmoens in #1894
- [BugFix, Performance] Fewer imports at root by @vmoens in #1930
- [BugFix,CI] Fix Windows CI by @vmoens in #1983
- [BugFix,CI] Fix sporadically failing tests in CI by @vmoens in #2098
- [BugFix,Refactor] Dreamer refactor by @BY571 in #1918
- [BugFix] Adaptable non-blocking for mps and non cuda device in batched-envs by @vmoens in #1900
- [BugFix] Call contiguous on rollout results in TestMultiStepTransform by @vmoens in #2025
- [BugFix] Dedicated tests for on policy losses reduction parameter by @albertbou92 in #1974
- [BugFix] Extend with a list of tensordicts by @vmoens in #2032
- [BugFix] Fix Atari DQN ensembling by @vmoens in #1981
- [BugFix] Fix CQL/IQL pbar update by @vmoens in #2020
- [BugFix] Fix Exclude / Double2Float transforms by @vmoens in #2101
- [BugFix] Fix Isaac by @vmoens in #2072
- [BugFix] Fix KLPENPPOLoss KL computation by @vmoens in #1922
- [BugFix] Fix MPS sync in device transform by @vmoens in #2061
- [BugFix] Fix OOB TruncatedNormal LP by @vmoens in #1924
- [BugFix] Fix R2Go once more by @vmoens in #2089
- [BugFix] Fix Ray collector example error by @albertbou92 in #1908
- [BugFix] Fix Ray collector on Python > 3.8 by @albertbou92 in #2015
- [BugFix] Fix RoboHiveEnv tests by @sriramsk1999 in #2062
- [BugFix] Fix _reset data passing in parallel env by @vmoens in #1880
- [BugFix] Fix a bug in SliceSampler, indexes outside sampler lengths were produced by @vladisai in #1874
- [BugFix] Fix args/kwargs passing in advantages by @vmoens in #2001
- [BugFix] Fix batch-size expansion in functionalization by @vmoens in #1959
- [BugFix] Fix broken gym tests by @vmoens in #1980
- [BugFix] Fix clip_fraction in PO losses by @vmoens in #2021
- [BugFix] Fix colab in tutos by @vmoens in #2113
- [BugFix] Fix env.shape regex matches by @vmoens in #1940
- [BugFix] Fix examples by @vmoens in #1945
- [BugFix] Fix exploration in losses by @vmoens in #1898
- [BugFix] Fix flaky rb tests by @vmoens in #1901
- [BugFix] Fix habitat by @vmoens in #1941
- [BugFix] Fix jumanji by @vmoens in #2064
- [BugFix] Fix load_state_dict and is_empty td bugfix impact by @vmoens in #1869
- [BugFix] Fix mp_start_method for ParallelEnv with single_for_serial by @vmoens in #2007
- [BugFix] Fix multiple context syntax in multiagent examples by @matteobettini in #1943
- [BugFix] Fix offline CatFrames by @vmoens in #1953
- [BugFix] Fix offline CatFrames for pixels by @vmoens in #1964
- [BugFix] Fix prints of size error when no file is associated with memmap by @vmoens in #2090
- [BugFix] Fix replay buffer extension with lists by @vmoens in #1937
- [BugFix] Fix reward2go for nd tensors by @vmoens in #2087
- [BugFix] Fix robohive by @vmoens in #2080
- [BugFix] Fix sampling without replacement with ndim storages by @vmoens in #1999
- [BugFix] Fix slice sampler compatibility with split_trajs and MultiStep by @vmoens in #1961
- [BugFix] Fix slicesampler terminated/truncated signaling by @vmoens in #2044
- [BugFix] Fix strict-length for spanning trajectories by @vmoens in #1982
- [BugFix] Fix strict_length=True in SliceSampler by @vmoens in #2037
- [BugFix] Fix unwanted lazy stacks by @vmoens in #2102
- [BugFix] Fix update in serial / parallel env by @vmoens in #1866
- [BugFix] Fix vmas stacks by @vmoens in #2105
- [BugFix] Fixed import for importlib by @DanilBaibak in #1914
- [BugFix] Make KL-controllers independent of the model by @vmoens in #1903
- [BugFix] Make sure ParallelEnv does not overflow mem when policy requires grad by @vmoens in #1909
- [BugFix] More robust _StepMDP and multi-purpose envs by @vmoens in #2038
- [BugFix] No grad on collector reset by @matteobettini in #1927
- [BugFix] Non exclusive terminated and truncated by @vmoens in #1911
- [BugFix] Refactor reductions by @vmoens in #1968
- [BugFix] Remove
split_trajectories
's reference to("next", "done")
. by @initmaks in #2094 - [BugFix] Remove reset on last step of a rollout by @matteobettini in #1936
- [BugFix] Robust sync for non_blocking=True by @vmoens in #2034
- [BugFix] Set default value for
normalize_advantage
toFalse
. by @DobromirM in #2050 - [BugFix] Set strict=False in tensordict.select() calls for objective classes by @albertbou92 in #2004
- [BugFix] SliceSampler device and index mesh by @vmoens in #1996
- [BugFix] Solve recursion issue in losses hook by @vmoens in #1897
- [BugFix] Update cql docstring example by @BY571 in #1951
- [BugFix] Update iql docstring example by @BY571 in #1950
- [BugFix] Use same signature for append_transform in all cases by @vmoens in #2091
- [BugFix] Use setdefault in _cache_values by @vmoens in #1910
- [BugFix] Use traj_terminated in SliceSampler by @Cadene in #1884
- [BugFix] Vmap randomness for value estimator by @BY571 in #1942
- [BugFix] better device consistency in EGreedy by @vmoens in #1867
- [BugFix] check_env_specs seeding logic by @vmoens in #1872
- [BugFix] fix formatting for VideoRecorder docstring by @sriramsk1999 in #1985
- [BugFix] fix trunc normal device by @vmoens in #1931
- [BugFix] missing annotations import by @vmoens in #2074
- [BugFix] state typo in RNG control module by @vmoens in #1878
- [BugFix] to_observation_norm now works with keys which are not strings by @maxweissenbacher in #2045
- [BugFix] union -> intersection in _StepMDP check by @vmoens in #2039
- [CI,Doc] Sanitize version by @vmoens in #2120
- [CI] Doc on release tag by @vmoens in #2116
- [CI] Fix CI issues by @vmoens in #2084
- [CI] Fix Doc CI by @matteobettini in #2106
- [CI] Fixes sympy error by fixing mpmath version by @vmoens in #1988
- [CI] Install ffmpeg in Robohive tests by @vmoens in #2063
- [CI] Install stable torch and tensordict for release tests by @vmoens in #1978
- [CI] Remove all macos x86 jobs by @vmoens in #2117
- [CI] Remove x86 OSX jobs by @vmoens in #2112
- [CI] Schedule workflows for releases by @vmoens in #2114
- [CI] Temporarily remove snapshot from CI by @vmoens in #2000
- [CI] Unpin mpmath by @vmoens in #1997
- [CI] Upgrade 3.8 to 3.10 GPU jobs by @vmoens in #2013
- [Deprecation] Deprecate in prep for release by @vmoens in #1820
- [Doc,Feature] Better doc for modules and list of kwargs when possible by @vmoens in #1990
- [Doc] Fix tutos by @vmoens in #1863
- [Doc] Getting started tutos by @vmoens in #1886
- [Doc] Improve PrioritizedSampler doc and get rid of np dependency as much as possible by @vmoens in #1881
- [Doc] Installation instructions in API ref by @vmoens in #1871
- [Doc] Per-release doc by @vmoens in #2108
- [Documentation] Correct MaskedEnv Example in ActionMask Transform Documentation by @Jonathanace in #2060
- [Examples] Move examples to sota-implementations by @vmoens in #2016
- [Minor] Add env.shape attribute by @vmoens in #1938
- [Minor] Lint by @vmoens in #2096
- [Minor] Move distributed examples to examples by @vmoens in #2097
- [Minor] Remove duplicate if statement in storages by @vmoens in #2066
- [Minor] Remove warnings in test_cost by @vmoens in #1902
- [Minor] Support init lazy storages with add by @vmoens in #2028
- [Minor] Use the main branch for the M1 build wheels by @DanilBaibak in #1965
- [Performance] Faster DMC by @vmoens in #2002
- [Quality] Capture errors in specs transforms by @vmoens in #2092
- [Quality] Make sure deprec warnings are displayed by @vmoens in #2088
- [Refactor,Feature] Refactor collector shapes and stack_result in sync collector by @vmoens in #1994
- [Refactor] Clearer separation between single_task and share_individual_td by @vmoens in #2026
- [Refactor] Faster and more generic multi-agent nets by @vmoens in #1921
- [Refactor] Refactor split_trajectories by @vmoens in #1955
- [Refactor] Remove remnant legacy functional calls by @vmoens in #1973
- [Refactor] Use filter_empty=False in apply for params by @vmoens in #1882
- [Refactor] Use filter_empty=True in apply by @vmoens in #1879
- [Tutorial] PettingZoo Parallel competitive tutorial by @matteobettini in #2047
- [Versioning] Deprecations for 0.4 by @vmoens in #2109
- [Versioning] New torch version by @vmoens in #2110
- [Versioning] v0.4.0 by @vmoens in #1860
New Contributors
- @vladisai made their first contribution in #1874
- @Cadene made their first contribution in #1884
- @sriramsk1999 made their first contribution in #1985
- @DobromirM made their first contribution in #2050
- @Jonathanace made their first contribution in #2060
- @maxweissenbacher made their first contribution in #2045
- @initmaks made their first contribution in #2094
A big thanks to our dear contributors as well as the entire user base for helping with this lib!
Full Changelog: v0.3.0...v0.4.0