Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI] Fix benchmark workflows #2488

Merged
merged 24 commits into from
Oct 14, 2024
Merged

[CI] Fix benchmark workflows #2488

merged 24 commits into from
Oct 14, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Oct 11, 2024

No description provided.

Copy link

pytorch-bot bot commented Oct 11, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2488

Note: Links to docs will display an error until the docs builds have been completed.

❌ 18 New Failures, 4 Unrelated Failures

As of commit b957a40 with merge base c1c2e84 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 11, 2024
Copy link

github-actions bot commented Oct 11, 2024

$\color{#35bf28}\textsf{\Large✔\kern{0.2cm}\normalsize OK}$ Result of CPU Benchmark Tests

Total Benchmarks: 143. Improved: $\large\color{#35bf28}20$. Worsened: $\large\color{#d91a1a}0$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4097s 0.4078s 2.4524 Ops/s 2.3276 Ops/s $\textbf{\color{#35bf28}+5.36\%}$
test_transformed 0.6828s 0.6043s 1.6548 Ops/s 1.6342 Ops/s $\color{#35bf28}+1.26\%$
test_serial 1.4099s 1.3413s 0.7456 Ops/s 0.7296 Ops/s $\color{#35bf28}+2.18\%$
test_parallel 1.1889s 1.1839s 0.8447 Ops/s 0.8069 Ops/s $\color{#35bf28}+4.68\%$
test_step_mdp_speed[True-True-True-True-True] 0.1301ms 28.6479μs 34.9066 KOps/s 34.5696 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-True-True-True-False] 48.2210μs 17.4500μs 57.3067 KOps/s 57.5664 KOps/s $\color{#d91a1a}-0.45\%$
test_step_mdp_speed[True-True-True-False-True] 67.2760μs 16.0700μs 62.2276 KOps/s 61.3848 KOps/s $\color{#35bf28}+1.37\%$
test_step_mdp_speed[True-True-True-False-False] 55.6740μs 9.4708μs 105.5882 KOps/s 105.4852 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-True-False-True-True] 80.3710μs 30.9130μs 32.3488 KOps/s 32.0167 KOps/s $\color{#35bf28}+1.04\%$
test_step_mdp_speed[True-True-False-True-False] 58.7400μs 19.4656μs 51.3727 KOps/s 51.1823 KOps/s $\color{#35bf28}+0.37\%$
test_step_mdp_speed[True-True-False-False-True] 48.2200μs 18.2921μs 54.6685 KOps/s 54.5140 KOps/s $\color{#35bf28}+0.28\%$
test_step_mdp_speed[True-True-False-False-False] 51.0660μs 11.6998μs 85.4714 KOps/s 86.4253 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[True-False-True-True-True] 82.5650μs 33.0256μs 30.2796 KOps/s 30.0058 KOps/s $\color{#35bf28}+0.91\%$
test_step_mdp_speed[True-False-True-True-False] 54.7020μs 21.9908μs 45.4735 KOps/s 46.3981 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[True-False-True-False-True] 48.6110μs 18.2794μs 54.7065 KOps/s 54.7591 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[True-False-True-False-False] 35.5870μs 11.7635μs 85.0087 KOps/s 85.9394 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[True-False-False-True-True] 87.4940μs 35.2274μs 28.3870 KOps/s 28.3134 KOps/s $\color{#35bf28}+0.26\%$
test_step_mdp_speed[True-False-False-True-False] 60.4240μs 23.6405μs 42.3003 KOps/s 42.0603 KOps/s $\color{#35bf28}+0.57\%$
test_step_mdp_speed[True-False-False-False-True] 66.5240μs 19.8827μs 50.2951 KOps/s 49.5530 KOps/s $\color{#35bf28}+1.50\%$
test_step_mdp_speed[True-False-False-False-False] 41.2070μs 13.5663μs 73.7123 KOps/s 73.4592 KOps/s $\color{#35bf28}+0.34\%$
test_step_mdp_speed[False-True-True-True-True] 83.9570μs 33.0659μs 30.2426 KOps/s 30.1260 KOps/s $\color{#35bf28}+0.39\%$
test_step_mdp_speed[False-True-True-True-False] 63.6700μs 21.6956μs 46.0923 KOps/s 46.0982 KOps/s $\color{#d91a1a}-0.01\%$
test_step_mdp_speed[False-True-True-False-True] 55.5440μs 21.2158μs 47.1347 KOps/s 46.0371 KOps/s $\color{#35bf28}+2.38\%$
test_step_mdp_speed[False-True-True-False-False] 2.3900ms 13.5939μs 73.5627 KOps/s 73.9349 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-True-False-True-True] 78.8280μs 34.9846μs 28.5840 KOps/s 28.1067 KOps/s $\color{#35bf28}+1.70\%$
test_step_mdp_speed[False-True-False-True-False] 52.6190μs 23.7737μs 42.0633 KOps/s 42.2730 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[False-True-False-False-True] 67.8070μs 23.7043μs 42.1864 KOps/s 41.4499 KOps/s $\color{#35bf28}+1.78\%$
test_step_mdp_speed[False-True-False-False-False] 43.0700μs 15.7162μs 63.6288 KOps/s 64.6175 KOps/s $\color{#d91a1a}-1.53\%$
test_step_mdp_speed[False-False-True-True-True] 86.4630μs 37.2548μs 26.8421 KOps/s 26.5702 KOps/s $\color{#35bf28}+1.02\%$
test_step_mdp_speed[False-False-True-True-False] 59.0600μs 25.8210μs 38.7282 KOps/s 38.7191 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-False-True-False-True] 58.4000μs 22.9999μs 43.4784 KOps/s 42.8351 KOps/s $\color{#35bf28}+1.50\%$
test_step_mdp_speed[False-False-True-False-False] 85.8430μs 15.0075μs 66.6331 KOps/s 64.9593 KOps/s $\color{#35bf28}+2.58\%$
test_step_mdp_speed[False-False-False-True-True] 93.4350μs 38.9670μs 25.6628 KOps/s 25.5793 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[False-False-False-True-False] 66.8150μs 27.5306μs 36.3232 KOps/s 35.5590 KOps/s $\color{#35bf28}+2.15\%$
test_step_mdp_speed[False-False-False-False-True] 63.9090μs 25.0976μs 39.8444 KOps/s 39.1127 KOps/s $\color{#35bf28}+1.87\%$
test_step_mdp_speed[False-False-False-False-False] 51.1660μs 17.4787μs 57.2124 KOps/s 57.4415 KOps/s $\color{#d91a1a}-0.40\%$
test_values[generalized_advantage_estimate-True-True] 9.8211ms 9.3477ms 106.9785 Ops/s 102.1410 Ops/s $\color{#35bf28}+4.74\%$
test_values[vec_generalized_advantage_estimate-True-True] 35.8476ms 33.3906ms 29.9486 Ops/s 27.5663 Ops/s $\textbf{\color{#35bf28}+8.64\%}$
test_values[td0_return_estimate-False-False] 0.2311ms 0.1699ms 5.8854 KOps/s 5.1063 KOps/s $\textbf{\color{#35bf28}+15.26\%}$
test_values[td1_return_estimate-False-False] 27.3331ms 23.4014ms 42.7325 Ops/s 41.1124 Ops/s $\color{#35bf28}+3.94\%$
test_values[vec_td1_return_estimate-False-False] 47.7853ms 34.3135ms 29.1430 Ops/s 27.4986 Ops/s $\textbf{\color{#35bf28}+5.98\%}$
test_values[td_lambda_return_estimate-True-False] 37.6075ms 33.9991ms 29.4126 Ops/s 28.5654 Ops/s $\color{#35bf28}+2.97\%$
test_values[vec_td_lambda_return_estimate-True-False] 35.6493ms 33.4668ms 29.8803 Ops/s 27.5133 Ops/s $\textbf{\color{#35bf28}+8.60\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.5175ms 8.3102ms 120.3336 Ops/s 120.9084 Ops/s $\color{#d91a1a}-0.48\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.2819ms 1.9451ms 514.1153 Ops/s 483.9548 Ops/s $\textbf{\color{#35bf28}+6.23\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.9758ms 0.3626ms 2.7575 KOps/s 2.6503 KOps/s $\color{#35bf28}+4.04\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.7466ms 44.0786ms 22.6867 Ops/s 20.5502 Ops/s $\textbf{\color{#35bf28}+10.40\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.7069ms 3.0472ms 328.1750 Ops/s 325.3217 Ops/s $\color{#35bf28}+0.88\%$
test_dqn_speed[False-None] 1.8258ms 1.3559ms 737.4938 Ops/s 721.3344 Ops/s $\color{#35bf28}+2.24\%$
test_dqn_speed[False-backward] 1.8791ms 1.8402ms 543.4191 Ops/s 524.8993 Ops/s $\color{#35bf28}+3.53\%$
test_dqn_speed[True-None] 1.2332ms 0.4641ms 2.1549 KOps/s 2.1383 KOps/s $\color{#35bf28}+0.77\%$
test_dqn_speed[True-backward] 0.9562ms 0.8861ms 1.1286 KOps/s 1.0941 KOps/s $\color{#35bf28}+3.15\%$
test_dqn_speed[reduce-overhead-None] 0.6613ms 0.4670ms 2.1414 KOps/s 2.1186 KOps/s $\color{#35bf28}+1.07\%$
test_dqn_speed[reduce-overhead-backward] 0.9405ms 0.8883ms 1.1258 KOps/s 1.1158 KOps/s $\color{#35bf28}+0.90\%$
test_ddpg_speed[False-None] 3.5198ms 2.8367ms 352.5255 Ops/s 345.6983 Ops/s $\color{#35bf28}+1.97\%$
test_ddpg_speed[False-backward] 4.2594ms 3.9681ms 252.0083 Ops/s 245.8478 Ops/s $\color{#35bf28}+2.51\%$
test_ddpg_speed[True-None] 1.3042ms 1.0009ms 999.0628 Ops/s 973.6931 Ops/s $\color{#35bf28}+2.61\%$
test_ddpg_speed[True-backward] 1.9542ms 1.8883ms 529.5751 Ops/s 459.0091 Ops/s $\textbf{\color{#35bf28}+15.37\%}$
test_ddpg_speed[reduce-overhead-None] 1.4468ms 1.0086ms 991.4407 Ops/s 981.4898 Ops/s $\color{#35bf28}+1.01\%$
test_ddpg_speed[reduce-overhead-backward] 1.9529ms 1.8876ms 529.7689 Ops/s 516.7210 Ops/s $\color{#35bf28}+2.53\%$
test_sac_speed[False-None] 8.8596ms 7.9312ms 126.0842 Ops/s 120.0838 Ops/s $\color{#35bf28}+5.00\%$
test_sac_speed[False-backward] 10.8487ms 10.5920ms 94.4105 Ops/s 89.3187 Ops/s $\textbf{\color{#35bf28}+5.70\%}$
test_sac_speed[True-None] 2.0996ms 1.8587ms 538.0035 Ops/s 526.0045 Ops/s $\color{#35bf28}+2.28\%$
test_sac_speed[True-backward] 3.8548ms 3.5675ms 280.3089 Ops/s 278.4686 Ops/s $\color{#35bf28}+0.66\%$
test_sac_speed[reduce-overhead-None] 2.3116ms 1.8550ms 539.0952 Ops/s 515.1846 Ops/s $\color{#35bf28}+4.64\%$
test_sac_speed[reduce-overhead-backward] 4.3336ms 3.5747ms 279.7448 Ops/s 269.2427 Ops/s $\color{#35bf28}+3.90\%$
test_redq_speed[False-None] 15.2403ms 13.0233ms 76.7852 Ops/s 74.0726 Ops/s $\color{#35bf28}+3.66\%$
test_redq_speed[False-backward] 23.7799ms 22.0748ms 45.3006 Ops/s 42.7497 Ops/s $\textbf{\color{#35bf28}+5.97\%}$
test_redq_speed[True-None] 5.3662ms 4.6699ms 214.1391 Ops/s 183.7105 Ops/s $\textbf{\color{#35bf28}+16.56\%}$
test_redq_speed[True-backward] 12.4590ms 12.2511ms 81.6254 Ops/s 72.9019 Ops/s $\textbf{\color{#35bf28}+11.97\%}$
test_redq_speed[reduce-overhead-None] 5.5226ms 4.6131ms 216.7754 Ops/s 190.4207 Ops/s $\textbf{\color{#35bf28}+13.84\%}$
test_redq_speed[reduce-overhead-backward] 13.2568ms 12.3634ms 80.8837 Ops/s 73.0342 Ops/s $\textbf{\color{#35bf28}+10.75\%}$
test_redq_deprec_speed[False-None] 15.1502ms 12.6932ms 78.7823 Ops/s 71.4092 Ops/s $\textbf{\color{#35bf28}+10.33\%}$
test_redq_deprec_speed[False-backward] 20.7645ms 18.4817ms 54.1075 Ops/s 49.9265 Ops/s $\textbf{\color{#35bf28}+8.37\%}$
test_redq_deprec_speed[True-None] 4.0409ms 3.5792ms 279.3890 Ops/s 260.8216 Ops/s $\textbf{\color{#35bf28}+7.12\%}$
test_redq_deprec_speed[True-backward] 8.9423ms 8.0960ms 123.5175 Ops/s 118.9402 Ops/s $\color{#35bf28}+3.85\%$
test_redq_deprec_speed[reduce-overhead-None] 4.6784ms 3.6441ms 274.4199 Ops/s 269.6781 Ops/s $\color{#35bf28}+1.76\%$
test_redq_deprec_speed[reduce-overhead-backward] 8.9416ms 8.4972ms 117.6863 Ops/s 111.7686 Ops/s $\textbf{\color{#35bf28}+5.29\%}$
test_td3_speed[False-None] 8.3042ms 8.0030ms 124.9536 Ops/s 119.8709 Ops/s $\color{#35bf28}+4.24\%$
test_td3_speed[False-backward] 12.5340ms 10.6008ms 94.3325 Ops/s 90.3307 Ops/s $\color{#35bf28}+4.43\%$
test_td3_speed[True-None] 2.0405ms 1.7752ms 563.3241 Ops/s 552.3069 Ops/s $\color{#35bf28}+1.99\%$
test_td3_speed[True-backward] 4.3304ms 3.4402ms 290.6786 Ops/s 269.3805 Ops/s $\textbf{\color{#35bf28}+7.91\%}$
test_td3_speed[reduce-overhead-None] 2.0076ms 1.7555ms 569.6335 Ops/s 559.0911 Ops/s $\color{#35bf28}+1.89\%$
test_td3_speed[reduce-overhead-backward] 3.5132ms 3.3634ms 297.3221 Ops/s 291.9419 Ops/s $\color{#35bf28}+1.84\%$
test_cql_speed[False-None] 37.1735ms 35.5458ms 28.1327 Ops/s 27.7756 Ops/s $\color{#35bf28}+1.29\%$
test_cql_speed[False-backward] 47.6192ms 45.6546ms 21.9036 Ops/s 21.6405 Ops/s $\color{#35bf28}+1.22\%$
test_cql_speed[True-None] 17.0353ms 15.5981ms 64.1105 Ops/s 62.4473 Ops/s $\color{#35bf28}+2.66\%$
test_cql_speed[True-backward] 23.7316ms 22.5657ms 44.3150 Ops/s 43.7998 Ops/s $\color{#35bf28}+1.18\%$
test_cql_speed[reduce-overhead-None] 17.6227ms 15.7107ms 63.6509 Ops/s 62.1512 Ops/s $\color{#35bf28}+2.41\%$
test_cql_speed[reduce-overhead-backward] 23.6854ms 22.7101ms 44.0333 Ops/s 42.7378 Ops/s $\color{#35bf28}+3.03\%$
test_a2c_speed[False-None] 8.2697ms 7.1742ms 139.3893 Ops/s 137.5111 Ops/s $\color{#35bf28}+1.37\%$
test_a2c_speed[False-backward] 15.6646ms 14.3513ms 69.6803 Ops/s 69.1203 Ops/s $\color{#35bf28}+0.81\%$
test_a2c_speed[True-None] 3.8817ms 3.3291ms 300.3795 Ops/s 294.4257 Ops/s $\color{#35bf28}+2.02\%$
test_a2c_speed[True-backward] 10.6738ms 9.8852ms 101.1616 Ops/s 99.6913 Ops/s $\color{#35bf28}+1.47\%$
test_a2c_speed[reduce-overhead-None] 4.0555ms 3.3298ms 300.3161 Ops/s 288.1810 Ops/s $\color{#35bf28}+4.21\%$
test_a2c_speed[reduce-overhead-backward] 10.2541ms 9.8425ms 101.6006 Ops/s 101.6383 Ops/s $\color{#d91a1a}-0.04\%$
test_ppo_speed[False-None] 9.3990ms 7.4893ms 133.5237 Ops/s 133.1582 Ops/s $\color{#35bf28}+0.27\%$
test_ppo_speed[False-backward] 16.2761ms 15.2529ms 65.5615 Ops/s 66.3625 Ops/s $\color{#d91a1a}-1.21\%$
test_ppo_speed[True-None] 4.3726ms 3.7407ms 267.3280 Ops/s 267.3133 Ops/s $+0.01\%$
test_ppo_speed[True-backward] 10.0452ms 9.6664ms 103.4516 Ops/s 102.3371 Ops/s $\color{#35bf28}+1.09\%$
test_ppo_speed[reduce-overhead-None] 4.1354ms 3.7125ms 269.3610 Ops/s 266.5307 Ops/s $\color{#35bf28}+1.06\%$
test_ppo_speed[reduce-overhead-backward] 10.0239ms 9.6404ms 103.7300 Ops/s 102.3609 Ops/s $\color{#35bf28}+1.34\%$
test_reinforce_speed[False-None] 8.0378ms 6.4714ms 154.5268 Ops/s 152.8910 Ops/s $\color{#35bf28}+1.07\%$
test_reinforce_speed[False-backward] 12.6397ms 9.8132ms 101.9031 Ops/s 101.8215 Ops/s $\color{#35bf28}+0.08\%$
test_reinforce_speed[True-None] 4.8656ms 2.6844ms 372.5279 Ops/s 369.4329 Ops/s $\color{#35bf28}+0.84\%$
test_reinforce_speed[True-backward] 9.4849ms 8.6792ms 115.2178 Ops/s 115.4982 Ops/s $\color{#d91a1a}-0.24\%$
test_reinforce_speed[reduce-overhead-None] 3.8693ms 2.6604ms 375.8899 Ops/s 373.0478 Ops/s $\color{#35bf28}+0.76\%$
test_reinforce_speed[reduce-overhead-backward] 10.3606ms 8.7447ms 114.3544 Ops/s 114.7518 Ops/s $\color{#d91a1a}-0.35\%$
test_iql_speed[False-None] 33.1669ms 31.9277ms 31.3208 Ops/s 29.8590 Ops/s $\color{#35bf28}+4.90\%$
test_iql_speed[False-backward] 46.8585ms 44.9319ms 22.2559 Ops/s 21.8741 Ops/s $\color{#35bf28}+1.75\%$
test_iql_speed[True-None] 14.4923ms 13.6168ms 73.4385 Ops/s 71.2837 Ops/s $\color{#35bf28}+3.02\%$
test_iql_speed[True-backward] 31.8431ms 24.7559ms 40.3944 Ops/s 39.9981 Ops/s $\color{#35bf28}+0.99\%$
test_iql_speed[reduce-overhead-None] 14.6527ms 13.5811ms 73.6319 Ops/s 72.9708 Ops/s $\color{#35bf28}+0.91\%$
test_iql_speed[reduce-overhead-backward] 26.2774ms 24.5889ms 40.6687 Ops/s 39.9563 Ops/s $\color{#35bf28}+1.78\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.1357ms 4.8065ms 208.0513 Ops/s 203.9316 Ops/s $\color{#35bf28}+2.02\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.9462ms 0.4838ms 2.0672 KOps/s 2.0386 KOps/s $\color{#35bf28}+1.40\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7791ms 0.4593ms 2.1771 KOps/s 2.1529 KOps/s $\color{#35bf28}+1.12\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 11.3703ms 4.8639ms 205.5948 Ops/s 207.6893 Ops/s $\color{#d91a1a}-1.01\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 3.1970ms 0.4776ms 2.0937 KOps/s 2.0897 KOps/s $\color{#35bf28}+0.19\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6802ms 0.4508ms 2.2185 KOps/s 2.1846 KOps/s $\color{#35bf28}+1.55\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2236ms 1.5960ms 626.5739 Ops/s 620.5793 Ops/s $\color{#35bf28}+0.97\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.9218ms 1.5417ms 648.6498 Ops/s 647.7440 Ops/s $\color{#35bf28}+0.14\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.2016ms 4.9298ms 202.8470 Ops/s 202.1439 Ops/s $\color{#35bf28}+0.35\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.1668ms 0.6184ms 1.6171 KOps/s 1.5869 KOps/s $\color{#35bf28}+1.90\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8731ms 0.5945ms 1.6822 KOps/s 1.6708 KOps/s $\color{#35bf28}+0.68\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.6575ms 4.7537ms 210.3631 Ops/s 204.4457 Ops/s $\color{#35bf28}+2.89\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.3436ms 0.4773ms 2.0952 KOps/s 2.0536 KOps/s $\color{#35bf28}+2.03\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6844ms 0.4588ms 2.1794 KOps/s 2.0878 KOps/s $\color{#35bf28}+4.39\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2108ms 4.6960ms 212.9476 Ops/s 207.6169 Ops/s $\color{#35bf28}+2.57\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.2778ms 0.4775ms 2.0942 KOps/s 2.0789 KOps/s $\color{#35bf28}+0.73\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6815ms 0.4519ms 2.2127 KOps/s 2.1826 KOps/s $\color{#35bf28}+1.38\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 5.1738ms 4.9085ms 203.7296 Ops/s 203.5404 Ops/s $\color{#35bf28}+0.09\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.7937ms 0.6176ms 1.6192 KOps/s 1.5941 KOps/s $\color{#35bf28}+1.58\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8730ms 0.5953ms 1.6797 KOps/s 1.6739 KOps/s $\color{#35bf28}+0.35\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.4898ms 4.2450ms 235.5688 Ops/s 246.9730 Ops/s $\color{#d91a1a}-4.62\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.9291ms 2.2961ms 435.5214 Ops/s 435.9204 Ops/s $\color{#d91a1a}-0.09\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 5.6126ms 1.3470ms 742.4035 Ops/s 758.8484 Ops/s $\color{#d91a1a}-2.17\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.3810s 11.8222ms 84.5868 Ops/s 34.7596 Ops/s $\textbf{\color{#35bf28}+143.35\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 8.5228ms 2.3296ms 429.2523 Ops/s 438.0452 Ops/s $\color{#d91a1a}-2.01\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 2.6082ms 1.2790ms 781.8624 Ops/s 771.2046 Ops/s $\color{#35bf28}+1.38\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.1623ms 4.4299ms 225.7367 Ops/s 215.6726 Ops/s $\color{#35bf28}+4.67\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 5.2475ms 2.3976ms 417.0897 Ops/s 414.2970 Ops/s $\color{#35bf28}+0.67\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.8078ms 1.4381ms 695.3645 Ops/s 663.9166 Ops/s $\color{#35bf28}+4.74\%$

@vmoens vmoens added the CI Has to do with CI setup (e.g. wheels & builds, tests...) label Oct 11, 2024
@vmoens vmoens merged commit 194a5ff into main Oct 14, 2024
1 check passed
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 143. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7276s 0.7271s 1.3754 Ops/s 1.3564 Ops/s $\color{#35bf28}+1.40\%$
test_transformed 1.0735s 0.9906s 1.0094 Ops/s 1.0364 Ops/s $\color{#d91a1a}-2.60\%$
test_serial 2.2416s 2.1492s 0.4653 Ops/s 0.4804 Ops/s $\color{#d91a1a}-3.15\%$
test_parallel 1.9617s 1.9196s 0.5209 Ops/s 0.5263 Ops/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[True-True-True-True-True] 0.2423ms 39.1449μs 25.5461 KOps/s 25.7555 KOps/s $\color{#d91a1a}-0.81\%$
test_step_mdp_speed[True-True-True-True-False] 62.4810μs 22.8037μs 43.8525 KOps/s 43.9913 KOps/s $\color{#d91a1a}-0.32\%$
test_step_mdp_speed[True-True-True-False-True] 56.4310μs 21.2262μs 47.1115 KOps/s 47.6289 KOps/s $\color{#d91a1a}-1.09\%$
test_step_mdp_speed[True-True-True-False-False] 48.2710μs 12.2720μs 81.4863 KOps/s 81.4498 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[True-True-False-True-True] 87.9620μs 41.7455μs 23.9547 KOps/s 23.8944 KOps/s $\color{#35bf28}+0.25\%$
test_step_mdp_speed[True-True-False-True-False] 63.3610μs 25.3753μs 39.4084 KOps/s 39.4470 KOps/s $\color{#d91a1a}-0.10\%$
test_step_mdp_speed[True-True-False-False-True] 68.7510μs 24.0067μs 41.6550 KOps/s 40.3634 KOps/s $\color{#35bf28}+3.20\%$
test_step_mdp_speed[True-True-False-False-False] 43.4510μs 15.2228μs 65.6910 KOps/s 68.2409 KOps/s $\color{#d91a1a}-3.74\%$
test_step_mdp_speed[True-False-True-True-True] 78.8420μs 44.9340μs 22.2549 KOps/s 22.4977 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[True-False-True-True-False] 58.2710μs 28.4568μs 35.1410 KOps/s 35.7716 KOps/s $\color{#d91a1a}-1.76\%$
test_step_mdp_speed[True-False-True-False-True] 56.1210μs 24.0213μs 41.6298 KOps/s 41.4929 KOps/s $\color{#35bf28}+0.33\%$
test_step_mdp_speed[True-False-True-False-False] 44.9310μs 15.2720μs 65.4793 KOps/s 67.0705 KOps/s $\color{#d91a1a}-2.37\%$
test_step_mdp_speed[True-False-False-True-True] 89.2610μs 47.1531μs 21.2075 KOps/s 21.4871 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[True-False-False-True-False] 64.0010μs 30.5445μs 32.7391 KOps/s 32.6853 KOps/s $\color{#35bf28}+0.16\%$
test_step_mdp_speed[True-False-False-False-True] 59.1110μs 26.9325μs 37.1299 KOps/s 36.7042 KOps/s $\color{#35bf28}+1.16\%$
test_step_mdp_speed[True-False-False-False-False] 50.7110μs 17.8854μs 55.9114 KOps/s 56.5900 KOps/s $\color{#d91a1a}-1.20\%$
test_step_mdp_speed[False-True-True-True-True] 85.0020μs 44.4411μs 22.5017 KOps/s 22.2532 KOps/s $\color{#35bf28}+1.12\%$
test_step_mdp_speed[False-True-True-True-False] 59.1810μs 28.2137μs 35.4438 KOps/s 34.6865 KOps/s $\color{#35bf28}+2.18\%$
test_step_mdp_speed[False-True-True-False-True] 58.5710μs 29.0372μs 34.4386 KOps/s 34.0007 KOps/s $\color{#35bf28}+1.29\%$
test_step_mdp_speed[False-True-True-False-False] 2.6163ms 17.5525μs 56.9721 KOps/s 55.6431 KOps/s $\color{#35bf28}+2.39\%$
test_step_mdp_speed[False-True-False-True-True] 85.3410μs 47.1298μs 21.2180 KOps/s 20.8651 KOps/s $\color{#35bf28}+1.69\%$
test_step_mdp_speed[False-True-False-True-False] 62.0010μs 30.9902μs 32.2683 KOps/s 31.9320 KOps/s $\color{#35bf28}+1.05\%$
test_step_mdp_speed[False-True-False-False-True] 59.0610μs 31.4219μs 31.8250 KOps/s 31.5019 KOps/s $\color{#35bf28}+1.03\%$
test_step_mdp_speed[False-True-False-False-False] 46.3110μs 20.1716μs 49.5747 KOps/s 49.3988 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-False-True-True-True] 78.0920μs 50.4578μs 19.8186 KOps/s 20.0808 KOps/s $\color{#d91a1a}-1.31\%$
test_step_mdp_speed[False-False-True-True-False] 58.2210μs 33.4327μs 29.9109 KOps/s 29.5350 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[False-False-True-False-True] 62.9710μs 31.7063μs 31.5394 KOps/s 31.6516 KOps/s $\color{#d91a1a}-0.35\%$
test_step_mdp_speed[False-False-True-False-False] 47.8810μs 20.2291μs 49.4338 KOps/s 49.7299 KOps/s $\color{#d91a1a}-0.60\%$
test_step_mdp_speed[False-False-False-True-True] 87.2920μs 53.0456μs 18.8517 KOps/s 19.2430 KOps/s $\color{#d91a1a}-2.03\%$
test_step_mdp_speed[False-False-False-True-False] 73.7510μs 36.0181μs 27.7638 KOps/s 27.4490 KOps/s $\color{#35bf28}+1.15\%$
test_step_mdp_speed[False-False-False-False-True] 65.3210μs 34.1826μs 29.2546 KOps/s 29.3009 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-False-False-False-False] 51.2210μs 23.0085μs 43.4621 KOps/s 44.3297 KOps/s $\color{#d91a1a}-1.96\%$
test_values[generalized_advantage_estimate-True-True] 24.1835ms 23.7603ms 42.0870 Ops/s 42.4789 Ops/s $\color{#d91a1a}-0.92\%$
test_values[vec_generalized_advantage_estimate-True-True] 99.6799ms 2.8751ms 347.8142 Ops/s 343.3351 Ops/s $\color{#35bf28}+1.30\%$
test_values[td0_return_estimate-False-False] 85.7910μs 64.1357μs 15.5919 KOps/s 15.4768 KOps/s $\color{#35bf28}+0.74\%$
test_values[td1_return_estimate-False-False] 53.8703ms 53.4533ms 18.7079 Ops/s 18.7482 Ops/s $\color{#d91a1a}-0.21\%$
test_values[vec_td1_return_estimate-False-False] 1.2866ms 1.0606ms 942.8257 Ops/s 942.5793 Ops/s $\color{#35bf28}+0.03\%$
test_values[td_lambda_return_estimate-True-False] 85.5319ms 84.9959ms 11.7653 Ops/s 11.7437 Ops/s $\color{#35bf28}+0.18\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2556ms 1.0554ms 947.5129 Ops/s 942.4208 Ops/s $\color{#35bf28}+0.54\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.1023ms 23.6697ms 42.2481 Ops/s 42.6854 Ops/s $\color{#d91a1a}-1.02\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0189ms 0.7282ms 1.3733 KOps/s 1.3766 KOps/s $\color{#d91a1a}-0.24\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7752ms 0.6462ms 1.5475 KOps/s 1.5448 KOps/s $\color{#35bf28}+0.17\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5195ms 1.4591ms 685.3366 Ops/s 685.9378 Ops/s $\color{#d91a1a}-0.09\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7073ms 0.6608ms 1.5134 KOps/s 1.5118 KOps/s $\color{#35bf28}+0.11\%$
test_dqn_speed[False-None] 6.4704ms 1.3330ms 750.1609 Ops/s 684.8198 Ops/s $\textbf{\color{#35bf28}+9.54\%}$
test_dqn_speed[False-backward] 1.8688ms 1.8044ms 554.2112 Ops/s 547.3689 Ops/s $\color{#35bf28}+1.25\%$
test_dqn_speed[True-None] 1.0160ms 0.5641ms 1.7727 KOps/s 1.7682 KOps/s $\color{#35bf28}+0.25\%$
test_dqn_speed[True-backward] 1.0376ms 0.9900ms 1.0101 KOps/s 834.3452 Ops/s $\textbf{\color{#35bf28}+21.07\%}$
test_dqn_speed[reduce-overhead-None] 0.7998ms 0.5576ms 1.7933 KOps/s 1.7807 KOps/s $\color{#35bf28}+0.71\%$
test_dqn_speed[reduce-overhead-backward] 1.0452ms 0.9894ms 1.0107 KOps/s 1.0039 KOps/s $\color{#35bf28}+0.67\%$
test_ddpg_speed[False-None] 3.4447ms 2.7290ms 366.4293 Ops/s 371.7043 Ops/s $\color{#d91a1a}-1.42\%$
test_ddpg_speed[False-backward] 4.1717ms 3.9455ms 253.4565 Ops/s 257.5837 Ops/s $\color{#d91a1a}-1.60\%$
test_ddpg_speed[True-None] 1.5786ms 1.2397ms 806.6489 Ops/s 812.1161 Ops/s $\color{#d91a1a}-0.67\%$
test_ddpg_speed[True-backward] 2.2086ms 2.1737ms 460.0523 Ops/s 463.1647 Ops/s $\color{#d91a1a}-0.67\%$
test_ddpg_speed[reduce-overhead-None] 1.5874ms 1.2415ms 805.4594 Ops/s 823.4664 Ops/s $\color{#d91a1a}-2.19\%$
test_ddpg_speed[reduce-overhead-backward] 2.2383ms 2.1830ms 458.0898 Ops/s 463.0146 Ops/s $\color{#d91a1a}-1.06\%$
test_sac_speed[False-None] 7.7883ms 7.5263ms 132.8679 Ops/s 134.2976 Ops/s $\color{#d91a1a}-1.06\%$
test_sac_speed[False-backward] 11.1960ms 10.6854ms 93.5858 Ops/s 94.3745 Ops/s $\color{#d91a1a}-0.84\%$
test_sac_speed[True-None] 2.3363ms 2.0123ms 496.9548 Ops/s 489.7753 Ops/s $\color{#35bf28}+1.47\%$
test_sac_speed[True-backward] 5.0025ms 3.9690ms 251.9526 Ops/s 220.9916 Ops/s $\textbf{\color{#35bf28}+14.01\%}$
test_sac_speed[reduce-overhead-None] 2.3454ms 2.0082ms 497.9545 Ops/s 488.8571 Ops/s $\color{#35bf28}+1.86\%$
test_sac_speed[reduce-overhead-backward] 4.0111ms 3.8901ms 257.0635 Ops/s 255.1691 Ops/s $\color{#35bf28}+0.74\%$
test_redq_speed[False-None] 14.3688ms 9.9893ms 100.1075 Ops/s 96.5027 Ops/s $\color{#35bf28}+3.74\%$
test_redq_speed[False-backward] 17.7643ms 16.8536ms 59.3344 Ops/s 57.3851 Ops/s $\color{#35bf28}+3.40\%$
test_redq_speed[True-None] 3.8815ms 3.4219ms 292.2335 Ops/s 271.1600 Ops/s $\textbf{\color{#35bf28}+7.77\%}$
test_redq_speed[True-backward] 8.3556ms 7.9182ms 126.2913 Ops/s 113.8493 Ops/s $\textbf{\color{#35bf28}+10.93\%}$
test_redq_speed[reduce-overhead-None] 3.7720ms 3.3884ms 295.1239 Ops/s 279.8310 Ops/s $\textbf{\color{#35bf28}+5.47\%}$
test_redq_speed[reduce-overhead-backward] 9.0029ms 8.2399ms 121.3607 Ops/s 119.4786 Ops/s $\color{#35bf28}+1.58\%$
test_redq_deprec_speed[False-None] 11.7817ms 10.9731ms 91.1319 Ops/s 95.3335 Ops/s $\color{#d91a1a}-4.41\%$
test_redq_deprec_speed[False-backward] 16.8502ms 15.8777ms 62.9812 Ops/s 65.5197 Ops/s $\color{#d91a1a}-3.87\%$
test_redq_deprec_speed[True-None] 3.4854ms 3.2780ms 305.0679 Ops/s 313.3648 Ops/s $\color{#d91a1a}-2.65\%$
test_redq_deprec_speed[True-backward] 8.1439ms 7.3454ms 136.1397 Ops/s 148.5254 Ops/s $\textbf{\color{#d91a1a}-8.34\%}$
test_redq_deprec_speed[reduce-overhead-None] 3.8025ms 3.2660ms 306.1894 Ops/s 319.4470 Ops/s $\color{#d91a1a}-4.15\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.4352ms 7.0826ms 141.1920 Ops/s 145.4006 Ops/s $\color{#d91a1a}-2.89\%$
test_td3_speed[False-None] 7.5582ms 7.4812ms 133.6685 Ops/s 132.8591 Ops/s $\color{#35bf28}+0.61\%$
test_td3_speed[False-backward] 10.7378ms 10.2302ms 97.7502 Ops/s 95.8352 Ops/s $\color{#35bf28}+2.00\%$
test_td3_speed[True-None] 1.9112ms 1.8681ms 535.3045 Ops/s 523.8606 Ops/s $\color{#35bf28}+2.18\%$
test_td3_speed[True-backward] 3.7338ms 3.6181ms 276.3917 Ops/s 255.5671 Ops/s $\textbf{\color{#35bf28}+8.15\%}$
test_td3_speed[reduce-overhead-None] 1.9244ms 1.8785ms 532.3443 Ops/s 517.2228 Ops/s $\color{#35bf28}+2.92\%$
test_td3_speed[reduce-overhead-backward] 3.7464ms 3.6674ms 272.6703 Ops/s 275.3038 Ops/s $\color{#d91a1a}-0.96\%$
test_cql_speed[False-None] 27.5482ms 24.3727ms 41.0294 Ops/s 40.7060 Ops/s $\color{#35bf28}+0.79\%$
test_cql_speed[False-backward] 37.0274ms 33.7851ms 29.5988 Ops/s 28.7252 Ops/s $\color{#35bf28}+3.04\%$
test_cql_speed[True-None] 12.2827ms 10.6205ms 94.1577 Ops/s 94.1740 Ops/s $\color{#d91a1a}-0.02\%$
test_cql_speed[True-backward] 16.6471ms 16.1969ms 61.7402 Ops/s 61.8996 Ops/s $\color{#d91a1a}-0.26\%$
test_cql_speed[reduce-overhead-None] 11.1462ms 10.6060ms 94.2863 Ops/s 94.0206 Ops/s $\color{#35bf28}+0.28\%$
test_cql_speed[reduce-overhead-backward] 16.6536ms 16.2419ms 61.5692 Ops/s 61.3273 Ops/s $\color{#35bf28}+0.39\%$
test_a2c_speed[False-None] 7.6275ms 5.2829ms 189.2904 Ops/s 190.8313 Ops/s $\color{#d91a1a}-0.81\%$
test_a2c_speed[False-backward] 12.0243ms 11.6848ms 85.5816 Ops/s 86.4567 Ops/s $\color{#d91a1a}-1.01\%$
test_a2c_speed[True-None] 3.8784ms 3.0646ms 326.3065 Ops/s 324.8162 Ops/s $\color{#35bf28}+0.46\%$
test_a2c_speed[True-backward] 8.9721ms 8.4859ms 117.8429 Ops/s 106.8485 Ops/s $\textbf{\color{#35bf28}+10.29\%}$
test_a2c_speed[reduce-overhead-None] 3.1766ms 3.0050ms 332.7783 Ops/s 329.6035 Ops/s $\color{#35bf28}+0.96\%$
test_a2c_speed[reduce-overhead-backward] 8.5484ms 8.3422ms 119.8720 Ops/s 119.5120 Ops/s $\color{#35bf28}+0.30\%$
test_ppo_speed[False-None] 5.7363ms 5.5092ms 181.5151 Ops/s 180.7238 Ops/s $\color{#35bf28}+0.44\%$
test_ppo_speed[False-backward] 12.5317ms 12.1106ms 82.5724 Ops/s 84.0130 Ops/s $\color{#d91a1a}-1.71\%$
test_ppo_speed[True-None] 3.6647ms 3.4252ms 291.9574 Ops/s 281.3875 Ops/s $\color{#35bf28}+3.76\%$
test_ppo_speed[True-backward] 8.4658ms 8.1619ms 122.5203 Ops/s 122.5777 Ops/s $\color{#d91a1a}-0.05\%$
test_ppo_speed[reduce-overhead-None] 3.5532ms 3.3655ms 297.1293 Ops/s 293.7204 Ops/s $\color{#35bf28}+1.16\%$
test_ppo_speed[reduce-overhead-backward] 9.4648ms 8.2932ms 120.5810 Ops/s 118.1339 Ops/s $\color{#35bf28}+2.07\%$
test_reinforce_speed[False-None] 5.0376ms 4.4627ms 224.0802 Ops/s 222.6708 Ops/s $\color{#35bf28}+0.63\%$
test_reinforce_speed[False-backward] 7.8913ms 7.3034ms 136.9232 Ops/s 137.0172 Ops/s $\color{#d91a1a}-0.07\%$
test_reinforce_speed[True-None] 2.5620ms 2.2186ms 450.7369 Ops/s 446.4033 Ops/s $\color{#35bf28}+0.97\%$
test_reinforce_speed[True-backward] 7.9276ms 7.0998ms 140.8482 Ops/s 143.2868 Ops/s $\color{#d91a1a}-1.70\%$
test_reinforce_speed[reduce-overhead-None] 2.5985ms 2.2338ms 447.6701 Ops/s 450.3925 Ops/s $\color{#d91a1a}-0.60\%$
test_reinforce_speed[reduce-overhead-backward] 7.1973ms 6.9786ms 143.2960 Ops/s 141.9873 Ops/s $\color{#35bf28}+0.92\%$
test_iql_speed[False-None] 19.4886ms 19.0916ms 52.3790 Ops/s 50.3055 Ops/s $\color{#35bf28}+4.12\%$
test_iql_speed[False-backward] 29.8036ms 29.1488ms 34.3068 Ops/s 33.3066 Ops/s $\color{#35bf28}+3.00\%$
test_iql_speed[True-None] 8.1019ms 7.6792ms 130.2221 Ops/s 123.7056 Ops/s $\textbf{\color{#35bf28}+5.27\%}$
test_iql_speed[True-backward] 16.5854ms 16.1086ms 62.0788 Ops/s 58.8256 Ops/s $\textbf{\color{#35bf28}+5.53\%}$
test_iql_speed[reduce-overhead-None] 7.9262ms 7.6938ms 129.9740 Ops/s 127.4580 Ops/s $\color{#35bf28}+1.97\%$
test_iql_speed[reduce-overhead-backward] 16.5576ms 16.1468ms 61.9319 Ops/s 60.7474 Ops/s $\color{#35bf28}+1.95\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3746ms 6.2883ms 159.0244 Ops/s 159.1687 Ops/s $\color{#d91a1a}-0.09\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 2.3140ms 0.2791ms 3.5829 KOps/s 4.0468 KOps/s $\textbf{\color{#d91a1a}-11.46\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5890ms 0.2622ms 3.8142 KOps/s 4.7618 KOps/s $\textbf{\color{#d91a1a}-19.90\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4404ms 6.1975ms 161.3563 Ops/s 163.9015 Ops/s $\color{#d91a1a}-1.55\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.1869ms 0.2991ms 3.3430 KOps/s 3.5252 KOps/s $\textbf{\color{#d91a1a}-5.17\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5028ms 0.2819ms 3.5480 KOps/s 3.6666 KOps/s $\color{#d91a1a}-3.24\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5506ms 1.3304ms 751.6544 Ops/s 864.4784 Ops/s $\textbf{\color{#d91a1a}-13.05\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5757ms 1.2845ms 778.4899 Ops/s 900.7676 Ops/s $\textbf{\color{#d91a1a}-13.57\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5090ms 6.4247ms 155.6497 Ops/s 160.1296 Ops/s $\color{#d91a1a}-2.80\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.9866ms 0.4446ms 2.2493 KOps/s 2.4239 KOps/s $\textbf{\color{#d91a1a}-7.20\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7265ms 0.4622ms 2.1635 KOps/s 2.6779 KOps/s $\textbf{\color{#d91a1a}-19.21\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4172ms 6.2267ms 160.5988 Ops/s 163.3819 Ops/s $\color{#d91a1a}-1.70\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.8900ms 0.3530ms 2.8326 KOps/s 4.3731 KOps/s $\textbf{\color{#d91a1a}-35.23\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5245ms 0.3044ms 3.2850 KOps/s 4.8376 KOps/s $\textbf{\color{#d91a1a}-32.10\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.5758ms 6.1939ms 161.4483 Ops/s 166.3347 Ops/s $\color{#d91a1a}-2.94\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8389ms 0.3157ms 3.1676 KOps/s 4.3782 KOps/s $\textbf{\color{#d91a1a}-27.65\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.4235ms 0.2346ms 4.2619 KOps/s 4.8195 KOps/s $\textbf{\color{#d91a1a}-11.57\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5548ms 6.4136ms 155.9179 Ops/s 161.3477 Ops/s $\color{#d91a1a}-3.37\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2406ms 0.4205ms 2.3780 KOps/s 2.1572 KOps/s $\textbf{\color{#35bf28}+10.23\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6523ms 0.4215ms 2.3726 KOps/s 2.5318 KOps/s $\textbf{\color{#d91a1a}-6.29\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.9280ms 5.3618ms 186.5033 Ops/s 35.7723 Ops/s $\textbf{\color{#35bf28}+421.36\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 7.3706ms 2.1784ms 459.0500 Ops/s 493.0203 Ops/s $\textbf{\color{#d91a1a}-6.89\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 7.9173ms 1.2303ms 812.7943 Ops/s 877.8830 Ops/s $\textbf{\color{#d91a1a}-7.41\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4348s 14.0215ms 71.3191 Ops/s 186.0087 Ops/s $\textbf{\color{#d91a1a}-61.66\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.2796ms 1.9994ms 500.1624 Ops/s 518.6367 Ops/s $\color{#d91a1a}-3.56\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.2359ms 1.1721ms 853.2039 Ops/s 808.9395 Ops/s $\textbf{\color{#35bf28}+5.47\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.2970ms 5.5658ms 179.6676 Ops/s 178.8907 Ops/s $\color{#35bf28}+0.43\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.4795ms 2.1717ms 460.4636 Ops/s 462.3670 Ops/s $\color{#d91a1a}-0.41\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.8990ms 1.3474ms 742.1927 Ops/s 734.9900 Ops/s $\color{#35bf28}+0.98\%$

@vmoens vmoens deleted the fix-benchmarks branch October 17, 2024 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CI Has to do with CI setup (e.g. wheels & builds, tests...) CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants