diff --git a/docs/index.html b/docs/index.html index 62c4f8d6..d68208e0 100644 --- a/docs/index.html +++ b/docs/index.html @@ -115,7 +115,8 @@
+ We obtain upper confidence bounds on the cumulative distribution function (CDF) of the total reward obtained by diffusion policies in out-of-distribution robosuite environments. + An upper confidence bound on the CDF can be interpreted as the worst-case distribution of reward that is consistent with the observed policy rollouts. + Here we show representative policy rollouts for the Square environment, and plot the in-distribution CDF of reward and our upper confidence bound constructed from 40 out-of-distribution policy rollouts. + The confidence bounds we obtain quantify our uncertainty in the performance of the robot in a concrete and interpretable manner. +
+We obtain lower confidence bounds on the success rate of a diffusion policy tested in two out-of-distribution environments. - The confidence bounds we obtain make the most efficient use of the 50 samples used to estimate the performance of the robot. + The confidence bounds we obtain make the most efficient use of the 50 policy rollouts used to estimate the performance of the robot. The confidence bounds we obtain quantify our uncertainty in the performance of the robot in a concrete and interpretable manner.