Independence of the step-size and stochastic gradient ? #8

Algue-Rythme · 2021-04-27T16:45:52Z

Dear authors,
Thanks for this work

According to the paper, Appendix F.1 in page 25: "To enforce independence of the step-size and stochastic gradient, we perform a backtracking line-search at the current iterate w_k using a mini-batch of examples that is independent of the mini-batch on which∇f_ik(w_k) is evaluated."

I am not sure if I understand well:

do you perform all the computations of Armijo line search with a batch i (and its gradient) to find learning rate Eta, before using Eta to perform gradient step with gradient of batch j ?

Because I read the implementation of Sls and it seems that you are using the same batch (x,y) for both the update and the Line Search, contrary to what specifies the paper. I understand because you are using closure() function to perform Armijo, and the last iterate of Armijo is actually used as final step (still using the same closure() function).

Is there anywhere else in the code where you used the trick of forcing indepandance between Eta and Gradient ?

Thank you very much

Algue-Rythme · 2021-10-08T10:43:17Z

Dear authors
@IssamLaradji
I am still interested in this topic and I would be glad if you could answer my concern.
Thank you

Saurabh-29 · 2023-04-18T01:29:50Z

Hi Algue,

Yes, your understanding of the implementation is right. The SLS is using the same batch (x,y) for both the update and the Line Search which is in line with the description of SLS in the paper. In the main paper, it is mentioned that the SLS is used on the same batch (x,y) that the gradients were computed. The closure is a proxy for loss_function and gives the loss the current parameter.

Regards,

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Independence of the step-size and stochastic gradient ? #8

Independence of the step-size and stochastic gradient ? #8

Algue-Rythme commented Apr 27, 2021

Algue-Rythme commented Oct 8, 2021

Saurabh-29 commented Apr 18, 2023

Independence of the step-size and stochastic gradient ? #8

Independence of the step-size and stochastic gradient ? #8

Comments

Algue-Rythme commented Apr 27, 2021

Algue-Rythme commented Oct 8, 2021

Saurabh-29 commented Apr 18, 2023