Add Hyperparam Optimization for GEKPLS Theta #366

vikram-s-narayan · 2022-06-30T14:35:13Z

At present, in GEKPLS , initial theta is supplied as a real number by the user. This theta value is repeated in an array for the number of components which is also supplied by the user. For example, if the number of components is 2 and the initial theta value is 0.01, then an array is constructed like so: [0.01, 0.01].

The theta values need to be optimized (similar to what is done in the Python version) using an appropriate optimizer with the objective being to maximize the reduced_likelihood_function_value.

vikram-s-narayan · 2022-07-01T09:15:13Z

I have a quick question about which optimizer package to choose for hyperparameter optimization for theta:

What is needed is a relatively light-weight (i.e. very few package dependencies) and stable (i.e. will not suddenly break) tool to simply find the theta values that minimizes our reduced_likelihood_function_value.

Some things to note about our specific requirements are:

Our theta array will usually consist of very few elements (as these are usually the principal components projected down from the full-dimensional space).
The SMT code doc string suggests that in practice theta values between 0 and 20 give good results.

Hence, would it be okay to write our own super-simple optimizer which simply permutes (i.e. tries out various combinations) between various theta value combinations until the highest reduced_likelihood_function_value is found or until a particular number of iterations are reached?

Or would it be better to go with an existing, ready-made package? If this is the case, are there any recommended packages to first try?

ChrisRackauckas · 2022-07-01T09:17:09Z

To get to here, can the GEKPLS be made differentiable? If so, differentiable hyperparmeter optimization will be much faster than having it non-differentiable, and that would completely change the algorithms one would use.

vikram-s-narayan · 2022-07-01T11:12:38Z

can the GEKPLS be made differentiable?

I will research this option.

vikram-s-narayan · 2022-07-07T06:45:29Z

The outputs of the reduced_likelihood_function_value - both, in the Julia version of GEKPLS and in the SMT code from which GEKPLS.jl is adapted - does not appear to have any clear correlation with the root mean squared error. For example, in the SMT code, we see the following outputs:

theta values	rlfv	rmse	remarks
[1.48487442e-02 1.04230276e-06]	2761.35	0.36	value found by the COBYLA optimizer in SMT
[0.01 0.01]	2753.81	0.40
[0.001 0.001]	2598.12	0.06
[0.0001 0.0001]	2385.86	0.01

The gist of the code that produced these results is available here.

This means that weaving in hyperparameter optimization into the GEKPLS code will result in sub-optimal performance often.

Other disadvantages of including hyperparameter optimization into GEKPLS are:

Code execution time can go up.
Users may not have control on which theta hyperparameters to use

Hence, it may be better to allow users to supply theta as a hyperparameter when they construct the GEKPLS model and users can choose to optimize hyperparams on their own outside of the system.

ChrisRackauckas · 2022-07-07T20:55:32Z

Hence, it may be better to allow users to supply theta as a hyperparameter when they construct the GEKPLS model and users can choose to optimize hyperparams on their own outside of the system.

Agreed.

value found by the COBYLA optimizer in SMT

That's part of the issue. COBYLA is a derivative-free optimizer, and so it will generally have a bit of trouble accurately converging beyond a certain accuracy. Derivative-based methods are a lot more stable in how they converge, so I wouldn't be surprised if that improved the hyperparameter optimization here. I still think users should have the option to provide the hyperparameter, and then we should just have a hyperparameter optimization mode if it's not provided or something.

BTW, this issue seems very parallel to #368, just on GEKPLS instead of Kriging.

vikram-s-narayan · 2022-07-09T12:39:11Z

Closing this issue as a decision and a minor fix to facilitate that direction has been made.

vikram-s-narayan changed the title ~~Add Hyperparam Optimization for GEKPLS~~ Add Hyperparam Optimization for GEKPLS Theta Jun 30, 2022

ChrisRackauckas mentioned this issue Jul 7, 2022

Making Kriging differentiable and better initial hyperparams #368

Closed

vikram-s-narayan mentioned this issue Jul 8, 2022

Make theta a user-supplied array param #370

Merged

vikram-s-narayan closed this as completed Jul 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Hyperparam Optimization for GEKPLS Theta #366

Add Hyperparam Optimization for GEKPLS Theta #366

vikram-s-narayan commented Jun 30, 2022

vikram-s-narayan commented Jul 1, 2022

ChrisRackauckas commented Jul 1, 2022

vikram-s-narayan commented Jul 1, 2022

vikram-s-narayan commented Jul 7, 2022 •

edited

Loading

ChrisRackauckas commented Jul 7, 2022

vikram-s-narayan commented Jul 9, 2022 •

edited

Loading

Add Hyperparam Optimization for GEKPLS Theta #366

Add Hyperparam Optimization for GEKPLS Theta #366

Comments

vikram-s-narayan commented Jun 30, 2022

vikram-s-narayan commented Jul 1, 2022

ChrisRackauckas commented Jul 1, 2022

vikram-s-narayan commented Jul 1, 2022

vikram-s-narayan commented Jul 7, 2022 • edited Loading

ChrisRackauckas commented Jul 7, 2022

vikram-s-narayan commented Jul 9, 2022 • edited Loading

vikram-s-narayan commented Jul 7, 2022 •

edited

Loading

vikram-s-narayan commented Jul 9, 2022 •

edited

Loading