Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Hyperparam Optimization for GEKPLS Theta #366

Closed
vikram-s-narayan opened this issue Jun 30, 2022 · 6 comments
Closed

Add Hyperparam Optimization for GEKPLS Theta #366

vikram-s-narayan opened this issue Jun 30, 2022 · 6 comments

Comments

@vikram-s-narayan
Copy link
Contributor

At present, in GEKPLS , initial theta is supplied as a real number by the user. This theta value is repeated in an array for the number of components which is also supplied by the user. For example, if the number of components is 2 and the initial theta value is 0.01, then an array is constructed like so: [0.01, 0.01].

The theta values need to be optimized (similar to what is done in the Python version) using an appropriate optimizer with the objective being to maximize the reduced_likelihood_function_value.

@vikram-s-narayan vikram-s-narayan changed the title Add Hyperparam Optimization for GEKPLS Add Hyperparam Optimization for GEKPLS Theta Jun 30, 2022
@vikram-s-narayan
Copy link
Contributor Author

I have a quick question about which optimizer package to choose for hyperparameter optimization for theta:

What is needed is a relatively light-weight (i.e. very few package dependencies) and stable (i.e. will not suddenly break) tool to simply find the theta values that minimizes our reduced_likelihood_function_value.

Some things to note about our specific requirements are:

  1. Our theta array will usually consist of very few elements (as these are usually the principal components projected down from the full-dimensional space).
  2. The SMT code doc string suggests that in practice theta values between 0 and 20 give good results.

Hence, would it be okay to write our own super-simple optimizer which simply permutes (i.e. tries out various combinations) between various theta value combinations until the highest reduced_likelihood_function_value is found or until a particular number of iterations are reached?

Or would it be better to go with an existing, ready-made package? If this is the case, are there any recommended packages to first try?

@ChrisRackauckas
Copy link
Member

To get to here, can the GEKPLS be made differentiable? If so, differentiable hyperparmeter optimization will be much faster than having it non-differentiable, and that would completely change the algorithms one would use.

@vikram-s-narayan
Copy link
Contributor Author

can the GEKPLS be made differentiable?

I will research this option.

@vikram-s-narayan
Copy link
Contributor Author

vikram-s-narayan commented Jul 7, 2022

The outputs of the reduced_likelihood_function_value - both, in the Julia version of GEKPLS and in the SMT code from which GEKPLS.jl is adapted - does not appear to have any clear correlation with the root mean squared error. For example, in the SMT code, we see the following outputs:

theta values rlfv rmse remarks
[1.48487442e-02 1.04230276e-06] 2761.35 0.36 value found by the COBYLA optimizer in SMT
[0.01 0.01] 2753.81 0.40  
[0.001 0.001] 2598.12 0.06  
[0.0001 0.0001] 2385.86 0.01  

The gist of the code that produced these results is available here.

This means that weaving in hyperparameter optimization into the GEKPLS code will result in sub-optimal performance often.

Other disadvantages of including hyperparameter optimization into GEKPLS are:

  1. Code execution time can go up.
  2. Users may not have control on which theta hyperparameters to use

Hence, it may be better to allow users to supply theta as a hyperparameter when they construct the GEKPLS model and users can choose to optimize hyperparams on their own outside of the system.

@ChrisRackauckas
Copy link
Member

Hence, it may be better to allow users to supply theta as a hyperparameter when they construct the GEKPLS model and users can choose to optimize hyperparams on their own outside of the system.

Agreed.

value found by the COBYLA optimizer in SMT

That's part of the issue. COBYLA is a derivative-free optimizer, and so it will generally have a bit of trouble accurately converging beyond a certain accuracy. Derivative-based methods are a lot more stable in how they converge, so I wouldn't be surprised if that improved the hyperparameter optimization here. I still think users should have the option to provide the hyperparameter, and then we should just have a hyperparameter optimization mode if it's not provided or something.

BTW, this issue seems very parallel to #368, just on GEKPLS instead of Kriging.

@vikram-s-narayan
Copy link
Contributor Author

vikram-s-narayan commented Jul 9, 2022

Closing this issue as a decision and a minor fix to facilitate that direction has been made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants