EM slow convergence #28

gaow · 2020-04-24T19:27:22Z

I've discussed offline with @stephens999 this notebook:

https://stephenslab.github.io/mmbr/prototypes/em_optim_est_prior.html

I'm posting it here for better book-keeping of the discussion. There are several issues with this notebook:

Estimated prior variance scalar looks weird when model is completely mis-specified -- both with EM and optim, in different ways.
There are overlapping CS issues
It certainly helps to use a mixture prior than to use mis-specified fixed prior, but to my unpleasant surprise it does not completely resolve the problem.

In a way, it is okay because all CS indeed contain a causal signal. But we do now have CS overlapping problems more severe than when we investigated univariate SuSiE. Estimated prior effect size for some CS are off. And finally, in the presence of overlapping CS, our marginal PIP needs to be adjusted which we don't implement for now.

Any suggestions?

gaow · 2020-04-25T17:28:58Z

@stephens999 as I was looking into this problem further I noticed this is a convergence issue that didn't catch my attention because I didn't output a warning message when IBSS exceeds maximum number of iterations. As I increased number of iterations, the result is better:

https://stephenslab.github.io/mmbr/prototypes/em_optim_est_prior.html#Updated-results:-more-EM-iterations-15

Here EM and optim estimates are still different, but are less different. EM took >900 iterations to converge, but the result is actually better than optim: higher ELBO and no overlapping CS. When I use a naive mixture prior, IBSS converged in under 600 iterations and all the 3 simulated signals are cleanly captured.

@stephens999 at this point, does it still bother you that optim and EM results are not roughly identical? Here is what they are:

# optim
> result2$V
0.000755795  0.000929637  0.00026145

# EM with >900 iterations
> result5$V
0.001580928  0.000940417

Notice with optim, the first and 3rd CS are completely identical. So in both cases two unique CS are detected. 0.000755795 + 0.00026145 = 0.001017245 which is closer to the first V from EM.

Still I wonder what we can do to make EM converge faster.

stephens999 · 2020-04-25T21:37:22Z

It looks much better, but it does bother me that the estimate does not match the back of the envelope computation. Possibly the back of the envelope computation was wrong of course. But we should understand why if so .. Matthew

…

On Sat, Apr 25, 2020, 12:29 gaow ***@***.***> wrote: @stephens999 <https://github.com/stephens999> as I was looking into this problem further I noticed this is a convergence issue that didn't catch my attention because I didn't output a warning message when IBSS exceeds maximum number of iterations. As I increased number of iterations, the result is better: https://stephenslab.github.io/mmbr/prototypes/em_optim_est_prior.html#Updated-results:-more-EM-iterations-15 Here EM and optim estimates are still different, but are less different. EM took >900 iterations to converge, but the result is actually better than optim: higher ELBO and no overlapping CS. When I use a naive mixture prior, IBSS converged in under 600 iterations and all the 3 simulated signals are cleanly captured. @stephens999 <https://github.com/stephens999> at this point, does it still both you that optim and EM results are not roughly identical? Here is what they are: # optim> result2$V0.000755795 0.000929637 0.00026145 # EM with >900 iterations> result5$V0.001580928 0.000940417 Notice with optim, the first and 3rd CS are completely identical. So in both cases two unique CS are detected. 0.000755795 + 0.00026145 = 0.001017245 which is closer to the first V from EM. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#28 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANXRRNSJTE6L5VZRLYVPNTROMMWNANCNFSM4MQLJCOQ> .

gaow · 2020-04-27T13:58:15Z

@stephens999 sorry I was asking if you are bothered by EM and optim results don't agree exactly (though they roughly agree after EM converges) as shown in my ticket above. I forgot to mention that if EM (and optim) indeed converge, the estimated prior scalar will be in agreement with our back of the envelope calculation. Still, as discussed earlier I tried to come up with a smaller example below:

https://stephenslab.github.io/mmbr/prototypes/em_optim_difference.html

This involves R = 15, barely large enough conditions to show slow convergence of EM although in the link above I haven't discussed it. You can see that the estimate for the coefficient is off from the truth under this wrong model, but the scalar estimate is consistent with the effect size estimate. EM and optim results for this example, as shown in previous notebook, do agree in scale given enough iterations for EM. So we should be good here.

Notice I did try to write simpler code for the task but I still used some functions & R6 classes because I think rewriting those classes in plain functions could be even more confusing than this. Also we dont really seem to have an issue here, so this notebook is perhaps most useful as an example how we manually run IBSS to debug if we need to do this in the future for a different goal.

gaow · 2020-04-27T14:09:22Z

@stephens999 @pcarbo I changed the title of this ticket to "EM slow convergence" because this seems the only problem as far as I see now for the EM based approach to estimate prior scalar. I should add that even with over 100 EM iterations, the computation is still faster than a couple of iterations using optim, for data of R=50 scale.

stephens999 · 2020-04-27T15:04:14Z

no i'm not bothered they don't agree as they could correspond to different local modes. I guess the right check is to run optim starting from the EM solution and vice versa.

…

On Mon, Apr 27, 2020 at 8:58 AM gaow ***@***.***> wrote: @stephens999 <https://github.com/stephens999> sorry I was asking if you are bothered by EM and optim results dont agree as shown in my ticket above. I forgot to mention that if EM (and optim) indeed converge, the estimated prior scalar will be in agreement with our back of the envelope calculation. Still, as discussed earlier I tried to come up with a smaller example below: https://stephenslab.github.io/mmbr/prototypes/em_optim_difference.html This involves R = 15, barely large enough conditions to show slow convergence of EM although in the link above I haven't discussed it. You can see that the estimate for the coefficient is off from the truth under this wrong model, but the scalar estimate is consistent with the effect size estimate. EM and optim results for this example, as shown in previous notebook, do agree in scale given enough iterations for EM. So we should be good here. Notice I did try to write simpler code for the task but I still used some functions & R6 classes because I think rewriting those classes in plain functions could be even more confusing than this. Also we dont really seem to have an issue here, so this notebook is perhaps most useful as an example how we manually run IBSS to debug if we need to do this in the future for a different goal. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#28 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AANXRRM3S7PL2HX4O5YLSLTROWFQPANCNFSM4MQLJCOQ> .

gaow · 2020-04-28T02:00:31Z

@stephens999 okay just getting back to this: here is result running optim starting from the EM solution and vice versa,

https://stephenslab.github.io/mmbr/prototypes/em_optim_est_prior.html#Assessing-impact-of-initialization-on-agreement-between-EM-and-optim-16

When initialized from each other the results do agree with what they are initialized from. So as you suspected they are different local modes.

Now the results are pretty consistent. Notice that by "running X from Y" we use s_init parameter in msusie() call which under the hood will set single effect posterior mean, single effect PIP, and single effect estimate of prior variance scalar. See this function for details.

gaow added a commit that referenced this issue Apr 27, 2020

Add a smaller example for #28

51aef31

gaow changed the title ~~CS overlapping issue in multivariate analysis~~ EM slow convergence Apr 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

EM slow convergence #28

EM slow convergence #28

gaow commented Apr 24, 2020

gaow commented Apr 25, 2020 •

edited

Loading

stephens999 commented Apr 25, 2020 via email

gaow commented Apr 27, 2020 •

edited

Loading

gaow commented Apr 27, 2020

stephens999 commented Apr 27, 2020 via email

gaow commented Apr 28, 2020

EM slow convergence #28

EM slow convergence #28

Comments

gaow commented Apr 24, 2020

gaow commented Apr 25, 2020 • edited Loading

stephens999 commented Apr 25, 2020 via email

gaow commented Apr 27, 2020 • edited Loading

gaow commented Apr 27, 2020

stephens999 commented Apr 27, 2020 via email

gaow commented Apr 28, 2020

gaow commented Apr 25, 2020 •

edited

Loading

gaow commented Apr 27, 2020 •

edited

Loading