Model Initialization Extremely slow #54

dc250601 · 2023-05-21T13:12:19Z

Is there a way to speed up the model initialisation process? Every time I initialize the model, it takes over 30 minutes to initialize the model before the training starts.

Gabri95 · 2023-06-22T16:52:04Z

Hi @dc250601

Unfortunately, this can happen for wide models.
This is due to the slow computation of the variance needed for He weight initialization here.

To speed this up, these variances can be cached such that following layers using the same basisexpansion / basissampler / basismanager will not need to recompute them.
This also helps if you train your model multiple times in a row (only the first time these variances need to be computed).

The R3Conv (and R2Conv) constructor calls this method with cached=False by default, so no caching is performed.
However, you can set initialize=False to avoid initialization entirely and then manually use generalized_he_init with cached=True.

Alternatively, you could also try to use the delta-orthogonal initialization, which I think is a bit faster.
As earlier, you'll have to disable the automatic initialization within the conv layers by using initialize=False and then manually call this initialization method.

Let me know if these solutions work for you!

Best,
Gabriele

jacksonloper · 2023-12-03T02:54:11Z

So the He initialization is why things are slow if I have a convolution with lots of input channels and output channels? And I suppose that would be true even if I have many duplicates of the same "kind" of channels (i.e. 128 irrep(5) channels as input and 128 irrep(5) channels as output)?

Or is there some other kind of "width" that would explain the slowness? Like maybe by width do you mean kernel size? Basically my question is: what do you mean by "wide" model?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Initialization Extremely slow #54

Model Initialization Extremely slow #54

dc250601 commented May 21, 2023

Gabri95 commented Jun 22, 2023

jacksonloper commented Dec 3, 2023 •

edited

Loading

Model Initialization Extremely slow #54

Model Initialization Extremely slow #54

Comments

dc250601 commented May 21, 2023

Gabri95 commented Jun 22, 2023

jacksonloper commented Dec 3, 2023 • edited Loading

jacksonloper commented Dec 3, 2023 •

edited

Loading