-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Model Initialization Extremely slow #54
Comments
Hi @dc250601 Unfortunately, this can happen for wide models. To speed this up, these variances can be cached such that following layers using the same basisexpansion / basissampler / basismanager will not need to recompute them. The R3Conv (and R2Conv) constructor calls this method with Alternatively, you could also try to use the delta-orthogonal initialization, which I think is a bit faster. Let me know if these solutions work for you! Best, |
So the He initialization is why things are slow if I have a convolution with lots of input channels and output channels? And I suppose that would be true even if I have many duplicates of the same "kind" of channels (i.e. 128 irrep(5) channels as input and 128 irrep(5) channels as output)? Or is there some other kind of "width" that would explain the slowness? Like maybe by width do you mean kernel size? Basically my question is: what do you mean by "wide" model? |
Is there a way to speed up the model initialisation process? Every time I initialize the model, it takes over 30 minutes to initialize the model before the training starts.
The text was updated successfully, but these errors were encountered: