-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve runtime and peak memory usage of PVEngine.run_full_mode #140
base: master
Are you sure you want to change the base?
Conversation
I suppose this is ready for review! Open questions:
|
You probably already know this, but the complexity/organization/shape of the matrix also affects computation speed. Diagonal, banded, & symmetrical matrices are easier to solve. If you can permute the matrix so that the non-zero elements are close to the diagonal axis you can further improve the computation simplicity. Although I believe most sparse solvers will try to permute the matrix automatically. See: https://en.m.wikipedia.org/wiki/Band_matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this @kanderso-nrel ! LGTM
I ran it quickly locally on my very old mac and I definitely saw an improvement 🚀 I only have a few minor comments below
pvfactors/engine.py
Outdated
from pvfactors.viewfactors import VFCalculator | ||
from pvfactors.irradiance import HybridPerezOrdered | ||
from pvfactors.config import DEFAULT_RHO_FRONT, DEFAULT_RHO_BACK | ||
|
||
|
||
def _sparse_solve_3D(A, b): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any chance you could add the arg and returned types as type hints please? (instead of in the docstrings) most of the code doesn't have it but maybe we could start adding it for new implementations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, but I've not used type hints before so please let me know if I've messed it up :)
pvfactors/engine.py
Outdated
# iteration across the time dimension is required | ||
# - scipy 1.8.0 added new sparse arrays (as opposed to sparse matrices); | ||
# they are 2-D only at time of writing, but in the future if they | ||
# become N-D then it may make sense to use them here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot for adding these comments, it's super helpful 🚀 maybe we could have them in the docstrings as well?
@@ -250,6 +293,8 @@ def run_full_mode(self, fn_build_report=None): | |||
# Calculate absorbed irradiance | |||
qabs = (np.einsum('ijk,jk->ik', vf_aoi_matrix, q0)[:-1, :] | |||
+ irradiance_abs_mat) | |||
del irradiance_abs_mat | |||
del vf_aoi_matrix | |||
|
|||
# --- Update surfaces with values: the lists are ordered by index | |||
for idx_surf, ts_surface in enumerate(pvarray.all_ts_surfaces): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome 👍 it doesn't shock me to have the del
s in this function for now, although it is true that it is not that frequent to see in python. We could maybe add a small note in the docstrings to explain why we're doing that
x_dense = np.linalg.solve(A, b) | ||
x_sparse = _sparse_solve_3D(A, b) | ||
assert x_sparse.shape == b.shape | ||
np.testing.assert_allclose(x_dense, x_sparse) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome 👍
scipy.sparse.linalg.spsolve seems to include automatic column permuting but I didn't think I was qualified to try to do better than the default :) For reference, here's what a typical slice of |
Passing empty timeseries inputs currently works on master but didn't work here; 480b350 fixes that and adds a relevant test. Not sure how valuable it is to support that use case, but might as well if it's easy... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So sorry for the delay, and thanks so much for the updates! LGTM 🚀
Ping @campanelli-sunpower -- these changes would be nice to have in the main branch for a project @spaneja and I are doing, any chance this could get merged soonish? |
Putting this back to draft -- after further experimentation I'm no longer confident that this PR is the best approach. I will continue the discussion back in #134. |
This PR does two things:
PVEngine.run_full_mode
is in the matrix inversion. However, especially for large simulations, using a sparse linear solve function turns out to be significantly faster than explicitly inverting the full matrix. This replaces the linear algebra operations fromnumpy.linalg
with sparse equivalents fromscipy.sparse.linalg
.PVEngine.run_full_mode
creates many large matrices, most of which are only needed in certain parts of the function. This results in significant (GBs) memory usage for large simulations. Especially for large simulations, includingdel
statements when these large matrices are no longer needed reduces maximum memory usage significantly.Here are some runtime and peak memory comparisons of a typical call to the pvlib wrapper function (
pvlib.bifacial.pvfactors_timeseries
in v0.9.0). All values are the ratio of this PR to v1.5.2, so e.g. a value of 0.67 means this PR runs in 2/3 the time, or uses 2/3 the memory, relative to v1.5.2.Runtime ratio:
Peak RAM usage ratio:
The above ratios, especially the peak memory values, are relatively stable on my machine. Timing is rather less stable; reducing external CPU load as much as possible and setting the power mode to "best battery" (in an attempt to disable CPU turbo boost and such) seems to help. I would be very interested to see if others can recreate the above results, especially on the timing side -- @spaneja kindly ran some for me earlier and the results were not wholly consistent with what I've shown above.
Miscellaneous to-do:
del
situation is probably still suboptimal_sparse_solve