Improve runtime and peak memory usage of PVEngine.run_full_mode #140

kandersolar · 2022-02-25T21:46:22Z

This PR does two things:

As discussed in Runtime scales poorly with n_pvrows #134, the majority of the runtime of PVEngine.run_full_mode is in the matrix inversion. However, especially for large simulations, using a sparse linear solve function turns out to be significantly faster than explicitly inverting the full matrix. This replaces the linear algebra operations from numpy.linalg with sparse equivalents from scipy.sparse.linalg.
PVEngine.run_full_mode creates many large matrices, most of which are only needed in certain parts of the function. This results in significant (GBs) memory usage for large simulations. Especially for large simulations, including del statements when these large matrices are no longer needed reduces maximum memory usage significantly.

Here are some runtime and peak memory comparisons of a typical call to the pvlib wrapper function (pvlib.bifacial.pvfactors_timeseries in v0.9.0). All values are the ratio of this PR to v1.5.2, so e.g. a value of 0.67 means this PR runs in 2/3 the time, or uses 2/3 the memory, relative to v1.5.2.

Runtime ratio:

n_timestamps  100    1000   5000   10000
n_pvrows                                
3              0.91   0.77   0.81   0.80
5              0.61   0.41   0.31   0.39
7              0.60   0.40   0.30   0.34

Peak RAM usage ratio:

n_timestamps  100    1000   5000   10000
n_pvrows                                
3              0.99   0.84   0.67   0.63
5              0.97   0.67   0.60   0.59
7              0.75   0.61   0.58   0.58

The above ratios, especially the peak memory values, are relatively stable on my machine. Timing is rather less stable; reducing external CPU load as much as possible and setting the power mode to "best battery" (in an attempt to disable CPU turbo boost and such) seems to help. I would be very interested to see if others can recreate the above results, especially on the timing side -- @spaneja kindly ran some for me earlier and the results were not wholly consistent with what I've shown above.

Miscellaneous to-do:

take a closer look at where memory is allocated and where it is used; the current del situation is probably still suboptimal
what's new
tests?
make scipy an explicit dependency (right now it's implicit via pvlib)
document _sparse_solve
...and maybe put it somewhere that makes more sense?
consider making the sparse approach opt-in via a boolean kwarg or similar?

kandersolar · 2022-02-26T14:28:55Z

I suppose this is ready for review! Open questions:

The private sparse solve function seems a bit out of place in engine.py; should it live somewhere else?
Should the new sparse approach be made opt-in somehow?
An alternative to using del is to break up run_full_mode into small functions where the intermediate arrays created in each sub function go out of scope when that function finishes. I think sub functions is probably the more pythonic approach, but it would be a bigger change than just throwing in a few dels. Any preference?
Most importantly: can other people replicate the runtime and memory improvements I'm seeing locally?

mikofski · 2022-02-26T20:02:07Z

You probably already know this, but the complexity/organization/shape of the matrix also affects computation speed. Diagonal, banded, & symmetrical matrices are easier to solve. If you can permute the matrix so that the non-zero elements are close to the diagonal axis you can further improve the computation simplicity. Although I believe most sparse solvers will try to permute the matrix automatically. See: https://en.m.wikipedia.org/wiki/Band_matrix

anomam

Thanks a lot for this @kanderso-nrel ! LGTM
I ran it quickly locally on my very old mac and I definitely saw an improvement 🚀 I only have a few minor comments below

anomam · 2022-02-27T02:47:21Z

pvfactors/engine.py

 from pvfactors.viewfactors import VFCalculator
 from pvfactors.irradiance import HybridPerezOrdered
 from pvfactors.config import DEFAULT_RHO_FRONT, DEFAULT_RHO_BACK


+def _sparse_solve_3D(A, b):


any chance you could add the arg and returned types as type hints please? (instead of in the docstrings) most of the code doesn't have it but maybe we could start adding it for new implementations

Done, but I've not used type hints before so please let me know if I've messed it up :)

anomam · 2022-02-27T02:50:15Z

pvfactors/engine.py

+    #   iteration across the time dimension is required
+    # - scipy 1.8.0 added new sparse arrays (as opposed to sparse matrices);
+    #   they are 2-D only at time of writing, but in the future if they
+    #   become N-D then it may make sense to use them here.


thanks a lot for adding these comments, it's super helpful 🚀 maybe we could have them in the docstrings as well?

anomam · 2022-02-27T02:54:38Z

pvfactors/engine.py

@@ -250,6 +293,8 @@ def run_full_mode(self, fn_build_report=None):
        # Calculate absorbed irradiance
        qabs = (np.einsum('ijk,jk->ik', vf_aoi_matrix, q0)[:-1, :]
                + irradiance_abs_mat)
+        del irradiance_abs_mat
+        del vf_aoi_matrix

        # --- Update surfaces with values: the lists are ordered by index
        for idx_surf, ts_surface in enumerate(pvarray.all_ts_surfaces):


awesome 👍 it doesn't shock me to have the dels in this function for now, although it is true that it is not that frequent to see in python. We could maybe add a small note in the docstrings to explain why we're doing that

anomam · 2022-02-27T02:55:06Z

pvfactors/tests/test_engine.py

+    x_dense = np.linalg.solve(A, b)
+    x_sparse = _sparse_solve_3D(A, b)
+    assert x_sparse.shape == b.shape
+    np.testing.assert_allclose(x_dense, x_sparse)


awesome 👍

kandersolar · 2022-02-27T15:39:44Z

If you can permute the matrix so that the non-zero elements are close to the diagonal axis you can further improve the computation simplicity. Although I believe most sparse solvers will try to permute the matrix automatically.

scipy.sparse.linalg.spsolve seems to include automatic column permuting but I didn't think I was qualified to try to do better than the default :) For reference, here's what a typical slice of a_mat looks like (color means nonzero):

kandersolar · 2022-03-03T20:02:19Z

Passing empty timeseries inputs currently works on master but didn't work here; 480b350 fixes that and adds a relevant test. Not sure how valuable it is to support that use case, but might as well if it's easy...

anomam

So sorry for the delay, and thanks so much for the updates! LGTM 🚀

kandersolar · 2022-04-06T17:48:51Z

Ping @campanelli-sunpower -- these changes would be nice to have in the main branch for a project @spaneja and I are doing, any chance this could get merged soonish?

kandersolar · 2022-05-19T16:48:20Z

Putting this back to draft -- after further experimentation I'm no longer confident that this PR is the best approach. I will continue the discussion back in #134.

kandersolar added 6 commits February 25, 2022 08:33

proof of concept sparse linear solve

38922c8

a nice sprinkling of dels

4450fcc

document sparse solve helper function

077a160

add scipy dependency

e0e1e3e

add basic sparse solve test

5268d5f

what's new

12f93e5

kandersolar marked this pull request as ready for review February 26, 2022 14:28

anomam approved these changes Feb 27, 2022

View reviewed changes

kandersolar added 2 commits February 27, 2022 10:54

edits from review

0d2e6f6

bugfix for empty inputs

480b350

anomam approved these changes Mar 13, 2022

View reviewed changes

kandersolar marked this pull request as draft May 19, 2022 16:46

kandersolar mentioned this pull request May 19, 2022

Runtime scales poorly with n_pvrows #134

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve runtime and peak memory usage of PVEngine.run_full_mode #140

Improve runtime and peak memory usage of PVEngine.run_full_mode #140

kandersolar commented Feb 25, 2022 •

edited

Loading

kandersolar commented Feb 26, 2022

mikofski commented Feb 26, 2022

anomam left a comment

anomam Feb 27, 2022

kandersolar Feb 27, 2022

anomam Feb 27, 2022

anomam Feb 27, 2022

anomam Feb 27, 2022

kandersolar commented Feb 27, 2022

kandersolar commented Mar 3, 2022

anomam left a comment

kandersolar commented Apr 6, 2022

kandersolar commented May 19, 2022

Improve runtime and peak memory usage of PVEngine.run_full_mode #140

Are you sure you want to change the base?

Improve runtime and peak memory usage of PVEngine.run_full_mode #140

Conversation

kandersolar commented Feb 25, 2022 • edited Loading

kandersolar commented Feb 26, 2022

mikofski commented Feb 26, 2022

anomam left a comment

Choose a reason for hiding this comment

anomam Feb 27, 2022

Choose a reason for hiding this comment

kandersolar Feb 27, 2022

Choose a reason for hiding this comment

anomam Feb 27, 2022

Choose a reason for hiding this comment

anomam Feb 27, 2022

Choose a reason for hiding this comment

anomam Feb 27, 2022

Choose a reason for hiding this comment

kandersolar commented Feb 27, 2022

kandersolar commented Mar 3, 2022

anomam left a comment

Choose a reason for hiding this comment

kandersolar commented Apr 6, 2022

kandersolar commented May 19, 2022

kandersolar commented Feb 25, 2022 •

edited

Loading