Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Using the robust solver for pyMBAR - avoiding convergence Failu… #735

Conversation

RiesBen
Copy link
Contributor

@RiesBen RiesBen commented Jul 4, 2024

Description

In this PR, we would like to propose switching the default solver, if pyMBAR > 4.0.0, such we have an improved convergence rate at the cost of minimal more time. -> less errors thrown.

Todos

  • Implement feature / fix bug
  • Add tests
  • Update documentation as needed
  • Update changelog to summarize changes in behavior, enhancements, and bugfixes implemented in this PR

Status

  • Ready to go

Changelog message


RiesBen and others added 2 commits July 4, 2024 17:36
…res.

## Description
In this PR, we would like to propose switching the default solver, if pyMBAR > 4.0.0, such we have an improved convergence rate at the cost of minimal more time. -> less errors thrown. 

## Todos
- [ ] Implement feature / fix bug
- [ ] Add [tests](https://github.com/choderalab/openmmtools/tree/master/openmmtools/tests)
- [ ] Update [documentation](https://github.com/choderalab/openmmtools/tree/master/docs) as needed
- [ ] Update [changelog](https://github.com/choderalab/openmmtools/blob/master/docs/releasehistory.rst) to summarize changes in behavior, enhancements, and bugfixes implemented in this PR

## Status
- [ ] Ready to go

## Changelog message
```

```
@mikemhenry
Copy link
Contributor

   if pymbar.version.short_version >= "4.0.0" and "solver_protocol" not in self._user_extra_analysis_kwargs:
AttributeError: module 'openmmtools.multistate.pymbar' has no attribute 'version'

@mikemhenry
Copy link
Contributor

We should use this to compare versions https://packaging.pypa.io/en/latest/version.html#packaging.version.Version

@ijpulidos
Copy link
Contributor

There are a few options here, from our meeting:

  • Need to benchmark if this affects the performance (is it slower?)
  • We would change the default behavior and communicate the performance impact in the CHANGELOG (release notes). Would be passed on construction of the analyzer.
  • Realtime analysis_kwargs in
    try: # Trap errors for MBAR being under sampled and the W_nk matrix not being normalized correctly
    mbar = analysis.mbar
    free_energy, err_free_energy = analysis.get_free_energy()
    n_equilibration_iterations = analysis.n_equilibration_iterations
    statistical_inefficiency = analysis.statistical_inefficiency
    except ParameterError as e:
    # We don't update self._last_err_free_energy here since if it
    # wasn't below the target threshold before, it won't stop MultiStateSampler now.
    logger.debug(f"ParameterError computing MBAR. {e}.")
    bump_error_counter = True
    self._online_error_bank.append(e)
    if len(self._online_error_bank) > 6:
    # Cache only the last set
    self._online_error_bank.pop(0)
    free_energy = None
    else:
    self._last_mbar_f_k_offline = mbar.f_k
    free_energy = free_energy[idx, jdx]
    self._last_err_free_energy = err_free_energy[idx, jdx]
    logger.debug("Current Free Energy Estimate is {} +- {} kT".format(free_energy,
    self._last_err_free_energy))
    # Trap a case when errors don't converge (usually due to under sampling)
    if np.isnan(self._last_err_free_energy):
    self._last_err_free_energy = np.inf
    timer.stop("MBAR")
    could stay the same as it was before.
  • Test with the overlap matrix from example in pymbar issue thread.
  • Test with "production" simulations. Just to double check how this behaves with longer trajectories and more data.
  • Make sure we have a CI matrix with pymbar 4 and the robust settings.

@mikemhenry @RiesBen please add any comments to this thread if I'm missing something. Thanks!

@IAlibay
Copy link
Contributor

IAlibay commented Aug 21, 2024

Realtime analysis_kwargs could stay the same as it was before.

I'm not fully clued in on the discussion, but my suggestion would be, if possible, to keep the kwargs equal between realtime analysis and the final analysis. I.e. the failures are more likely to happen at low sample counts.

Need to benchmark if this affects the performance (is it slower?)

  1. I believe from an exchange a little while back with @mrshirts, the relative speed of different solvers is system dependent - i.e. down to the number of iterations run not the speed of each iteration.

  2. It might be good to check what we're testing against as the performance baseline. The default solver in pymbar 3 is adaptive, whilst it's hybr with adaptive fallback in pymbar 4. Robust is adaptive w/ L-BFGS-B fallback in pymbar 4. If we're doing a performance regression test, the baseline probably should be pymbar 3 default vs pymbar 4 robust (although pymbar 4 default vs robust would be good to know too!).

@mikemhenry
Copy link
Contributor

I'm not fully clued in on the discussion, but my suggestion would be, if possible, to keep the kwargs equal between realtime analysis and the final analysis. I.e. the failures are more likely to happen at low sample counts.

I think we decided against this since as the simulation goes on, the analysis takes longer so it would be better to do a fast method for the realtime analysis

@IAlibay
Copy link
Contributor

IAlibay commented Sep 25, 2024

the analysis takes longer so it would be better to do a fast method for the realtime analysis

Assuming you're trying to avoid regressing, the "slow" method is the same method as pymbar 3. So you're not actually going "slower", indeed the "slow" method isn't actually clearly slower.

Going for the "fast" method probably increases your chances of getting errored values on fractional datasets.

@ijpulidos
Copy link
Contributor

The real time analysis currently instantiates its own MultiStateSamplerAnalyzer and it would always use the default user kwargs specified in the changes in this PR, since we don't really provide a way for the user to change those. That is, it would always be using the "robust" solver with PyMBAR 4. I think this should be fine. I agree with @IAlibay on that with less samples we have more chances to fail, so "robust" here sounds like a good idea.

I agree that we are probably jumping the gun and maybe the "robust" solver is fast enough (maybe even faster than the pymbar 3 implementation we were using). This is something that we should probably double check to keep our sanity.

Other than that, I'd suggest that we probably want to adapt the test script in choderalab/pymbar#419 (comment) to make it a tests for our MultiStateSamplerAnalyzer, just to be sure we are doing things correctly, if that makes sense.

@IAlibay
Copy link
Contributor

IAlibay commented Sep 26, 2024

Maybe to add to this a little bit, what I'm advocating for is closer to:

  1. For now, robust will do - we shouldn't take a performance hit, because it's the same this as before.
  2. In the future, optimizing things would be great.
  3. Getting the "for now" solution out first would be great.

@mrshirts
Copy link

mrshirts commented Sep 26, 2024

In the future, optimizing things would be great.

I'm happy to meet with people to discuss what "in the future" means. For now, "Robust" should be the best option, and should fail in relatively few cases. There's some interesting options of creating synthetic data to improve convergence, but that adds bias.

@mikemhenry mikemhenry force-pushed the MultistateAnalyzer---default-to-robust-pyMBAR-solver branch from fa92d11 to 0d733d2 Compare September 26, 2024 21:36
Copy link

codecov bot commented Sep 26, 2024

Codecov Report

Attention: Patch coverage is 85.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 84.95%. Comparing base (c2a13c0) to head (b1badbb).
Report is 1 commits behind head on main.

Additional details and impacted files

@mikemhenry
Copy link
Contributor

mikemhenry commented Sep 26, 2024

ala-thr.zip

Just uploading this file here (from choderalab/pymbar#419 (comment)) so I can download the file for a test

@mikemhenry
Copy link
Contributor

@IAlibay @ijpulidos ready for review!

Copy link
Contributor

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of comments otherwise lgtm

@@ -2612,6 +2615,42 @@ def test_resume_velocities_from_legacy_storage(self):
state.velocities.value_in_unit_system(unit.md_unit_system) != 0
), "At least some velocity in sampler state from new checkpoint is expected to different from zero."

@pytest.fixture
def download_nc_file(tmpdir):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using something like pooch be better for this kind of thing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that but it felt like a lot to add for a single test, if we have more to download then we can add it.

reporter_file = download_nc_file
reporter = MultiStateReporter(reporter_file)
analyzer = MultiStateSamplerAnalyzer(reporter, max_n_iterations=n_iterations)
f_ij, df_ij = analyzer.get_free_energy()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would encourage doing a number regression check here rather than just a pure smoke test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that but we already do that in other tests, happy to add it. What threshold do we want to use to check that it is close?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't be sampling anything, so ideally we should always be converging to a very similar value. If it changes, something changed enough to affect all our free energies. I would suggest something like 1e-4 or 1e-5 precision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll be honest, I forgot we were not sampling something, so I didn't even consider that we will get the same result each time we run (or really close to the same result)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need a docstring here telling what the test is doing and how the nc file was generated. For future reference.

# free energy
assert abs(f_ij[0, -1] - -52.00083148433459) < 1e-5
# error
assert abs(df_ij[0, -1] - 0.21365627649558516) < 1e-5
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fun way to do this is pytest.approx if you've not tried it before!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've used numpy.isclose but never pytest.approx before

Copy link
Contributor

@IAlibay IAlibay Oct 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the single value approximate, super fast and good error messages, I would encourage you to have a go!

@IAlibay
Copy link
Contributor

IAlibay commented Oct 1, 2024

I can't approve @mikemhenry 😅 , only comment - but lgtm

@mikemhenry
Copy link
Contributor

Sounds good, I will wait for @ijpulidos to review it before we merge

Copy link
Contributor

@ijpulidos ijpulidos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good to me. Thanks! I would only advice on making a docstring for the test so we can now what the test is doing and the NC file provenance.

@@ -2612,6 +2615,46 @@ def test_resume_velocities_from_legacy_storage(self):
state.velocities.value_in_unit_system(unit.md_unit_system) != 0
), "At least some velocity in sampler state from new checkpoint is expected to different from zero."

@pytest.fixture
def download_nc_file(tmpdir):
FILE_URL = "https://github.com/user-attachments/files/17156868/ala-thr.zip"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we cannot rely on this long term, since as far as I know the structure of these URLs can change without notice. Is there another way we can have a long-lived url? Not a big issue but probably something to keep in mind.


See https://github.com/choderalab/pymbar/issues/419#issuecomment-1718386779 for more
information on how the file was generated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for adding the comments, this is now much more clear. Great work.

@mikemhenry
Copy link
Contributor

Oh I forgot to suggest a changelog entry, this is what I am thinking

New Behavior

@mikemhenry
Copy link
Contributor

I am going to merge this in, but feel free to comment on the changelog entry!

@mikemhenry mikemhenry merged commit 9334bc9 into choderalab:main Oct 3, 2024
29 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants