[REVIEW]: Dynamax: A Python package for probabilistic state space modeling with JAX #7069

editorialbot · 2024-08-04T07:40:30Z

Submitting author: @slinderman (Scott Linderman)
Repository: https://github.com/probml/dynamax
Branch with paper.md (empty if default branch): paper
Version: v0.1.4
Editor: @osorensen
Reviewers: @thomaspinder, @gdalle
Archive: Pending

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/aaec4098e71833c94a74dbe1ff785d9e"><img src="https://joss.theoj.org/papers/aaec4098e71833c94a74dbe1ff785d9e/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/aaec4098e71833c94a74dbe1ff785d9e/status.svg)](https://joss.theoj.org/papers/aaec4098e71833c94a74dbe1ff785d9e)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@thomaspinder & @gdalle, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @osorensen know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @gdalle

📝 Checklist for @thomaspinder

The text was updated successfully, but these errors were encountered:

editorialbot · 2024-08-04T07:40:33Z

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot · 2024-08-04T07:40:54Z

Software report:

github.com/AlDanial/cloc v 1.90  T=0.15 s (764.3 files/s, 298530.4 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Python                          67           3037           4303           8814
Jupyter Notebook                30              0          24675           3907
Markdown                        10             99              0            282
TeX                              1             25              0            199
reStructuredText                 2            152            208            153
YAML                             5             20              7            147
DOS Batch                        1              8              1             26
make                             1              4              7              9
TOML                             1              1              0              4
-------------------------------------------------------------------------------
SUM:                           118           3346          29201          13541
-------------------------------------------------------------------------------

Commit count by author:

   344	Scott Linderman
   178	Peter G. Chang
   135	xinglong
   131	Kevin P Murphy
    83	karalleyna
    79	gileshd
    69	Gerardo Duran-Martin
    60	petergchang
    26	Caleb Weinreb
    18	libby
    14	kostastsa
    13	slinderman
    10	Elizabeth DuPre
     8	Kevin Murphy
     6	andrewwarrington
     6	davidzoltowski
     6	patel-zeel
     4	Ravin Kumar
     3	Aleyna Kara
     2	Yixiu Zhao
     2	Zeel B Patel
     2	partev
     1	Collin Schlager
     1	Jake VanderPlas
     1	Jason Davies
     1	RaulPL
     1	Xinglong
     1	Xinglong Li
     1	xinglong-li

editorialbot · 2024-08-04T07:40:58Z

Paper file info:

📄 Wordcount for paper.md is 960

✅ The paper includes a Statement of need section

editorialbot · 2024-08-04T07:40:59Z

License info:

✅ License found: MIT License (Valid open source OSI approved license)

editorialbot · 2024-08-04T07:40:59Z

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.1146/annurev-neuro-092619-094115 is OK
- 10.1017/CBO9781139344203 is OK
- 10.48550/arXiv.2305.16543 is OK
- 10.48550/arXiv.2306.03291 is OK
- 10.48550/arXiv.2305.19535 is OK
- 10.1038/s41592-024-02318-2 is OK
- 10.25080/majora-92bf1922-011 is OK
- 10.1017/cbo9780511790492 is OK
- 10.1016/j.tree.2007.10.009 is OK
- 10.1198/073500102753410408 is OK
- 10.3402/tellusa.v56i5.14462 is OK
- 10.1145/355656.355657 is OK
- 10.1109/TAC.2020.2976316 is OK
- 10.1109/TSP.2021.3103338 is OK

MISSING DOIs

- No DOI given, and none found for title: Probabilistic Machine Learning: Advanced Topics
- No DOI given, and none found for title: JAX: composable transformations of Python+NumPy pr...
- No DOI given, and none found for title: PyHSMM: Bayesian inference in HSMMs and HMMs
- No DOI given, and none found for title: Code Companion for Bayesian Filtering and Smoothin...
- No DOI given, and none found for title: SSM: Bayesian Learning and Inference for State Spa...
- No DOI given, and none found for title: JSL: JAX State-Space models (SSM) Library
- No DOI given, and none found for title: hmmlearn
- No DOI given, and none found for title: Structural Time Series (STS) in JAX

INVALID DOIs

- None

editorialbot · 2024-08-04T07:42:07Z

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

gdalle · 2024-08-04T15:24:01Z

Review checklist for @gdalle

Conflict of interest

I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

I confirm that I read and will adhere to the JOSS code of conduct.

General checks

Repository: Is the source code for this software available at the https://github.com/probml/dynamax?
License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
Contribution and authorship: Has the submitting author (@slinderman) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

Installation: Does installation proceed as outlined in the documentation?
Functionality: Have the functional claims of the software been confirmed?
Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
State of the field: Do the authors describe how this software compares to other commonly-used packages?
Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

thomaspinder · 2024-08-05T05:03:52Z

Review checklist for @thomaspinder

Conflict of interest

I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

I confirm that I read and will adhere to the JOSS code of conduct.

General checks

Repository: Is the source code for this software available at the https://github.com/probml/dynamax?
License: Does the repository contain a plain-text LICENSE or COPYING file with the contents of an OSI approved software license?
Contribution and authorship: Has the submitting author (@slinderman) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

Installation: Does installation proceed as outlined in the documentation?
Functionality: Have the functional claims of the software been confirmed?
Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
State of the field: Do the authors describe how this software compares to other commonly-used packages?
Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

osorensen · 2024-08-22T12:33:14Z

👋 @gdalle, @thomaspinder, could you please update us on how it's going with your reviews?

thomaspinder · 2024-08-22T15:15:41Z

@osorensen expecting to be done by September 1st.

gdalle · 2024-08-23T16:25:59Z

Haven't started it yet so I will probably need two more weeks.

thomaspinder · 2024-09-01T18:04:21Z

Dynamax Review

Firstly, I'd like to congratulate the authors and contributors of Dynamax for creating a nicely designed package in JAX. Dynamax enables practitioners to easily fit state-space models (SSM), whilst simultaneously allowing researchers to implement their own custom SSM approaches. The repo's corresponding paper is well written and provides a clear and concise summary of the paper. Finally, I enjoyed reading the documentation of Dynamax; the large number of examples and use-cases is great for new users of the package.

I have broken my review up into two sections. The first section lists the blocking issues I have that prevent me from being able to mark each item of the reviewer's checklist as complete. In the second section, I suggest some things that I would advise the authors to do in Dynamax; however, I do not consider these as blocking concerns.

Significant issues

Failing tests
Docstring coverage
Incomplete contribution guidelines
Missing typing

Failing tests

For me, running

git clone [email protected]:probml/dynamax.git
cd dynamax
pip install -e '.[dev]'
pytest dynamax

as per the docs, threw errors on a Mac M1. I was using Python 3.10 and a fresh virtual environment.

On this topic though, I see that you're Github testing workflow uses a pinned Python version and machine. It could be good to run your tests on all supported Python versions and a Mac and Linux machine. I have opened a PR with a suggestion that reflects this comment.

Docstring coverage

The docstrings within Dynamax are inconsistent. Taking, for example, dynamax.generalized_gaussian_ssm.inference as an example file, there are no docstrings for the methods or arguments of the public classes, yet there are comprehensive docstrings for the private functions. This is unconventional, and I would encourage the authors to more widely document the classes, methods, and functions of Dynamax so as to make the package easier for people to use. Additionally, typing such as the below from dynamax.generalized_gaussian_ssm.models is not particularly descriptive.

:param initial_mean: $m$
:param initial_covariance: $S$

It would be good to see more comprehensive documentation. To rigorously test documentation thresholds in the future, you may consider using interrogate. This will measure the number of objects for which a docstring is supplied and raise errors/warnings when the coverage decreases. I have opened a suggestive PR for how this can be achieved.

33.1% of classes/methods/functions are covered by tests. It would be good to see this increased. This was calculated with interrogate dynamax.

Broken documentation

The documentation is broken and many notebooks do not render - Example. My suggestion would be to use something like nbsphinx to execute the notebooks each time the documentation is built, and to fail when a notebook does not execute top-to-bottom.

### Incomplete contribution guidelines

Your contribution guidelines are incomplete. To push a change, one would need to add and commit the change before pushing - this is missing from the document.

### Missing typing

The package has no typing. Additionally, many functions use variable names which are very hard to interpret e.g., def _predict(m, P, f, Q, u, g_ev, g_cov) in dynamax.generalized_gaussian_ssm.inference. The combination of no typing and terse variable naming makes the code challenging to read, particularly for new users and/or those new to SSMs. I would strongly encourage the authors to either use more descriptive variable names, or add typing to their code. JaxTyping is excellent for this.

Incorrect typing

Some of the typing used is incorrectly done. Take ParamsGGSSM.initial_mean in dynamax.generalized_gaussian_ssm.models as an example. In this function, the typing on initial_mean is incorrect, as outlined here in JaxTyping. A change to the effect of

- initial_mean: Float[Array, "state_dim"]
+ initial_mean: Float[Array, " state_dim"]

should be made.

Suggested improvements

Large repo size
Tighter Python bounds
Package management

### Large repo size

You have some very large git pack files:

❯ du -ah dynamax | sort -rh | head -n 10

224M	dynamax
209M	dynamax/.git/objects/pack
209M	dynamax/.git/objects
209M	dynamax/.git
208M	dynamax/.git/objects/pack/pack-aaa194639683ed252462b75ee0ec2203edc5c09f.pack

You may consider running git gc to clean these up unless you need them.

### Tighter Python bounds

I would suggest tighter Python bounds. Currently your lower bound is 3.6; a version which is considered end-of-life by the Python Org.: https://devguide.python.org/versions/#unsupported-versions
As an aside, I appreciate that it's specified in your setup.py, but explicitly stating the supported Python version(s) in your project's README is nice.

Package management

It is strongly advised to migrate setup.py and setup.cfg files to a pyproject.toml. I would encourage the authors to do this in order to future-proof Dynamax.

osorensen · 2024-09-07T17:02:05Z

Thanks a lot for your thorough review, @thomaspinder. @slinderman, you're welcome to start addressing the issues whenever you like.

slinderman · 2024-09-11T16:34:25Z

Thank you, @osorensen and @thomaspinder! We will start working on these issues and suggestions ASAP, and I'll keep you posted.

gdalle · 2024-09-12T10:53:37Z

Meanwhile I'm making my way through my own review. I will gather my remarks about code and documentation in two issues on the dynamax repo:

I might be limited by my very ancient Python skills for installation and testing of dynamax, so I'd appreciate a hand in figuring out the errors I observe.

As for the paper itself, it is very clear and does a good job of introducing dynamax. I have three minor suggestions:

Dynamax supports canonical SSMs and allows the user to construct bespoke models as needed.

Can you give more details on how this is implemented API-wise? For instance, how generic can observation distributions be?
This question was a major motivation for my own take on HMMs, coded in Julia (https://joss.theoj.org/papers/10.21105/joss.06436). Hopefully I didn't misrepresent dynamax in the state of the art there.

Dynamax provides a unique combination of low-level inference algorithms and high-level modeling objects that can support a wide range of research applications in JAX.

Would it be possible to provide concrete examples of what is missing from the previous libraries? Obviously JAX support is a big aspect, since I know that some of them are coded in Numpy (hmmlearn) or PyTorch (pomegranate)

Parallel message passing routines that leverage GPU or TPU acceleration to perform message passing in sublinear time.

What do you mean by sublinear time? Isn't it just (roughly speaking) total sequential time divided by amount of parallelism?

gdalle · 2024-09-13T06:41:25Z

I'm halfway through the documentation and a significant number of code examples are broken by the Numpy 2.0 release (about 1 per page). I think it would be a good idea to fix them before I make a second (complete) pass.

osorensen · 2024-09-22T11:12:19Z

@slinderman, ref the post by @gdalle, could you please follow up and ping us here when done?

slinderman · 2024-09-24T03:50:20Z

Hi @osorensen, thanks for the reminder. @gdalle, sorry for the rendering issue. I thought we had fixed this by pinning to Numpy<2.0, but somehow this slipped through the cracks. I will fix it and let you know when it's ready for another pass.

osorensen · 2024-10-25T11:46:56Z

@slinderman any updates on this?

slinderman · 2024-10-28T13:37:20Z

Hi @osorensen, sorry for the delay. We pinned Numpy < 2.0 and recreated the documentation, and now almost all of the rendering issues should be fixed. There is a known issue with the HMC notebook (see probml/dynamax#384), but the rest look good.

Regarding the overall review progress, we have begun work on the feedback we received from @thomaspinder. It is taking more time than anticipated to fix the typing issues, but we aim to wrap that up in the coming weeks. We greatly appreciate the reviewers' feedback, and we appreciate their patience while we work to address their comments.

osorensen · 2024-10-28T14:55:30Z

Thanks for the update, @slinderman

gdalle · 2024-11-01T12:41:24Z

@slinderman thanks for the fixes. Can you tag a new release containing them, so that running the documentation examples does not require cloning the repo?

slinderman · 2024-11-01T15:57:53Z

Hi @gdalle, of course. Please see https://github.com/probml/dynamax/releases/tag/0.1.5, which is now available on PyPI as well.

editorialbot added review Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning labels Aug 4, 2024

editorialbot assigned osorensen Aug 4, 2024

editorialbot mentioned this issue Aug 4, 2024

[PRE REVIEW]: Dynamax: A Python package for probabilistic state space modeling with JAX #7029

Closed

This was referenced Sep 1, 2024

Correct typo probml/dynamax#373

Merged

Add testing workflow probml/dynamax#374

Open

Add interrogate workflow probml/dynamax#375

Open

This was referenced Sep 12, 2024

JOSS review - Remarks on documentation probml/dynamax#376

Open

JOSS review - Remarks on code probml/dynamax#377

Open

gileshd mentioned this issue Sep 12, 2024

Tackle Typing and Linting Errors probml/dynamax#379

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[REVIEW]: Dynamax: A Python package for probabilistic state space modeling with JAX #7069

[REVIEW]: Dynamax: A Python package for probabilistic state space modeling with JAX #7069

editorialbot commented Aug 4, 2024 •

edited

Loading

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

gdalle commented Aug 4, 2024 •

edited

Loading

thomaspinder commented Aug 5, 2024 •

edited

Loading

osorensen commented Aug 22, 2024

thomaspinder commented Aug 22, 2024

gdalle commented Aug 23, 2024

thomaspinder commented Sep 1, 2024

osorensen commented Sep 7, 2024

slinderman commented Sep 11, 2024

gdalle commented Sep 12, 2024 •

edited

Loading

gdalle commented Sep 13, 2024

osorensen commented Sep 22, 2024

slinderman commented Sep 24, 2024

osorensen commented Oct 25, 2024

slinderman commented Oct 28, 2024 •

edited

Loading

osorensen commented Oct 28, 2024

gdalle commented Nov 1, 2024

slinderman commented Nov 1, 2024

[REVIEW]: Dynamax: A Python package for probabilistic state space modeling with JAX #7069

[REVIEW]: Dynamax: A Python package for probabilistic state space modeling with JAX #7069

Comments

editorialbot commented Aug 4, 2024 • edited Loading

Status

Reviewer instructions & questions

Checklists

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

editorialbot commented Aug 4, 2024

gdalle commented Aug 4, 2024 • edited Loading

Review checklist for @gdalle

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

thomaspinder commented Aug 5, 2024 • edited Loading

Review checklist for @thomaspinder

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

osorensen commented Aug 22, 2024

thomaspinder commented Aug 22, 2024

gdalle commented Aug 23, 2024

thomaspinder commented Sep 1, 2024

Dynamax Review

Significant issues

Failing tests

Docstring coverage

Broken documentation

Incorrect typing

Suggested improvements

Package management

osorensen commented Sep 7, 2024

slinderman commented Sep 11, 2024

gdalle commented Sep 12, 2024 • edited Loading

gdalle commented Sep 13, 2024

osorensen commented Sep 22, 2024

slinderman commented Sep 24, 2024

osorensen commented Oct 25, 2024

slinderman commented Oct 28, 2024 • edited Loading

osorensen commented Oct 28, 2024

gdalle commented Nov 1, 2024

slinderman commented Nov 1, 2024

editorialbot commented Aug 4, 2024 •

edited

Loading

gdalle commented Aug 4, 2024 •

edited

Loading

thomaspinder commented Aug 5, 2024 •

edited

Loading

gdalle commented Sep 12, 2024 •

edited

Loading

slinderman commented Oct 28, 2024 •

edited

Loading