Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Pybind11 to v2.11.1 #160

Merged
merged 2 commits into from
Sep 5, 2023
Merged

Upgrade Pybind11 to v2.11.1 #160

merged 2 commits into from
Sep 5, 2023

Conversation

varunagrawal
Copy link
Collaborator

@varunagrawal varunagrawal commented Aug 31, 2023

Script has a vulnerability alert which affects Pybind11 tests and is causing dependabot alerts. Since Pybind11 has updated their code, this is a good opportunity to update it for the wrapper as well.

This is a patch version update so everything is backwards compatible.

I also compiled the GTSAM wrapper with these changes and everything looks good so far.

I am considering making this a git submodule so that pulling becomes easier rather than copy-pasting.
@ProfFan @gchenfc thoughts?

@varunagrawal varunagrawal self-assigned this Aug 31, 2023
@gchenfc
Copy link
Member

gchenfc commented Aug 31, 2023

I'm not sure my opinion should matter all that much, but if you're asking, here it is. :)

I don't know if constantly staying up to date with the latest pybind is really a significant and common enough issue to warrant changing the current system. I think this argument requires 2 parts:

1. The cost of changing to git submodules

Even though it seems negligible, I do think it adds a bit of friction for users not as comfortable with git to need to clone the submodule, and adding the command in the README may still cause a non-negligible amount of confusion. I myself was burned by this just last week when I blindly git cloned the apriltag ios repo without realizing it had a submodule. I had already copy-pasted the apriltag repo in before realizing an hour later that it was already present as a submodule. Copying and deleting then re-pulling only took 5 minutes, but it was an inconvenience that really wasn't necessary.

2. Lack of compelling advantages of changing to git submodules

First, I see submodules as most useful when there are bidirectional updates: both pulling from upstream but also submitting your own changes upstream. I certainly don't see us pushing PRs to pybind or every really touching the pybind code itself so why use a submodule?

Second, I see the "tedious" copy-pasting for version bumps as a positive side-effect, since unless there is a real compelling reason to upgrade to the newest version, it should be best to stick with the existing version. Upgrading introduces more room for unintended bugs and issues on different platforms, so IMO it should really only be done if there is a good reason, and lowering the barrier of upgrading is counter-productive to making upgrading a very intentional decision.
As an example, it would be helpful in this PR to describe the reasons for upgrading, and all the byproducts or changes that need to happen both positive and negative. Seeing as the changes in this PR are only to the pybind directory and it doesn't link any issues or future plans, I'm not sure what the pros are. But seeing as the changelog lists that the minimum cmake version has changed, that's a potential con.


I appreciate your work in upgrading this and appreciate the thoughts you have given towards making this maintainable. This is my personal opinion and I'm open to discussion.

In summary, I don't support changing pybind to a git submodule because I don't see the advantages but I do see that it increases confusion.

I'm also unclear the motivation for upgrading pybind; if you could discuss that, that would be helpful.

@ProfFan
Copy link
Collaborator

ProfFan commented Aug 31, 2023

I 2nd Gerry's opinions, for the following reasons:

  1. git submodule is not supported for GitHub to make release tarballs (original reason why I did not use them)
  2. if we don't vendor anything, it's great and (1) will not apply anymore but -
  3. if we are going to remove the vendor-ed 3rd party stuff, then we should use the pattern of find_package...FetchContent() and be explicit about which copy is used in the logs

@varunagrawal
Copy link
Collaborator Author

I'm with you on the git submodule bit. My git foo is strong so I find it not too bad to work with but I've been similarly burned multiple times. Figured I'd get multiple opinions before forcing changes down people's throats.

The main reason for the upgrade is that there is a security alert for scipy on the wrap module in GTSAM. I take security alerts pretty seriously and this has been months overdue.

@varunagrawal
Copy link
Collaborator Author

@ProfFan gotcha. I'm with both of you two on this.

I wish we had something like npm or bundler for C++.

@varunagrawal
Copy link
Collaborator Author

Can I get an approval please? There's no git submodule used here, just a simple version upgrade.

@varunagrawal
Copy link
Collaborator Author

PS this PR doesn't add git submodules, just updates Pybind11 to the latest stable version.

@varunagrawal varunagrawal merged commit 390cb80 into master Sep 5, 2023
8 checks passed
@varunagrawal varunagrawal deleted the upgrade-pybind11 branch September 5, 2023 15:53
@varunagrawal
Copy link
Collaborator Author

I have a bit more time to write a response to this so I'm doing it now for posterity.

1. The cost of changing to git submodules

Even though it seems negligible, I do think it adds a bit of friction for users not as comfortable with git to need to clone the submodule, and adding the command in the README may still cause a non-negligible amount of confusion. I myself was burned by this just last week when I blindly git cloned the apriltag ios repo without realizing it had a submodule. I had already copy-pasted the apriltag repo in before realizing an hour later that it was already present as a submodule. Copying and deleting then re-pulling only took 5 minutes, but it was an inconvenience that really wasn't necessary.

I humbly disagree with this. Even if you fail to initialize a github repository with the submodules, you can always do it post hoc with the git submodule update command. Adding it to the README shouldn't cause any confusion because the process is opaque and doesn't lead to any differences whether we use a submodule or not. At the end of the day, you get the same repo and file structure, and your compiler is happy.

I would argue that the lack of use of submodules is what caused your issue with the apriltag repo, specifically inexperience. You could say the same thing about rebasing in git, or using CMake. If you're going to avoid using it, how will you ever know what is the right thing to do and gain the necessary experience?

Speaking of inconvenience, it is 5 minutes of inconvenience for you, but 5 hours of inconvenience for the devs if they don't use a submodule. For this PR, I had to manually look at the files to commit, rather than simply reading the commit history that would have been captured via the submodule. And instead of copy-pasting, I would have preferred to run git checkout v2.11.1 which takes 10 seconds of my time instead of 10 minutes. That's not a fair trade-off IMO.

2. Lack of compelling advantages of changing to git submodules

First, I see submodules as most useful when there are bidirectional updates: both pulling from upstream but also submitting your own changes upstream. I certainly don't see us pushing PRs to pybind or every really touching the pybind code itself so why use a submodule?

This isn't true. A submodule lets you add an "internal" repo to an existing repo, and there is no expectation of pushing anything back. I think you may be confusing this with a fork? Instead the semantics of an internal repo allow us to quick and easily move to a specific commit (in this case, the commit that updates the scipy version requirement) and that history is captured within the PR without the PR showing 150+ files being added. This way you can use additional tools like git-bisect to find out which commit introduced a bug or move versions up and down to find the one that best suits your current state of affairs.

And who knows, maybe we get some collaborator who decides to push to pybind to make wrapping easier for us? It isn't outside the realm of possibility.

Second, I see the "tedious" copy-pasting for version bumps as a positive side-effect, since unless there is a real compelling reason to upgrade to the newest version, it should be best to stick with the existing version.

You seem to have a lot of time on your hands. Probably should have Frank email you about various problems. 😂
On a serious note, I disagree. It is tedious, and it is poor practice. Along this line of argument, you could go on to say that why use git at all? Why not just copy-paste different versions of folders, wrap-v1, wrap-v2, wrap-v2.1, etc?

Upgrading introduces more room for unintended bugs and issues on different platforms, so IMO it should really only be done if there is a good reason, and lowering the barrier of upgrading is counter-productive to making upgrading a very intentional decision.

Upgrading introduces bugs, but also introduces far more improvements. I fully expect lots of open issues to be resolved and the pybind code to be faster and resource intensive than a previous version. A good example of this is boost. I have found so many issues with boost, such as bugs in serialization, that upgrading was the only option. The maintainers of boost were not going to accept a patch back to version 1.65, while they were focusing on version 1.80, and the onus of the patch falls onto us. Another example is fixing of Python wheels for Python 3.8. We had to do this manually previously when building the wheels, but PyPA fixed this issue in setuptools so that's one less thing for us to worry about.

Additionally, newer versions are better supported in terms of package managers. I am able to install packages and their binaries in minutes using either apt, pacman or brew. Taking boost as an example again, currently v1.73 is available to install via homebrew and this takes 2 minutes to install, instead of spending 25 minutes to compile and install (just look at how long the gtsam CI takes). Unfortunately, since gtsam requires supporting 1.65, we have to wait an additional 30 minutes for the CI to run and if something breaks after the boost install, then we have to do this dance once again.
If we decide to go the CMake fetch_package route, or some other means of obtaining pybind that doesn't have it vendored in this repo, you can see why upgrading is a great choice.
I can keep going on with examples of why newer versions of Python, Ubuntu, Pytorch, etc are all better in various ways.

Lowering the barrier of upgrading is also tangential to deciding to do it. Deciding to upgrade is always a reactive one; something breaks or a bug gets exposed, or in this case there is a security alert. Once the decision is made after careful discussion (which was had on the GTSAM dependabot PR), should one abandon it because it is "too much work"? That seems ridiculous to me, especially if someone new comes along and isn't as familiar with these various systems.

In summary, I don't support changing pybind to a git submodule because I don't see the advantages but I do see that it increases confusion.

This most definitely seems like a chicken-and-egg problem to me. This is the git equivalent of "need experience to get a job, need a job to get experience". I don't believe someone as smart as Linus Torvalds would have added the submodule feature without immense thought to its advantages.

I'm also unclear the motivation for upgrading pybind; if you could discuss that, that would be helpful.

This is my mistake. I have updated the PR message to specify what the issue being resolved it. I for some reason thought you had seen the dependabot alert on GTSAM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants