-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python Segfault on MacOS #1720
Comments
Hi @drpjm is this from PyPI or compiled from main? |
@ProfFan I tried to build from source and also use PyPI. |
Coming very late to this conversation. I did most of the book using python 3.9, and there all tests succeed. But I am seeing segfaults with 3.12. I will try 3.10 and then 3.11, and see whether I can track down the issue. |
Python 3.10 works (at least all tests pass without segfault) |
OK, repro with Python 3.11.9:
|
@ProfFan @varunagrawal any ideas? Maybe we need to upgrade pybind? |
Might need to run the thing within LLDB and see what is happening |
Would you be willing to upgrade pybind and give it a try? |
I forget exactly where to do it, please put me on the review so I can do it the next time..l |
I had upgraded Pybind11 2 months ago I'll take a closer look later today. |
My quick recommendation would be to try upgrading to numpy 2.0.0? IIRC there is backwards compatibility with numpy V1, but the symptoms described indicate that maybe numpy 2.0.0 is already being used and it's the latest gtsam python build that needs to be used. @drpjm can you please report your numpy version here? You can get it with |
Cool, thanks @varunagrawal . could you also tell me the PR where this version of wrap was then included into GTSAM? (Submodule or subtree? I forget) |
Here you go: #1773 |
@drpjm I re-ran the current version of |
Haven't heard back from @drpjm so I will close this for now since I can't reproduce this. If you're still having issues, please feel free to reopen. |
@varunagrawal Been very busy and had to track down the code that segfaults. I can add you as a collaborator to try it out. I was using |
Wait, @varunagrawal - I have reproduced the segfaults with Python 3.11.9, so I'm re-opening. |
I am running with numpy 2.0.1, still segfaults: (py311) $ /Users/dellaert/mambaforge/envs/py311/bin/python /Users/dellaert/git/github/python/gtsam/tests/test_Factors.py
.Segmentation fault: 11
(py311) $ pip show numpy | grep Version
Version: 2.0.1 |
@ProfFan or @varunagrawal, with lldb I get below, which is mildly useless. I get unnamed symbols even when compiling GTSAM with Debug. Is that flag propagated correctly to wrap?
|
OK, after blasting away all my libraries, I have symbols: test_Factors fails with this, immediately:
and test_Cal3Fisheye fails with this:
Both seem boost serialization related ! |
I set up a 3.11.9 environment on my M1 mac and I am again not able to repro. :( All tests pass here. Could it be the way boost is installed? Mine is via homebrew. |
Let me see what I can do, from what I see Python is from |
Can't reproduce the crash on However the PyPI version does crash. @dellaert Did you reproduce the crash with |
Yeah, this is on develop, and boost 1.86 from brew, and now latest
numpy. It could be an installation problem: sometimes I get symbols,
sometimes I don’t. But 3.9 and 3.10 just work. Still, let me try and
completely blast out my build folder - I do notice that “clean” does not
clean everything.
On August 25, 2024, GitHub ***@***.***> wrote:
Can't reproduce the crash on develop. This is with boost 1.86
(Homebrew), Python 3.11 on conda-forge and numpy 2.0.
However the PyPI version does crash. @dellaert
<https://github.com/dellaert> Did you reproduce the crash with
develop?
—
Reply to this email directly, view it on GitHub
<#1720 (comment)>,
or unsubscribe <https://github.com/notifications/unsubscribe-
auth/ACQHGSK5HDCTRREKYYBEAXDZTHTBVAVCNFSM6AAAAABC6AHI4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBYHA3TIMJWGI>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Here is one possible issue. cmake says:
so it does not seem to pick up on the pybind included with wrap... |
Should be up-to-date enough? What does this say on your computer? |
But, pybind11 is included in wrap, so the bigger issue is: why does our cmake not use that one. It should not pick up on the brew one, right? |
Interesting.
which possibly explains the issue. |
I made a PR since this is easy to fix via CMake. @dellaert can you please try it out? |
I'll try. In the meantime I'm also trying to create an M1 CI run, to see if the issue is reproducible on github runners |
Thanks @varunagrawal ! |
@dellaert Would I compile from source or install with pip? |
Build from source. |
I still wonder why this fixed the issue. |
I had another pybind installed using brew and it picked up on that. When
I ran make again with the changes, a lot of different flags appeared in
the cmake settings as well, indicating it now hooked up to our version.
On August 26, 2024, GitHub ***@***.***> wrote:
> Interesting. Mine says
>
> //Value Computed by CMake
> pybind11_BINARY_DIR:STATIC=/Users/varunagrawal/borglab/gtsam/build/python/pybind11
> //The directory containing a CMake configuration file for pybind11.
> pybind11_DIR:PATH=pybind11_DIR-NOTFOUND //Value Computed by CMake
> pybind11_IS_TOP_LEVEL:STATIC=OFF //Value Computed by CMake
> pybind11_SOURCE_DIR:STATIC=/Users/varunagrawal/borglab/gtsam/wrap/pybind11
>
> which possibly explains the issue.
>
I still wonder why this fixed the issue. pybind11 is header-only and
these variables look totally legit to me...
Also I have the same config at @dellaert
<https://github.com/dellaert> and cannot reproduce.
—
Reply to this email directly, view it on GitHub
<#1720 (comment)>,
or unsubscribe <https://github.com/notifications/unsubscribe-
auth/ACQHGSPSYYNZQWM324PDAZ3ZTNCHFAVCNFSM6AAAAABC6AHI4OVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMJQGQ4DCNBUGY>.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Closing as complete. |
Well, I still have the segfault in linux (pop-os 22.04). The
(I haven't tried all the examples). I have however successfully been able to run examples if I use the following combination (had to match the python + matplotlib + numpy versions and relative dependencies):
This issue seems to be all about MacOS but it should apply to linux as well, yet, I still get the segfault unless I downgrade as mentioned. |
Description
I was running through the
robotics
text on performing MAP with multiple sensors and when computing the unnormalized posterior from a DiscreteConditional likelihood, I get a segfault.This is running in Python 3.11.6, Mac OSX 14.3. Mac OS reports the following;
Steps to reproduce
I am running this in a Python script, not a Jupyter notebook. I have a conductivity sensor based on the DiscreteConditional in the robotics textbook in Chapter 2.4.4.
The segfault occurs when I run something similar to the example in Chapter 2.4.10.
Expected behavior
I would expect that the posterior is computed without crashing when multiplying out the likelihood factors and prior.
When I use a DecisionTreeFactor to represent a continuous sensor model, this crash does not occur. So it appears that there is a problem with theIt looks like it happens for any combination of theDiscreteConditional
python object when using the*
operator.DiscreteConditional
orDecisionTreeFactor
.Environment
Python 3.11.6, Mac OSX 14.3 with Apple silicon (M2)
The text was updated successfully, but these errors were encountered: