Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes to CMake scripts #950

Merged
merged 14 commits into from
May 9, 2022
Merged

Fixes to CMake scripts #950

merged 14 commits into from
May 9, 2022

Conversation

joaander
Copy link
Member

@joaander joaander commented Apr 29, 2022

Description

Make a number of fixes to the CMake scripts.

Motivation and Context

My goal is to provide a working macos-arm64 build of freud on conda-forge. The crucial fix is to change from the scikit-build provided python_extension_module to a custom function. This is needed because:

  1. Linking to libpython3.X.dylib causes segmentation faults on import:
(lldb) run -c "import freud"                                                                
Process 69557 launched: '/Users/joaander/miniforge3/bin/python3' (arm64)                    
Process 69557 stopped                                                                                                                                                                    
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)                                                                               
    frame #0: 0x0000000101411bd0 libpython3.9.dylib`_PyObject_GC_Alloc + 56                                                                                                              
libpython3.9.dylib`_PyObject_GC_Alloc:                                                                                                                                                   
->  0x101411bd0 <+56>: ldr    x22, [x19, #0x10]                                                                                                                                          
    0x101411bd4 <+60>: add    x2, x1, #0x10             ; =0x10                                                                                                                          
    0x101411bd8 <+64>: adrp   x8, 311                                                                                                                                                    
    0x101411bdc <+68>: add    x8, x8, #0xa90            ; =0xa90                                                                                                                         
Target 0: (python3) stopped.                                                                                                                                                             
(lldb) bt                                                                                                                                                                                
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=1, address=0x10)                                                                                        
  * frame #0: 0x0000000101411bd0 libpython3.9.dylib`_PyObject_GC_Alloc + 56                                                                                                              
    frame #1: 0x00000001012f67a0 libpython3.9.dylib`PyTuple_New + 284                                                                                                                    
    frame #2: 0x00000001012fb4c0 libpython3.9.dylib`PyType_Ready + 284                                                                                                                   
    frame #3: 0x00000001012fb470 libpython3.9.dylib`PyType_Ready + 204                                                                                                                   
    frame #4: 0x00000001012dd564 libpython3.9.dylib`PyModuleDef_Init + 32                                                                                                                
    frame #5: 0x000000010019b1a4 python3`_imp_create_dynamic + 2376
  1. Linking libpython3.X.dylib is not necessary as the functions it provides are present in the already loaded python3 image.

I also had to change from MODULE to SHARED library builds, as the conda-forge installation script automatically links modules to libpython3.X.dylib on installation.

For reasons I don't understand, these changes are not necessary when building from source in a native macos-arm64 environment. The problems only occur up in cross-compiled builds such as those made by conda-forge.

Additionally, I removed LD_LIBRARY_PATH from the TBB search path. LD_LIBRARY_PATH is a runtime search path. Users that wish to use a specific TBB at compile time should use the provided cmake variables CMAKE_PREFIX_PATH or TBBROOT. In particular, this caused problems with cross-compilation on conda-forge.

How Has This Been Tested?

I built conda-forge packages on conda-forge/freud-feedstock#48, downloaded the build artifacts, and tested import freud on a macos-arm64 system. Once I had a working import, python -m pytest . in freud's tests directory passes.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds or improves functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation improvement (updates to user guides, docstrings, or developer docs)

Checklist:

  • I have read the CONTRIBUTING document.
  • My code follows the code style of this project.
  • I have updated the documentation (if relevant).
  • I have added tests that cover my changes (if relevant).
  • All new and existing tests passed.
  • I have updated the credits.
  • I have updated the Changelog.

@joaander
Copy link
Member Author

Fixes on this pull request might help a future cibuildwheel macos-arm64 build. However, such a build will require significant additional work, such as providing a cross-compiled TBB.

@joaander joaander marked this pull request as ready for review April 29, 2022 16:49
@joaander joaander requested a review from a team as a code owner April 29, 2022 16:49
@joaander joaander requested review from tommy-waltmann and removed request for a team April 29, 2022 16:49
Copy link
Collaborator

@tommy-waltmann tommy-waltmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple small things.

doc/source/gettingstarted/installation.rst Outdated Show resolved Hide resolved
freud/CMakeLists.txt Outdated Show resolved Hide resolved
Copy link
Collaborator

@tommy-waltmann tommy-waltmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good. Are we able to start deploying macos-arm builds for conda-forge and pypi in the next release after these changes are merged?

@joaander
Copy link
Member Author

joaander commented May 2, 2022

I'll leave pypi support up to you. I'm not sure how much effort that will take to enable. I'm happy to test any build artifacts you create when testing.

I would like at least to see macos-arm64 conda-forge builds in the next release. The MoSDeF tools are currently unable to support macos-arm64 due to upstream dependencies, one of which is freud.

@vyasr
Copy link
Collaborator

vyasr commented May 2, 2022

@joaander if this is change isn't extremely time-sensitive, I'd appreciate keeping this open for a few days to let me review it. I have been doing some work on scikit-build, and if the issues that you are fixing are more broadly applicable I would like to see if at least some of them can/should be upstreamed. If the MoSDeF timeline is tight I can always review the changes later.

@joaander
Copy link
Member Author

joaander commented May 2, 2022

@joaander if this is change isn't extremely time-sensitive, I'd appreciate keeping this open for a few days to let me review it. I have been doing some work on scikit-build, and if the issues that you are fixing are more broadly applicable I would like to see if at least some of them can/should be upstreamed. If the MoSDeF timeline is tight I can always review the changes later.

No, this is not extremely time sensitive. Feel free to reach out on Slack if you have questions that are not answered in the description. I didn't consider pushing these upstream as they are a big change from the behavior of python_extension_module as implemented in scikit-build. The new behavior more closely matches that used by pybind11's pybind11_add_module, which I use without issue to cross compile HOOMD for macos-arm64: https://github.com/pybind/pybind11/blob/75007dda72ad4508064c1f080394eae50d3a61ee/tools/pybind11Tools.cmake#L140-L207.

@joaander
Copy link
Member Author

joaander commented May 6, 2022

Vyas, have you had a chase to look at these changes?

@vyasr
Copy link
Collaborator

vyasr commented May 6, 2022

I started looking at this last night. The LD_LIBRARY_PATH issue was clearly just an oversight on my part when first putting this together, so thanks for fixing that. Regarding the other changes, the libpython errors do seem confusing. I have some questions mostly to help me understand what's going and perhaps to pose some changes to conda-forge/scikit-build teams in the future:

  • Am I understanding correctly that on Windows all the libpython symbols are not automatically loaded when running the Python interpreter, so you have to link manually? That seems surprising no?
  • The MODULE->SHARED makes sense given your explanation. It seems like from what you're saying this is specific to the cross-compilation that conda-forge is doing. My first thought was that the cross-compiling needs to link in libpython at link time rather than relying on dynamic loading, but that can't be the whole story because then I would expect to see the same problem on other architectures. Is it possible that ARM-based Macs aren't using dlopen at all, but are instead loading the modules in a different way that is causing the issue?
  • My guess as to why you had to change to a custom function would be the function target_link_libraries_with_dynamic_lookup defined in this file in scikit-build. It seems plausible that it isn't correctly deciding what to link to in the specific conda-forge case (or maybe it's linking to the right libraries but setting the wrong linker flags). Based on the errors that you saw, does anything there jump out at you?

Now that I've read this a little more in detail I'm fine pushing it forward in the near term. My only concern is whether architecture-specific linking logic is actually correct on all architectures. Could someone try installing the new conda-forge packages on an Intel Mac and a Linux box (and Windows if that's easy, although we don't need to be blocked on that) and make sure that the new packages work? If they do then this PR gets a 👍 from me.

@joaander
Copy link
Member Author

joaander commented May 6, 2022

  • Am I understanding correctly that on Windows all the libpython symbols are not automatically loaded when running the Python interpreter, so you have to link manually? That seems surprising no?

I think so, but I don't understand how dynamic library loading works on windows as well as I do for Linux. But this is the behavior that pybind11 uses in pybind11Config.cmake

    if(WIN32 OR CYGWIN)
      set_property(TARGET ${PN}::module APPEND PROPERTY INTERFACE_LINK_LIBRARIES ${PYTHON_LIBRARIES})
    endif()
  • The MODULE->SHARED makes sense given your explanation. It seems like from what you're saying this is specific to the cross-compilation that conda-forge is doing. My first thought was that the cross-compiling needs to link in libpython at link time rather than relying on dynamic loading, but that can't be the whole story because then I would expect to see the same problem on other architectures. Is it possible that ARM-based Macs aren't using dlopen at all, but are instead loading the modules in a different way that is causing the issue?

I'm less sure on this. I don't know why a conda install would change the libraries that a .so file links to. Maybe my conda install test pulled an older package out of cache? This change may not be needed, but I also don't think it hurts anything.

  • My guess as to why you had to change to a custom function would be the function target_link_libraries_with_dynamic_lookup defined in this file in scikit-build. It seems plausible that it isn't correctly deciding what to link to in the specific conda-forge case (or maybe it's linking to the right libraries but setting the wrong linker flags). Based on the errors that you saw, does anything there jump out at you?

It definitely produces binaries that link to libpython on both the Linux and Mac systems I tested - both on conda-forge and on local builds.

Now that I've read this a little more in detail I'm fine pushing it forward in the near term. My only concern is whether architecture-specific linking logic is actually correct on all architectures. Could someone try installing the new conda-forge packages on an Intel Mac and a Linux box (and Windows if that's easy, although we don't need to be blocked on that) and make sure that the new packages work? If they do then this PR gets a 👍 from me.

It should be correct, as it is the same logic used by pybind11 with HOOMD. I don't have the time to adjust that conda-forge PR to produce artifacts for all systems and then test it on all platforms. I'm confident it will work with a release build. If it doesn't, I will fix it.

@vyasr
Copy link
Collaborator

vyasr commented May 6, 2022

It should be correct, as it is the same logic used by pybind11 with HOOMD. I don't have the time to adjust that conda-forge PR to produce artifacts for all systems and then test it on all platforms. I'm confident it will work with a release build. If it doesn't, I will fix it.

Fair enough, nothing here seems particularly concerning so if you are OK with keeping an eye on conda installations once we cut the next release then we don't need to test the conda packages too extensively right now.

@vyasr
Copy link
Collaborator

vyasr commented May 6, 2022

@tommy-waltmann I'll let you finalize and merge this when you're happy.

@joaander
Copy link
Member Author

joaander commented May 9, 2022

It should be correct, as it is the same logic used by pybind11 with HOOMD. I don't have the time to adjust that conda-forge PR to produce artifacts for all systems and then test it on all platforms. I'm confident it will work with a release build. If it doesn't, I will fix it.

Fair enough, nothing here seems particularly concerning so if you are OK with keeping an eye on conda installations once we cut the next release then we don't need to test the conda packages too extensively right now.

Vyas, I added tests to the conda recipe itself: conda-forge/freud-feedstock#48 . When the build platform is the target platform, the conda build will now run all freud's tests with pytest to ensure that the built package functions.

@tommy-waltmann
Copy link
Collaborator

If I understand #961 correctly, there are some faililng tests that are due to changes introduced on this branch. With that in mind, let's fix them on this branch before merging. The failing tests appear to be floating point precision on the aarch64 architecture.

@joaander
Copy link
Member Author

joaander commented May 9, 2022

The tests are not failing due to changes introduced on this branch. Previous freud-feedstock builds did NOT run pytest, so these tests are being run for the first time in conda-forge/freud-feedstock#48 - per Vyas's suggestion that all packaged builds must be thoroughly tested on all platforms.

aarch64 compatibility in the tests is a different problem than fixing the CMake scripts to generate proper builds, so it should be handled in a different pull request.

Copy link
Collaborator

@tommy-waltmann tommy-waltmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good! Let's make another PR to fix the floating point issues on aarch64

@tommy-waltmann tommy-waltmann merged commit 76c4eb9 into master May 9, 2022
@tommy-waltmann tommy-waltmann deleted the cmake-updates branch May 9, 2022 16:02
@vyasr
Copy link
Collaborator

vyasr commented May 9, 2022

Awesome, thanks for adding those tests @joaander! I agree that fixing the architecture-specific floating point errors is out of scope for this PR, so disabling those tests on those specific architectures is fine with me. The tests that you added are sufficient to ensure that we're generating functional builds on all platforms, which was my main concern following these changes to the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants