-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The new scikit-build-core setup copies external shared objects into Python wheel #717
Comments
TileDB-Vector-Search is also being updated to not always copy the shared objects into the wheel TileDB-Inc/TileDB-Vector-Search#361 |
Now that tiledb-vcf-feedstock was updated to 0.32.0 (TileDB-Inc/tiledb-vcf-feedstock#120), which was the first to use the new scikit-build-core setup, the tiledbcf-py conda binaries ballooned in size since they now vendor libtiledb, libtiledbvcf, and htslib. As a concrete example, linux-64/tiledbvcf-py-0.31.1-py39h1dd0e15_0.conda is 2.0 MB and linux-64/tiledbvcf-py-0.32.0-py39h59b0bc9_0.conda is 9.4 MB. |
Not only does this duplication increase the size of our cloud Docker images, but it will complicate future libtiledb updates. If we release libtiledb 2.23.1, all the other conda binaries will automatically use the new libtiledb 2.23.1, but presumably tiledbvcf-py will continue to use its vendored libtiledb 2.23.0. |
That is problematic. We need to do our best to ensure a single libtiledb is used and loaded. This simplifies for passing different structures back and forth in python (i.e creating a |
I suspect this is the cause of the user reported error in https://forum.tiledb.com/t/tiledbvcf-installation-error-on-macos/710 When tiledbvcf-py is built, it builds against whatever libgoogle-cloud version that upstream tiledb is currently pinned to. It then vendors this Hence the short-term solution is to downgrade libgoogle-cloud until you find the compatible one that your verison of tiledbvcf-py was built against (Azure doesn't keep old build logs, so trial and error is the only option). Long-term we need to stop copying the shared objects into the Python package, like we've already done for TileDB-Vector-Search (TileDB-Inc/TileDB-Vector-Search#361) and TileDB-Py (TileDB-Inc/TileDB-Py#1988). Maybe @dudoslav can help with this |
There are various situations where we want to be able to build tiledbvcf-py against an existing external
libtiledbvcf.so
:This is the same situation that we previously addressed for tiledbsoma-py in single-cell-data/TileDB-SOMA#1937 and single-cell-data/TileDB-SOMA#2221. Unfortunately tiledbsoma-py uses
setup.py
, so I can't directly apply the previous solution to the scikit-build-core setup we are now using for tiledbvcf-py.I think two things need to happen:
RUNPATH
(so thatlibtiledbvcf.cpython-3XX-x86_64-linux-gnu.so
can still find the externallibtiledbvcf.so
at runtime)Here is a reprex to demonstrate the current shared object copying behavior:
xref: #701, #702
The text was updated successfully, but these errors were encountered: