Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add environment variable to disable writing extra dist-info files #8877

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adisbladis
Copy link
Contributor

@adisbladis adisbladis commented Nov 7, 2024

Summary

This change introduces the UV_NO_EXTRA_DIST_INFO environment variable as a way to opt out of the extra dist-info files that uv is creating.

This is important to achieve reproducible builds in distribution packaging, allowing to replace usage of
installer with uv pip install.

At the time of writing these files are:

  • uv_cache.json Contains timestamps which are non-reproducible. These hashes also leak in to RECORD.

  • direct_url.json Contains the path to the installed wheel. While not non-reproducible it's not required for distribution packaging.

  • INSTALLER Again, not non-reproducible, but of no value in distribution packaging.

Test Plan

Automated test added.

@adisbladis adisbladis force-pushed the reproducible-builds-no-dist-info branch 6 times, most recently from aaa8644 to 0b53fc9 Compare November 7, 2024 05:16
@zanieb
Copy link
Member

zanieb commented Nov 7, 2024

Thanks for putting up the PR!

Note to other reviewers, we briefly discussed using an environment variable for this in Discord.

I don't have strong opinions on the name, but we might want to use UV_NO_EXTRA_DIST_INFO=1 rather than UV_EXTRA_DIST_INFO=0?

Otherwise, I'm not well suited to be the primary reviewer for this. Perhaps @konstin or @charliermarsh would be interested.

@@ -525,4 +525,7 @@ impl EnvVars {

/// Ignore `.env` files when executing `uv run` commands.
pub const UV_NO_ENV_FILE: &'static str = "UV_NO_ENV_FILE";

/// Skip writing `uv` cache & installer files to site-packages dist-info.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Set to false to skip writing uv cache & installer files to site-packages dist-info." ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no other env vars written in that style, and I believe the wording makes more sense since the name was swapped from UV_EXTRA_DIST_INFO to UV_NO_EXTRA_DIST_INFO, with the logic inverted from previously.

@adisbladis adisbladis force-pushed the reproducible-builds-no-dist-info branch from 0b53fc9 to 94ebfbb Compare November 7, 2024 05:33
@adisbladis
Copy link
Contributor Author

I don't have strong opinions on the name, but we might want to use UV_NO_EXTRA_DIST_INFO=1 rather than UV_EXTRA_DIST_INFO=0?

It made more sense to me when writing this to think about it the other way around, but it's indeed seems more consistent with other options to use UV_NO_EXTRA_DIST_INFO=1.

@adisbladis adisbladis force-pushed the reproducible-builds-no-dist-info branch 3 times, most recently from 7d992a0 to 64977b4 Compare November 7, 2024 05:50
@konstin
Copy link
Member

konstin commented Nov 7, 2024

What's the motivation for going through wheel installation for repackaging over re-zipping the wheel into the target format you are interested in?

@adisbladis
Copy link
Contributor Author

adisbladis commented Nov 7, 2024

What's the motivation for going through wheel installation for repackaging over re-zipping the wheel into the target format you are interested in?

Sorry, I don't understand the question? I'll explain my use case in more detail, hopefully we'll find some understanding.

The use case for this PR is to replace usage of installer with uv in distribution packaging scripts.
These scripts need to:

  • Invoke a Python build system, such as pypa/build which takes a source tree and outputs a wheel
    • Or download a pre-built wheel
  • Invoke an installer that can install said wheel into a prefix

In nixpkgs we use:

  • pypa/installer to install Python wheels.
  • pypa/build to build Python packages from source
  • In some rare cases download pre-built wheels

For distribution packaging we want reproducible builds, meaning that the outputs produced by a packaging script should be bit-for-bit identical.
Nix comes with tooling to perform these checks: nix-build ./. -A somePackage --check.

I have implemented an alternative Python build infrastructure for Nix where I use uv.
These builds end up failing reproducibility checks because of extra dist info files.

At some point in the future I'd like to replace the nixpkgs install hook with a uv implementation too. A requisite for that is that outputs are reproducible.

This change introduces the `UV_NO_EXTRA_DIST_INFO` environment variable
as a way to opt out of the extra dist-info files that `uv` is creating.

This is important to achieve reproducible builds in distribution
packaging, allowing to replace usage of
[installer](https://pypi.org/project/installer) with `uv pip install`.

At the time of writing these files are:
- `uv_cache.json`
    Contains timestamps which are non-reproducible.
    These hashes also leak in to the `RECORD` file.

- `direct_url.json`
    Contains the path to the installed wheel.
    While not non-reproducible it's not required for distribution packaging.

- `INSTALLER`
    Again, not non-reproducible, but of no value in distribution packaging.
@adisbladis adisbladis force-pushed the reproducible-builds-no-dist-info branch from ca26dd0 to 8313229 Compare November 12, 2024 00:39
@adisbladis
Copy link
Contributor Author

Ping @charliermarsh

@charliermarsh
Copy link
Member

I'm not super excited to maintain this but I see the value. I think we should call this "installer metadata" rather than "extra dist-info", since the latter is just a term we made up within the wheel installer crate. How's that sound?

@@ -45,6 +45,7 @@ pub fn install_wheel(
installer: Option<&str>,
link_mode: LinkMode,
locks: &Locks,
extra_dist_info: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be after installer.

@@ -241,6 +243,7 @@ pub(super) async fn do_sync(
allow_insecure_host: &[TrustedHost],
cache: &Cache,
printer: Printer,
extra_dist_info: bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not gonna make the case that anything here is well-ordered but we do consistently end with cache and then printer. And generally, we try to put the global arguments at the bottom. I think this should be after logger maybe?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants