Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS 15 Sequoia clobbers _nixbld1-4 users #10892

Open
abathur opened this issue Jun 11, 2024 · 46 comments
Open

macOS 15 Sequoia clobbers _nixbld1-4 users #10892

abathur opened this issue Jun 11, 2024 · 46 comments
Labels
installer macos Nix on macOS, aka OS X, aka darwin

Comments

@abathur
Copy link
Member

abathur commented Jun 11, 2024

Note: I keep this first comment up-to-date. It includes fixes. You do not have to read the thread unless you are having trouble with its suggestions!

The macOS 15 Sequoia update takes 4 UIDs in the range we've been using, clobbering any _nixbldN users in the way (typically _nixbld1-4).

This manifests as:

  • On existing installs, build errors like: error: the user '_nixbld1' in the group 'nixbld' does not exist

    To fix this, run our migration script (which relocates/replaces _nixbld users) before or after taking the macOS 15 Sequoia update:

    curl --proto '=https' --tlsv1.2 -sSf -L https://github.com/NixOS/nix/raw/master/scripts/sequoia-nixbld-user-migration.sh | bash -
    

    Caution: If you installed Nix with a third-party installer, you should check with them for additional/different instructions.

  • On fresh installs (with unpatched installer versions), the following error creating user _nixbld1:

    <main> attribute status: eDSRecordAlreadyExists
    <dscl_cmd> DS Error: -14135 (eDSRecordAlreadyExists)
    

    While the 2.24.6+ installers are fixed, older installers don't all work at the moment. (The installer fixes for this have been backported for every release back to the 2.18 series--but these aren't all quite released yet.)

    If you're trying to install versions from 2.20.0 and 2.24.5, you can explicitly override the starting UID:

    NIX_FIRST_BUILD_UID="351" sh <(curl -L <whatever release-specific installer URL you need>)
    

    If you run into this error with versions older than 2.18, you'll need to download the installer tarball for your platform, unpack it, and update the first UID in install-darwin-multi-user.sh to 351.

More background/context on the issue

Edit: As the macOS release is near, I'm tucking context away to focus the first comment on how users can fix broken installs.

Reports are percolating about the upcoming macOS Sequoia 15 (from people trying the beta out) using 4 UIDs in the range we've been using:

History on our previous change and ID range selection is in:

PRs to address:

@fbettag
Copy link

fbettag commented Jun 11, 2024

cat /etc/passwd on sequoia shows

_aonsensed:*:300:300:Always On Sense Daemon:/var/db/aonsensed:/usr/bin/false
_modelmanagerd:*:301:301:Model Manager:/var/db/modelmanagerd:/usr/bin/false
_reportsystemmemory:*:302:302:ReportSystemMemory:/var/empty:/usr/bin/false
_swtransparencyd:*:303:303:Software Transparency Services:/var/db/swtransparencyd:/usr/bin/false
_naturallanguaged:*:304:304:Natural Language Services:/var/db/com.apple.naturallanguaged:/usr/bin/false
_oahd:*:441:441:OAH Daemon:/var/empty:/usr/bin/false

@ryanbooker
Copy link

ryanbooker commented Jun 11, 2024

Perhaps count down from 400? That's what I've done locally to fix the immediate issue. FYI, everything works with an arbitrary range, e.g. 3001–3032.

@abathur
Copy link
Member Author

abathur commented Jun 11, 2024

No personal objection to counting down as a strategy, but:

  • IIRC the code that does this is in the general installer so we'd be changing it (and the "first" UID defaults) for linux as well.
  • Since the starting UIDs are overridable via environment variables, a naive flip in the order would technically be a breaking interface change for anyone actually doing scripted installs that pre-set UIDs. (No clue how common this is in the wild.) We could try to preserve the semantics/behavior of the current env by doing math on the value, but it may turn into a bit of a stumbling block over time if people get used to it going in reverse order and expect the variable to control the number it counts down from?

@roberth roberth added the macos Nix on macOS, aka OS X, aka darwin label Jun 12, 2024
@roberth
Copy link
Member

roberth commented Jun 12, 2024

Also affects nix-darwin

@stepbrobd
Copy link
Member

Question 1:
Would it be possible to workaround this without reinstalling Nix on macOS systems with ids option (exposed but not in docs, and changing the settings doesn't change anyting)?

Q1 RFC:
In this file, nixbld = 300; is set but this is not used anywhere. Perhaps we can add an idempotent shell script to add/remove nixbld group and nixbld* users on every rebuild?

Question 2:
For NixOS systems, if I'm understanding the docs correctly, nixbld* users are not needed to perform builds when auto-allocate-uids or cgroups is enabled, is there anything equivalent to these on macOS systems?

@abathur
Copy link
Member Author

abathur commented Jun 12, 2024

@michaelvanstraten noted in #6153 (comment) that you can get unblocked on new installs for testing with:

To repeat @ikuz's solution again for a quick fix on macOS 15 install/reinstall with:

 NIX_FIRST_BUILD_UID="351" sh <(curl -L https://nixos.org/nix/install)

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Some thoughts on where to reassign the IDs:

We need to be below (or equal to?) UID 400, and Apple has now used up to UID 304. Clearly we should expect they might keep adding users to the low end of this range occasionally. However, running right up against the 400 limit doesn’t seem safe to me either; /etc/group on my Sonoma machine contains groups from 395 to 400, so it seems like Apple considers the upper end of the system range to be its for the taking as well. The natural place to go, then, would be in the middle of the range.

We default to 32 build users currently. The main reason you might want more is that it limits the number of concurrent build jobs, and the number of those you might want to run is proportional to the cores/threads on your machine. The highest number of cores on a currently-shipping Mac is the 24‐core M2 Ultra, they used to ship Xeons with 48 threads (24 cores × 2 threads/core), and there are rumours of a 32‐core M3 Ultra and even a 64‐core M3 Ultimate. We can’t fit more than 96 users at the absolute maximum as of Sequoia, and that would obviously be risky, so let’s say we want to plan to have space for around 64 to 80 build users.

I suggest we start at 331 (if we want to keep the last digit matching the build user number) or 330 (if we don’t). That gives us enough margin for Apple to add ~26 new users before we run into problems again, 38 empty spaces above the top UID for the current default of 32 users, and just enough space to squeeze in 64 users before hitting the 395 ID that Apple has already used for a group.

The other good candidate would be 321/320; that reduces the margin on the low end to ~16 new Apple users, but increases the margin on the upper end, to (if using 320) just barely allow squeezing in 80 users in the available range. Personally I feel like the release of an 80‐core Mac would make me scared for 96 to 128‐core Macs that we have no realistic way of adding enough users for with our current approach anyway, and we have precedent that Apple is happy to add users on the low end of the range, so I lean towards 331 to give us more margin on that side. But I’m ambivalent if people have a strong preference for more margin to add and think that Apple will continue adding system users at a restrained enough pace compared to core count inflation. 331 spreads our users around the middle of the available range for 32 users, 321 does the same for 64 users.

This would also be a good opportunity to change the group ID; the current default of 30000 has the unfortunate effect of making the group show up in System Settings, unlike system groups. I would suggest using a related ID of 330 or 320 depending on the choice, and perhaps renaming it to _nixbld for consistency with the user names and other system groups (unless there’s any reason not to?).

Since we have to coordinate this with two installers and nix-darwin and get a migration plan in place before the release of Sequoia, I hope we can commit to a set of IDs ASAP to enable this to go smoothly.

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Incidentally I note that _oahd has UID 441 which makes me wonder if Apple has secretly expanded the system user range without updating the meagre documentation of it? But I remember it being a pain to test and reproduce the issues that led us to use the system UID range in the first place, so it’d need careful verification of the bounds if we wanted to see if we could go beyond 400.

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Some more investigation:

It seems like groups with GID < 500 don’t show up in System Settings. This may imply that the system UID range has also expanded, but I don’t know a convenient way to check, and anyway considering that Apple is using the middle of that range with 441 it might be awkward real estate to occupy; who knows what values Apple might pick in future. (There are also groups with higher GIDs that don’t show up in System Settings (e.g. com.apple.sharepoint.group.1/“<name>’s Public Folder”) that have dsAttrTypeNative:IsHidden: 1 in dscl(1) and Directory Utility, but setting that for the nixbld group didn’t seem to help.)

The maximum in‐use UID < 400 went from 297 to 304 in one version. I don’t know what the historical growth rate is like, but I’d definitely be more comfortable with 331 than 321 given that. If the growth continued at the rate of Sequoia (which seems unlikely, but still), we’d have to think of a new idea in 4 OS releases rather than 3. In general it seems like we’re on borrowed time here and I’m not really sure what the long‐term solution is. We may have to migrate now with the expectation of migrating again later.

If we could verify that UIDs < 500 now work fine, I suppose one solution would be to start at, say, 360 now, and hope Apple don’t eat up the lower 400s range if we start having machines that want 64 users. But I don’t remember how to test that.

@abathur
Copy link
Member Author

abathur commented Jun 15, 2024

Incidentally I note that _oahd has UID 441 which makes me wonder if Apple has secretly expanded the system user range without updating the meagre documentation of it?

Does this mean you looked at the usage info for sysadminctl and it still says 200-400?

But I remember it being a pain to test and reproduce the issues that led us to use the system UID range in the first place, so it’d need careful verification of the bounds if we wanted to see if we could go beyond 400.

Yeah. The main manifestation I recall was the system giving an obtuse adrenaline-inducing error message and booting into recovery mode during system updates that require a full reboot cycle. Looking back over the thread, it looks like that got less-scary on the next point release. The other was the build users showing up in a user list.

If we wanted to try moving outside of the current range, I suspect a workable protocol would be installing into the new range on a sub-sequoia version and then running the sequoia update and seeing if it blows up. (I don't currently have a spare mac that's eligible for this update, so someone else will have to drive...)

It can't save us from needing to fix this UID issue in the short run, but one way to address the long-run problem would be to figure out if we can get the various issues with auto-allocate-uids sorted out in order to make it the default. (The detsys folks tried defaulting to this in their installer and ran into trouble that compelled them to revert it on both macOS and Linux.)

After I hit post here I'll start drafting a feedback/radar report for Apple. Once I do, I'll also email their devrel about it, and post the FB number here in case anyone wants to refer to it from their own FB report. I'm not terribly optimistic about that helping (for example, I never got a response on the reports I opened about the big sur issue in 2021), but I guess there's an outside chance they'll improve their updater to relocate any UID/GID they trample to another valid ID.

(That would leave people with Weird installs--they might fail to fully clean up when they follow the uninstall instructions for example--but I think it would at least not break every existing macOS install made since early 2022 or whenever the multi-user default was released...)


For searchability, here's one real manifestation of this on an existing install (from ##10912):

...
these 13 derivations will be built:
  /nix/store/jcrd05mlpsw8wmixwd133pv3q3xbm18w-nerdfonts-3.2.1.drv
  ...
error: the user '_nixbld1' in the group 'nixbld' does not exist

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

Does this mean you looked at the usage info for sysadminctl and it still says 200-400?

On Sonoma, which already has that _oahd user, yes; so if that’s within the system UID range and there’s not something weird going on with that user specifically and groups in the 400 to 500 range (perfectly possible! OAH is the internal name for Rosetta 2, as I understand it, so I wouldn’t be surprised if there are strange things going on there), then they expanded it without updating what passes for the “documentation”. I haven’t tried Sequoia yet, so I can’t comment on what the command says there.

Yeah. The main manifestation I recall was the system giving an obtuse adrenaline-inducing error message and booting into recovery mode during system updates that require a full reboot cycle. Looking back over the thread, it looks like that got less-scary on the next point release. The other was the build users showing up in a user list.

I think filling up the visible user list with 32 random daemon users is scary and off‐putting to users (especially in the absence of an official upstream uninstaller), so even if the more fundamental issues might be solved now I’d be reluctant to settle on that unless we can find another way to hide them.

It can't save us from needing to fix this UID issue in the short run, but one way to address the long-run problem would be to figure out if we can get the various issues with auto-allocate-uids sorted out in order to make it the default. (The detsys folks tried defaulting to this in their installer and ran into trouble that compelled them to revert it on both macOS and Linux.)

Yes, I would love this. If we can commit in the interim to e.g. UIDs starting at 331 and a GID of 330, hopefully that would give us enough runway to make something workable out of auto-allocate-uids. I remember hearing that the problems were worse on macOS than Linux, though (e.g. DeterminateSystems/nix-installer#521, DeterminateSystems/nix-installer#580 (comment) – I guess the lack of user namespaces really makes it tricky).

@abathur
Copy link
Member Author

abathur commented Jun 15, 2024

Ok, I've reported this in FB13917314 and emailed the devrel about it. For reference, report is roughly:

macOS 15 Sequoia beta installer clobbering existing role users with UIDs 301-304

We're getting reports (example: #10912) that the Sequoia update is clobbering existing build users for the Nix package manager, causing later errors such as:

error: the user '_nixbld1' in the group 'nixbld' does not exist

Users who have taken the update report seeing new users in this range in /etc/passwd:

_aonsensed:*:300:300:Always On Sense Daemon:/var/db/aonsensed:/usr/bin/false
_modelmanagerd:*:301:301:Model Manager:/var/db/modelmanagerd:/usr/bin/false
_reportsystemmemory:*:302:302:ReportSystemMemory:/var/empty:/usr/bin/false
_swtransparencyd:*:303:303:Software Transparency Services:/var/db/swtransparencyd:/usr/bin/false
_naturallanguaged:*:304:304:Natural Language Services:/var/db/com.apple.naturallanguaged:/usr/bin/false
_oahd:*:441:441:OAH Daemon:/var/empty:/usr/bin/false

A few years ago, the Nix installer used UIDs from 30001-30032 by default. The issue I reported in FB8997501 started causing trouble when users with these UIDs were present, so in response we took the hint from the usage note in sysadminctl ("Role accounts require name starting with _ and UID in 200-400 range") and migrated our build user UID defaults to 301-332.

The current behavior of the beta installer will break all existing multi-user Nix installs on macOS made in the last few years, confusing a lot of users in the process.

I can imagine at least two improvements that would help us out, here:

  • If these UIDs don't need to be hardcoded on your end, avoid clobbering existing role users and select a UID that doesn't clash.
  • If these UIDs need to be hardcoded, relocate any existing users to new unoccupied UIDs in the role user range.

@emilazy
Copy link
Member

emilazy commented Jun 15, 2024

I can confirm that Sequoia’s sysadminctl says the same thing, so if there’s been any change it remains undocumented.

@ahcm
Copy link

ahcm commented Jul 7, 2024

On Macos 15 beta, users added with -roleAccount but no leading _ get an automatic user id >500.
With _ underscore one gets the following message:
Role account requires specified UID in 450-499 range.

While the help still gives the footnote:
*Role accounts require name starting with _ and UID in 200-400 range.

@lloeki
Copy link

lloeki commented Jul 12, 2024

For searchability too (thanks @abathur!), I had error: the user '_nixbld1' in the group 'nixbld' does not exist, if anyone needs to get out of this hole in a pinch, here's what I did to fill in the blanks:

# check your _nixbld users for who's missing
dscl . list /Users UniqueID | grep _nixbld | sort -n -k2

# check what are the non-nix used ones, so that you can find a hole in there
dscl . list /Users UniqueID | grep -v _nixbld | sort -n -k2

# fill in the blanks, mine were 1 to 4, I picked 401 as the start:
for i in {1..4}; do
  sudo dscl . -create "/Users/_nixbld${i}" UniqueID $(( 400 + ${i} ))
  sudo dscl . -create "/Users/_nixbld${i}" PrimaryGroupID 30000
  sudo dscl . -create "/Users/_nixbld${i}" IsHidden 1
  sudo dscl . -create "/Users/_nixbld${i}" RealName "_nixbld${i}"
  sudo dscl . -create "/Users/_nixbld${i}" NFSHomeDirectory '/var/empty'
: sudo createhomedir -cu "_nixbld${i}"
  sudo dscl . -create "/Users/_nixbld${i}" UserShell /sbin/nologin
done

# that's just for having the survivors be consistent, also using 401 as the start
for i in {5..32}; do
  id="$(dscl . -read "/Users/_nixbld${i}" UniqueID | cut -d' ' -f2)"
  sudo dscl . change "/Users/_nixbld${i}" UniqueID $id $(( 400 + ${i} ))
done

@lloeki
Copy link

lloeki commented Jul 16, 2024

FYI on my work laptop (still Sonoma 14) I have these:

-----8<-----
_backgroundassets        291
_mobilegestalthelper     293
_audiomxd                294
_terminusd               295
_neuralengine            296
_eligibilityd            297
-----8<-----
_nixbld{1-32} {301-332}
-----8<-----
_oahd                    441
_sentinelguard           498
_sentinel                499

Note:

  • _oahd 441 is already there; apparently it's Rosetta 2 related (/usr/libexec/rosetta/oahd) and part of the AOT compiler
  • I think _sentinel 498 and _sentinelguard 499 come from some company-managed security software (SentinelOne)

EDIT: confirmed that these two are SentinelOne

@Teebor-Choka
Copy link

@michaelvanstraten noted in #6153 (comment) that you can get unblocked on new installs for testing with:

To repeat @ikuz's solution again for a quick fix on macOS 15 install/reinstall with:

 NIX_FIRST_BUILD_UID="351" sh <(curl -L https://nixos.org/nix/install)

I had to first remove the group and users with:

sudo dscl . delete /Groups/nixbld
for i in $(seq 1 32); do sudo dscl . -delete /Users/_nixbld$i; done

then

NIX_FIRST_BUILD_UID="351" sh <(curl -L https://nixos.org/nix/install)

@n8henrie
Copy link
Contributor

Hmmm, I ran the script from @emilazy here (thank you, always so helpful!) but it only migrated 1 of 32 nixblkd users to 351, the rest were left in the 30,000 range. I thought that seemed wrong, so then I deleted all the nixbld users and ran PATH=$(getconf PATH) NIX_FIRST_BUILD_UID="351" sh <(curl -L https://nixos.org/nix/install) (the PATH part because the installer seems to choke on GNU curl, which is first on my PATH, instead of the macos one). Got most of the way through the installer, but now Alacritty has been beach balled for the last 15 minutes...

Hmmm.

@abathur
Copy link
Member Author

abathur commented Sep 23, 2024

Hmmm, I ran the script from @emilazy here (thank you, always so helpful!)

Just to clarify for anyone reading this far down, there aren't multiple competing scripts here. This is the same script and same install instruction listed in the first comment.

but it only migrated 1 of 32 nixblkd users to 351, the rest were left in the 30,000 range.

A bit of a guess without knowing what you had before/after or what the script did, but it sounds like you have (or had) an edge-case install with a mix of _nixbldN users and older nixbldN users?

The installer hasn't been creating users with 30000+ UIDs since 0431cf6 (first released with nix 2.4), and that commit also changed from nixbldN to _nixbldN. (Basically, it sounds like you had a working 2.4+ install with some lingering artifacts of an older pre-2.4 install.)

I thought that seemed wrong, so then I deleted all the nixbld users

I don't think you need to do this if you get everything working soon--but if you keep having trouble here I'd recommend fully uninstalling Nix and reinstalling. The instructions are at https://nixos.org/manual/nix/stable/installation/uninstall.html#macos. (But since those instructions focus on modern installs, I'd run the commands for cleaning up the build group and users with both the _nixbld and nixbld prefix.

and ran PATH=$(getconf PATH) NIX_FIRST_BUILD_UID="351" sh <(curl -L https://nixos.org/nix/install)

Again to be clear for anyone following along, the UIDs are already fixed in the latest release (the one you get from https://nixos.org/nix/install) and you don't need to set your own NIX_FIRST_BUILD_UID. (This was just a workaround suggested before fixes were merged.)

(the PATH part because the installer seems to choke on GNU curl, which is first on my PATH, instead of the macos one).

What does "choke" mean in terms of the error, here? I would be a little surprised if GNU curl intrinsically causes any trouble here. (That's what most people installing on Linux will be using...)

Got most of the way through the installer, but now Alacritty has been beach balled for the last 15 minutes...

Does this mean Alacritty's own UI isn't even responding? Where did it halt? Terminal.app has an "inspector" in the right-click menu that'll show you what's running--not sure if Alacritty has something similar, but if not it may be worth a run in Terminal.app if this keeps up.

@n8henrie
Copy link
Contributor

n8henrie commented Sep 23, 2024

@abathur Thanks for your input!

The installer hasn't been creating users with 30000+ UIDs since

No, sorry I was unclear -- they were created in the script I (we) linked, it specifically noted moving them "temporarily," but in my case they were never moved back for some reason. Look for ((TEMP_NIX_FIRST_BUILD_UID=31000))

This was just a workaround suggested before fixes were merged.

Ah, ok thanks.

What does "choke" mean in terms of the error, here?

This has been a longstanding issue for me, I have to put MacOS curl first on the PATH or else the nix installer doesn't work. (Same issue / workaround with the asahi installer, fwiw.)

Or maybe it's not curl in this case (and I'm remembering wrong), currently I get an error when unpacking the archive:

$ sh <(curl -L https://nixos.org/nix/install)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  4267  100  4267    0     0  10441      0 --:--:-- --:--:-- --:--:--     0
downloading Nix 2.24.7 binary tarball for aarch64-darwin from 'https://releases.nixos.org/nix/nix-2.24.7/nix-2.24.7-aarch64-darwin.tar.xz' to '/var/folders/kb/tw_lp_xd2_bbv0hqk4m0bvt80000gn/T/nix-binary-tarball-unpack.nJf7SW58hu'...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 14.6M  100 14.6M    0     0  18.8M      0 --:--:-- --:--:-- --:--:-- 18.8M
tar (child): xz: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
/dev/fd/63: failed to unpack 'https://releases.nixos.org/nix/nix-2.24.7/nix-2.24.7-aarch64-darwin.tar.xz'

This issue is resolved by eliminating non-MacOS utilities from the PATH: PATH=$(getconf PATH) sh <(curl ...

Does this mean Alacritty's own UI isn't even responding?

Alacritty froze at sudo /usr/sbin/diskutil unmount force disk4s7. Other apps still working. After an hour or so I force killed Alacritty and reran the installer in Terminal.app, which made it much farther (past creating the nixblkd users), but has now been stuck at about 30 minutes at sudo /usr/sbin/chown -R root:nixbld /nix.

EDIT: Sampling Terminal.app in Activity Monitor, it looks like it just has a lot to chown -- slowly chugging its way through.

EDIT2: Reinstall in Terminal.app eventually succeeded, took 45 mins or so on my M1 MBP, likely due to a fairly large nix store. Lesson learned: don't reinstall nix from a terminal emulator that's in the nix store, thanks @emilazy and @abathur!

@emilazy
Copy link
Member

emilazy commented Sep 23, 2024

Was your Alacritty perhaps stored in the Nix store?

@n8henrie
Copy link
Contributor

Aha! Yes of course.

@abathur
Copy link
Member Author

abathur commented Sep 23, 2024

No, sorry I was unclear -- they were created in the script I (we) linked, it specifically noted moving them "temporarily," but in my case they were never moved back for some reason. Look for ((TEMP_NIX_FIRST_BUILD_UID=31000))

That sounds more manageable :)

Not super confident without a log, but one cause I'm aware of would be if there was already a gap in your set of _nixbldN users (i.e., if _nixbld33 were missing).

Or maybe it's not curl in this case (and I'm remembering wrong), currently I get an error when unpacking the archive:

$ sh <(curl -L https://nixos.org/nix/install)
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  4267  100  4267    0     0  10441      0 --:--:-- --:--:-- --:--:--     0
downloading Nix 2.24.7 binary tarball for aarch64-darwin from 'https://releases.nixos.org/nix/nix-2.24.7/nix-2.24.7-aarch64-darwin.tar.xz' to '/var/folders/kb/tw_lp_xd2_bbv0hqk4m0bvt80000gn/T/nix-binary-tarball-unpack.nJf7SW58hu'...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 14.6M  100 14.6M    0     0  18.8M      0 --:--:-- --:--:-- --:--:-- 18.8M
tar (child): xz: Cannot exec: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
/dev/fd/63: failed to unpack 'https://releases.nixos.org/nix/nix-2.24.7/nix-2.24.7-aarch64-darwin.tar.xz'

Does type -a tar show a location other than /usr/bin/tar? (Maybe gnutar from Nix?) If so, I think it's actually that causing the trouble.

If so, can you open a new issue for the problem and symptoms?

(It sounds like there was a previous attempt to address this, but I guess it was only a partial fix:

Even if there isn't a good way around, documenting it may help others.)

@n8henrie
Copy link
Contributor

Does type -a tar show a location other than /usr/bin/tar?

Yes, this is what I was trying to say above. I prefer having the GNU utilities so I can use the same awk / sed / grep incantations across my linux and macos machines.

$ type -a tar
tar is /etc/profiles/per-user/n8henrie/bin/tar
tar is /run/current-system/sw/bin/tar
tar is /usr/bin/tar

If so, can you open a new issue for the problem and symptoms?

#11570

@juboba
Copy link

juboba commented Sep 28, 2024

I love this community.

@torgeir
Copy link

torgeir commented Oct 9, 2024

I recommend curl --proto '=https' --tlsv1.2 -sSf -L https://github.com/NixOS/nix/raw/master/scripts/sequoia-nixbld-user-migration.sh | bash -, which enforces secure protocols, handles errors a bit better, and has a slightly more legible “obviously from the upstream Nix repository” flavour to it. This is patterned on the commands used by rustup and the Determinate Systems installers.

And you'd make for an even more trustworthy suggestion of a command for people to pipe into their shell by also pinning it to the current reveision, also improving its relevance for historical purposes.

Suggestion to change the url in your first post to
https://github.com/NixOS/nix/blob/8b2ffbae3adc2418a6221c24619d9bca51852d05/scripts/sequoia-nixbld-user-migration.sh @abathur

(obtained by pressing y when viewing it on github)

@mkenigs
Copy link
Contributor

mkenigs commented Oct 9, 2024

And you'd make for an even more trustworthy suggestion of a command for people to pipe into their shell by also pinning it to the current reveision, also improving its relevance for historical purposes.

Suggestion to change the url in your first post to https://github.com/NixOS/nix/blob/8b2ffbae3adc2418a6221c24619d9bca51852d05/scripts/sequoia-nixbld-user-migration.sh @abathur

(obtained by pressing y when viewing it on github)

Commands like this get copied and pasted around, and if there are patches to the script, it's preferable if people get the latest version rather than pinning to something that might be stale and not have fixes

@emilazy
Copy link
Member

emilazy commented Oct 9, 2024

Yes, if the Nix repository is compromised then users have bigger problems. Especially since the very comment containing the command to run is in the Nix repository.

@butterflyhug
Copy link

I have two Macs with Nix, both installed via the official https://nixos.org/nix/install installer. One of these machines is running MacOS 14.7 and the other has been upgraded to 15.0.1 (via 15.0). On both machines, the recommended sequoia-nixbld-user-migration.sh script is erroring out on its first dscl command when I run it from my admin user account, more or less immediately after I enter my password for the script's invocation of sudo. The full command output on both machines is identical:

$ curl --proto '=https' --tlsv1.2 -sSf -L https://github.com/NixOS/nix/raw/master/scripts/sequoia-nixbld-user-migration.sh > /tmp/sequoia-nixbld-user-migration.sh
$ bash /tmp/sequoia-nixbld-user-migration.sh
Attempting to migrate _nixbld users.

Step 1: move existing _nixbld users out of the destination UID range.
Password:
<main> attribute status: eDSPermissionError
<dscl_cmd> DS Error: -14120 (eDSPermissionError)

(I believe the only difference here vs the recommended one-liner in the issue description is that I'm separating download from execution. I make a habit of keeping a local copy of all download-and-execute scripts like this, so that I can manually inspect the exact script that I have executed if/when anything goes wrong.)

Unsurprisingly given that the two scripts appear to be invoking dscl in the same way, the Determinate Systems installer's repair script that was also suggested upthread ultimately produces this same dscl error for me. It looks like my Nix installs are a bit old at this point (2.18.8 and 2.17.0, respectively), so I should upgrade Nix anyway... but it also doesn't seem like my stale Nix versions should really affect these scripts given my (limited) understanding of what the scripts are doing?

@abathur
Copy link
Member Author

abathur commented Oct 11, 2024

I agree that those versions shouldn't have anything to do with the issue. Afaik this is our first report of the problem, so the tractability of this will likely hinge on what you can figure out locally (at least until we figure out how to reproduce it).

A few questions to get us started:

  • Are these macs ~managed by an org (using mdm profiles)?
  • Have you restored them from time machine backups or used migration assistant (or any other means of image deployment or porting/recovering data)?
  • Can you manually invoke the failing dscl command while watching system logs (via Console.app or the logs command) and see if it coughs up any clues about the failure?

@butterflyhug
Copy link

butterflyhug commented Oct 11, 2024

No active MDM profiles on either Mac. Technically, the 14.7 machine is erroneously listed on a MDM auto-enrollment list from its previous life as an MDM-managed machine for about 6 months in 2020, but has never actually been re-enrolled after the previous owner removed their MDM profile and Recovery-wiped the machine for resale. (Yeah, I'm annoyed that the former owners could never be bothered to fix their mess after confirming that their auto-enrollment claim is erroneous, but that whole sordid tale should be irrelevant for our current purposes.) The 15.0.1 machine has no previous owners and has always been completely MDM-free; it originally shipped directly from Apple in early 2023 with 13.x preinstalled, and then has been kept regularly updated with production (non-beta) OS releases since.

Also no restored backups; I set up both machines from scratch from their clean OS installs as soon as they entered my hands and never looked back. As a result they have both been through multiple major MacOS upgrades over time. IIRC my original Nix installation on each of these machines was after upgrading the corresponding Mac's OS to 14.x.

Adding set -o xtrace to the top of the Nix-maintained script reveals that the exact command that is failing is sudo dscl . -create /Users/_nixbld5 UniqueID 31000. Here's a copy of Console.app's logs from the time period covering a manual invocation of that command on the machine running MacOS 15.0.1, although unfortunately I'm not really spotting anything we didn't already know in there.

@abathur
Copy link
Member Author

abathur commented Oct 11, 2024

Iirc warp is a terminal app, yeah? Are you using it on both systems? If so can you try again in Terminal.app?

@butterflyhug
Copy link

Oh hey, I found this old comment which suggested that the problem might be my choice of terminal emulator (I had previously discounted that as being potentially relevant because I'm using different third-party terminal apps on each of the two machines), and indeed the script works as expected in Terminal.app.

(fun, looks we got to the same place at the same time 🙂 )

@Enzime
Copy link
Member

Enzime commented Oct 11, 2024

@butterflyhug have you granted full disk access to either Terminal.app or Warp?

@abathur
Copy link
Member Author

abathur commented Oct 12, 2024

@Enzime fair question. In my case, the terminal.app hunch was because the log attached earlier shows some sandbox/tccd errors related to warp, which makes me think macos is further restricting permissions here.

I suspect we'd have heard by now if FDA was required to run these dscl commands broadly, but it is certainly still possible that there's something specific up with these systems and FDA explains why the command succeeded in Terminal (but I am hoping this isn't the reason).

@n8henrie
Copy link
Contributor

Terminal.app doesn't have FDA by default either.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
installer macos Nix on macOS, aka OS X, aka darwin
Projects
None yet
Development

No branches or pull requests