-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix: make shmoverride.so initfirst #148
Conversation
I guess it's an interaction with a constructor from another library. Does NixOS use LD_PRELOAD (with something that has a constructor) and/or patches some of the library used by Xwayland to add/modify a constructor? |
Nope, the only LD_PRELOADed library is shmoverride, the rest of runtime dependencies are explicit elf shared object dependencies. NixOS patches rarely touch runtime logic, and I see no such changes in llvm/mesa. The fsync call is coming from llvm, which is used by mesa. Maybe llvm/mesa update caused this? |
OpenQA test summaryComplete test suite and dependencies: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024092617-4.3&flavor=pull-requests Test run included the following:
New failures, excluding unstableCompared to: https://openqa.qubes-os.org/tests/overview?distri=qubesos&version=4.3&build=2024070519-4.3&flavor=update
Failed tests61 failures
Fixed failuresCompared to: https://openqa.qubes-os.org/tests/105374#dependencies 9 fixed
Unstable tests
|
As I can see, old llvm version was initialized before shmoverride too, but it wasn't calling fstat before. I believe it is correct to have -z initfirst linker flag. |
This looks to be caused by this PR. I see the following in the log:
|
That's strange, I have tested this change on QubesOS with no problems, it should not affect library loading at all, it only affects initialization. Except if there was some build failure, and LD_PRELOADed library is not found?.. |
Well, maybe it is loaded, but constructor is not called? Or it fails? I don't see any errors from it in Xorg.0.log. You can find more logs in "logs & assets" tab on openqa (click any of those failed tests in the comment) |
Yep, also a possibility, I'll research. |
bdc4bea
to
e7d4a0c
Compare
For some reason, it is not working when built using QubesOS build system, and I have no idea why. I'll include it with my NixOS patchset until I figure out what would be best way to workaround this issue. In NixOS, I think it would be easier to directly patch Xorg, instead of hooking its functions. |
So, it was created, but later removed?
For test, I restored the initfirst link flag, removed the
It's captured when the process is still running, so the last destructor call is not on Xwayland exit (which you can see also from the PID). I've added also process PID and argv[0], and as you can see, some of those are in different processes - did |
Huh, both current and previous version of this patch works fine on NixOS with Xwayland |
Maybe it did, but some other initializer spawned processes before try_init was called? It is being called on NixOS, earlier try_init calls are only inserted in the case if some other library initializer (e.g llvm) wants to use fstat/mmap, it is however wrong to assume fstat/mmap will always be called before other processes will spawn |
Looking at what got spawned, it's unlikely.
Yes. |
Have you used only one patch (try_init), or both at the same time (try_init + initfirst)? try_init doesn't require initfirst to be set |
I had an alternative (or additional?) idea to unsetting Regardless of all that, removing the file if taking the lock failed is simply wrong... |
How about using patchelf to reroute calls to fstat/mmap from wanted library to shmoverride.so directly? (Just --add-needed shmoverride.so, --rename-dynamic-symbols fstat -> shmoverride_fstat, etc) |
That's not going to fly, we strongly prefer to not need to modify X server binary/build (use the one from given distribution, which can be Fedora, Debian, Arch, I've seen people doing Alpine now and probably few more). Replacing binary with a wrapper is already PITA in some distros... |
Could it be that |
I'm testing things in #151 |
I think this is the case, the |
And #152 has extended config, but due to getenv not working with initfirst, it's broken. |
So, all in all, IMO the slower (current) version is better. The overhead shouldn't be that big. In fact, it looks like the compiler is smart enough to inline the |
I was checking on godbolt, and __builtin_expect makes this optimization always work, without it - I haven't seen inlining here. |
Is there anything left to change here? If not, can you make it non-draft? |
After I have rebased my qubesos nixos dom0 on latest nixos version, I have encountered Xwayland/Xorg failing with segfault.
The cause of segfault was fstat hook being called before library constructor:
This was not caused by constructor not finding fstat with dlsym, I have checked that no logs being emitted by this library (and double-checked by using bpftrace to catch all fprints to stderr), constructor defenitely was not being called.
I'm not sure why, but
-z initfirst
linker flag fixes this constructor call ordering issue, making constructor being run before fsync, thus fixing the segfault, and restoring the Xwayland/Xorg operation.I can provide more info/logs about this issue, but my knowledge about dynamic linking is limited, and I have no idea what might be useful here.