-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes LTO + build-std + Oz failed to resolve undefined symbols #109821
Fixes LTO + build-std + Oz failed to resolve undefined symbols #109821
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @oli-obk (or someone else) soon. Please see the contribution instructions for more information. Namely, in order to ensure the minimum review times lag, PR authors and assigned reviewers should ensure that the review label (
|
compiler-builtins shouldn't depend on symbols exported by libcore. Even without LTO this is an issue due to compiler-builtins being added at the end of the linker invocations and ld.bfd (and other classical unix linkers) don't allow object files to depend on symbols exported by anything passed earlier on the commandline. compiler-builtins needs to be at the end as everything else depends on compiler-builtins. Another way to fix #108853 would be to make the respective |
@bjorn3 You would know this better than me, but... isn't I can think of two edge cases:
|
AFAIK |
@bjorn3 @cr1901 I tested with
I'm curious as to why compiler-builtins depend on libcore, but compiler-builtins shouldn't depend on symbols exported by libcore. It looks strange to me and causes other issues, such as compiler-builtins requiring the If this is ok, we have to copy all the symbols/definitions needed by compiler-builtins to compiler-builtins, even under Oz builds. For this case, the Let's think a bit further, about
If we think of the LTO of rustc as truly Link Time Optimization, this patch provides a similar stage of Symbol Resolution. |
All crates that contain any function depend on a number of so called lang items. These are things like the
This is why compiler-builtins must be built with
In case of linker plugin LTO ( |
I don't expect them to do that. I expect it to be a breaking change in some cases. In any case here is an example: cat >foo.c <<EOF
void foo() {}
EOF
cat >bar.c <<EOF
void foo();
void main() { foo(); }
EOF
gcc -c foo.c -o foo.o
ar q foo.a foo.o
gcc -c bar.c -o bar.o
ar q bar.a bar.o
gcc bar.a foo.a -o test # use comes before definition => works
gcc foo.a bar.a -o test # definition comes before use => doesn't work
/usr/bin/ld: bar.a(bar.o): in function `main':
bar.c:(.text+0xa): undefined reference to `foo'
collect2: error: ld returned 1 exit status ld.bfd --version
GNU ld (GNU Binutils for Debian) 2.35.2
Copyright (C) 2020 Free Software Foundation, Inc.
This program is free software; you may redistribute it under the terms of
the GNU General Public License version 3 or (at your option) a later version.
This program has absolutely no warranty. |
In other words, |
This is indeed the behavior of the current version of rustc. There is no guarantee that this will remain the case so user code shouldn't depend on it, but if it changed, we can easily change compiler-builtins at the same time to fix it. |
I think I understand why compiler-builtins depend on libcore. Can you explain more why compiler-builtins shouldn't depend on symbols exported by libcore? Forgive me for not understanding. Due to the
I don't think
According to my understanding and practice of LTO, by calling the
I figured out why using
The difference is that the So I think this patch is still a reasonable solution. I don't think inline is an appropriate solution. It should be a workaround.
|
Both core and compiler-builtins will be passed to the linker as archives rather than raw object files, which means that even with the exception for raw object files, compiler-builtins is not allowed to depend on core.
The |
12e7b2d
to
52d6abc
Compare
These commits modify the If this was intentional then you can ignore this comment. |
This comment has been minimized.
This comment has been minimized.
It only occurs under non-LTO. This patch only affects LTO. In the case of LTO, the core merges into the LTO object file. I read the When to
This missing method of adding inline may not be appropriate. And when it comes to the next missing method, we can't guarantee that inline will work. I've added support for wasm, although more work is needed to address the license and the compilation warning. |
Just to be clear, by "exception", you mean exception to:
,correct? (Should the bold be "archive files"? It's the only way that makes sense to me, since there is an example of an archive depending on a symbol from something passed earlier on the command line here.)
My own experience w/ msp430 code is that this is often correct. You have to call small function a number of times before the overhead of calling to a single function elsewhere is smaller than the size cost of inlining at every call site. And for the smallest binaries, you're unlikely to reach this threshold b/c the number of calls to any given |
In case of non-LTO compiler-builtins is not allowed to depend on core with or without this PR. If compiler-builtins doesn't depend on core, this PR is not necessary.
This is under normal circumstances when
It will work.
Just enabling the wasm feature of object is not enough. That only enables support for reading the symbol table from linked wasm modules, not wasm object files. The object crate currently doesn't know how to parse the linking custom section: gimli-rs/object#471 |
Truncating integers as happens here should not generate any code if inlined, right?
Yes, and yes. |
If so, this PR may not be necessary. But the truth is that compiler-builtins depend on core. I know we can make compiler-builtins not depend on core symbolically by inlining or by turning off share-generics. However, there is some symbols dependency on core, which still ensures consistency of function implementation. So I'm still not sure why we want compiler-builtins to depend on core, and why we want compiler-builtins to not depend on core.
I read https://man.openbsd.org/man1/ld.bfd.1 and https://lld.llvm.org/ELF/warn_backrefs.html. To be precise, this rule does not allow an undefined symbol to be found from an earlier archive file. For example, we assume that
At this point, our case under LTO becomes We can find the issue where the problem occurred much earlier, see #72140. This can be reproduced in the stable version. I have also added a smaller version, see https://github.com/DianQK/rust-109821-reproducible/tree/main/stable.
Thank you for the reminder. I realize that the wasm-related fixes need more work. |
52d6abc
to
ec1e86d
Compare
What do you think of this patch, any ideas? @alexcrichton |
#109983 landed like 2 hours ago. Can you verify if the next nightly works? The next nightly should be released in a couple of hours. |
I tried the nightly build (c49c4fb) and now |
r? pnkfelix |
…e compiler_builtins.
ec1e86d
to
e7a98a5
Compare
At this point, based on @bjorn3 's argument above (see comments here, here, here, and here), I'm inclined to close this PR. However, I do not yet understand what is going on for #72140. Things marked with |
You can refer to https://github.com/DianQK/rust-109821-reproducible/tree/main/stable to see a simpler example. |
Right, okay. So I guess the implicit question for @bjorn3 here is: What is the right way to address that problem without a pass of the form being proposed in this PR. Or in other words: What is supposed to be happening with those symbols? Should they be unconditionally exported (rather than having their export status be conditionalized on a pass like the one added by this PR)? |
From an LTO perspective, it is normal to export symbols for object files that do not participate in LTO. Here's some highlighting of the key content:
--- a.h ---
extern int foo1(void);
--- a.c ---
#include "a.h"
int foo1(void) {
...
}
--- main.c ---
#include "a.h"
int main() {
return foo1();
} Phase 3 : Optimize Bitcode Files The example of the instructions is just the right way to illustrate this issue. If we say
|
that sounds like a very strong statement (potentially rendering the but I think I should move my follow-up questions about |
Based on the #35637 implementation, it appears that the semantics are more similar to If we consider the semantics to be similar to If we say
Maybe you can consider this pull request as adding partial LTO support. |
I read the documentation at https://doc.rust-lang.org/reference/attributes/codegen.html#the-no_builtins-attribute again.
#![no_builtins] are the same as -fno-builtin .
In terms of semantics, I think it does not explicitly statement that it prohibits external dependencies or excludes participation in LTO. In that case, the fact that compiler-builtins cannot exhibit dependencies on core externally is a separate topic. If I revisit the problem discussed in #35540.
This attribute can be preserved in LTO, see https://clang.godbolt.org/z/Wod6KK6eq. It can work at a function level and be preserved. The clang ensures that LTO is not lost by adding this attribute, see https://clang.godbolt.org/z/z4j6Wsod5. I am currently attempting to add the no-builtins attribute to Rust, which should allow Finally, I think this can be transformed into two patches:
|
Based on my conversations with @DianQK (e.g. #72140 (comment)), I'm closing this PR in favor of the plan outlined in #113716 (comment) |
…, r=pnkfelix Add the `no-builtins` attribute to functions when `no_builtins` is applied at the crate level. **When `no_builtins` is applied at the crate level, we should add the `no-builtins` attribute to each function to ensure it takes effect in LTO.** This is also the reason why no_builtins does not take effect in LTO as mentioned in rust-lang#35540. Now, `#![no_builtins]` should be similar to `-fno-builtin` in clang/gcc, see https://clang.godbolt.org/z/z4j6Wsod5. Next, we should make `#![no_builtins]` participate in LTO again. That makes sense, as LTO also takes into consideration function-level instruction optimizations, such as the MachineOutliner. More importantly, when a user writes a large `#![no_builtins]` crate, they would like this crate to participate in LTO as well. We should also add a function-level no_builtins attribute to allow users to have more control over it. This is similar to Clang's `__attribute__((no_builtin))` feature, see https://clang.godbolt.org/z/Wod6KK6eq. Before implementing this feature, maybe we should discuss whether to support more fine-grained control, such as `__attribute__((no_builtin("memcpy")))`. Related discussions: - rust-lang#109821 - rust-lang#35540 Next (a separate pull request?): - [ ] Revert rust-lang#35637 - [ ] Add a function-level `no_builtin` attribute?
…, r=pnkfelix Add the `no-builtins` attribute to functions when `no_builtins` is applied at the crate level. **When `no_builtins` is applied at the crate level, we should add the `no-builtins` attribute to each function to ensure it takes effect in LTO.** This is also the reason why no_builtins does not take effect in LTO as mentioned in rust-lang#35540. Now, `#![no_builtins]` should be similar to `-fno-builtin` in clang/gcc, see https://clang.godbolt.org/z/z4j6Wsod5. Next, we should make `#![no_builtins]` participate in LTO again. That makes sense, as LTO also takes into consideration function-level instruction optimizations, such as the MachineOutliner. More importantly, when a user writes a large `#![no_builtins]` crate, they would like this crate to participate in LTO as well. We should also add a function-level no_builtins attribute to allow users to have more control over it. This is similar to Clang's `__attribute__((no_builtin))` feature, see https://clang.godbolt.org/z/Wod6KK6eq. Before implementing this feature, maybe we should discuss whether to support more fine-grained control, such as `__attribute__((no_builtin("memcpy")))`. Related discussions: - rust-lang#109821 - rust-lang#35540 Next (a separate pull request?): - [ ] Revert rust-lang#35637 - [ ] Add a function-level `no_builtin` attribute?
…, r=pnkfelix Add the `no-builtins` attribute to functions when `no_builtins` is applied at the crate level. **When `no_builtins` is applied at the crate level, we should add the `no-builtins` attribute to each function to ensure it takes effect in LTO.** This is also the reason why no_builtins does not take effect in LTO as mentioned in rust-lang#35540. Now, `#![no_builtins]` should be similar to `-fno-builtin` in clang/gcc, see https://clang.godbolt.org/z/z4j6Wsod5. Next, we should make `#![no_builtins]` participate in LTO again. That makes sense, as LTO also takes into consideration function-level instruction optimizations, such as the MachineOutliner. More importantly, when a user writes a large `#![no_builtins]` crate, they would like this crate to participate in LTO as well. We should also add a function-level no_builtins attribute to allow users to have more control over it. This is similar to Clang's `__attribute__((no_builtin))` feature, see https://clang.godbolt.org/z/Wod6KK6eq. Before implementing this feature, maybe we should discuss whether to support more fine-grained control, such as `__attribute__((no_builtin("memcpy")))`. Related discussions: - rust-lang#109821 - rust-lang#35540 Next (a separate pull request?): - [ ] Revert rust-lang#35637 - [ ] Add a function-level `no_builtin` attribute?
…, r=pnkfelix Add the `no-builtins` attribute to functions when `no_builtins` is applied at the crate level. **When `no_builtins` is applied at the crate level, we should add the `no-builtins` attribute to each function to ensure it takes effect in LTO.** This is also the reason why no_builtins does not take effect in LTO as mentioned in rust-lang#35540. Now, `#![no_builtins]` should be similar to `-fno-builtin` in clang/gcc, see https://clang.godbolt.org/z/z4j6Wsod5. Next, we should make `#![no_builtins]` participate in LTO again. That makes sense, as LTO also takes into consideration function-level instruction optimizations, such as the MachineOutliner. More importantly, when a user writes a large `#![no_builtins]` crate, they would like this crate to participate in LTO as well. We should also add a function-level no_builtins attribute to allow users to have more control over it. This is similar to Clang's `__attribute__((no_builtin))` feature, see https://clang.godbolt.org/z/Wod6KK6eq. Before implementing this feature, maybe we should discuss whether to support more fine-grained control, such as `__attribute__((no_builtin("memcpy")))`. Related discussions: - rust-lang#109821 - rust-lang#35540 Next (a separate pull request?): - [ ] Revert rust-lang#35637 - [ ] Add a function-level `no_builtin` attribute?
Fixes #108853. Fixes #72140. Fixes #110606.
The problem is that
compiler_builtins
is not involved in LTO. Whenshare-generics
+Oz/s/1
is turned on, the symbols needed forcompiler_builtins
are defined incore
or other target files are involved in LTO.In this case, we should export the relevant symbols at LTO. And we can write similar code to reproduce it in the stable version. Please see https://github.com/DianQK/rust-109821-reproducible.
Before LTO:
After LTO (Before the PR):
After LTO (After the PR):
Why does the problem only occur in Windows?
The rustc adds the
-gc-sections
on Linux and-Wl,-dead_strip
on macOS. This problem can also be reproduced if these parameters are removed.Another candidate patch that allows compiler_builtins to generate LLVM IR to be involved in LTO. Let
ignored_for_lto
always return false.TODO:
rustc_codegen_gcc
.