Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow Adding Runtime Precompiles to Module Image #55161

Open
PatrickHaecker opened this issue Jul 18, 2024 · 13 comments
Open

Allow Adding Runtime Precompiles to Module Image #55161

PatrickHaecker opened this issue Jul 18, 2024 · 13 comments
Labels
compiler:precompilation Precompilation of modules feature Indicates new feature / enhancement requests pkgimage

Comments

@PatrickHaecker
Copy link

PatrickHaecker commented Jul 18, 2024

There should be a command line parameter called --append-precompiles or --save-compile or something like this which enables to save all precompiles which happened during runtime (either in parallel at runtime or when the program finishes). The precompiles should probably be saved in the module image where the precompile was triggered from (the caller of the function).

This way, the precompilation image file contains both the precompiles from the precompile pass and the precompiles which came up during runtime (possibly the result of multiple runs (as the union set operation) if new calls come up in different runs). If the module is changed, all precompiles would get invalidated as usual. Afterwards, the number of precompiles grows first with the first precompile run and then possibly with each run (although most modules will not grow anymore at all after precompilation or only for the first run).

I am not sure how the REPL comes into place for this features. Until there are good ideas it might be best to not have this feature (saving new precompiles) in an interactive session.

With this mechanism in place I could even imagine that the regular precompile run might not even be intended for a lot of use cases. Just run the code and whenever some method is called the first time, precompile it, save the precompilation and go on. That means after a code change the functions are precompiled iff they are actually used, i.e. no currently unused function is precompiled and no function is precompiled more than once (even when accounting for multiple program runs; as long as the involved code is not modified).

@bvdmitri
Copy link
Contributor

Would be also amazing if those precompiled modules could be saved as shared libraries locally to the files/scripts/dev packages as Manifest.toml currently does. So whenever someone starts a script or developing a package those precompiled modules would be picked up automatically.

@nsajko nsajko added compiler:precompilation Precompilation of modules feature Indicates new feature / enhancement requests pkgimage labels Jul 18, 2024
@nsajko
Copy link
Contributor

nsajko commented Jul 18, 2024

This is kind of already here, I think. It's possible to track all compilation, saving precompile calls to a file with --trace-compile=file_name. After that just put the precompile statements into a package and use the package from your startup.jl. Perhaps some of this should be further automated?

@bvdmitri
Copy link
Contributor

@nsajko
That sounds like a lot of work for a regular user, especially for those who are not computer scientists or have a poor understanding of how compilers work. Additionally, this setup might not work well with Revise.

Perhaps some of this should be further automated?

Thats how I read the initial proposal yes, would be nice to automate all of this, making the precompile statements run behind the scenes and enabling the ability to dump and reload the precompiled modules between Julia sessions.

@PatrickHaecker
Copy link
Author

PatrickHaecker commented Jul 18, 2024

This is kind of already here, I think. It's possible to track all compilation, saving precompile calls to a file with --trace-compile=file_name. After that just put the precompile statements into a package and use the package from your startup.jl. Perhaps some of this should be further automated?

Thanks, @nsajko , indeed I am currently following a similar, but even more complicated manual workflow (due to supporting both PackageCompiler and a "development build" and needing to investigate which precompiles were triggered by which module). I was thinking towards a better solution and this proposal is what I came up with from a user perspective. I would probably even set an alias on my system to have this parameter activated by default, because that's the behavior what I want nearly all the time.

I also assume that we have most of the major building blocks already. As far as I know only the "which module was responsible for this runtime precompile" is missing a user interface and this sounds like a low hanging fruit. I guess it's only a matter of putting the blocks together, but I do not know about the implementation details of these blocks. But to me that sounds like the best of both worlds (classical AoT compiled languages and interpreted/JIT/JAoT compiled languages).

@PatrickHaecker
Copy link
Author

Would be also amazing if those precompiled modules could be saved as shared libraries locally to the files/scripts/dev packages as Manifest.toml currently does. So whenever someone starts a script or developing a package those precompiled modules would be picked up automatically.

I agree that this would be cool when it works, but would it work so often? At least the native code would fail whenever someone has a different computer architecture / instruction set. Optimizations are also sometimes very CPU-specific (e.g. AES-512). So this, although interesting, opens up a whole lot of additional questions. Therefore, I propose to have a separate issue for that question, if you want to follow up on this proposal.

@bvdmitri
Copy link
Contributor

but would it work so often?

I think so. In my opinion, it would cover the vast majority of cases since most people use the same laptop or computer for months/years. For example, if I worked on experiments yesterday, I could restart the Julia session today with all the functions precompiled from yesterday. That's the use case for 99% of users. Currently, most of the cache is lost between sessions (although packages do precompile some stuff, the majority of session-specific compilations are wiped). What I was trying to propose is basically equivalent to not closing Julia terminal session over night and keeping it alive for days. Would be nice to just dump all the available precompiled cache in a binary file and restart julia session with this binary file later on such that it feels you never actually closed your terminal.

You are correct that this wouldn't work on a different computer architecture. However, I believe your original proposal would face the same issue, as precompiled modules wouldn't work between different computer architectures anyway. I may have misunderstood your proposal, though.

@mkitti
Copy link
Contributor

mkitti commented Jul 18, 2024

JIT compilation and generating compilation output that can be reused in another session are fundamentally different. JIT compilation can avoid some indirection since it can assume the locations of procedures will not change. Whereas compilation that can be reloaded requires some indirection.

Julia's compilation modes can be affected by the options below, but they are not meant for the end user. The closest end user implementation to what you want are the pkgimages, which uses the options below internally.

I'm unclear if you have tried pkgimages and PrecompileTools.jl and how they may or may not be applicable to your problem.

$ julia --help-hidden

    julia [switches] -- [programfile] [args...]

Switches (a '*' marks the default value, if applicable):

 --compile={yes*|no|all|min}
                          Enable or disable JIT compiler, or request exhaustive or minimal compilation

 --output-o <name>        Generate an object file (including system image data)
 --output-ji <name>       Generate a system image data file (.ji)
 --strip-metadata         Remove docstrings and source location info from system image
 --strip-ir               Remove IR (intermediate representation) of compiled functions

 --output-unopt-bc <name> Generate unoptimized LLVM bitcode (.bc)
 --output-bc <name>       Generate LLVM bitcode (.bc)
 --output-asm <name>      Generate an assembly file (.s)
 --output-incremental={yes|no*}
                          Generate an incremental output file (rather than complete)
 --trace-compile={stderr,name}
                          Print precompile statements for methods compiled during execution or save to a path
 --image-codegen          Force generate code in imaging mode
 --permalloc-pkgimg={yes|no*} Copy the data section of package images into memory

@PatrickHaecker
Copy link
Author

Would be nice to just dump all the available precompiled cache in a binary file and restart julia session with this binary file later on such that it feels you never actually closed your terminal.

The use case of the same computer should already be covered in my proposal. I thought you wanted to extend it, but I think we have the same use case in our mind.

@mkitti
Copy link
Contributor

mkitti commented Jul 19, 2024

Relaying the thought from the related Discourse thread, have you considered using PrecompileTools.jl with a Startup package?

https://julialang.github.io/PrecompileTools.jl/stable/#Tutorial:-local-%22Startup%22-packages

@PatrickHaecker
Copy link
Author

JIT compilation can avoid some indirection since it can assume the locations of procedures will not change. Whereas compilation that can be reloaded requires some indirection.

Thanks for the explanations. So if I got it, then the feature request really is "if the flag is provided, generate relocatable JIT code which is then saved to the image file if it is not in there already".

The idea is to make use of pkgimages to basically achieve a similar, but faster and more comfortable effect than using PrecompileTools.jl.

I am not sure which of these command line arguments might support the described use case, so I tested them separately and commented them so that we can see whether my understanding is correct. Probably I need at least a combination of them, but I am unclear which one.

 --compile={yes*|no|all|min}         Seems to be orthogonal
 --output-o <name>        Results in "ERROR: File "boot.jl" not found"
 --output-ji <name>       Results in "ERROR: File "boot.jl" not found"
 --strip-metadata         Seems to be orthogonal
 --strip-ir               Seems to be orthogonal

 --output-unopt-bc <name>       Seems to be orthogonal
 --output-bc <name>       Seems to be orthogonal
 --output-asm <name>      Seems to be orthogonal
 --output-incremental={yes|no*}       This works, but I am not sure what it does without any other options, but it does not seem to save runtime precompiles per se.
 --trace-compile={stderr,name}       This might be a building block, but as long as it does not output the calling module, I am not sure how much it will help.
 --image-codegen          I am not sure what it does without any other options, but it does not seem to save runtime precompiles per se.
 --permalloc-pkgimg={yes|no*}          This sounds unreleated

I have the feeling, that I did not state very well, what I want to have as a solution. I tried to improve the wording, but please give me hints what would help you to help me. :-)

@PatrickHaecker
Copy link
Author

Relaying the thought from the related Discourse thread, have you considered using PrecompileTools.jl with a Startup package?

https://julialang.github.io/PrecompileTools.jl/stable/#Tutorial:-local-%22Startup%22-packages

Thanks for the hint, I guess you are referring to this thread. Ideally I do not want to have to setup nor to maintain anything as this should really support a developing workflow where things change. If a function gets precompiled outside of a module precompilation run, it should just be saved by Julia for next time without any function-specific configuration and without an additional run (as with a workset with PrecompileTools.jl.

@mkitti
Copy link
Contributor

mkitti commented Jul 23, 2024

The idea is to make use of pkgimages to basically achieve a similar, but faster and more comfortable effect than using PrecompileTools.jl.

What is uncomfortable about PrecompileTools.jl? You technically do not need it, but it makes compiling modular relocatable code a lot more pleasant.

You could simply run code at the package module top-level and it will be saved into the pkgimage. However, it will run every single time you try to load the code. We can strip the functionality of PrecompileTools.jl down to this single if statement.

if ccall(:jl_generating_output, Cint, () == 1
    # code to compile to disk
end

You also need to consider the situations of when the serialized code is valid or not. This is the issue that pkgimages solves.

Perhaps the issue is that you have a script and not a package. If so look into https://github.com/jolin-io/JuliaScript.jl which automates the creation of a package for a script.

If you want to understand the output options, you may want to study how PackageCompiler.jl works:

https://github.com/JuliaLang/PackageCompiler.jl/blob/master/src%2FPackageCompiler.jl#L462-L465

        cmd = `$(get_julia_cmd()) --cpu-target=$cpu_target $sysimage_build_args
            --sysimage=$base_sysimage --project=$project --output-o=$(object_file)
            $outputo_file`
        @debug "running $cmd"

@PatrickHaecker
Copy link
Author

Thanks for all the hints, @mkitti .

What is uncomfortable about PrecompileTools.jl? You technically do not need it, but it makes compiling modular relocatable code a lot more pleasant.

The workflow with PrecompileTools.jl looks like this:

  • Getting familiar with PrecompileTools.jl
  • Importing/using it
  • setup a workload
  • define the compile workload
  • update the workload whenever there is a more severe change of your code
  • not only having the compilation time but also the additional runtime of the workload for every change

In comparison the proposed workflow looks like this:

  • Call e.g. julia --append-precompiles and it just works.

You could simply run code at the package module top-level and it will be saved into the pkgimage

Yes, for most of the cases this works. However, there are cases where this does not work (I hit some of these cases, too), see e.g. here or there.

Perhaps the issue is that you have a script and not a package. If so look into https://github.com/jolin-io/JuliaScript.jl which automates the creation of a package for a script.

I am using packages. Thanks a lot for pointing me towards JuliaScript.jl. I watched the JuliaCon talk about it and it looks interesting. However, as the first compilation run takes more time, this is at least currently not an improvement for code which changes frequently.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:precompilation Precompilation of modules feature Indicates new feature / enhancement requests pkgimage
Projects
None yet
Development

No branches or pull requests

4 participants