Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make TimeZones.jl relocatable #479

Merged
merged 21 commits into from
Dec 20, 2024
Merged

Conversation

lcontento
Copy link
Contributor

@lcontento lcontento commented Oct 29, 2024

Fixes #467

Half of a possible solution to #467. The other half is at JuliaTime/TZJData.jl#32.

As discussed in the issue, there are many possible ways to fix this. I have tested that this approach works when including TimeZones.jl in a system image and relocating the depot. I think it should also be 100% equivalent to the old code in a normal setup. I cannot exclude that there are other relocatability issues hidden away, but even if this will not be the long term solution to the problem I think it is a decent fix for most people's use-cases.

to ensure relocatability
@omus
Copy link
Member

omus commented Oct 30, 2024

Would be good to re-run and post the benchmarks from #457 around importing to see if this has an impact on load time.

src/TimeZones.jl Outdated Show resolved Hide resolved
@lcontento
Copy link
Contributor Author

lcontento commented Oct 30, 2024

Would be good to re-run and post the benchmarks from #457 around importing to see if this has an impact on load time.

I tried the benchmarks. The first time I run them after precompiling the packages they are quite slower, but on subsequent runs (after restarting the REPL) they are faster and consistent. This is for all three cases reported below, not sure why (maybe the artifact is redownloaded or rechecked every time the packages is recompiled?). I have reported only the timings for the runs >= 2nd (if you want to see the ones for the 1st run let me know): they all seem comparable and within the error I observe by running multiple times.

Before this PR:

julia> @time_imports import TimeZones
      0.5 ms  Scratch
      3.7 ms  InlineStrings
      0.2 ms  TZJData
      0.4 ms  Compat
      0.2 ms  Compat  CompatLinearAlgebraExt
      0.3 ms  ExprTools
      1.1 ms  Mocking
               ┌ 0.7 ms TimeZones.TZData.__init__() 
               ├ 0.0 ms TimeZones.__init__() 
     31.1 ms  TimeZones 36.28% compilation time

julia> using BenchmarkTools, TimeZones

julia> @btime istimezone("Europe/Warsaw");
  72.906 ns (1 allocation: 48 bytes)

This PR (with vanilla TZJData.jl)

julia> @time_imports import TimeZones
      0.4 ms  Scratch
      3.7 ms  InlineStrings
      0.3 ms  TZJData
      0.4 ms  Compat
      0.2 ms  Compat  CompatLinearAlgebraExt
      0.3 ms  ExprTools
      1.0 ms  Mocking
               ┌ 0.5 ms TimeZones.TZData.__init__() 
               ├ 0.1 ms TimeZones.__init__() 
     30.3 ms  TimeZones 33.97% compilation time

julia> using BenchmarkTools, TimeZones

julia> @btime istimezone("Europe/Warsaw");

  77.583 ns (1 allocation: 48 bytes)

This PR (with relocatable TZJData.jl)

julia> @time_imports import TimeZones
      0.4 ms  Scratch
      3.8 ms  InlineStrings
      1.0 ms  TZJData
      0.5 ms  Compat
      0.2 ms  Compat  CompatLinearAlgebraExt
      0.3 ms  ExprTools
      1.1 ms  Mocking
               ┌ 0.7 ms TimeZones.TZData.__init__() 
               ├ 0.1 ms TimeZones.__init__() 
     30.5 ms  TimeZones 36.08% compilation time

julia> using BenchmarkTools, TimeZones

julia> @btime istimezone("Europe/Warsaw");
  74.769 ns (1 allocation: 48 bytes)

Copy link

codecov bot commented Nov 4, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 92.69%. Comparing base (2bc8f50) to head (8d430e1).
Report is 20 commits behind head on master.

Additional details and impacted files
@@            Coverage Diff             @@
##           master     #479      +/-   ##
==========================================
- Coverage   92.79%   92.69%   -0.11%     
==========================================
  Files          39       38       -1     
  Lines        1818     1834      +16     
==========================================
+ Hits         1687     1700      +13     
- Misses        131      134       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@lcontento lcontento requested a review from omus November 4, 2024 17:14
Project.toml Show resolved Hide resolved
@lcontento
Copy link
Contributor Author

I tried to avoid the code coverage errors by excluding the new code, but the project level check still fails. The errors on Julia nightly were not there last time I looked at this, but they do not seem related to this PR.
@omus Is there anything else you would like to see improved here?

@omus
Copy link
Member

omus commented Dec 17, 2024

I'll try to get this PR moving today this week

@omus
Copy link
Member

omus commented Dec 19, 2024

We are good to move forward here. I'll need to take care of the failing nightly tests but I'll do that in another PR first

src/TimeZones.jl Outdated
error("TZJData.jl with TZDATA_VERSION = $(TZJData.TZDATA_VERSION) is supposed to be relocatable!")
end
# Backwards compatibility for TZJData versions below v1.3.1
artifact_dict = Artifacts.parse_toml(joinpath(pkgdir(TZJData), "Artifacts.toml"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will not work with a system image because pkgdir(TZJData) is the path of the TZJData package on the machine that created the system image.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One could try to take the last part of the path and look into all depots until one finds the new package path. However, I am not 100% sure that having the source of the package in the depot is a requirement; I think you can probably run the system image without downloading the source. Hence, my original, but defintely not elegant, approach of hard-coding the hashes...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? That seems like a bug as we're calling pkgdir at package initialization time so this should be looked up on the system which is running the sysimage

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it's a situation similar to @__DIR__ which is also non-relocatable.

Copy link
Member

@omus omus Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears to be working as intended (future reader note. You need to use TZJData 1.3.0):

❯ JULIA_DEPOT_PATH=/tmp/depot julia -E 'using Pkg; Pkg.add(PackageSpec(name="TimeZones", version="1.19.0")); using TimeZones; TimeZones._COMPILED_DIR[]'
  Installing known registries into `/tmp/depot`
       Added `General` registry to /tmp/depot/registries
    Updating registry at `/tmp/depot/registries/General.toml`
   Resolving package versions...
   Installed Mocking ─────── v0.8.1
   Installed InlineStrings ─ v1.4.2
   Installed Compat ──────── v4.16.0
   Installed TZJData ─────── v1.3.0+2024b
   Installed ExprTools ───── v0.1.10
   Installed Scratch ─────── v1.2.1
   Installed TimeZones ───── v1.19.0
  Downloaded artifact: tzjdata
    Updating `/private/tmp/depot/environments/v1.11/Project.toml`
  [f269a46b] + TimeZones v1.19.0
    Updating `/private/tmp/depot/environments/v1.11/Manifest.toml`
  [34da2185] + Compat v4.16.0
  [e2ba6199] + ExprTools v0.1.10
  [842dd82b] + InlineStrings v1.4.2
  [78c3b35d] + Mocking v0.8.1
  [6c6a2e73] + Scratch v1.2.1
  [dc5dba14] + TZJData v1.3.0+2024b
  [f269a46b] + TimeZones v1.19.0
  [0dad84c5] + ArgTools v1.1.2
  [56f22d72] + Artifacts v1.11.0
  [ade2ca70] + Dates v1.11.0
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching v1.11.0
  [b27032c2] + LibCURL v0.6.4
  [8f399da3] + Libdl v1.11.0
  [ca575930] + NetworkOptions v1.2.0
  [de0858da] + Printf v1.11.0
  [9a3f8284] + Random v1.11.0
  [ea8e919c] + SHA v0.7.0
  [fa267f1f] + TOML v1.0.3
  [cf7118a7] + UUIDs v1.11.0
  [4ec0a83e] + Unicode v1.11.0
  [deac9b47] + LibCURL_jll v8.6.0+0
  [29816b5a] + LibSSH2_jll v1.11.0+1
  [c8ffd9c3] + MbedTLS_jll v2.28.6+0
  [14a3606d] + MozillaCACerts_jll v2023.12.12
  [83775a58] + Zlib_jll v1.2.13+1
  [8e850ede] + nghttp2_jll v1.59.0+0
  [3f19e933] + p7zip_jll v17.4.0+2
Precompiling project...
  9 dependencies successfully precompiled in 7 seconds. 14 already precompiled.
"/tmp/depot/artifacts/7fdea2a12522469ca39925546d1fd93c10748180"

❯ mv /tmp/depot /tmp/depot2

❯ JULIA_DEPOT_PATH=/tmp/depot2 julia -E 'using TimeZones; TimeZones._COMPILED_DIR[]'
"/tmp/depot/artifacts/7fdea2a12522469ca39925546d1fd93c10748180"

❯ rm -rf /tmp/depot2

❯ JULIA_DEPOT_PATH=/tmp/depot julia -E 'using Pkg; Pkg.add(PackageSpec(url="https://github.com/lcontento/TimeZones.jl", rev="lc/relocatable")); using TimeZones; TimeZones._COMPILED_DIR[]'
  Installing known registries into `/tmp/depot`
       Added `General` registry to /tmp/depot/registries
     Cloning git-repo `https://github.com/lcontento/TimeZones.jl`
    Updating git-repo `https://github.com/lcontento/TimeZones.jl`
    Updating registry at `/tmp/depot/registries/General.toml`
   Resolving package versions...
   Installed Scratch ─────── v1.2.1
   Installed Compat ──────── v4.16.0
   Installed InlineStrings ─ v1.4.2
   Installed TZJData ─────── v1.3.0+2024b
   Installed Mocking ─────── v0.8.1
   Installed ExprTools ───── v0.1.10
  Downloaded artifact: tzjdata
    Updating `/private/tmp/depot/environments/v1.11/Project.toml`
  [f269a46b] + TimeZones v1.19.0 `https://github.com/lcontento/TimeZones.jl#lc/relocatable`
    Updating `/private/tmp/depot/environments/v1.11/Manifest.toml`
  [34da2185] + Compat v4.16.0
  [e2ba6199] + ExprTools v0.1.10
  [842dd82b] + InlineStrings v1.4.2
  [78c3b35d] + Mocking v0.8.1
  [6c6a2e73] + Scratch v1.2.1
  [dc5dba14] + TZJData v1.3.0+2024b
  [f269a46b] + TimeZones v1.19.0 `https://github.com/lcontento/TimeZones.jl#lc/relocatable`
  [0dad84c5] + ArgTools v1.1.2
  [56f22d72] + Artifacts v1.11.0
  [ade2ca70] + Dates v1.11.0
  [f43a241f] + Downloads v1.6.0
  [7b1f6079] + FileWatching v1.11.0
  [b27032c2] + LibCURL v0.6.4
  [8f399da3] + Libdl v1.11.0
  [ca575930] + NetworkOptions v1.2.0
  [de0858da] + Printf v1.11.0
  [9a3f8284] + Random v1.11.0
  [ea8e919c] + SHA v0.7.0
  [fa267f1f] + TOML v1.0.3
  [cf7118a7] + UUIDs v1.11.0
  [4ec0a83e] + Unicode v1.11.0
  [deac9b47] + LibCURL_jll v8.6.0+0
  [29816b5a] + LibSSH2_jll v1.11.0+1
  [c8ffd9c3] + MbedTLS_jll v2.28.6+0
  [14a3606d] + MozillaCACerts_jll v2023.12.12
  [83775a58] + Zlib_jll v1.2.13+1
  [8e850ede] + nghttp2_jll v1.59.0+0
  [3f19e933] + p7zip_jll v17.4.0+2
Precompiling project...
  9 dependencies successfully precompiled in 6 seconds. 14 already precompiled.
"/tmp/depot/artifacts/7fdea2a12522469ca39925546d1fd93c10748180"

❯ mv /tmp/depot /tmp/depot2

❯ JULIA_DEPOT_PATH=/tmp/depot2 julia -E 'using TimeZones; TimeZones._COMPILED_DIR[]'
"/tmp/depot2/artifacts/7fdea2a12522469ca39925546d1fd93c10748180"

Maybe there is something I'm forgetting about sysimages. Can you verify this works for your use case?

Copy link
Member

@omus omus Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I was also working on an example:

docker run -it --entrypoint=/bin/bash julia:1.11.2

apt-get update -qq
apt-get install -qq build-essential

cat >demo.jl <<EOF
using TimeZones
TimeZone("America/Winnipeg")
EOF

julia -e '
    using Pkg
    Pkg.add([
        PackageSpec(name="PackageCompiler", version="1"),
        PackageSpec(url="https://github.com/lcontento/TimeZones.jl", rev="lc/relocatable"), 
        PackageSpec(name="TZJData", version="1.3.0"),
    ])
    '

julia -e '
    using PackageCompiler
    PackageCompiler.create_sysimage(["TimeZones"]; sysimage_path="demo-sysimage.so", precompile_execution_file="demo.jl")
    '

echo "With sysimage"
julia -Jdemo-sysimage.so --trace-compile=stderr demo.jl

rm -rf /root/.julia/packages /root/.julia/compiled
echo "With sysimage after delete"
julia -Jdemo-sysimage.so --trace-compile=stderr demo.jl

The system image no longer requires the package contents or the .ji to be present in the depot. However, the artifact must still exist. The solution I have in place with pkgdir does allow pkgdir to point to the location where the Artifact.toml would be but will fail when trying to load the file.

Update: I just realized I was using an old version of PackageCompiler.jl

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect I'll roll back to using the hardcoded hashes but I'll do a little more experimentation first

Copy link
Member

@omus omus Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I came up with an improved MWE:

rm -rf build_depot build_env sysimage_depot sysimage_env
mkdir build_depot build_env sysimage_env
JULIA_DEPOT_PATH=build_depot julia --project=sysimage_env -e '
    using Pkg
    Pkg.add([
        PackageSpec(name="TZJData", version=v"1.3.0"),
        PackageSpec(url="https://github.com/lcontento/TimeZones.jl", rev="lc/relocatable"),
    ])
    Pkg.precompile()'
JULIA_DEPOT_PATH=build_depot julia --project=build_env -e '
    using Pkg
    Pkg.add(PackageSpec(name="PackageCompiler", version="2"))
    Pkg.precompile()
    using PackageCompiler
    create_sysimage(; project="sysimage_env", sysimage_path="sysimage.so")'
JULIA_DEPOT_PATH=build_depot julia -Jsysimage.so --project=sysimage_env -E 'using TimeZones; TimeZones._COMPILED_DIR[]'
mv build_depot sysimage_depot
JULIA_DEPOT_PATH=sysimage_depot julia -Jsysimage.so --project=sysimage_env -E 'using TimeZones; TimeZones._COMPILED_DIR[]'

It looks like we can use Base.identify_package and Base.locate_package to get the package location with the activated sysimage.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does it take to build a sysimage that contains TimeZones.jl? If it just takes a few minutes, could we maybe add a sysimage test to CI, so that we can catch any relocatability regressions in the future?

We could check for ENV["CI"], so that people who run Pkg.test("TimeZones") locally don't have to run the test if they don't want to.

Copy link
Member

@omus omus Dec 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up writing these tests directly in GHA. If I find this to be problematic I'll pull them out into the TimeZones tests directly in the future

@omus
Copy link
Member

omus commented Dec 20, 2024

I think I would prefer to take the CodeCov hit and try to address the coverage drop later

@omus
Copy link
Member

omus commented Dec 20, 2024

I re-ran the benchmark on Julia 1.11.2:

With this PR:

julia> @time_imports import TimeZones
      0.7 ms  Printf
     17.2 ms  Dates
      0.4 ms  Scratch
      3.7 ms  InlineStrings
      0.5 ms  TZJData
               ┌ 0.1 ms p7zip_jll.__init__()
      3.7 ms  p7zip_jll
               ┌ 0.0 ms NetworkOptions.__init__()
      2.3 ms  NetworkOptions
      8.0 ms  ArgTools
               ┌ 0.3 ms nghttp2_jll.__init__()
      2.3 ms  nghttp2_jll
               ┌ 1.6 ms LibCURL_jll.__init__()
      3.4 ms  LibCURL_jll
               ┌ 0.0 ms MozillaCACerts_jll.__init__()
      2.0 ms  MozillaCACerts_jll
               ┌ 0.0 ms LibCURL.__init__()
      1.1 ms  LibCURL
               ┌ 2.9 ms Downloads.Curl.__init__()
     15.7 ms  Downloads
      0.3 ms  UUIDs
      0.4 ms  Compat
      0.3 ms  Compat → CompatLinearAlgebraExt
      0.4 ms  ExprTools
      0.5 ms  Mocking
               ┌ 3.6 ms TimeZones.TZData.__init__() 62.16% compilation time
               ├ 0.1 ms TimeZones.__init__()
     24.9 ms  TimeZones 8.91% compilation time

julia> using BenchmarkTools, TimeZones

julia> @btime istimezone("Europe/Warsaw");
  73.451 ns (1 allocation: 48 bytes)

Before this PR (with updated TZJData):

julia> @time_imports import TimeZones
      0.6 ms  Printf
     14.4 ms  Dates
      0.5 ms  Scratch
      2.9 ms  InlineStrings
      0.4 ms  TZJData
               ┌ 0.0 ms p7zip_jll.__init__()
      3.1 ms  p7zip_jll
               ┌ 0.0 ms NetworkOptions.__init__()
      1.8 ms  NetworkOptions
      7.2 ms  ArgTools
               ┌ 0.4 ms nghttp2_jll.__init__()
      2.4 ms  nghttp2_jll
               ┌ 1.6 ms LibCURL_jll.__init__()
      3.4 ms  LibCURL_jll
               ┌ 0.0 ms MozillaCACerts_jll.__init__()
      2.3 ms  MozillaCACerts_jll
               ┌ 0.0 ms LibCURL.__init__()
      1.4 ms  LibCURL
               ┌ 2.8 ms Downloads.Curl.__init__()
     16.3 ms  Downloads
      0.4 ms  UUIDs
      0.4 ms  Compat
      0.3 ms  Compat → CompatLinearAlgebraExt
      0.4 ms  ExprTools
      0.5 ms  Mocking
               ┌ 2.7 ms TimeZones.TZData.__init__() 79.32% compilation time
               ├ 0.0 ms TimeZones.__init__()
     24.2 ms  TimeZones 8.97% compilation time

julia> using BenchmarkTools, TimeZones

julia> @btime istimezone("Europe/Warsaw");
  73.484 ns (1 allocation: 48 bytes)

@omus
Copy link
Member

omus commented Dec 20, 2024

Hmmm, the CI sysimage test isn't failing when I expected it to. Ah, I forgot to force the old version of TZJData.jl

@omus
Copy link
Member

omus commented Dec 20, 2024

I've verified that the tests are failing when using pkgdir and older versions of TZJData: 73da68a. Should be good to merge this after the CI wraps up. Thanks everyone!

@omus omus merged commit d6eff71 into JuliaTime:master Dec 20, 2024
20 of 21 checks passed
@lcontento lcontento deleted the lc/relocatable branch January 7, 2025 11:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Relocatability no longer given due to link to TZJData.ARTIFACT_DIR and storing path in const string reference
3 participants