Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: add sccache to speed up ci build #824

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

xxchan
Copy link
Contributor

@xxchan xxchan commented Dec 18, 2024

e.g., the unit job takes 10min in total, and build takes 5min. So I think we can introduce cache to speed it up.

image

@@ -104,6 +106,9 @@ jobs:
steps:
- uses: actions/checkout@v4

- name: Setup sccache
uses: mozilla-actions/[email protected]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The latest version is 0.0.7, but I only see 0.0.4 listed here https://github.com/apache/infrastructure-actions/blob/main/approved_patterns.yml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we consider submit a pr to ask infra team to update approved pattens? cc @Xuanwo

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: xxchan <[email protected]>
@xxchan xxchan force-pushed the xxchan/sacred-lobster branch from 15cb5f9 to deb7371 Compare December 18, 2024 14:27
@xxchan
Copy link
Contributor Author

xxchan commented Dec 18, 2024

comparison:
before: https://github.com/apache/iceberg-rust/actions/runs/12394303990
image

after: https://github.com/apache/iceberg-rust/actions/runs/12395899529?pr=824

image

Seems not as good as expected, but generally reduced >1 min. UT reduced >2 min

@xxchan
Copy link
Contributor Author

xxchan commented Dec 18, 2024

Seems caused by Mozilla-Actions/sccache-action#50

@Fokko
Copy link
Contributor

Fokko commented Dec 18, 2024

Windows build took even longer after the change 🤔

@xxchan
Copy link
Contributor Author

xxchan commented Dec 19, 2024

The cache hit rate was very low, and there's plenty of Cache write errors (I guess it's rate limit Mozilla-Actions/sccache-action#50).

After running it for more times to warm up the cache, it should perform better.

image

@xxchan
Copy link
Contributor Author

xxchan commented Dec 19, 2024

I found an alternative to sccache: swatinem/rust-cache@v2 (also approved by ASF)

They work differently:

  • sccache writes/reads cache for each individual build artifact.
  • rust-cache has a large cache for all dependencies (keyed by Cargo.toml and Cargo.lock)

It's possible that rust-cache works better with GHA cache. But we will need to check in Cargo.lock (currently ignored) to make it work better. (We can consider doing it, and it has other benefits like a reproduciable build for CI and all developers.)


IMO sticking with sccache is also not too bad as it seems simpler to me

@liurenjie1024
Copy link
Contributor

I found an alternative to sccache: swatinem/rust-cache@v2 (also approved by ASF)

They work differently:

  • sccache writes/reads cache for each individual build artifact.
  • rust-cache has a large cache for all dependencies (keyed by Cargo.toml and Cargo.lock)

It's possible that rust-cache works better with GHA cache. But we will need to check in Cargo.lock (currently ignored) to make it work better. (We can consider doing it, and it has other benefits like a reproduciable build for CI and all developers.)

IMO sticking with sccache is also not too bad as it seems simpler to me

I'm not a big fan of checking in Cargo.lock as it's an antipattern for library. I think sscache is good enough.

Copy link
Contributor

@liurenjie1024 liurenjie1024 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @xxchan for this pr! Personally I have no strong opinions about this, as it doesn't seem to improve performance too much, while rate limit error may even slow down the test. But in general it's not bad for rust project. I'm looking forward to hear others' ideas. cc @Xuanwo @Fokko @kevinjqliu

@xxchan
Copy link
Contributor Author

xxchan commented Dec 20, 2024

To be more precise about the performance gain: For unit test: previous compile time is 5m 06s, now 2m 35s

cache stats:

(non-cacheable calls could be the bottleneck in this case. They are usually some dependencies with build scripts e.g., C++ dependencies, and are slow to build. I guess rust-cache may also perform better for them)

image

@xxchan
Copy link
Contributor Author

xxchan commented Dec 20, 2024

I'm not a big fan of checking in Cargo.lock as it's an antipattern for library

Just FYI that it's not considered an antipattern any more https://blog.rust-lang.org/2023/08/29/committing-lockfiles.html

And e.g., opendal has it: https://github.com/apache/opendal/blob/main/core/Cargo.lock

I don't have strong opinion and I'm OK with both though.

@kevinjqliu
Copy link

kevinjqliu commented Dec 20, 2024

Since this PR only affects CI and seems to be an overall improvement. I don't see any downside of including this

btw im not sure if everyone has access to apache INFRA jira, happy to open a ticket to allow the 0.0.7 version of the library

@Fokko
Copy link
Contributor

Fokko commented Dec 21, 2024

The downside is additional complexity and maintenance. The Windows job takes the most time, and after the change it takes even more, so the maximum runtime gets worse, just the average runtime improves. I'm leaning toward what @liurenjie1024 is saying, and avoiding caching. Curious to learn what @Xuanwo thinks

@xxchan xxchan force-pushed the xxchan/sacred-lobster branch from 4b8e5ca to 92e30c4 Compare December 22, 2024 02:28
@xxchan xxchan force-pushed the xxchan/sacred-lobster branch from 92e30c4 to 22013a8 Compare December 22, 2024 03:35
Signed-off-by: xxchan <[email protected]>
@liurenjie1024
Copy link
Contributor

I'm not a big fan of checking in Cargo.lock as it's an antipattern for library

Just FYI that it's not considered an antipattern any more https://blog.rust-lang.org/2023/08/29/committing-lockfiles.html

And e.g., opendal has it: https://github.com/apache/opendal/blob/main/core/Cargo.lock

I don't have strong opinion and I'm OK with both though.

Thanks for the reference, I wasn't aware of it.

Would you mind to have a try with rust-cache? I'm not sure about the difference and it would be better to make decision with actual pr and result.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants