Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit CI runs to pushes and pull requests on main repo #4305

Merged
merged 3 commits into from
Nov 12, 2023

Conversation

rgommers
Copy link
Contributor

This is a follow-up to gh-4271.

  • Add conditions to all CI jobs to only run on main repo by default
  • Cancel running CI jobs when new changes are pushed to a PR
  • Only trigger AWS Graviton 3 job on develop and release-** branches

At the moment, when a contributor pushes the latest develop to their own branch to bring their own fork in sync with main, or if they push another branch, this triggers 30 CI jobs to run. Most will complete silently and only burn CPU time unnecessarily. If there's a failure, this may result in unexpected failure notifications. And the AWS Graviton3 run won't complete at all and time out, since the Cirun hook will only work when triggered from the main repo.

The group expression ensures that the cancel-in-progress behavior is to only cancel if a new commit is pushed to the PR for which the job is running, not other PRs. This is a fairly standard snippet, used also in CI jobs for NumPy and other projects.

For the last change, triggering only on develop and release-** branches, that could make sense to apply to all jobs as well (it's what we do in NumPy/SciPy). But I wasn't sure, that's more a personal preference. The difference is that that prevents the job from triggering to begin with, and hence it avoids the few seconds needed to start an instance, evaluate the if: github.repository == snippet, and shut down again. That process generates actions in the log on your own fork (note that arm64 graviton cirun is not visible here, because of the third commit):

image

If you want me to add that to all CI jobs, please let me know @martin-frbg.

This is a follow-up to OpenMathLibgh-4271. At the moment, when a contributor
pushes the latest `develop` to their own branch to bring their own
fork in sync with `main`, or if they push another branch, this triggers
30 CI jobs to run. Most will complete silently and only burn CPU
time unnecessarily. If there's a failure, this may result in unexpected
failure notifications. And the AWS Graviton3 run won't complete at all
and time out, since the Cirun hook will only work when triggered from
the main repo.
The `group` expression ensures that the cancel-in-progress
behavior is to only cancel if a new commit is pushed to the PR for
which the job is running, not other PRs.

This is a fairly standard snippet, used also in CI jobs for NumPy
and other projects.
@martin-frbg
Copy link
Collaborator

Thank you very much. I do not think it makes sense to expand the "only develop and release branches" bit to all CI jobs (if I understand correctly, this would e.g. disable CI on PRs against the present "riscv" branch). Preventing spurious runs outside the main repo would seem to make sense for the time-limited Cirrus service however. (I am not entirely sure how this would affect contributors who happen to have CI app bindings set up for their own fork in order to try PRs "in private" first - guess they would need to back out the change ?)

@martin-frbg martin-frbg merged commit 6f094c3 into OpenMathLib:develop Nov 12, 2023
64 checks passed
@rgommers rgommers deleted the ci-limit-runs branch November 13, 2023 10:50
@rgommers
Copy link
Contributor Author

Thanks Martin.

I do not think it makes sense to expand the "only develop and release branches" bit to all CI jobs (if I understand correctly, this would e.g. disable CI on PRs against the present "riscv" branch).

The intent was "all actively maintained branches", I didn't know about riscv being active. That said, riscv is orthogonal to AWS Graviton3, so the third commit in this PR doesn't need a change I think.

I am not entirely sure how this would affect contributors who happen to have CI app bindings set up for their own fork in order to try PRs "in private" first - guess they would need to back out the change

This PR didn't affect Azure or Cirrus, so nothing changes there. I'll note that we do have custom logic to be able to skip runs on those better in NumPy/SciPy, but that's more cumbersome to deal with so I didn't bother to include that unless there's a real need.

For GitHub Actions yes, the default behavior changes so if you want the old default back then you need to enable something as a contributor. In my experience, if I want to test something on my own fork before submitting a PR but after it works locally, it's for a specific CI config that I don't have locally. In those cases I'll insert a DEBUG: run on fork commit that has a 1 line diff to change the if: condition on the job of interest. And then drop that commit again before opening a PR.

@martin-frbg
Copy link
Collaborator

thanks, my comment was only in response to your question if it would be useful to apply similar restrictions to all the other CI jobs
(I may try that with cirrus though, seeing that we ran out out free credits in the middle of the month again)

@rgommers
Copy link
Contributor Author

(I may try that with cirrus though, seeing that we ran out out free credits in the middle of the month again)

Two things we did that helped:

  1. Don't run CI on merge commits into main (/ develop), since they're invariably successful if PRs had green CI, and in the unlikely case there's a problem, it will get caught soon after on the next PR anyway
  2. Sequence Cirrus jobs to let macOS arm64 run only after Linux/FreeBSD completes successfully (since macOS is ~5x more expensive per job)

The logic for some of that has to be written in Starlark rather than YAML. https://github.com/scipy/scipy/blob/main/.cirrus.star may contain something of interest.

Also, I spotted what looks like a quick win: the most expensive job is AppleM1/LLVM/CMAKE, because it's running the build on a single core. Not sure if that's on purpose? Happy to submit a PR if not.

@martin-frbg
Copy link
Collaborator

Thanks, that's certainly not intentional, just me forgetting that CMAKE won't instruct make and the MAKEFLAGS as output by getarch is useless unless applied manually. Maybe I'll look into caching the brew install llvm later - this is 40 seconds of single-core utilization in each m1/llvm job.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants