Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve CI for wasmer_wamr feature #116

Merged
merged 62 commits into from
Sep 11, 2024
Merged

Conversation

mattyg
Copy link
Member

@mattyg mattyg commented Aug 30, 2024

Some additional cleanup I missed in #112

I discovered that windows runners in the holochain repo were failing to build these crates with the wasmer_wamr feature, then realized I hadn't updated the windows ci here to also test against that feature.

I am still unable to build with the wasmer_wamr feature on windows, but I have made some progress. For now I've commented out those CI jobs. See #117

I'm also not sure if this CI setup for running benchmarking makes sense. Do we want to run benchmarking on every platform and feature? If we're running cargo bench on a fresh runner every time, does it even have a prior run to compare to? Should it be comparing to develop for each PR?

@mattyg mattyg changed the title Fix/integration wasmer wamr Some cleanup on wasmer/wamr integration Aug 30, 2024
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not being used anymore and appears years outdated.

os: ["ubuntu-latest", "macos-latest"]
wasmer-feature: ["wasmer_sys", "wasmer_wamr"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separate ci jobs for each wasmer feature for ease of review.

- run: nix develop --command ${{ matrix.script }}
- run: nix develop --command ./${{ matrix.script }}-${{ matrix.wasmer-feature }}.sh

test-windows:
Copy link
Member Author

@mattyg mattyg Aug 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Test for both wasmer features on windows
  • Move test-windows ci job into same file to make it more obvious that both jobs need to be modified together

@@ -103,7 +103,7 @@ very reasonable overhead for the host to build a completely fresh instance.
Calling a function with `holochain_wasmer_host::guest::call()` takes several `us`
for small input/output values and some `ms` for ~1mb of input/output data.

To see benchmarks on your system run `nix-shell --run ./bench.sh`.
To see benchmarks on your system run `nix-shell --run ./scripts/bench.sh`.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move all scripts into scripts dir

@mattyg mattyg changed the title Some cleanup on wasmer/wamr integration WIP Some cleanup on wasmer/wamr integration Aug 30, 2024
@mattyg mattyg marked this pull request as ready for review August 30, 2024 18:00
@@ -6,13 +6,18 @@ on:
- develop
pull_request: {}

concurrency:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cancel superseded ci jobs

@mattyg mattyg changed the title WIP Some cleanup on wasmer/wamr integration Improve CI for wasmer_wamr feature Sep 1, 2024
@neonphog
Copy link
Contributor

neonphog commented Sep 6, 2024

I'm also not sure if this CI setup for running benchmarking makes sense. Do we want to run benchmarking on every platform and feature? If we're running cargo bench on a fresh runner every time, does it even have a prior run to compare to? Should it be comparing to develop for each PR?

Indeed we need to figure out our benchmarking strategy in general. Whether we're just visually inspecting, or setting some threshold failure value... whether we care about comparing changes, or just looking at spot absolute results.

But that doesn't have to be solved for this PR : )

Copy link
Contributor

@neonphog neonphog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, LGTM

Copy link
Contributor

@jost-s jost-s left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question about the benchmarking. There's no comparison to previous results taking place right now. I tend to think we only need benchmarks on one platform, being Linux.

runs-on: ${{ matrix.os }}
steps:
- name: increase swap space
if: matrix.os == 'ubuntu-latest'
uses: actionhippie/swap-space@v1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In holochain there's an action for ubuntu to free up disk space - does that achieve what you need the swap space for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the bench runs on CI were crashing when run in series due to running out of memory.

@ThetaSinner
Copy link
Member

does it even have a prior run to compare to?

I was confused by this. The benchmarks are actually set up to do a number of iterations in a known amount of time. So the performance benchmark is how many iterations they can complete in that time. So if performance regresses then we'd catch it. That's not so useful for performance being improved and then harmed again, unless those tests are actively maintained.

@@ -6,15 +6,31 @@ on:
- develop
pull_request: {}

concurrency:
group: ${{ github.head_ref }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should include the pipeline, otherwise it's concurrency across all CI workflows that use the head ref - e.g. https://github.com/holochain/holochain/blob/develop/.github/workflows/test.yml#L10

That uses github.workflow to keep workflow concurrency separate

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I think probably both? I think either head_ref or ref_name will conflict if you put them into two different workflows

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated -- I just copied the settings from the holochain workflow.

@@ -0,0 +1,4 @@
#! /usr/bin/env bash

./bench-wasmer_sys.sh && \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to && rather than just sequential commands?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so that if the first command fails you see the failure output as the last text in the terminal rather than having it get buried by the output of the subsequent command. I don't have a strong preference either way.

Copy link
Member

@ThetaSinner ThetaSinner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! Really appreciate CI improvements being made on this repo, it's not had that much attention recently :)

@mattyg
Copy link
Member Author

mattyg commented Sep 11, 2024

does it even have a prior run to compare to?

I was confused by this. The benchmarks are actually set up to do a number of iterations in a known amount of time. So the performance benchmark is how many iterations they can complete in that time. So if performance regresses then we'd catch it. That's not so useful for performance being improved and then harmed again, unless those tests are actively maintained.

First it runs a bunch of iterations to get a performance range, then that range is compared to saved results from the prior run if they are found in target/criterion/ ( see https://bheisler.github.io/criterion.rs/book/analysis.html#comparison ).

I'm thinking the ideal setup would have develop benchmark results saved, and then run the PR benchmark to compare against it.

@mattyg mattyg merged commit 280f3ac into develop Sep 11, 2024
8 checks passed
@mattyg mattyg deleted the fix/integration-wasmer-wamr branch September 11, 2024 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants