Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix macOS CI issue (was: support cabal new-build) #240

Closed
wants to merge 13 commits into from

Conversation

gelisam
Copy link
Owner

@gelisam gelisam commented Dec 29, 2020

This was originally an attempt to fix #186 (now continued at #242), but then I encountered an annoying bug in CI and this PR became a testing ground for that bug instead. I'll open another PR.

apparently with cabal-v2 the executable and the tests don't have access
to that auto-generated module.
@gelisam
Copy link
Owner Author

gelisam commented Dec 29, 2020

When compiling using stack, there is a HASKELL_PACKAGE_SANDBOXES environment variable available at compile time giving the path to the package-dbs (there can be more than one). I was happy to find a similar HASKELL_DIST_DIR environment variable available with cabal-v2, but it's strange; when I print the variable's value at compile-time, I see it's /.../dist-newstyle/build/x86_64-osx/ghc-8.4.4/haskell-awk-1.1.1/t/reference, which does seem to be a path to a package-db. But when I actually run hawk, I get:

$ ~/.cabal/bin/hawk '2+2'
error: GhcException "can't find a package database at dist"

which means that the value is now dist. But I'm not looking at the environment variable at runtime, I'm saving it at compile time! So how come it is different?

Another difficulty is that the tests fail:

$ cabal v2-test
Running 1 test suites...
Test suite reference: RUNNING...

src/System/Console/Hawk/Lock.hs:23:1: error:
    Could not find module ‘System.FileLock’
Test suite reference: FAIL

This is a module from the filelock module, which is listed in the dependencies of both the executable and the test suite. I am guessing that this is, once again, a case in which doctest needs to be cajoled.

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

arh, cannot execute binary file on the macOS build again! This has nothing to do with this PR, it's something I struggled with in #238, it disappeared on its own after clearing the cache, and now it's back again. I should get to the bottom of this.

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

at 2020-12-28 9pm, https://github.com/gelisam/hawk/runs/1620047078 successfully found the cache at key macOS-stack-5c609afcc0a994ba05d1daed3f933e5ca8fc15a7fc70153c5dfcdf2cd435605e-2, and thus successfully ran the problematic binary file.

at 2020-12-29 12pm, https://github.com/gelisam/hawk/runs/1622766015 failed to find the cache at that same key. So I am guessing they get cleared daily. That's unfortunate for our build times, but whatever. It built successfully, so it successfully ran the problematic binary. The binary was then saved to the cache.

at 2020-12-29 6pm, https://github.com/gelisam/hawk/pull/240/checks?check_run_id=1623867348 successfully found the cache at that same key, but this time it was not able to run /Users/runner/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.4.

Maybe the tar commands don't keep the permissions? Maybe GitHub's cache system corrupts the cache somehow? Maybe GitHub has different kinds of MacOS machines in its rotation, and the file is only considered executable on some of them?

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

I re-ran the build a couple of times, and it always fails with cannot execute binary file. So I don't think there are different kinds of MacOS machines in the rotation.

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

The build passes when the cache is empty, the executable is still fine just before it is added to the cache, but then it is no longer fine when it is restored from the cache. Like last time. Well, at least it's consistent.

ghc is still installed via stack when it is set to false. maybe its
meaning is reversed?
@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

According to ls -l and du -sh, it's the exact same file: same length, same permissions, same owner and group. So why won't it run??

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

Aha! The md5 of the file is different. The file is indeed corrupted! It's a bug in actions/cache@v2!

when putting the file into the cache:

Darwin Mac-1609350922102.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
-rwxr-xr-x  1 runner  staff  22467592 Dec 30 18:00 /Users/runner/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.4
MD5 (/Users/runner/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.4) = ffa64f019124ff6108ce099120cf1a54
/Users/runner/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.4: Mach-O 64-bit executable x86_64

when extracting the file from the cache:

Darwin Mac-1609351629299.local 19.6.0 Darwin Kernel Version 19.6.0: Thu Oct 29 22:56:45 PDT 2020; root:xnu-6153.141.2.2~1/RELEASE_X86_64 x86_64
-rwxr-xr-x  1 runner  staff  22467592 Dec 30 18:00 /Users/runner/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.4
MD5 (/Users/runner/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.4) = 5bfa9c7d99ecbfc919996612846e0424
/Users/runner/.stack/setup-exe-cache/x86_64-osx/Cabal-simple_mPHDZzAJ_2.2.0.1_ghc-8.4.4: data

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

Looks like it has already been reported: actions/cache#445

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

A similar bug claims to have found a workaround, but the link to their workaround is a broken link: actions/cache#403 (comment)

Fortunately I managed to find a different link to that workaround: shadowsocks/shadowsocks-rust@e88a536#diff-bc668a2c9f2299cef15b222055b4b4d5311646caec2e7610e540cee18ae9b948

Basically, the solution is to add

      - name: Install GNU tar
        run: |
          brew install gnu-tar
          echo "::add-path::/usr/local/opt/gnu-tar/libexec/gnubin"

before the actions/cache step so that it uses gnu's version of tar instead of the bsd tar which is installed on macOS by default. So I guess that means the bug is bsd tar?

@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

I see that the command is /usr/bin/tar --use-compress-program zstd, so it could be a bug in zstd. I tried installing it on my local machine, but it crashes during decompression for some reason, so I was not able to confirm the bug.

@gelisam gelisam changed the title support cabal new-build fix macOS CI issue (was: support cabal new-build) Dec 30, 2020
@gelisam
Copy link
Owner Author

gelisam commented Dec 30, 2020

Moved to #241

@gelisam gelisam closed this Dec 31, 2020
@gelisam gelisam deleted the issue-186/cabal-v2 branch December 31, 2020 00:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

support cabal new-build
1 participant