[QT-637] Streamline our build pipeline (#24892)

Context ------- Building and testing Vault artifacts on pull requests and merges is responsible for about 1/3rd of our overall spend on Vault CI. Of the artifacts that we ship as part of a release, we do Enos testing scenarios on the `linux/amd64` and `linux/arm64` binaries and their derivative artifacts. The extended build artifacts for non-Linux platforms or less common machine architectures are not tested at this time. They are built, notarized, and signed as part of every pull request update and merge. As we don't actually test these artifacts, the only gain we get from this rather expensive behavior is that we wont merge a change that would prevent Vault from building on one of the extended targets. Extended platform or architecture changes are quite rare, so performing this work as frequently as we do is costly in both monetary and developer time for little relative safety benefit. Goals ----- Rethink and implement how and when we build binaries and artifacts of Vault so that we can spend less money on repetitive work and while also reducing the time it takes for the build and test pipelines to complete. Solution -------- Instead of building all release artifacts on every push, we'll opt to build only our testable (core) artifacts. With this change we are introducing a bit of risk. We could merge a change that breaks an extended platform and only find out after the fact when we trigger a complete build for a release. We'll hedge against that risk by building all of the release targets on a scheduled cadence to ensure that they are still buildable. We'll make building all of the targets optional on any pull request by use of a `build/all` label on the pull request. Further considerations ---------------------- * We want to reduce the total number of workflows and runners for all of our pipelines if possible. As each workflow runner has infrastructure cost and runner time penalties, using a single runner over many is often preferred. * Many of our jobs runners have been optimized for cost and performance. We should simplify the choices of which runners to use. * CRT requires us to use the same build workflow in both CE and Ent. Historically that meant that modifying `build.yml` in CE would result in a merge conflict with `build.yml` in Ent, and break our merge workflows. * Workflow flow control in both `build.yml` and `ci.yml` can be quite complicated, as each needs to maintain compatibility whether executed as CE or Ent, and when triggered with various Github events like pull_request, push, and workflow_call, each with their own requirements. * Many jobs utilize similar patterns of flow control and metadata but are not reusable. * Workflow call depth has a maximum of four, so we need to be quite considerate when calling other workflows. * Called workflows can only have 10 inputs. Implementation -------------- * Refactor the `build.yml` workflow to be agnostic to whether or not it is executing in CE or Ent. That makes future updates to the build much easier as we won't have to worry about merge conflicts when the change is merged downstream. * Extract common steps in workflows into composite actions that we can reuse. * Fix bugs where some but not all workflows would use different Git references when building and testing a pull request. * We rewrite the application, docs, and UI change helpers as a composite action. This allows us to re-use this logic to make consistent behavior choices across build and CI. * We combine several `build.yml` and `ci.yml` jobs into our final job. This reduces the number of workflows required for the same behavior while saving time overall. * Update most of our action pins. Results ------- | Metric | Before | After | Diff | |-------------------|----------|---------|-------| | Duration: | ~14-18m | ~15-18m | ~ = | | Workflows: | 43 | 18 | - 58% | | Billable time: | ~1h15m | 16m | - 79% | | Saved artifacts: | 34 | 12 | - 65% | Infra costs should map closely to billable time. Network I/O costs should map closely to the workflow count. Storage costs should map directly with saved artifacts. We could probably get parity with duration by getting more clever with our UBI container build, as that's where we're seeing the increase. I'm not yet concerned as it takes roughly the same time for this job to complete as it did before. While the CI workflow was not the focus on the PR, some shared refactoring does show some marginal improvements there. | Metric | Before | After | Diff | |-------------------|----------|----------|--------| | Duration: | ~24m | ~12.75m | - 15% | | Workflows: | 55 | 47 | - 8% | | Billable time: | ~4h20m | ~3h36m | - 7% | Further focus on streamlining the CI workflows would likely result in a few more marginal improvements, but nothing on the order like we've seen with the build workflow. Signed-off-by: Ryan Cragun <[email protected]>
hashicorp · Feb 6, 2024 · 89c75d3 · 89c75d3
1 parent 87d76fc
commit 89c75d3
Show file tree

Hide file tree

Showing 31 changed files with 1,664 additions and 1,049 deletions.
diff --git a/.github/actions/build-vault/action.yml b/.github/actions/build-vault/action.yml
@@ -0,0 +1,201 @@
+# Copyright (c) HashiCorp, Inc.
+# SPDX-License-Identifier: BUSL-1.1
+
+---
+name: Build Vault
+description: |
+  Build various Vault binaries and package them into Zip bundles, Deb and RPM packages,
+  and various container images. Upload the resulting artifacts to Github Actions artifact storage.
+  This composite action is used across both CE and Ent, thus is should maintain compatibility with
+  both repositories.
+
+inputs:
+  github-token:
+    type: string
+    description: An elevated Github token to access private Go modules if necessary.
+    default: ""
+  cgo-enabled:
+    type: number
+    description: Enable or disable CGO during the build.
+    default: 0
+  create-docker-container:
+    type: boolean
+    description: Package the binary into a Docker/AWS container.
+    default: true
+  create-redhat-container:
+    type: boolean
+    description: Package the binary into a Redhat container.
+    default: false
+  create-packages:
+    type: boolean
+    description: Package the binaries into deb and rpm formats.
+    default: true
+  goos:
+    type: string
+    description: The Go GOOS value environment variable to set during the build.
+  goarch:
+    type: string
+    description: The Go GOARCH value environment variable to set during the build.
+  goarm:
+    type: string
+    description: The Go GOARM value environment variable to set during the build.
+    default: ""
+  goexperiment:
+    type: string
+    description: Which Go experiments to enable.
+    default: ""
+  go-tags:
+    type: string
+    description: A comma separated list of tags to pass to the Go compiler during build.
+    default: ""
+  package-name:
+    type: string
+    description: The name to use for the linux packages.
+    default: ${{ github.event.repository.name }}
+  vault-binary-name:
+    type: string
+    description: The name of the vault binary.
+    default: vault
+  vault-edition:
+    type: string
+    description: The edition of vault to build.
+  vault-version:
+    type: string
+    description: The version metadata to inject into the build via the linker.
+  web-ui-cache-key:
+    type: string
+    description: The cache key for restoring the pre-built web UI artifact.
+
+outputs:
+  vault-binary-path:
+    description: The location of the built binary.
+    value: ${{ steps.containerize.outputs.vault-binary-path != '' && steps.containerize.outputs.vault-binary-path || steps.metadata.outputs.binary-path }}
+
+runs:
+  using: composite
+  steps:
+    - name: Ensure zstd is available for actions/cache
+      # actions/cache restores based on cache key and "cache version", the former is unique to the
+      # build job or web UI, the latter is a hash which is based on the runner OS, the paths being
+      # cached, and the program used to compress it. Most of our workflows will use zstd to compress
+      # the cached artifact so we have to have it around for our machines to get both a version match
+      # and to decompress it. Most runners include zstd by default but there are exception like
+      # our Ubuntu 20.04 compatibility runners which do not.
+      shell: bash
+      run: which zstd || (sudo apt update && sudo apt install -y zstd)
+    - uses: ./.github/actions/set-up-go
+      with:
+        github-token: ${{ inputs.github-token }}
+    - if: inputs.vault-edition != 'ce'
+      name: Configure Git
+      shell: bash
+      run: git config --global url."https://${{ inputs.github-token }}:@github.com".insteadOf "https://github.com"
+    - name: Restore UI from cache
+      uses: actions/cache@e12d46a63a90f2fae62d114769bbf2a179198b5c # v3.3.3
+      with:
+        # Restore the UI asset from the UI build workflow. Never use a partial restore key.
+        enableCrossOsArchive: true
+        fail-on-cache-miss: true
+        path: http/web_ui
+        key: ${{ inputs.web-ui-cache-key }}
+    - name: Metadata
+      id: metadata
+      env:
+        # We need these for the artifact basename helper
+        GOARCH: ${{ inputs.goarch }}
+        GOOS: ${{ inputs.goos }}
+        VERSION: ${{ inputs.vault-version }}
+        VERSION_METADATA: ${{ inputs.vault-edition != 'ce' && inputs.vault-edition || '' }}
+      shell: bash
+      run: |
+        if [[ '${{ inputs.vault-edition }}' =~ 'ce' ]]; then
+          build_step_name='Vault ${{ inputs.goos }} ${{ inputs.goarch }} v${{ inputs.vault-version }}'
+          package_version='${{ inputs.vault-version }}'
+        else
+          build_step_name='Vault ${{ inputs.goos }} ${{ inputs.goarch }} v${{ inputs.vault-version }}+${{ inputs.vault-edition }}'
+          package_version='${{ inputs.vault-version }}+ent' # this should always be +ent here regardless of enterprise edition
+        fi
+        {
+          echo "artifact-basename=$(make ci-get-artifact-basename)"
+          echo "binary-path=dist/${{ inputs.vault-binary-name }}"
+          echo "build-step-name=${build_step_name}"
+          echo "package-version=${package_version}"
+        } | tee -a "$GITHUB_OUTPUT"
+    - name: ${{ steps.metadata.outputs.build-step-name }}
+      env:
+        CGO_ENABLED: ${{ inputs.cgo-enabled }}
+        GO_TAGS: ${{ inputs.go-tags }}
+        GOARCH: ${{ inputs.goarch }}
+        GOARM: ${{ inputs.goarm }}
+        GOOS: ${{ inputs.goos }}
+        GOEXPERIMENT: ${{ inputs.goexperiment }}
+        GOPRIVATE: github.com/hashicorp
+        VERSION: ${{ inputs.version }}
+        VERSION_METADATA: ${{ inputs.vault-edition != 'ce' && inputs.vault-edition || '' }}
+      shell: bash
+      run: make ci-build
+    - if: inputs.vault-edition != 'ce'
+      shell: bash
+      run: make ci-prepare-legal
+    - name: Bundle Vault
+      env:
+        BUNDLE_PATH: out/${{ steps.metadata.outputs.artifact-basename }}.zip
+      shell: bash
+      run: make ci-bundle
+    # Use actions/upload-artifact @3.x until https://hashicorp.atlassian.net/browse/HREL-99 is resolved
+    - uses: actions/upload-artifact@a8a3f3ad30e3422c9c7b888a15615d19a852ae32 # v3.1.3
+      with:
+        name: ${{ steps.metadata.outputs.artifact-basename }}.zip
+        path: out/${{ steps.metadata.outputs.artifact-basename }}.zip
+        if-no-files-found: error
+    - if: inputs.create-packages == 'true'
+      uses: hashicorp/actions-packaging-linux@v1
+      with:
+        name: ${{ inputs.package-name }}
+        description: Vault is a tool for secrets management, encryption as a service, and privileged access management.
+        arch: ${{ inputs.goarch }}
+        version: ${{ steps.metadata.outputs.package-version }}
+        maintainer: HashiCorp
+        homepage: https://github.com/hashicorp/vault
+        license: BUSL-1.1
+        binary: ${{ steps.metadata.outputs.binary-path }}
+        deb_depends: openssl
+        rpm_depends: openssl
+        config_dir: .release/linux/package/
+        preinstall: .release/linux/preinst
+        postinstall: .release/linux/postinst
+        postremove: .release/linux/postrm
+    - if: inputs.create-packages == 'true'
+      id: package-files
+      name: Determine package file names
+      shell: bash
+      run: |
+        {
+          echo "rpm-files=$(basename out/*.rpm)"
+          echo "deb-files=$(basename out/*.deb)"
+        } | tee -a "$GITHUB_OUTPUT"
+    - if: inputs.create-packages == 'true'
+      # Use actions/upload-artifact @3.x until https://hashicorp.atlassian.net/browse/HREL-99 is resolved
+      uses: actions/upload-artifact@a8a3f3ad30e3422c9c7b888a15615d19a852ae32 # v3.1.3
+      with:
+        name: ${{ steps.package-files.outputs.rpm-files }}
+        path: out/${{ steps.package-files.outputs.rpm-files }}
+        if-no-files-found: error
+    - if: inputs.create-packages == 'true'
+      # Use actions/upload-artifact @3.x until https://hashicorp.atlassian.net/browse/HREL-99 is resolved
+      uses: actions/upload-artifact@a8a3f3ad30e3422c9c7b888a15615d19a852ae32 # v3.1.3
+      with:
+        name: ${{ steps.package-files.outputs.deb-files }}
+        path: out/${{ steps.package-files.outputs.deb-files }}
+        if-no-files-found: error
+    # Do our containerization last as it will move the binary location if we create containers.
+    - uses: ./.github/actions/containerize
+      id: containerize
+      with:
+        docker: ${{ inputs.create-docker-container }}
+        redhat: ${{ inputs.create-redhat-container }}
+        goarch: ${{ inputs.goarch }}
+        goos: ${{ inputs.goos }}
+        vault-binary-path: ${{ steps.metadata.outputs.binary-path }}
+        vault-edition: ${{ inputs.vault-edition }}
+        vault-version: ${{ inputs.vault-version }}
diff --git a/.github/actions/changed-files/action.yml b/.github/actions/changed-files/action.yml
@@ -0,0 +1,73 @@
+# Copyright (c) HashiCorp, Inc.
+# SPDX-License-Identifier: BUSL-1.1
+
+---
+name: Determine what files changed between two git referecnes.
+description: |
+  Determine what files have changed between two git references. If the github.event_type is
+  pull_request we'll compare the github.base_ref (merge target) and pull request head SHA.
+  For other event types we'll gather the changed files from the most recent commit. This allows
+  us to support PR and merge workflows.
+
+outputs:
+  app-changed:
+    description: Whether or not the vault Go app was modified.
+    value: ${{ steps.changed-files.outputs.app-changed }}
+  docs-changed:
+    description: Whether or not the documentation was modified.
+    value: ${{ steps.changed-files.outputs.docs-changed }}
+  ui-changed:
+    description: Whether or not the web UI was modified.
+    value: ${{ steps.changed-files.outputs.ui-changed }}
+  files:
+    description: All of the file names that changed.
+    value: ${{ steps.changed-files.outputs.files }}
+
+runs:
+  using: composite
+  steps:
+    - id: ref
+      shell: bash
+      name: ref
+      run: |
+        # Determine our desired checkout ref.
+        #
+        # * If the trigger event is pull_request we will default to a magical merge SHA that Github
+        #   creates. This SHA is the product of what merging our PR into the merge target branch at
+        #   at the point in time when we created the PR. When you push a change to a PR branch
+        #   Github updates this branch if it can. When you rebase a PR it updates this branch.
+        #
+        # * If the trigger event is pull_request and a `checkout-head` tag is present or the
+        #   checkout-head input is set, we'll use HEAD of the PR branch instead of the magical
+        #   merge SHA.
+        #
+        # * If the trigger event is a push (merge) then we'll get the latest commit that was pushed.
+        #
+        # * For anything any other event type we'll default to whatever is default in Github.
+        if [ '${{ github.event_name }}' = 'pull_request' ]; then
+          checkout_ref='${{ github.event.pull_request.head.sha }}'
+        elif [ '${{ github.event_name }}' = 'push' ]; then
+          # Our checkout ref for any other event type should default to the github ref.
+          checkout_ref='${{ github.event.after && github.event.after || github.event.push.after }}'
+        else
+          checkout_ref='${{ github.ref }}'
+        fi
+        echo "ref=${checkout_ref}" | tee -a "$GITHUB_OUTPUT"
+    - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
+      with:
+        repository: ${{ github.repository }}
+        path: "changed-files"
+        # The fetch-depth could probably be optimized at some point. It's currently set to zero to
+        # ensure that we have a successfull diff, regardless of how many commits might be present
+        # present between the two references we're comparing. It would be nice to change this
+        # depending on the number of commits by using the push.commits and/or pull_request.commits
+        # payload fields, however, they have different behavior and limitations. For now we'll do
+        # the slow but sure thing of getting the whole repository.
+        fetch-depth: 0
+        ref: ${{ steps.ref.outputs.ref }}
+    - id: changed-files
+      name: changed-files
+      # This script writes output values to $GITHUB_OUTPUT and STDOUT
+      shell: bash
+      run: ./.github/scripts/changed-files.sh ${{ github.event_name }} ${{ github.ref_name }} ${{ github.base_ref }}
+      working-directory: changed-files
diff --git a/.github/actions/checkout/action.yml b/.github/actions/checkout/action.yml
@@ -0,0 +1,77 @@
+# Copyright (c) HashiCorp, Inc.
+# SPDX-License-Identifier: BUSL-1.1
+
+---
+name: Check out the correct git reference.
+description: |
+  Determine and checkout the correct Git reference depending on the actions event type and tags.
+
+inputs:
+  checkout-head:
+    description: |
+      Whether or not to check out HEAD on a pull request. This can also be triggered with a
+      `checkout-head` tag.
+    default: 'false'
+  path:
+    description: Relative path to $GITHUB_WORKSPACE to check out to
+    default: ""
+
+outputs:
+  ref:
+    description: The git reference that was checked out.
+    value: ${{ steps.ref.outputs.ref }}
+  depth:
+    description: The fetch depth that was checked out.
+    value: ${{ steps.ref.outputs.ref }}
+
+runs:
+  using: composite
+  steps:
+    - id: ref
+      shell: bash
+      run: |
+        # Determine our desired checkout ref and fetch depth. Depending our our workflow event
+        # trigger, inputs, and tags, we'll check out different references at different depths.
+        #
+        # * If the trigger event is a pull request we will default to a magical merge SHA that Github
+        #   creates. Essentially, this SHA is the product of merging our PR into the merge target
+        #   branch at some point in time. When you push a change to a PR branch Github updates this
+        #   branch if it can.
+        # * If the trigger event is a pull request and a `checkout-head` tag is present or the
+        #   checkout-head input is set, we'll use HEAD of the PR branch instead of the magical
+        #   merge SHA.
+        # * If the trigger event is a push (merge) then we'll get the latest commit that was pushed.
+        # * For anything any other event type we'll default to whatever is default in Github.
+        #
+        # Our fetch depth will varies depending on what our chosen SHA is. We normally want to do
+        # the most shallow clone possible for speed, but we also need to support getting history
+        # for determining what files have changed, etc. We'll always check out one level deep for
+        # merges or standard pull requests. If checking out HEAD is requested we'll fetch a deeper
+        # history because we need all commits on the branch.
+        #
+        if [ '${{ github.event_name }}' = 'pull_request' ]; then
+          if [ '${{ contains(github.event.pull_request.labels.*.name, 'checkout-head') || inputs.checkout-head == 'true' }}' = 'true' ]; then
+            checkout_ref='${{ github.event.pull_request.head.sha }}'
+            fetch_depth=0
+          else
+            checkout_ref='${{ github.ref }}'
+            fetch_depth=1
+          fi
+        elif [ '${{ github.event_name }}' = 'push' ]; then
+          # Our checkout ref for any other event type should default to the github ref.
+          checkout_ref='${{ github.event.push.after }}'
+          fetch_depth=1
+        else
+          checkout_ref='${{ github.ref }}'
+          fetch_depth=0
+        fi
+
+        {
+          echo "ref=${checkout_ref}"
+          echo "depth=${fetch_depth}"
+        } | tee -a "$GITHUB_OUTPUT"
+    - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
+      with:
+        path: ${{ inputs.path }}
+        fetch-depth: ${{ steps.ref.outputs.depth }}
+        ref: ${{ steps.ref.outputs.ref }}