Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove target locking #43

Closed
wants to merge 3 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
145 changes: 52 additions & 93 deletions rust/Earthfile
Original file line number Diff line number Diff line change
@@ -1,96 +1,81 @@
VERSION --global-cache 0.7
VERSION --use-function-keyword --global-cache 0.7

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--use-function-keyword only available in version 0.8 repo and samples pinned to version 0.7

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Alex, that feature flag is available since 0.7.22 https://github.com/earthly/earthly/blob/v0.7.22/features/features.go#L67
Maybe your CLI is not updated?

# INIT sets some configuration in the environment (to be used by rest of the functions later on).
# Arguments:
# - cache_prefix: Overrides cache prefix for cache IDS. Its value is exported to the build environment under the entry: $EARTHLY_CARGO_CACHE_PREFIX (see below)
# This function sets the following entries in the calling environment:
# - CARGO_HOME: Changed to point to an internal location in the mount cache
# - PATH: Changed to include original $CARGO_HOME/bin
# - EARTHLY_CARGO_CACHE_PREFIX: Value of the cache_prefix, by default ${EARTHLY_TARGET_PROJECT_NO_TAG}#${OS_RELEASE}#earthly-cargo-cache#${EARTHLY_GIT_BRANCH}
# - CARGO_INSTALL_ROOT: Changed to point to the original $CARGO_HOME
# - EARTHLY_RUST_CARGO_HOME_CACHE: Cache mount definition for the $CARGO_HOME
# - EARTHLY_RUST_TARGET_CACHE: Cache mount definition for ./target folder

# INIT sets some configuration in the environment (used by following functions), and installs required dependencies.
# - cache_prefix: Overrides cache prefix for cache IDS. Its value is exported to the build environment under the entry: $EARTHLY_CACHE_PREFIX. By default ${EARTHLY_TARGET_PROJECT_NO_TAG}#${OS_RELEASE}#earthly-cargo-cache
# - keep_fingerprints (false): Instructs the following +CARGO calls to don't remove the Cargo fingerprints of the source packages. Use only when source packages have been COPYed with --keep-ts option.
# - sweep_days (4): +CARGO uses cargo-sweep to clean build artifacts that haven't been accessed for this number of days.
INIT:
COMMAND
RUN if [ -n "$EARTHLY_CACHE_PREFIX" ]; then \
FUNCTION
RUN if [ -n "$EARTHLY_CARGO_CACHE_PREFIX" ]; then \
echo "+INIT has already been called in this build environment" ; \
exit 1; \
fi
IF [ "$CARGO_HOME" = "" ]
ENV CARGO_HOME="$HOME/.cargo"
END

# Add current cargo home binaries folder to the path
IF ! echo $PATH | grep -E -q "(^|:)$CARGO_HOME/bin($|:)"
ENV PATH="$PATH:$CARGO_HOME/bin"
END
DO +INSTALL_CARGO_SWEEP
RUN mkdir -p /tmp/earthly/cfg
#https://docs.earthly.dev/docs/earthfile/builtin-args
ARG EARTHLY_GIT_BRANCH
ARG EARTHLY_TARGET_PROJECT_NO_TAG

# $EARTHLY_CACHE_PREFIX
ARG EARTHLY_TARGET_PROJECT_NO_TAG #https://docs.earthly.dev/docs/earthfile/builtin-args
ARG OS_RELEASE=$(md5sum /etc/os-release | cut -d ' ' -f 1)
ARG cache_prefix="${EARTHLY_TARGET_PROJECT_NO_TAG}#${OS_RELEASE}#earthly-cargo-cache"
ENV EARTHLY_CACHE_PREFIX=$cache_prefix

# $EARTHLY_KEEP_FINGERPRINTS
ARG keep_fingerprints=false
ENV EARTHLY_KEEP_FINGERPRINTS=$keep_fingerprints
ARG cache_prefix="${EARTHLY_TARGET_PROJECT_NO_TAG}#${OS_RELEASE}#earthly-cargo-cache#${EARTHLY_GIT_BRANCH}"
ENV EARTHLY_CARGO_CACHE_PREFIX=$cache_prefix

# $EARTHLY_SWEEP_DAYS
ARG sweep_days=4
ENV EARTHLY_SWEEP_DAYS=$sweep_days

# Make sure that crates installed through this function are stored in the original cargo home, and not in the cargo home within the mount cache.
# This way, if BK garbage-collects them, the build is not broken.
# The following entry will make crates installed through this function reside in the original cargo home, and not in the cargo home within the mount cache.
# This way, if BK garbage-collects them, the build won't be broken.
ENV CARGO_INSTALL_ROOT=$CARGO_HOME
# We change $CARGO_HOME while keeping $ORIGINAL_CARGO_HOME/bin directory in the path. This way, the Cargo binary is still accessible and the whole $CARGO_HOME is within the global cache

# We now change $CARGO_HOME while keeping the original $CARGO_HOME/bin directory in the path. This way, the cargo binaries are still accessible and the whole $CARGO_HOME is within the global cache
# ($CARGO_HOME/.package-cache has to be in the cache so Cargo can properly synchronize parallel access to $CARGO_HOME resources).
ENV CARGO_HOME="/tmp/earthly/.cargo"

# Set Cargo caches
ENV EARTHLY_RUST_CARGO_HOME_CACHE="type=cache,mode=0777,id=$EARTHLY_CARGO_CACHE_PREFIX#cargo-home,sharing=shared,target=$CARGO_HOME"
ENV EARTHLY_RUST_TARGET_CACHE="type=cache,mode=0777,id=${EARTHLY_CARGO_CACHE_PREFIX}#target,sharing=shared,target=target"

# CARGO runs the cargo command "cargo $args".
# This function is thread safe. Parallel builds of targets calling this function should be free of race conditions.
# Notice that in order to run this function, +INIT must be called first.
# Arguments:
# - args: Cargo subcommand and its arguments. Required.
# - output: Regex matching output artifacts files to be copied to ./target folder in the caller filesystem (image layers).
# Use this argument when you want to SAVE an ARTIFACT from the target folder (mounted cache), always trying to minimize the total size of the copied fileset.
# For example --output="release/[^\./]+" would keep all the files in /target/release that don't have any extension.
CARGO:
COMMAND
FUNCTION
DO +CHECK_INITED
ARG --required args
ARG output
DO +SET_CACHE_MOUNTS_ENV
IF [ "$EARTHLY_KEEP_FINGERPRINTS" = "false" ]
DO +REMOVE_SOURCE_FINGERPRINTS
END
RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE \
set -e; \
cargo $args; \
cargo sweep -r -t $EARTHLY_SWEEP_DAYS; \
cargo sweep -r -i;
cargo $args;
IF [ "$output" != "" ]
DO +COPY_OUTPUT --output=$output
END

# SET_CACHE_MOUNTS_ENV sets the following entries in the environment, to be used to mount the cargo caches.
# - EARTHLY_RUST_CARGO_HOME_CACHE: Code of the mount cache for the cargo home.
# - EARTHLY_RUST_TARGET_CACHE: Code of the mount cache for the target folder.
# Notice that in order to run this function, +INIT must be called first.
# Example:
# DO rust+SET_CACHE_MOUNTS_ENV
# RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE cargo build --release
SET_CACHE_MOUNTS_ENV:
COMMAND
DO +CHECK_INITED
ARG EARTHLY_TARGET_NAME #https://docs.earthly.dev/docs/earthfile/builtin-args
ENV EARTHLY_RUST_CARGO_HOME_CACHE="type=cache,mode=0777,id=$EARTHLY_CACHE_PREFIX#cargo-home,sharing=shared,target=$CARGO_HOME"
ENV EARTHLY_RUST_TARGET_CACHE="type=cache,mode=0777,id=${EARTHLY_CACHE_PREFIX}#target#${EARTHLY_TARGET_NAME},sharing=locked,target=target"

# COPY_OUTPUT copies files out of the target cache into the image layers.
# Use this function when you want to SAVE an ARTIFACT from the target folder (mounted cache), always trying to minimize the total size of the copied fileset.
# Notice that in order to run this function, +SET_CACHE_MOUNTS_ENV or +CARGO must be called first.
# Notice that in order to run this function, +INIT must be called first.
# Arguments:
# - output: Regex matching output artifacts files to be copied to ./target folder in the caller filesystem (image layers).
# Example:
# DO rust+SET_CACHE_MOUNTS_ENV
# DO rust+INIT
# RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE cargo build --release
# DO rust+COPY_OUTPUT --output="release/[^\./]+" # Keep all the files in /target/release that don't have any extension.
COPY_OUTPUT:
COMMAND
FUNCTION
DO +CHECK_INITED
ARG --required output
ARG TMP_FOLDER="/tmp/earthly/lib/rust"
RUN if [ ! -n "$EARTHLY_RUST_TARGET_CACHE" ]; then \
Expand All @@ -108,53 +93,27 @@ COPY_OUTPUT:
RUN mkdir -p target; \
mv $TMP_FOLDER/* target 2>/dev/null || echo "no files found within ./target matching the provided output regexp" ;

get-tomljson:
FROM alpine:3.18.3
ARG USERARCH
ARG version=2.1.0
RUN wget -O tomljson.tar.xz https://github.com/pelletier/go-toml/releases/download/v${version}/tomljson_${version}_linux_${USERARCH}.tar.xz && \
tar -xf tomljson.tar.xz; \
chmod +x tomljson
SAVE ARTIFACT tomljson

get-jq:
FROM alpine:3.18.3
ARG USERARCH
ARG version=1.7
RUN wget -O jq https://github.com/jqlang/jq/releases/download/jq-${version}/jq-linux-${USERARCH} && \
chmod +x jq
SAVE ARTIFACT jq

INSTALL_CARGO_SWEEP:
COMMAND
RUN if [ ! -f $CARGO_HOME/bin/cargo-sweep ]; then \
# SWEEP runs cargo-sweep to clean build artifacts that haven't been accessed for a number of days.
# Notice that in order to run this function, +INIT must be called first.
# Arguments:
# - days: Number of days. Default value: 4
SWEEP:
FUNCTION
DO +CHECK_INITED
ARG days=4
RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE \
set -e; \
if [ ! -f $CARGO_HOME/bin/cargo-sweep ]; then \
echo "Installing cargo sweep" ; \
cargo install cargo-sweep --root $CARGO_HOME; \
fi;

REMOVE_SOURCE_FINGERPRINTS:
COMMAND
DO +CHECK_INITED
COPY +get-tomljson/tomljson /tmp/tomljson
COPY +get-jq/jq /tmp/jq
RUN if [ ! -n "$EARTHLY_RUST_TARGET_CACHE" ]; then \
echo "+SET_CACHE_MOUNTS_ENV has not been called yet in this build environment" ; \
exit 1; \
fi;
RUN --mount=$EARTHLY_RUST_TARGET_CACHE \
set -e;\
source_libs=$(find . -name Cargo.toml -exec bash -c '/tmp/tomljson {} | /tmp/jq -r .package.name; printf "\n"' \;) ; \
fingerprint_folders=$(find target -name .fingerprint) ; \
for fingerprint_folder in $fingerprint_folders; do \
cd $fingerprint_folder; \
for source_lib in $source_libs; do \
find . -maxdepth 1 -regex "\./$source_lib-[^-]+" -exec bash -c 'echo "deleting $(readlink -f {})"; rm -rf {}' \; ; \
done \
done;
fi; \
cargo sweep -r -t $days; \
cargo sweep -r -i;

CHECK_INITED:
COMMAND
RUN if [ ! -n "$EARTHLY_CACHE_PREFIX" ]; then \
FUNCTION
RUN if [ ! -n "$EARTHLY_CARGO_CACHE_PREFIX" ]; then \
echo "+INIT has not been called yet in this build environment" ; \
exit 1; \
fi;

115 changes: 65 additions & 50 deletions rust/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,37 +6,32 @@ Earthly's official collection of Rust [functions](https://docs.earthly.dev/docs/

First, import the library up in your Earthfile:
```earthfile
VERSION --global-cache 0.7
VERSION 0.8
IMPORT github.com/earthly/lib/rust:<version/commit> AS rust
```
> :warning: Due to [this issue](https://github.com/earthly/earthly/issues/3490), make sure to enable `--global-cache` in the calling Earthfile, as shown above.
> *Due to [this issue](https://github.com/earthly/earthly/issues/3490), make sure to enable `--global-cache` in the calling Earthfile, as shown above.*
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer shown above, probably because it's not needed with 0.8 any more?


## +INIT
## +INIT

This function sets some configuration in the environment (used by following functions), and installs required dependencies.
It must be called once per build environment, to avoid passing repetitive arguments to the functions called after it, and to install required dependencies before the source files are copied from the build context.
Note that this function changes `$CARGO_HOME` in the calling environment to point to a cache mount later on.
It is recommended then that all interaction with cargo is done throug the `+CARGO` function or using cache mounts returned by `+SET_CACHE_MOUNTS_ENV`.
INIT sets some entries in the calling environment (to be used by rest of the functions later on), in particular the ones related to mounting the cargo caches:
- `EARTHLY_RUST_CARGO_HOME_CACHE`: Definition of the mount cache for the cargo home.
- `EARTHLY_RUST_TARGET_CACHE`: Definition of the mount cache for the target folder.

It must be called once per build environment.

It is recommended that all interaction with Cargo is done with the previous caches mounted.

### Usage

Call once per build environment:
```earthfile
DO rust+INIT ...
DO rust+INIT
```

### Arguments
#### `cache_prefix`
Overrides cache prefix for cache IDS. Its value is exported to the build environment under the entry: `$EARTHLY_CACHE_PREFIX`.
By default `${EARTHLY_TARGET_PROJECT_NO_TAG}#${OS_RELEASE}#earthly-cargo-cache`

#### `keep_fingerprints (false)`
Instructs the following `+CARGO` calls to don't remove the Cargo fingerprints of the source packages. Use only when source packages have been COPYed with `--keep-ts `option.
Cargo caches compilations of packages in `target` folder based on their last modification timestamps.
By default, this function removes the fingerprints of the packages found in the source code, to force their recompilation and work even when the Earthly `COPY` commands used overwrote the timestamps.

#### `sweep_days (4)`
`+CARGO` calls use cargo-sweep to clean build artifacts that haven't been accessed for this number of days.
Sets the prefix to be used in the IDs of the two mount caches. By default: `${EARTHLY_TARGET_PROJECT_NO_TAG}#${OS_RELEASE}#earthly-cargo-cache#${EARTHLY_GIT_BRANCH}`
Its value is exported in the build environment as: `$EARTHLY_CARGO_CACHE_PREFIX`.

## +CARGO

Expand All @@ -63,42 +58,43 @@ Use this argument when you want to `SAVE ARTIFACT` from the target folder (mount

For example `--output="release/[^\./]+"` would keep all the files in `/target/release` that don't have any extension.

### Thread safety
This function is thread safe. Parallel builds of targets calling this function should be free of race conditions.

## +SET_CACHE_MOUNTS_ENV

Sets the following entries in the environment, to be used to mount the cargo caches.
- `EARTHLY_RUST_CARGO_HOME_CACHE`: Code of the mount cache for the cargo home.
- `EARTHLY_RUST_TARGET_CACHE`: Code of the mount cache for the target folder.

Notice that in order to run this function, [+INIT](#init) must be called first.

### Example

```earthfile
cross:
...
DO rust+SET_CACHE_MOUNTS_ENV
WITH DOCKER
RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE cross build --target $TARGET --release
END
release:
FROM ...
DO rust+CARGO --args="build --release" --output="release/[^\./]+" # Keep all the files in /target/release that don't have any extension.
```

## COPY_OUTPUT
This function copies files out of the target cache into the image layers.
Use it function when you want to `SAVE ARTIFACT` from the target folder (mounted cache), always trying to minimize the total size of the copied fileset.

Notice that in order to run this function, `+SET_CACHE_MOUNTS_ENV` or `+CARGO` must be called first.
Use it when you want to perform `SAVE ARTIFACT` from the target folder (mounted cache), trying to minimize the total size of the copied fileset.
Notice that in order to run this function, `+INIT` must be called first.

### Arguments
#### `output`
Regex matching output artifacts files to be copied to `./target` folder in the caller filesystem (image layers).

### Example
```earthfile
DO rust+SET_RUST_CACHE_MOUNTS
RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE cargo build --release
DO rust+COPY_OUTPUT --output="release/[^\./]+" # Keep all the files in /target/release that don't have any extension.
release:
DO rust+INIT
RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE cargo build --release
DO rust+COPY_OUTPUT --output="release/[^\./]+" # Keep all the files in /target/release that don't have any extension.
```

## SWEEP
This function runs cargo-sweep to clean build artifacts that haven't been accessed for a number of days.
Notice that in order to run this function, `+INIT` must be called first.

### Arguments
#### `days`
Number of days. Default value: 4

### Example
```earthfile
sweep:
DO rust+INIT
DO rust+SWEEP --days=10
```

## Complete example
Expand All @@ -123,7 +119,7 @@ Suppose the following project:
The Earthfile would look like:

```earthfile
VERSION --global-cache 0.7
VERSION 0.8

# Imports the library definition from default branch (in a real case, specify version or commit to guarantee immutability)
IMPORT github.com/earthly/lib/rust AS rust
Expand All @@ -135,15 +131,17 @@ install:
RUN cargo install --locked cargo-deny
RUN rustup component add clippy
RUN rustup component add rustfmt
# Call +INIT before copying the source file to avoid installing depencies every time source code changes.
# Call +INIT before copying the source file to avoid installing dependencies every time source code changes.
# This parametrization will be used in future calls to functions of the library
DO rust+INIT --keep_fingerprints=true
DO rust+INIT

source:
FROM +install
# Always copy with --keep-ts for Cargo to detect changes
COPY --keep-ts Cargo.toml Cargo.lock ./
COPY --keep-ts deny.toml ./
COPY --keep-ts --dir package1 package2 ./
DO rust+CARGO --args="check"

# build builds with the Cargo release profile
build:
Expand Down Expand Up @@ -182,15 +180,32 @@ all:

## Mount caches and parallelization

This library uses several mount caches per tuple of `{project, os_release}`:
- One cache mount for `$CARGO_HOME`, shared across all target builds without any locking involved.
- A family of locked cache mounts for `$CARGO_TARGET_DIR`. One per target.
This library uses two mount caches per tuple of `{project, branch, os_release}`:
- One cache mount for `$CARGO_HOME`, shared across all target builds without any locking involved.
- One cache mount for `$CARGO_TARGET_DIR`, shared across all target builds without any locking involved.

Notice that:
- the previous targets builds might belong to one or multiple Earthly builds.
- builds will only be blocked by concurrent ones of the same target
- Earthly will perform no locking across the builds. Instead, Cargo locking will be in place, the same way as if those builds were concurrent Cargo processes in a local machine.

For example, running `earthly +all` in the previous example will:
- run all targets (`+lint,+build,+test,+fmt,+check-dependencies`) in parallel without any blocking involved
- use a common cache mount for `$CARGO_HOME`
- use one individual `$CARGO_TARGET_DIR` cache mount per target
- use a common cache mount for `$CARGO_TARGET_DIR`

## Explicitly mounting the caches

In that scenarios where running via `DO rust+CARGO` is not feasible, you can alternative mount the caches as follows:
```earthfile
RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE ...
```

### Example

```earthfile
cross:
...
WITH DOCKER
RUN --mount=$EARTHLY_RUST_CARGO_HOME_CACHE --mount=$EARTHLY_RUST_TARGET_CACHE cross build --target $TARGET --release
END
```
Loading