Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paralellize retrieving resolved packages #8220

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

fortmarek
Copy link
Contributor

@fortmarek fortmarek commented Jan 14, 2025

Paralellize retrieving resolved packages

Motivation:

This PR brings back the parallelization reverted here.

As raised in the revertal, checkouts and registry downloads mutate the WorkspaceState – and running these operations in parallel can lead to a crash since the WorkspaceState is not thread-safe.

Modifications:

Based on this suggestion, I'm making WorkspaceState an actor to make access to it thread-safe.

Additionally, I turned ManagedDependencies into a struct to ensure its thread-safety, too. There are two methods of this struct that mutate self – add and remove. I changed this to return a new copy of ManagedDependencies instead and add counterpart methods in WorkspaceState where the mutation is done in-place. Since ManagedDependencies is a Collection, making it into an actor because of two mutable methods didn't feel like the right fit. However, let me know what you think about it.

I added comments to the key modifications to make it easier to find the more interesting modifications since there's a lot of noise since I had to add loads of async and await. You can find these comments below.

Result:

Retrieving resolved packages in parallel should not lead to crashes.

As suggested by @bkhouri here, I ran WorkspaceTests multiple times to see if I run into any thread-safety issues:

for ((i=1; i<=30; i++)); do swift test --filter WorkspaceTests.WorkspaceTests; done
for ((i=1; i<=10; i++)); do swift test --filter WorkspaceTests; done

@fortmarek fortmarek changed the title Make WorkspaceState into actor to make it thread-safe Paralellize retrieving resolved packages Jan 15, 2025
)
default:
throw InternalError("invalid resolved package type \(resolvedPackage.packageRef.kind)")
await withThrowingTaskGroup(of: Void.self) { taskGroup in
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning back the parallelization of retrieving resolved packages

@@ -19,7 +19,7 @@ import SourceControl
import struct TSCUtility.Version

/// Represents the workspace internal state persisted on disk.
public final class WorkspaceState {
public actor WorkspaceState {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making WorkspaceState into actor to make it thread-safe

@@ -172,12 +172,18 @@ extension Workspace.ManagedDependency: CustomStringConvertible {

extension Workspace {
/// A collection of managed dependencies.
final public class ManagedDependencies {
public struct ManagedDependencies {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turning ManagedDependencies into a struct. The alternative would be to make it an actor but since it conforms to Collection, it wouldn't be that easy.

There are only two methods of ManagedDependencies that are mutable – add and remove. Instead of mutating self, we can return copy of Self and do the mutation in WorkspaceState.

Comment on lines +92 to +98
public func add(dependency: Workspace.ManagedDependency) {
dependencies = dependencies.add(dependency)
}

public func remove(identity: PackageIdentity) {
dependencies = dependencies.remove(identity)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To keep the setter of dependencies private, we need to add helper methods for the two operations that mutate ManagedDependencies. Alternatively, we could make the setter public, however, I do find this makes callsites of WorkspaceState consumers nicer, so I'm leaning to keep the current solution.

@bkhouri
Copy link
Contributor

bkhouri commented Jan 15, 2025

I ran, from the root of the repository, the following shell script against this PR and there were no entries in the associated failed directory

#!/bin/bash
# cmd="swift test --very-verbose --filter CommandsTests.RunCommandTests/testMultipleExecutableAndExplicitExecutable"
cmd=(
    "git log -n1"
    "time swift test --filter \"Workspace*Tests\" --parallel --disable-swift-testing"
)
BASEDIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"


custom_run=$(date -u +'%Y%m%dT%H%M%S%Z')
root_log_dir="${HOME}/Downloads/can_be_deleted/swiftpm_logs/${custom_run}"
failed_dir="${root_log_dir}/failed"

caffeinate_cmd=$(command -v caffeinate)

function cleanup() {
    echo ""
    echo ""
    echo "******************************************************"
    echo "Root log directory  : ${root_log_dir}"
    echo "Failed log directory: ${failed_dir}"
}

trap cleanup EXIT

mkdir -p ${failed_dir}


num_iterations=300
# for num in $(seq -w 2); 
echo "Executing         : ${cmd}"
echo "Root Log Directory: ${root_log_dir}"
set -x 
which swift 
xcode-select -p
swift --version
set +x
for num in $(seq -w $num_iterations); 
do 
    (
        cd ${BASEDIR}
        log_dir=${root_log_dir}/${num}
        message="[${num}/${num_iterations}] executing and writing log to ${log_dir}"
        echo "${message} ..."
        mkdir -p ${log_dir}
        test_log=${log_dir}/swift_test_console_${num}.txt
        rm -rf ${test_log}
        start_time=$(date +%s)
        for ((i = 0; i < ${#cmd[@]}; i++))
        do
            full_cmd="${caffeinate_cmd} ${cmd[${i}]}"
            echo "❯❯❯ Executing: ${full_cmd}" >> ${test_log}
            # set -x
            # caffeinate ${cmd} 2>&1 | tee -a "${test_log}"
            ${full_cmd} >> "${test_log}" 2>&1
            rc=$?
            # set +x
            if [ ${rc} -ne 0 ] ; then
            # if [ ${PIPESTATUS[0]} -ne 0 ] ; then
                ln -s ${log_dir} ${failed_dir}
            fi

        done
        end_time=$(date +%s)
        echo "${message} completed in $(( ($end_time - $start_time) )) seconds"
        
        # echo "Waiting 0.5 second..."
        # echo "Done run number ${num}.  Log dir: ${log_dir}"
        sleep 0.5 
    )
done
echo "Done all ${um_iterations} interations"

@fortmarek
Copy link
Contributor Author

Thanks for checking @bkhouri. I believe then the previous thread-safety issues should be fixed now 🙏 cc @dschaefer2

@bkhouri
Copy link
Contributor

bkhouri commented Jan 16, 2025

I added the utility script in #8227

@fortmarek
Copy link
Contributor Author

fortmarek commented Jan 16, 2025

Also, we did some performance testing of this branch at https://github.com/tuist/registry-tests using some real-worldPackage.swift files sent by folks from the Tuist community and clean checkouts (both when using registry or source control resolution) are between faster by 33 % to 51 %. In other words, clean checkouts will be up to 2x as fast once this ships.

Here's some of the relevant data we gathered:

Measurement data

Package 1 (41 dependencies)

Source control with Swift 6.0.3

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643543479#step:12:1276

Time (mean ± σ):     73.026 s ±  8.001 s    [User: 89.784 s, System: 17.401 s]
Range (min … max):   66.706 s … 86.556 s    5 runs

Registry with Swift 6.0.3

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643544361#step:6:760

Time (mean ± σ):     81.317 s ±  6.360 s    [User: 68.083 s, System: 10.738 s]
Range (min … max):   72.128 s … 86.903 s    5 runs

Source control using swift-package from this branch

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643543098#step:13:1006

Time (mean ± σ):     36.802 s ±  1.074 s    [User: 61.925 s, System: 9.012 s]
Range (min … max):   35.342 s … 37.982 s    5 runs
  • ~51 % faster than the Swift 6.0.1 release

Registry using swift-package from this branch

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643544096#step:11:514

Time (mean ± σ):     42.102 s ±  3.453 s    [User: 68.086 s, System: 10.986 s]
Range (min … max):   38.757 s … 46.024 s    5 runs
  • ~48 % faster than the Swift 6.0.1

Package 2 (63 dependencies)

Source control with Swift 6.0.3

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643545273#step:12:3070

Time (mean ± σ):     309.176 s ± 36.704 s    [User: 346.258 s, System: 67.350 s]
Range (min … max):   273.085 s … 348.343 s    5 runs

Registry with Swift 6.0.3

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643546221#step:6:2116

Time (mean ± σ):     289.427 s ± 18.107 s    [User: 173.334 s, System: 31.886 s]
Range (min … max):   263.536 s … 311.480 s    5 runs

Source control using swift-package from this branch

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643544647#step:13:2614

Time (mean ± σ):     196.464 s ± 13.283 s    [User: 219.689 s, System: 36.472 s]
Range (min … max):   185.316 s … 215.061 s    5 runs
  • ~38 % faster than current swift release

Registry using swift-package from this branch

Source: https://github.com/tuist/registry-tests/actions/runs/12785286884/job/35643545584#step:11:1750

Time (mean ± σ):     193.458 s ±  5.784 s    [User: 122.490 s, System: 17.624 s]
Range (min … max):   185.861 s … 201.466 s    5 runs
  • ~33 % faster than the current

@bkhouri
Copy link
Contributor

bkhouri commented Jan 17, 2025

@fortmarek Have you tried executing all of the swift tests on your change? Maybe it's me, but I'm seeing a few test failures when running swift test --parallel.

@fortmarek
Copy link
Contributor Author

Thanks @bkhouri! I don't think it has to do anything with running the tests in parallel, but some PackageCommandTests are actually failing. I will look into it 🙂

@fortmarek
Copy link
Contributor Author

@bkhouri the issue should be fixed – SwiftCommand should now conform to AsyncParsableCommand instead of ParsableCommand as the run method has been changed to being async.

@bkhouri
Copy link
Contributor

bkhouri commented Jan 23, 2025

@bkhouri the issue should be fixed – SwiftCommand should now conform to AsyncParsableCommand instead of ParsableCommand as the run method has been changed to being async.

@fortmarek Sorry for the delay. Let me rerun the script against your latest changes to ensure there no regression or intermittent issues were introduced.

I'll trigger the CI builds against the change in the interim

@bkhouri
Copy link
Contributor

bkhouri commented Jan 23, 2025

@swift-ci please test

@bkhouri
Copy link
Contributor

bkhouri commented Jan 23, 2025

@swift-ci please test macos linux

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants