Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runners not starting #64

Closed
annuh opened this issue Feb 29, 2024 · 20 comments · Fixed by #65
Closed

Runners not starting #64

annuh opened this issue Feb 29, 2024 · 20 comments · Fixed by #65
Labels
bug Something isn't working

Comments

@annuh
Copy link

annuh commented Feb 29, 2024

I'm experiencing the same issue(s) as #63. I'm creating a new issue since that other issue is closed.

Problem:
Starting a runner (Virtual Machine --> Start) doesn't start a runner.

Debug logs:
The debug log from Tartelet mentions this error:

2024-02-29T12:44:16Z [VirtualMachineResourcesCopier] [Info] Failed removing resources at file:///Users/anne/Library/Application%20Support/dk.shape.Tartelet/runner-1/.Trashes: “.Trashes” couldn’t be removed because you don’t have permission to access it.
2024-02-29T12:44:16Z [VirtualMachineResourcesCopier] [Info] Failed copying resources from file:///Users/anne/Library/Application%20Support/dk.shape.Tartelet/Virtual%20Machine%20Resources/.Trashes/ to file:///Users/anne/Library/Application%20Support/dk.shape.Tartelet/runner-1/.Trashes: “.Trashes” couldn’t be copied because you don’t have permission to access “runner-1”.
2024-02-29T12:44:16Z [EphemeralVirtualMachineFactory] [Error] Failed making ephemeral virtual machine as resources could not be created: “.Trashes” couldn’t be copied because you don’t have permission to access “runner-1”.
2024-02-29T12:44:16Z [VirtualMachineFleetLive] [Error] Could not create virtual machine named runner-1: “.Trashes” couldn’t be copied because you don’t have permission to access “runner-1”.

When I delete this Trashes directory manually via sudo rm -rf ~/Library/Application\ Support/dk.shape.Tartelet/Virtual\ Machine\ Resources/.Trashes a runner is started, but exits immediately with the following error:

image

Steps to reproduce:

I was able to reproduce it with the normal installation steps as described in the wiki: https://github.com/shapehq/tartelet/wiki/Configuring-Tartelet.
Using a newer IPSW file did not make a difference.

# old
tart create runner --disk-size=120 --from-ipsw=https://updates.cdn-apple.com/2023WinterFCS/fullrestores/032-48346/EFF99C1E-C408-4E7A-A448-12E1468AF06C/UniversalMac_13.2.1_22D68_Restore.ipsw

# new
tart create runner --disk-size=120 --from-ipsw=https://updates.cdn-apple.com/2024WinterFCS/fullrestores/052-40770/72916BCC-D357-422D-A4A2-EF1DEDF6968C/UniversalMac_14.3.1_23D60_Restore.ipsw

Other details:

Tartelet shows an 'Unknown' VM
image

These files exist in my Virtual Machine Resources:
image

Used versions

Tartelet: Version 0.7.1 (1)
Tart: 2.6.1

Tartelet did work in the past without any issues! I'm not sure when it exactly broke, but I did upgrade MacOS (to 14.3.1 (23D60)) and maybe updated Tart via a brew upgrade.

Does somebody have an idea what is wrong?

@leohidalgo
Copy link

+1 I'm getting the same error

@leohidalgo
Copy link

@annuh can you try with this?:

sudo rm -rf ~/Library/Application Support/dk.shape.Tartelet/Virtual Machine Resources/.Trashes

@annuh
Copy link
Author

annuh commented Feb 29, 2024

@annuh can you try with this?:

sudo rm -rf ~/Library/Application Support/dk.shape.Tartelet/Virtual Machine Resources/.Trashes

Yes, running that command results in the 2nd issue where runners start but stop right away.

@leohidalgo
Copy link

Me too, I've update the macOS to 14.4 and the error still exists

@fkorotkov
Copy link

This might be due to underlying bug in VirtioFS support in macOS itself. See cirruslabs/tart#567 for more details.

By mentions of "My Shared Files" I assume Tartlet mounts scripts and executes them. Maybe it's worth to change the logic to SSH in and execute the them directly. We use this technique in our managed GitHub Actions integration.

@simonbs
Copy link
Contributor

simonbs commented Mar 8, 2024

Just wanted to give everyone an update on this. I'm working on a version of Tartelet that addresses this issue by using SSH to transfer files between the host machine and the virtual machines, as recommended by @fkorotkov.

We've just started testing this internally at Shape and I hope we can release a version of Tartelet that uses this very soon.

You can find the changes on the feature/ssh branch. I'm also using this as an opportunity to heavily clean up the codebase to make Tartelet easier to maintain going forward.

@antonnyman
Copy link

antonnyman commented Mar 8, 2024

@annuh Sorry for not responding in the other thread.

For me, the issue was that the files from Virtual Machine Resources were copied to the runner's root directory so I changed the directory in the start.command file like this. The behavior @annuh describes was exactly what happened to me.

I had to make a self-signed app to use on our machines for this. After reading your comment @simonbs I assume you have a better solution and this is not worthy of a PR? 🙂

I also had issues starting the runners when I had entered edit mode and then shutting it down.
It seemed to be an issue with permissions (?) so I made the Virtual Machine Resources directory myself and copied the startup files there. Then the runners started automatically.

@simonbs
Copy link
Contributor

simonbs commented Mar 8, 2024

@antonnyman Normally, I'd very much love a PR for fixes like this, but I'm so close to having moved Tartelet away from using the resources folder altogether that I don't think it makes sense. I have good faith that copying the resources from the host machine to the virtual machine will be more reliable than what we've been using up until now 😊

@antonnyman
Copy link

Looking forward to the new changes! 🙂

I could provide the built app for those who are stuck in this thread, as a temporary workaround, but only if you think it is a good idea @simonbs.

Great work with Tartelet and thanks for open-sourcing it!

@leohidalgo
Copy link

Maybe @simonbs could generate a version only with the @antonnyman workaround, while working on the final one

@simonbs simonbs added the bug Something isn't working label Mar 9, 2024
@simonbs simonbs reopened this Mar 9, 2024
@simonbs
Copy link
Contributor

simonbs commented Mar 9, 2024

Tartelet 0.8.0 is now available and fixes this issue. Please consult the release notes for details on migrating from 0.7.1 to 0.8.0. Specifically, this involves:

  1. Enabling Remote Login on the virtual machine.
  2. Configuring Tartelet with the credentials needed to SSH into the virtual machine. If you've followed the guide on configuring Tartelet the username will be runner and the password will be runner too.

I consider this issue to be addressed now and will close it.

Admittedly, 0.8.0 brings some bigger changes to Tartelet. Please don't hesitate to open a new issue if you encounter any challenges with these changes 🙏

@simonbs simonbs closed this as completed Mar 9, 2024
@leohidalgo
Copy link

The loop keeps happening in my machine. I don't see anything unusual in the logs.

@leohidalgo
Copy link

I found this:
image

@simonbs
Copy link
Contributor

simonbs commented Mar 13, 2024

I suppose I can make a build with the com.apple.security.automation.apple-events entitlement, but I'm unsure why not having it would be a problem on your machine and not on everyone else's 🤔

@leohidalgo
Copy link

Thanks, if you want you can send me here the version to test, before releasing a new version.

@leohidalgo
Copy link

@simonbs with the latest version 8.0.2 still happens 🙁

@jpatters
Copy link

jpatters commented Mar 14, 2024

I am having this issue as well. Although I'm not seeing any errors in the logs. Things were working fine and then it just stopped.
It is not attempting to connect to the vm over ssh. I just changed the password to nonsense and there are no failures in the logs (i was seeing them earlier when I had things misconfigured)
I've tracked my issue to not getting an IP address. At this point I am unsure if it is related to the original issue this was created for. Or even if the issue is with tartelet or tart.
Using just the tart command, i have no connection if i just do tart run runner. However, if I run it in bridged mode via tart run runner --net-bridged en0 it does work.

@jpatters
Copy link

My issue was due to running out of addresses. All were leased to old vms. Deleting /var/db/dhcpd_leases resolved my issue.

@simonbs
Copy link
Contributor

simonbs commented Mar 16, 2024

My issue was due to running out of addresses. All were leased to old vms. Deleting /var/db/dhcpd_leases resolved my issue.

This is a known limitation that's documented in Tart's FAQ. This can happen for several reasons, one of them being that your runner has had a lot of short jobs in a short amount of time or that Tartelet has failed to start the virtual machine for other reasons, causing it to be shut down, and restarted, ultimately exhausting the pool.

@jpatters
Copy link

This is a known limitation that's documented in Tart's FAQ. This can happen for several reasons, one of them being that your runner has had a lot of short jobs in a short amount of time or that Tartelet has failed to start the virtual machine for other reasons, causing it to be shut down, and restarted, ultimately exhausting the pool.

Awesome. Thanks for the info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants