Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random Runner Crashes After Upgrading to 0.8 #76

Open
stmitt opened this issue Apr 18, 2024 · 6 comments
Open

Random Runner Crashes After Upgrading to 0.8 #76

stmitt opened this issue Apr 18, 2024 · 6 comments

Comments

@stmitt
Copy link
Contributor

stmitt commented Apr 18, 2024

We recently upgraded to Tartelet 0.8, and since then, we've been experiencing random crashes of our runners. This issue is impacting our CI/CD workflow, and we're trying to identify the cause.

We are now using Tartelet version 0.8.3, tart version 2.9.0, and are using the xcode 15.3 image from cirruslabs and the host is running sonoma 14.4.1.

This is how the VM's look when connection to the host:
Screenshot 2024-04-17 at 12 57 20

We are unsure whether this problem is related to Tartelet itself, the underlying Tart tool, the VM image, or our GitHub Actions job. Has anyone else experienced similar issues after upgrading? Any insights or suggestions would be greatly appreciated.

Thank you for your assistance!

@simonbs
Copy link
Contributor

simonbs commented Apr 22, 2024

Thanks for opening the issue.

Unfortunately, I do not have much to contribute except mentioning that we have three Apple M1 Mac minis running macOS Sonoma 14.3.1, and each of these machines is running two virtual machines with macOS Sonoma 14.1.2, and we have not seen these crashes.

If anyone is seeing the same issue and can provide some more information, then I'm happy to look into whether there's something that can be improved in Tartelet to address it.

@greg-cook
Copy link
Contributor

We're also experiencing this, although have only recently switched to tartelet over orchard now that cirrus images are working.

Next time it happens I will collect and post the crash report.

Our workaround for now is going to be automating the termination of the orphaned VMs and restarting tartelet.

@kondratk
Copy link

@simonbs I've experienced the mentioned crash- attaching a crashlog below.
@greg-cook could you share please how are you automating re-start of Tartelet and VMs?

crashlog.txt

@stmitt
Copy link
Contributor Author

stmitt commented Jun 5, 2024

Today I was able to extract the kernel panic log. Unfortunately, I cannot tell from it what caused the issue. Here is the log if anybody else can tell from it.

https://pastebin.com/Af4i7mcV

@greg-cook How do you detect if a VM is orphaned?

@eigenraven
Copy link

We've been seeing very similar crashes requiring a manual restart on our runners, right now they're all on macOS 14.5 (host and guest) using the Cirrus Xcode 15.4 images as a base. It seems that some of them are kernel panics, and some are loginwindow crashes, both leave the VM running in a zombie state without tartelet's terminal appearing, and tartelet never attempts to restart them. Closing the VM windows lets tartelet continue as usual.

@kuhnroyal
Copy link

This is probably the same as #93

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants