Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow to write cache file questions #770

Open
two-plus-two-equals-five opened this issue Dec 31, 2023 · 4 comments
Open

Slow to write cache file questions #770

two-plus-two-equals-five opened this issue Dec 31, 2023 · 4 comments
Assignees
Labels
performance Make it faster or use fewer resources wontfix This will not be worked on

Comments

@two-plus-two-equals-five
Copy link

two-plus-two-equals-five commented Dec 31, 2023

Hey all, thanks for building such a handy action! It's been really easy to use so far (and nice tests btw 🤩) ❤️

I wanted to ask: is it common for the "Post Docker Cache step" to take ~3 min to store the cache file in Github, or am I doing something wrong to make this happen?
Screenshot 2023-12-31 at 2 19 48 am
Screenshot 2023-12-31 at 2 21 01 am

The Cache saved successfully line is outputted about 10 seconds into the "Post Docker Cache step", so I'm wondering what else is going in that other 2 minutes and ~50 seconds.

Is this slowness a known limitation imposed by Github? (Note: I'm using the free tier).
Asking as: I'm new to ts, but from what I've been able to piece together from your action's code is, I think it's using Github's cache library/action under the hood to handle storing the cache file in Github. I can't tell what version of the action it is using, but could this slowness be caused by the cache action version your code is using?

I'm mainly raising this as my jobs take about ~40 seconds to complete without caching enabled but over 3 min when caching is enabled, but the cache file is not found. It feels like such a shame to have such a useful action be affected by something like this, so if I can help fix this in any way, please say!
Screenshot 2023-12-31 at 2 19 37 am

My workflow file:
Screenshot 2023-12-31 at 2 48 57 am

I hope the above information is helpful. Again, if I can help in any way, please say. Oh and happy new year to you all!

@Kurt-von-Laven
Copy link
Contributor

Hey all, thanks for building such a handy action! It's been really easy to use so far (and nice tests btw 🤩) ❤️

You're welcome! We're glad you're enjoying it.

I wanted to ask: is it common for the "Post Docker Cache step" to take ~3 min to store the cache file in Github, or am I doing something wrong to make this happen?

The literal answer to your questions are yes to the former and no to the latter. While it is common for saving large Docker images to take multiple minutes, your small cache size of ~241 MB should save in ~10 seconds on the current free GitHub-hosted runners.

The Cache saved successfully line is outputted about 10 seconds into the "Post Docker Cache step", so I'm wondering what else is going in that other 2 minutes and ~50 seconds.

The last line of saveDockerImages() calls cache.saveCache(), which in turn calls cacheHttpClient.saveCache() on its last line. The last line of cacheHttpClient.saveCache() logs Cache saved successfully, and we await all asynchronous operations, so my first guess would be that your runner happened to hang as I have observed this to happen occasionally in the past. Do you observe the post step to take ~3 minutes consistently, or was this an uncharacteristically slow run? I have also observed a "Heisenberg effect" where GitHub Actions can sometimes run significantly slower while being watched, particularly if they emit a lot of logs. Can you confirm that the slowness persists when nobody is watching the action run? While the Heisenberg effect could exacerbate the slowness, it wouldn't adequately explain all of it on its own.

Is this slowness a known limitation imposed by Github? (Note: I'm using the free tier). Asking as: I'm new to ts, but from what I've been able to piece together from your action's code is, I think it's using Github's cache library/action under the hood to handle storing the cache file in Github. I can't tell what version of the action it is using, but could this slowness be caused by the cache action version your code is using?

Assuming you are observing this slowness in the post step consistently, I am not aware of a reason why anything would be happening after Cache saved successfully has been emitted. If you wanted to prove that the cache has been successfully saved after ~10 seconds, you could look for the cache in the GitHub UI before the ~3 minute delay. If anyone has ideas of what we could do on our side to improve performance, we are always open to suggestions. We're using GitHub's official cache library, not the action, at the latest released version. We use Renovate to keep our dependencies up to date.

I'm mainly raising this as my jobs take about ~40 seconds to complete without caching enabled but over 3 min when caching is enabled, but the cache file is not found. It feels like such a shame to have such a useful action be affected by something like this, so if I can help fix this in any way, please say!

If you are able to, I recommend hosting your Docker images on the GitHub Container Registry (GHCR) rather than Docker Hub. That way there would be no need to use our action, which is intended for users who don't control where their Docker images are hosted. As an added bonus, you would no longer be subject to Docker Hub's rate limits in case you might find yourself impacted by those in the future.

I hope the above information is helpful. Again, if I can help in any way, please say. Oh and happy new year to you all!

Thank you! Happy New Year!

@Kurt-von-Laven Kurt-von-Laven self-assigned this Jan 6, 2024
@Kurt-von-Laven
Copy link
Contributor

I am closing this on the assumption that that answered your question, but feel free to follow up if not or if you have thoughts on how to improve performance.

@Kurt-von-Laven Kurt-von-Laven added wontfix This will not be worked on performance Make it faster or use fewer resources labels Feb 17, 2024
@two-plus-two-equals-five
Copy link
Author

Hey @Kurt-von-Laven, thanks a lot for your really thorough reply - and apologies for my late response, my Grandad was diagnosed with late-stage leukaemia soon after I had posted this "issue", and I've been a mess since then 🫠.

We've had good news since, so I've come out of my hermit shell recently and spotted your response.

In response to your questions:

... Do you observe the post step to take ~3 minutes consistently, or was this an uncharacteristically slow run?
...
Can you confirm that the slowness persists when nobody is watching the action run?

Yep, this post step consistently takes ~3 minutes to complete regardless of whether I watch it or not from my testing.

If you are able to, I recommend hosting your Docker images on the GitHub Container Registry (GHCR) rather than Docker Hub. That way there would be no need to use our action, which is intended for users who don't control where their Docker images are hosted. As an added bonus, you would no longer be subject to Docker Hub's rate limits in case you might find yourself impacted by those in the future.

Thanks for the heads up, I'll give it a try.

The last line of saveDockerImages() calls cache.saveCache(), which in turn calls cacheHttpClient.saveCache() on its last line. The last line of cacheHttpClient.saveCache() logs Cache saved successfully, and we await all asynchronous operations, so my first guess would be that your runner happened to hang as I have observed this to happen occasionally in the past.

Since it seems to be 3 min 30 seconds consistently (give or take a second) running the same commit each time, could it be that one of the cache call's async promises is not being met and there's some timeout cut-off at the 2-3 min mark? My tests take ~40 seconds to run.

Thank you again for everything!

@Kurt-von-Laven
Copy link
Contributor

Hey @Kurt-von-Laven, thanks a lot for your really thorough reply - and apologies for my late response, my Grandad was diagnosed with late-stage leukaemia soon after I had posted this "issue", and I've been a mess since then 🫠.

We've had good news since, so I've come out of my hermit shell recently and spotted your response.

So sorry your family has been going through such a rough time. Glad your grandpa has pulled through! I'm sure you must have many things to catch up on, so I will certainly understand if troubleshooting our action isn't your top priority.

Since it seems to be 3 min 30 seconds consistently (give or take a second) running the same commit each time, could it be that one of the cache call's async promises is not being met and there's some timeout cut-off at the 2-3 min mark? My tests take ~40 seconds to run.

That would be surprising on the one hand given that the last line of the asynchronous chain is reached, but on the other I think the relevant timeout may be 3 minutes. If that is indeed the issue, I don't know what we can do about it on our end, and it seems to my mind like a bug in @actions/cache, but let's investigate further. You mentioned that your tests take ~40 seconds to run, and the entire workflow takes ~210 seconds. From your original post, I gather that it takes ~10 seconds for the cache to be saved successfully, so the remaining ~160 seconds (or ~140 seconds based on your first post) are mysteriously spent waiting for our post step to finish. Can you rerun your workflow with debug logging enabled in case that might illuminate what is even happening after Cache saved successfully is emitted? It would be helpful to have some sense of the timing of the log statements that follow, so please hit Shift + T to show timestamps before capturing a screenshot of the "Post Docker Cache" step this time. Please don't watch the workflow run during this experiment as that can significantly impact the timing, especially with debug logging enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Make it faster or use fewer resources wontfix This will not be worked on
Development

No branches or pull requests

2 participants