-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rebooting GitLab may trigger ClamAV alarm #6114
Comments
Assignee to provide symptoms and solution in description. |
I don't think we need an elaborate demo. We discovered that the attempted fixes from the first two PRs (#6155 and #6315) weren't effective before we even got to the demo. IOW, we will likely make the same discovery about PR #6374 organically, during normal operations. |
@hannes-ucsc: "Rebooting the instance still results in a false alarm, for example:" https://groups.google.com/a/ucsc.edu/g/azul-group/c/zkmzRVv_rec/m/WbNWhMsKBQAJ
|
Assignee to consider increasing the frequency of the cronjob to |
There is a contradiction in the above comment: */18 would not be an increase. Assignee to formalize plan. |
The alarm fires when a successful clamscan message wasn't logged within the last 24 hours. On average, a successful scan takes anywhere from 10 - 14 hours ( quicker on Currently clamscan is set up to run twice a day. This causes the alarm to fire if the scan following a reboot takes longer than the scan that completed just prior to the reboot. For this example, assume a 11 hour scan starting at 00 and 12:
Since systemd timers won't start a service that is still running from its last activation by a timer, I purpose setting the clamscan's timer to run 6 times a day, or every 4 hours ( Example, 11 hour scan starting every 4 hours (00, 04, 08, 12, 16, 20):
Example, 14 hour scan starting every 4 hours (00, 04, 08, 12, 16, 20):
|
@hannes-ucsc: "Let's just start the unit every hour. If the scan takes less than an hour, it's actually desirable to start it on the next full hour. Extra care to be taken to ensure that the scans aren't running in parallel or overlap." |
Note my edits to the demo instructions. |
Resolution was incomplete, recently a scan took 26 hours to complete following a reboot of GitLab |
Spike to compare hierarchical folder sizes on the two lower GitLab instances using a treemap based GUI. |
Unfortunately, it isn't possible to generate a hierarchical folder size comparison of a given instance without proprietary software. Like TreeMap which features the ability to connect to the instance, perform the scan, and generate the desired map. The viable open source choice was dev_ncdu.json To display the generated reports, install
|
Originally, this was ran without The file download links in the previous comment have been updated to link to the |
The
azul-clamscan-<deployment>
alarm is triggered if aclamscan succeeded
log message is not produced within an 18 hour period. Since the ClamAV scan is run twice daily, and takes many hours to complete, it is possible for a reboot of the GitLab instance (due to an update, backup, or testing) to cancel an ongoing ClamAV scan and prevent a successful completion of a scan to fall within an 18 hour window since one last completed.Recommended solution is to increase the alarm's period to 24 hours.
Note: 24 hours is the maximum allowed time period for an alarm with one evaluation period. (from: Common features of CloudWatch alarms)
The text was updated successfully, but these errors were encountered: