You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
nagios relies on the worker.log being up to date to see if the daemon is alive. If the daemon is alive but idle for a long period of time with verbose set to False, the log ages past the warning age.
I turned verbose back on for the daemon in mozilla-releng/build-puppet#500. However, we shouldn't have to keep the daemon verbose.
We should:
separate log level from screen output in the config, so we can set the log level to INFO but keep console output for papertrail. (Or, we can add worker.log to the syslog config.)
do something to avoid triggering nagios if the log level is higher than DEBUG:
claimWork at the INFO level
output an occasional scriptworker is still alive every half hour or so if we've been idle
stop monitoring worker.log in nagios since we haven't really had an issue with hung scriptworker daemons for a long time
The text was updated successfully, but these errors were encountered:
We can also filter away logs at papertrail so they don't count against against log transfer. Should be able to drop DEBUG there while leaving the nagios check unaffected.
In mozilla-releng/build-puppet#498, I turned off signing scriptworker's verbosity for both the script and daemon. This broke two things:
verbose
isFalse
. This means scriptworker logs stop going to papertrailverbose
set toFalse
, the log ages past the warning age.I turned verbose back on for the daemon in mozilla-releng/build-puppet#500. However, we shouldn't have to keep the daemon verbose.
We should:
separate log level from screen output in the config, so we can set the log level to INFO but keep console output for papertrail. (Or, we can add worker.log to the syslog config.)
do something to avoid triggering nagios if the log level is higher than DEBUG:
scriptworker is still alive
every half hour or so if we've been idleThe text was updated successfully, but these errors were encountered: