Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does this not support monitoring of node ? #324

Open
amandpay opened this issue Jul 1, 2024 · 14 comments · Fixed by #333
Open

Does this not support monitoring of node ? #324

amandpay opened this issue Jul 1, 2024 · 14 comments · Fixed by #333
Assignees

Comments

@amandpay
Copy link

amandpay commented Jul 1, 2024

My rbac files lookes like below:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: {{ .Release.Name }}
rules:

  • apiGroups: [""]
    resources: ["pods", "pods/log", "events", "nodes"]
    verbs: ["get", "watch", "list"]

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: {{ .Release.Name }}
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: {{ .Release.Name }}
subjects:

  • kind: ServiceAccount
    name: {{ .Release.Name }}
    namespace: {{ .Release.Namespace }}

But one of node was Ready,SchedulingDisabled , but no alert fired. Is this expected or need some other configuration ?

@amandpay
Copy link
Author

amandpay commented Jul 1, 2024

I want to monitor node when Node Condition is ready but status is false.

@amandpay
Copy link
Author

amandpay commented Jul 3, 2024

@abahmed Any idea ?

@abahmed abahmed self-assigned this Jul 11, 2024
@abahmed
Copy link
Owner

abahmed commented Jul 11, 2024

Thank you @amandpay for raising this
Currently we don't support node monitoring, but we can work on supporting it soon

@amandpay
Copy link
Author

amandpay commented Jul 15, 2024

Hi @abahmed any specific timeline for this ?

@abahmed
Copy link
Owner

abahmed commented Jul 15, 2024

@amandpay will work on it ASAP and it's expected to be landed within one week from now

@amandpay
Copy link
Author

Thanks @abahmed.
It would be great if we can add alert for ImagePullBackOff also in the new release itself.

@amandpay
Copy link
Author

amandpay commented Jul 17, 2024

@abahmed Today i observed , i had a pod with termination grace period 60 sec and a sleep with PreStop of 30 secs , so pod was in terminating state for 30 secs due to sleep , i got alert for this
Any way to avoid such alerts during deployments ?
Alert fired for only few pods and not for all with below message

{This message is too long to display here. Please visit the external source app to view the message.}

@abahmed
Copy link
Owner

abahmed commented Jul 17, 2024

Hello @amandpay
I think this maybe related to this issue (#323), currently working on fixing it

@amandpay
Copy link
Author

Hey @abahmed , any update on this ?

@abahmed
Copy link
Owner

abahmed commented Jul 25, 2024

@amandpay
Released on v0.10.0 🎉

@amandpay
Copy link
Author

@abahmed Is this working , my node was in NotReady state for 5 mins as kubelet was not running but no alerts receive ?

@abahmed
Copy link
Owner

abahmed commented Jul 25, 2024

@amandpay Can you share logs of kwatch?

@abahmed abahmed reopened this Jul 25, 2024
@amandpay
Copy link
Author

amandpay commented Jul 26, 2024

@amandpay Can you share logs of kwatch?

@abahmed

image

@amandpay
Copy link
Author

amandpay commented Aug 8, 2024

Hi @abahmed , any updates here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants