You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Nov 2, 2021. It is now read-only.
Could you provide the logs from the dcgm-exporter itself?
It looks like there are two dcgm-exporter instances one aware of k8s environment (were able to connect to pod api) and another one didn't. The container_name, pod_namespace, pod_name labels are gathered from the k8s infra and if there are no such labels - connection to the k8s from the dcgm-exporter failed and that should be reflected in the dcgm-exporter logs.
WBR,
Nik
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
yaml:pod-gpu-exporter-daemonset.yaml
docker image:pod-gpu-metrics-exporter:1.0.0-alpha
dcgm:dcgm-exporter:1.4.6
Duplicate metrics occured when a job scheduling to this server for long time
The text was updated successfully, but these errors were encountered: