All data for the metric kepler_container_package_joules_total is zero #109
Replies: 3 comments
-
The fact that the metrics show up in prometheus means the kepler-exporter is being scraped. I checked the data source in Grafana and it tests good. The dashboard is being built with all the visualizations so Grafana can talk to prometheus. |
Beta Was this translation helpful? Give feedback.
-
Some metrics have values but the metric used by pretty much every visualization in Grafana is not being read. |
Beta Was this translation helpful? Give feedback.
-
@ajgillette Shall we move this to issue? |
Beta Was this translation helpful? Give feedback.
-
I have changed to deploy kepler on a lab cluster and I'm still not getting any data in the Grafana dashboard. I checked the queries in the dashboard and they rely heavily on a metric "kepler_container_package_joules_total". When I examine that metric via the prometheus query page (Port forward port 9090) is see that the value is always zero.
I checked inside the Kepler-exporter pod and all the kernel headers and modules are visible (Mapped to the expected place)
Here's the Kepler-exporter log:
kubectl logs -n kepler kepler-exporter-d456k
I0628 18:37:31.354308 1 gpu.go:46] Failed to init nvml, err: could not init nvml: error opening libnvidia-ml.so.1: libnvidia-ml.so.1: cannot open shared object file: No such file or directory
I0628 18:37:31.359458 1 acpi.go:67] Could not find any ACPI power meter path. Is it a VM?
I0628 18:37:31.370258 1 exporter.go:151] Kepler running on version: c5ab287
I0628 18:37:31.370297 1 config.go:210] using gCgroup ID in the BPF program: true
I0628 18:37:31.370337 1 config.go:212] kernel version: 5.15
I0628 18:37:31.370415 1 config.go:172] kernel source dir is set to /usr/share/kepler/kernel_sources
I0628 18:37:31.370535 1 exporter.go:169] EnabledBPFBatchDelete: true
I0628 18:37:31.370578 1 rapl_msr_util.go:129] failed to open path /dev/cpu/0/msr: no such file or directory
I0628 18:37:31.370620 1 power.go:64] Not able to obtain power, use estimate method
I0628 18:39:01.429764 1 exporter.go:182] Initializing the GPU collector
I0628 18:39:01.430807 1 watcher.go:67] Using in cluster k8s config
cannot attach kprobe, probe entry may not exist
perf_event_open: No such file or directory
I0628 18:39:06.559697 1 bcc_attacher.go:108] failed to attach perf event cpu_cycles_hc_reader: failed to open bpf perf event: no such file or directory
perf_event_open: No such file or directory
I0628 18:39:06.559792 1 bcc_attacher.go:108] failed to attach perf event cpu_ref_cycles_hc_reader: failed to open bpf perf event: no such file or directory
perf_event_open: No such file or directory
I0628 18:39:06.559859 1 bcc_attacher.go:108] failed to attach perf event cpu_instr_hc_reader: failed to open bpf perf event: no such file or directory
perf_event_open: No such file or directory
I0628 18:39:06.559933 1 bcc_attacher.go:108] failed to attach perf event cache_miss_hc_reader: failed to open bpf perf event: no such file or directory
I0628 18:39:06.559963 1 bcc_attacher.go:171] Successfully load eBPF module with option: [-DMAP_SIZE=10240 -DNUM_CPUS=8 -DSET_GROUP_ID]
I0628 18:39:06.576183 1 exporter.go:226] Started Kepler in 1m35.205964464s
Beta Was this translation helpful? Give feedback.
All reactions