All data for the metric kepler_container_package_joules_total is zero #109

ajgillette · 2023-06-28T18:55:10Z

ajgillette
Jun 28, 2023

I have changed to deploy kepler on a lab cluster and I'm still not getting any data in the Grafana dashboard. I checked the queries in the dashboard and they rely heavily on a metric "kepler_container_package_joules_total". When I examine that metric via the prometheus query page (Port forward port 9090) is see that the value is always zero.
I checked inside the Kepler-exporter pod and all the kernel headers and modules are visible (Mapped to the expected place)

Here's the Kepler-exporter log:

kubectl logs I0628 18:37:31.354308 I0628 18:37:31.359458 I0628 18:37:31.370258 I0628 18:37:31.370297 I0628 18:37:31.370337 I0628 18:37:31.370415 I0628 18:37:31.370535 I0628 18:37:31.370578 I0628 18:37:31.370620 I0628 18:39:01.429764 I0628 18:39:01.430807 cannot attach perf_event_open: I0628 18:39:06.559697 perf_event_open: I0628 18:39:06.559792 perf_event_open: I0628 18:39:06.559859 perf_event_open: I0628 18:39:06.559933 I0628 18:39:06.559963 I0628 18:39:06.576183 -n kepler kepler-exporter-d456k
1 gpu.go:46] Failed to init nvml, err: could not init nvml: error opening libnvidia-ml.so.1: libnvidia-ml.so.1: cannot open shared object file: No such file or directory
1 acpi.go:67] Could not find any ACPI power meter path. Is it a VM?
1 exporter.go:151] Kepler running on version: c5ab287
1 config.go:210] using gCgroup ID in the BPF program: true
1 config.go:212] kernel version: 5.15
1 config.go:172] kernel source dir is set to /usr/share/kepler/kernel_sources
1 exporter.go:169] EnabledBPFBatchDelete: true
1 rapl_msr_util.go:129] failed to open path /dev/cpu/0/msr: no such file or directory
1 power.go:64] Not able to obtain power, use estimate method
1 exporter.go:182] Initializing the GPU collector
1 watcher.go:67] Using in cluster k8s config
kprobe, probe entry may not exist
No such file or directory
1 bcc_attacher.go:108] failed to attach perf event cpu_cycles_hc_reader: failed to open bpf perf event: no such file or directory
No such file or directory
1 bcc_attacher.go:108] failed to attach perf event cpu_ref_cycles_hc_reader: failed to open bpf perf event: no such file or directory
No such file or directory
1 bcc_attacher.go:108] failed to attach perf event cpu_instr_hc_reader: failed to open bpf perf event: no such file or directory
No such file or directory
1 bcc_attacher.go:108] failed to attach perf event cache_miss_hc_reader: failed to open bpf perf event: no such file or directory
1 bcc_attacher.go:171] Successfully load eBPF module with option: [-DMAP_SIZE=10240 -DNUM_CPUS=8 -DSET_GROUP_ID]
1 exporter.go:226] Started Kepler in 1m35.205964464s

ajgillette · 2023-06-28T18:57:55Z

ajgillette
Jun 28, 2023
Author

The fact that the metrics show up in prometheus means the kepler-exporter is being scraped. I checked the data source in Grafana and it tests good. The dashboard is being built with all the visualizations so Grafana can talk to prometheus.
The issue must be related to the Kepler-exporters ability to connect and read data from the ebpf code.

0 replies

ajgillette · 2023-06-28T18:59:22Z

ajgillette
Jun 28, 2023
Author

Some metrics have values but the metric used by pretty much every visualization in Grafana is not being read.

0 replies

sunya-ch · 2023-09-19T04:03:18Z

sunya-ch
Sep 19, 2023
Maintainer

@ajgillette Shall we move this to issue?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

All data for the metric kepler_container_package_joules_total is zero #109

{{title}}

Replies: 3 comments

{{title}}

{{title}}

{{title}}

Select a reply

All data for the metric kepler_container_package_joules_total is zero #109

ajgillette Jun 28, 2023

Replies: 3 comments

ajgillette Jun 28, 2023 Author

ajgillette Jun 28, 2023 Author

sunya-ch Sep 19, 2023 Maintainer

ajgillette
Jun 28, 2023

ajgillette
Jun 28, 2023
Author

ajgillette
Jun 28, 2023
Author

sunya-ch
Sep 19, 2023
Maintainer