Skip to content
This repository has been archived by the owner on Nov 2, 2021. It is now read-only.

nvidia-dcgm-exporter creates huge logs inside container #182

Open
boniek83 opened this issue May 4, 2021 · 6 comments
Open

nvidia-dcgm-exporter creates huge logs inside container #182

boniek83 opened this issue May 4, 2021 · 6 comments

Comments

@boniek83
Copy link

boniek83 commented May 4, 2021

Either its size should be limited by some configurable option, it shouldn't be created at all or pv/pvc should be used. Ephemeral storage ain't free :)

root@nvidia-dcgm-exporter-ck74t:/# du -skh /var/log/*
4.0K    /var/log/alternatives.log
48K     /var/log/apt
60K     /var/log/bootstrap.log
0       /var/log/btmp
184K    /var/log/dpkg.log
4.0K    /var/log/faillog
32K     /var/log/lastlog
1.4G    /var/log/nv-hostengine.log
0       /var/log/wtmp
@dualvtable
Copy link
Contributor

hi @boniek83 - which version of dcgm-exporter are you using?

@boniek83
Copy link
Author

nvcr.io/nvidia/k8s/dcgm-exporter:2.1.4-2.2.0-ubuntu20.04
This is version in the gpu-operator v1.6.2

@jfolz
Copy link

jfolz commented Jun 9, 2021

I think this may be related to what we're seeing in #194. Our biggest nv-hostengine.log was something like 8+ GB.

@IsQiao
Copy link

IsQiao commented Jul 15, 2021

same issue

@treydock
Copy link
Contributor

Based on feedback from NVIDIA I set the following environment variable to silence the extra logging:

__DCGM_DBG_LVL=NONE

Now the only logs I get in /var/log/nv-hostengine.log is 1 or 2 messages every 30 seconds.

@boniek83
Copy link
Author

Nice but not good enough since it still does log something. We don't know whether amount of data being logged will change between releases. This should be logged to stdout, in dedicated persistent volume or we should just have an option to disable it altogether.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants