Interface Statistics

Use netdata to collect interfaces statistics from remote linux system.

I write this tool because I want one second granularity interface statistics on EdgeRouter.

Usage

Run ifstat.py on Linux system which we want to get interface statistics
- Modify INTERFACES, 'LISTEN' and PORT according to your need
Deploy netdata plugin and config to a running netdata server
1. Copy netdata/ifstat.chart.py to netdata plugins.d/python.d.plugin directory
2. Edit netdata/ifstat.conf and then copy to etc/netdata/python.d directory

How does it work

ifstat.py starts a simple socket server on remote linux system
- Upon receiving request, it reads various files under /sys/class/net/<itf>/statistics/ and combines them into csv format and then send back
netdata/ifstat.chart.py is a netdata plugin written in Python
- It sends requests to the above socket server defined in netdata/ifstat.conf, parses response and then generates output for netdata

Because data collection is triggered by netdata, so always get data that's most current. This is avoids getting data that's not aligned with collection frequency. We can also stop collection simply by removing socket server in config without touching on remote linux system.

Overhead on EdgeRouter

By watching output of top:

For ER-X, collecting 4 interfaces takes less than 1.3% CPU usage
For ER-12, collecting 8 interfaces takes 1% ~ 2% CPU usage

Memory (RES) used is less than 4800 bytes.

Known issue

When hardware offloading enabled, rx/tx bytes/packets statistics may be far below actual value. I guess because packets are forwarded directly by hardware and thus not counted on some interfaces.

For me, the reduced 50us NAT+forwarding latency is more valuable than correct statistics. So I will keep offloading enabled and only disable it when needing correct statistics.

Debugging netdata plugin

After deploying the python plugin, run following command:

bash plugins.d/python.d.plugin ifstat debug trace

If everything works fine, we will see lines starting with BEGIN, SET, etc. that's for netdata.

The wrong approach (separating data generation and collection)

The approach I first used turns out to be wrong. In this approach:

Remote device generates data in a while loop, sleeping a little less than 1 second between every data generations
Netdata plugin collects data every 1 second

The key problem here is that the generated data does not align with netdata collection interval. This may lead to wrong statistics values. Consider following scenario (suppose time starts at 0 second):

data generated on 0.01 second
netdata collects data at 1.00 second
data generated on 1.01 second
netdata collects data at 2.02 second

In this case, netdata collects value which represents a 2 seconds interval instead of 1 second. Thus we could see wrong spikes in chart.

The proper way to do this is written in document. The core idea is to collect data at exactly constant rate.

So it's better to let netdata plugin collection trigger data generation. That's why I moved to use Python socket server for moniting EdgeRouter interfaces. Not using SNMP is because update latency for snmpd.

For reference, obselect/ifstat.sh is kept as an example of collecting data separately.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Interface Statistics

Usage

How does it work

Overhead on EdgeRouter

Known issue

Debugging netdata plugin

The wrong approach (separating data generation and collection)

Files

README.md

Latest commit

History

README.md

File metadata and controls

Interface Statistics

Usage

How does it work

Overhead on EdgeRouter

Known issue

Debugging netdata plugin

The wrong approach (separating data generation and collection)