-
Notifications
You must be signed in to change notification settings - Fork 29
Future_Kdriver_Supervisor
Currently many VNFs are "migrated" from purposely built HW
solutions into virtualized environments and often expect
network interfaces (vNICs attached to SR-IOV VFs) to have
properties like 802.1Q VLAN trunking, VLAN push/pop,
security, mirroring, statistics, QoS, etc, we used to have
in traditional switches. In addition multi-tenant
virtualized environments require SR-IOV implementations
where security and network integrity cannot be compromised.
Much in the way that a guest operating system requires a
hypervisor to provide accessibility to the underlying real
operating system and hardware, as well as to ensure
policies are enforced, there is a similar need for a"NIC
hypervisor when virtual functions (VFs) are directly
available through SR-IOV. This effort is an attempt to
"fill the gap" by adding necessary functionality into NICs
Linux kernel driver.
To support SR-IOV hypervisor management interface, the kernel driver is required to support a new sysfs file hierarchy from existing /sys/class/net. Below is proposed sysfs file structure that is needed to support SR-IOV hypervisor management interface.
/sys/class/net/<interface-name>/device/sriov [1]
| +-- qos
+-- [TC, 0-7] # TC
| +-- priority # list of PCP values mapped to this TC
| +-- lsp # link strict priority
| +-- max_bw # max bandwidth for this class
| +-- min_bw # min bandwidth for this class
| +-- egress_mirror # mirror traffic from this PF to specified VF
| +-- ingress_mirror
+-- [VF-id, 0 ... 127] [2]
| +-- vlan_mirror # list of VLANs to mirror to this VF
| +-- trunk # list of VLANs to filter on (802.1Q trunk)
| +-- tpid # TPID of outer (s-tag) 0x8100 | 0x88A8
| +-- egress_mirror # mirror traffic from this VF to specified VF
| +-- ingress_mirror
| +-- mac_anti_spoof # enable/disable MAC anti spoofing
| +-- vlan_anti_spoof # enable/disable VLAN anti spoofing
| +-- loopback # enable/disable local traffic loopback (VEB/VEPA)
| +-- default_mac # default MAC, if not set use random
| +-- mac_list # list of additional MACs (00:11:22:33:44:55, aa:bb:cc:dd:ee:ff)
| +-- ucast_promisc # unicast promiscuous
| +-- mcast_promisc # multicast promiscuous
| +-- allow_bcast # allow/not allow bcast
| +-- strip_stag # strip outer tag (s-tag)
| +-- enable # enable/disable VF
| +-- link_state # up/down
| +-- queue_type # type of queues 0 " RSS, 1 " QoS
| +-- num_queues # num of RSS queues allocated to this VF, if queue_type QoS same as number TCs set in PF
| +-- max_tx_rate # ignore if TC in use
| +-- min_tx_rate # ignore if TC in use
| +-- stats # 64 bit counters
| +-- rx_bytes
| +-- rx_packets
| +-- rx_dropped
| +-- tx_bytes
| +-- tx_packets
| +-- tx_dropped
| +-- tx_spoofed
| +-- reset_stats # reset VF stats counters
| +-- qos
+-- [TC, 0-8]
| +-- share # % share of TC for this VF
[1] kobject started from "sriov" is not available from existing kernel sysfs, and it requires device driver to implement this interface.
[2] assume maximum # of VF supported by a PF is 128. To support a device that supports more than 128 SR-IOV instances, a "vfx" is added to 0..127. With "vfx" kobject, users need to add vf index as the first parameter and followed by ":".
Below are definitions of SR-IOV hypervisor functions:
The vlan_mirror sysfs kobject supports both ingress and egress traffic mirroring.
Example of how a user could mirror traffic based upon VLANs 2,4,6,18-22 to VF 3 of PF p1p1
# echo add 2,4,6,18-22 > /sys/class/net/p1p1/device/sriov/3/vlan_mirror
Example of how a user could remove VLAN 4, 15-17 from traffic mirroring at destination VF 3.
# echo rem 15-17 >/sys/class/net/p1p1/device/sriov/3/vlan_mirror
Example of how a user could remove all VLANs from mirroring at VF 3.
# echo rem 0 - 4095> /sys/class/net/p1p1/device/sriov/3/vlan_mirror
The trunk sysfs kobject supports two operations: add and rem. The add operator supports users to add one or more VLAN id into VF VLAN filtering. The rem operator supports removing VLAN ids from the VF VLAN filtering list.
Example of how a user can add multiple VLAN tags, VLANs 2,4,5,10-20, by PF, p1p2, on a selected VF, 1, for filtering, with the sysfs support:
#echo add 2,4,5,10-20 > /sys/class/net/p1p2/device/sriov/1/trunk
Example of how a user could remove VLANs 5, 11-13 from PF p1p2 VF 1 with sysfs:
#echo rem 5,11-13 > /sys/class/net/p1p2/device/sriov/1/trunk
Note: for rem, if VLAN id is not on the VLAN filtering list, the VLAN id will be ignored.
The trunk sysfs kobject used to specify TPID ot the outer VLAN tag (s-tag). Default value should be 0x8100. Could be set to 0x88A8 or 0x8100 or decimal equivalent 33024 | 34984"
Example of how a user set TPID to 88a8:
#echo 0x88a8 > /sys/class/net/p1p2/device/sriov/1/tpid
To show configured value:
#cat /sys/class/net/p1p2/device/sriov/1/tpid
The egress_mirror sysfs kobject supports egress traffic mirroring.
Example of how a user could add egress traffic mirroring on PF p1p2 VF 1 to VF 7
#echo add 7 > /sys/class/net/p1p2/device/sriov/1/egress_mirror
remove egress traffic mirroring on PF p1p2 VF 1 to VF 7
#echo rem 7 > /sys/class/net/p1p2/device/sriov/1/egress_mirror
The ingress_mirror sysfs kobject support ingress traffic mirroring. Example of how a user could mirror ingress traffic on PF p1p2 VF 1 to VF 7
#echo add 7 > /sys/class/net/p1p2/device/sriov/1/ingress_mirror
Example of how a user could show current ingress mirroring configuration
#cat /sys/class/net/p1p2/device/sriov/1/ingress_mirror
The mac_anti_spoof sysfs kobject supports Enable/Disable
MAC anti-spoof. Currently ip link controls VLAN/MAC
anti-spoofing together. This feature will allow VFs to
transmit packets with any SRC MAC which is needed for some
L2 applications as well as vNIC bonding within VMs if set
to OFF. Violation have to increment tx_spoof stats counter
if set to ON and packets have to be dropped
Example of how a user could enable MAC anti-spoof for PF p2p1 VF 1
#echo 1 > /sys/class/net/p1p2/device/sriov/1/mac_anti_spoof
Example of how a user could disable MAC anti-spoof for PF p2p1 VF 1
#echo 0 > /sys/class/net/p1p2/device/sriov/1/mac_anti_spoof
The vlan_anti_spoof sysfs kobject supports Enable/Disable
VLAN anti-spoof. Currently ip link controls VLAN/MAC
anti-spoofing together. This feature will allow VFs to
transmit packets only with VLAN tag specified in "trunk"
settings, also will not allow to transmit "untagged"
packets if set to ON. Violation have to increment tx_spoof
stats counter.
Example of how a user could enable VLAN anti-spoof for PF p2p1 VF 1
#echo 1 > /sys/class/net/p1p2/device/sriov/1/vlan_anti_spoof
Example of how a user could disable VLAN anti-spoof for PF p2p1 VF 1
#echo 0 > /sys/class/net/p1p2/device/sriov/1/vlan_anti_spoof
To display current settings
#cat /sys/class/net/p1p2/device/sriov/1/vlan_anti_spoof
The loopback sysfs kobject supports Enable/Disable VEB/VEPA (Local loopback).
Example of how a user could allow traffic switching between VFs on the same PF
#echo 1 > /sys/class/net/p1p2/device/sriov/loopback
Example of how a user put Hairpin traffic to the switch PF is connected to
#echo 0 > /sys/class/net/p1p2/device/sriov/loopback
Example of how to show loopback configuration.
#cat /sys/class/net/p1p2/device/sriov/loopback
The mac sysfs kobject supports setting default MAC address, If MAC address is set by this command, PF won't allow VF to change it using MBOX request Example of setting default MAC address to VF 1
#echo "00:11:22:33:44:55" > /sys/class/net/p1p2/device/sriov/1/default_mac
Example of how to show default MAC address
#cat /sys/class/net/p1p2/device/sriov/1/default_mac
The mac_list sysfs kobject supports adding additional MACs to the VF. Default MAC is taken from "ip link set p1p2 vf 1 mac 00:11:22:33:44:55" if configured. If not configures, random address is assigned to VF by NIC. If mac configured using IP LINK command, it doesn't allow VF to change it via MBOX/AdminQ requests "
Example of how to add mac 00:11:22:33:44:55 and
00:66:55:44:33:22 to PF p1p2 VF 1 #echo add
"00:11:22:33:44:55,00:66:55:44:33:22" >
/sys/class/net/p1p2/device/sriov/1/mac_list
Example of how to delete mac 00:11:22:33:44:55 from above VF device
#echo rem 00:11:22:33:44:55 > /sys/class/net/p1p2/device/sriov/1/mac_list
Example of how to display a VF MAC address list
#cat /sys/class/net/p1p2/device/sriov/1/mac_list
The ucast_promisc sysfs kobject supports setting/unsetting VF device unicast promiscuous mode promiscuous mode
Example of how to set unicast promiscuous on PF p1p2 VF 1
#echo 1 > /sys/class/net/p1p2/device/sriov/1/ucast_promisc
Example of how to unset unicast promiscuous on PF p1p2 VF 1
#echo 0 > "/sys/class/net/p1p2/device/sriov/1/ucast_promisc
Example of how to show current promiscuous mode
configuration
#cat /sys/class/net/p1p2/device/sriov/1/ucast_promisc
The mcast_promisc sysfs kobject supports setting/unsetting VF device multicast promiscuous mode promiscuous mode Example of how to set unicast promiscuous on PF p1p2 VF 1
#echo 1 > /sys/class/net/p1p2/device/sriov/1/mcast_promisc
Example of how to unset unicast promiscuous on PF p1p2 VF 1
#echo 0 > "/sys/class/net/p1p2/device/sriov/1/mcast_promisc
Example of how to show current promiscuous mode
configuration
#cat /sys/class/net/p1p2/device/sriov/1/mcast_promisc
The allow_bcast sysfs kobject supports enabling/disabling VF device to receive promiscuous broadcast packets
Example of how to allow broadcast on PF p1p2 VF 1
#echo 1 > /sys/class/net/p1p2/device/sriov/1/allow_bcast
Example of how to disable bcast on PF p1p2 VF 1
#echo 0 > "/sys/class/net/p1p2/device/sriov/1/allow_bcast
Example of how to show current promiscuous mode
configuration
#cat /sys/class/net/p1p2/device/sriov/1/allow_bcast
The strip_stag sysfs kobject supports enabling/disabling VF
device outer VLAN stripping. If vlan is stripped
information have to be posted to RX descriptor. On transmit
VLAN id from TX descriptor have to be inserted to packet
Example of how to enable VLAN strip on VF 3
# echo 1 > /sys/class/net/p1p1/device/sriov/3/strip_stag
Example of how to disable VLAN striping VF 3
# echo 0 > /sys/class/net/p1p1/device/sriov/3/strip_stag
The enable sysfs kobject supports enabling/disabling VF device
Example of how to enable VF 3
# echo 1 > /sys/class/net/p1p1/device/sriov/3/enable
Example of how to disable VF 3
# echo 0 > /sys/class/net/p1p1/device/sriov/3/enable
Show VF 3 enable state
# cat /sys/class/net/p1p1/device/sriov/3/enable
The link_state sysfs kobject displays link status
(up/down/disabled) Example of how to enable VF 3
# cat /sys/class/net/p1p1/device/sriov/3/link_state
The stats sysfs kobject supports getting VF statistics (64bit counters)
Example of how to display stats of VF 1
#cat /sys/class/net/p1p2/device/sriov/1/stats
rx_bytes
rx_dropped
rx_packets
tx_bytes
tx_dropped
tx_packets
tx_spoofed
Example of how to display anti-spoofing violations counter for VF 1
#cat /sys/class/net/p1p2/device/sriov/1/stats/tx_spoofed
The reset_stats sysfs kobject resets VF stats counters
Example of how to reset stats for VF 1
#echo > 1 /sys/class/net/p1p2/device/sriov/1/stats/reset_stats
The queue_type sysfs kobject is used to set type of queues 0- RSS, 1 - QoS, default RSS
Example of how to set queue type RSS for VF 3
# echo 0 > /sys/class/net/p1p1/device/sriov/3/queue_type
Example of how to set type QoS for VF 3
# echo 1 > /sys/class/net/p1p1/device/sriov/3/queue_type
Show VF 3 queue type
# cat /sys/class/net/p1p1/device/sriov/3/queue_type
The num_queues sysfs kobject is used to set number of queues for VF, If queue_type is QoS number of queues cannot be set (have to be equal to number of TC set for PF)
Example of how to set 8 queues for VF5 if queue_type is RSS
# echo 8 > /sys/class/net/p1p1/device/sriov/5/num_queues
Show VF 5 number of queues for VF 5 type
# cat /sys/class/net/p1p1/device/sriov/5/num_queues
The max_tx_rate sysfs kobject used to set MAX transmit rate in Mbps for VF (ignored if TC QoS is in used)
Example of how to set 200Mbps limit for VF 3
# echo 200 > /sys/class/net/p1p1/device/sriov/3/max_tx_rate
Show VF 3 max_tx_rate
# cat /sys/class/net/p1p1/device/sriov/3/max_tx_rate
The min_tx_rate sysfs kobject used to set MIX transmit rate in Mbps for VF (ignored if TC QoS is in used)
Example of how to set 20Mbps limit for VF 3
# echo 20 > /sys/class/net/p1p1/device/sriov/3/min_tx_rate
Show VF 3 min_tx_rate
# cat /sys/class/net/p1p1/device/sriov/3/min_tx_rate
The share sysfs kobject used to set traffic share used for specified traffic class for VF (TC settings are done for PF in relevant QoS section)
Example of how to set 15% of traffic share for TC1 VF 7
# echo 15 > /sys/class/net/p1p1/device/sriov/7/qos/1/share
To display current setting for TC3 VF 8
# cat /sys/class/net/p1p1/device/sriov/8/qos/3/share
The priority sysfs kobject used to set list of PCP values to map to traffic class
Example to set priority 0 and 1 to traffic class 0
# echo 0,1 > /sys/class/net/p1p1/device/sriov/qos/0/priority
To display current setting for TC3
# cat /sys/class/net/p1p1/device/sriov/qos/3/priority
The lsp sysfs kobject used to set Link Strict Priority
Example to set LSP priority for traffic class 0
# echo 0,1 > /sys/class/net/p1p1/device/sriov/qos/0/lsp
To display current setting for TC0
# cat /sys/class/net/p1p1/device/sriov/qos/0/lsp
The max_bw sysfs kobject used to set Max bandwidth in Mbps for TC
Example to set Max bandwidth 2Gbps for traffic class 2
# echo 2000 > /sys/class/net/p1p1/device/sriov/qos/2/max_bw
To display current setting for TC0
# cat /sys/class/net/p1p1/device/sriov/qos/0/max_bw
The mnx_bw sysfs kobject used to set Max bandwidth in Mbps for TC
Example to set Min bandwidth 20Mbps for traffic class 2
# echo 20 > /sys/class/net/p1p1/device/sriov/qos/2/min_bw
To display current setting for TC0
# cat /sys/class/net/p1p1/device/sriov/qos/2/min_bw