Configure SR-IOV Network Device Plugin for Kubernetes¶
SRIOV CNI plugin exposes the Scalable Function (SF) on the DPU node to Kubernetes. User must create a SF ConfigMap resource to expose the SF on the DPU node to Kubernetes.
Create a SF Configmap on the Host Node¶
To configure the SRIOV CNI plugin, a SF ConfigMap resource on the host must be created with the following settings.
Create
sf-cm.yaml
file with the example contents below.vi sf-cm.yaml
Example Contents:
Note: Make sure to update the values in the example content with the actual values as per your envirement.
apiVersion: v1 kind: ConfigMap metadata: name: sriovdp-config namespace: kube-system data: config.json: | { "resourceList": [ { "resourceName": "bf3_p0_sf", "resourcePrefix": "nvidia.com", "deviceType": "auxNetDevice", "selectors": [{ "vendors": ["15b3"], "devices": ["a2dc"], "pciAddresses": ["0000:03:00.0"], "pfNames": ["p0#1"], "auxTypes": ["sf"] }] }, { "resourceName": "bf3_p1_sf", "resourcePrefix": "nvidia.com", "deviceType": "auxNetDevice", "selectors": [{ "vendors": ["15b3"], "devices": ["a2dc"], "pciAddresses": ["0000:03:00.1"], "pfNames": ["p1#1"], "auxTypes": ["sf"] }] } ] }
Apply the SF ConfigMap on the Host node.
kubectl apply -f sf-cm.yaml
Install SR-IOV Network Device Plugin for Kubernetes¶
Download the
sriovdp-daemonset.yaml
.wget https://raw.github.com/k8snetworkplumbingwg/sriov-network-device-plugin/master/deployments/sriovdp-daemonset.yaml
Add
tolerations
insriovdp-daemonset.yaml
file to allow the sriovdp pods run on the DPU nodes. Add the below section underspec
. Download the examplesriovdp-daemonset.yaml
file, here.tolerations: - key: "dpu" value: "true" operator: "Equal"
Apply the SRI-OV CNI plugin.
kubectl create -f sriovdp-daemonset.yaml
Verify the SRI-OV device plugin pod created for each node in the cluster.
kubectl get pods -owide -n kube-system
Sample Response:
kube-sriov-device-plugin-nstrs 1/1 Running 0 2d2h <IP address> localhost.localdomain <none> <none> kube-sriov-device-plugin-p8mv5 1/1 Running 0 2d22h <IP address> sm-hgx1 <none> <none>
Verify the pod deployed for the DPU. You should see the SF ConfigMap read by the pod and resources created for each SF. The pod will iterate through all the PCI resources but should eventually locate the correct one.
kubectl logs pod/kube-sriov-device-plugin-nstrs -n kube-system
In the example logs below, look for new resource server created for
bf3_p0_sf
andbf3_p1_sf
ResourcePools.I0814 15:33:51.566759 1 manager.go:57] Using Kubelet Plugin Registry Mode I0814 15:33:51.567877 1 main.go:46] resource manager reading configs I0814 15:33:51.568002 1 manager.go:86] raw ResourceList: { "resourceList": [ { "resourceName": "bf3_p0_sf", "resourcePrefix": "nvidia.com", "deviceType": "auxNetDevice", "selectors": [{ "vendors": ["15b3"], "devices": ["a2dc"], "pciAddresses": ["0000:03:00.0"], "pfNames": ["p0#1"], "auxTypes": ["sf"] }] }, { "resourceName": "bf3_p1_sf", "resourcePrefix": "nvidia.com", "deviceType": "auxNetDevice", "selectors": [{ "vendors": ["15b3"], "devices": ["a2dc"], "pciAddresses": ["0000:03:00.1"], "pfNames": ["p1#1"], "auxTypes": ["sf"] }] } ] } I0814 15:33:51.569064 1 factory.go:211] *types.AuxNetDeviceSelectors for resource bf3_p0_sf is [0x40000d24e0] I0814 15:33:51.569131 1 factory.go:211] *types.AuxNetDeviceSelectors for resource bf3_p1_sf is [0x40000d2680] ... I0814 15:33:51.641690 1 manager.go:156] New resource server is created for bf3_p0_sf ResourcePool I0814 15:33:51.641701 1 manager.go:121] Creating new ... I0814 15:33:51.692891 1 factory.go:124] device added: [identifier: mlx5_core.sf.5, vendor: 15b3, device: a2dc, driver: mlx5_core] I0814 15:33:51.692939 1 manager.go:156] New resource server is created for bf3_p1_sf ResourcePool
Examine the DPU node to ensure that it has correct resources available.
kubectl describe node localhost.localdomain
Sample Response:
Name: localhost.localdomain Capacity: nvidia.com/bf3_p0_sf: 1 nvidia.com/bf3_p1_sf: 1 Allocatable: nvidia.com/bf3_p0_sf: 1 nvidia.com/bf3_p1_sf: 1
Next Steps:
- Update the Resource Name as per your network configurations in Multus Network Attachment Definition.