Cluster Requirements

Overview

Prior to integrating Service Proxy for Kubernetes (SPK) into the OpenShift cluster, review this document to ensure the required software components are installed and properly configured.

Software support

The SPK and Red Hat software versions listed below are the tested versions. F5 recommends these versions for the best performance and installation experience.

SPK OpenShift
1.9.2 4.14.26
1.9.1 4.14.10
1.9.0 4.12.26 and 4.14.1
1.8.2 4.12.26
1.8.0 4.12.22
1.7.10 - 1.7.11 4.12.59
1.7.5 - 1.7.9 4.12.45
1.7.4 4.12.44
1.7.3 4.12.26
1.7.2 4.12.14
1.7.1 4.12.7
1.7.0 4.12.1
1.5.0 - 1.6.1 4.10.13
1.4.12 - 1.4.15 4.8.10
1.3.1 - 1.4.11 4.7.8

Pod Networking

To support low-latency 5G workloads, SPK relies on Single Root I/O Virtualization (SR-IOV) and the Open Virtual Network with Kubernetes (OVN-Kubernetes) CNI. To ensure the cluster supports multi-homed Pods; the ability to select either the default (virtual) CNI or the SR-IOV / OVN-Kubernetes (physical) CNI, review the sections below.

Network Operator

To properly manage the cluster networks, the OpenShift Cluster Network Operator must be installed.

_images/spk_warn.png Important: OpenShift 4.8 requires configuring local gateway mode using the steps below:

  1. Create the manifest files:

    openshift-install --dir=<install dir> create cluster
    
  2. Create a ConfigMap in new manifest directory, and add the following YAML code:

    apiVersion: v1
    kind: ConfigMap
    metadata:
        name: gateway-mode-config
        namespace: openshift-network-operator
    data:
        mode: "local"
    immutable: true
    
  3. Create the cluster:

    openshift-install create cluster --dir=<install dir>
    

_images/spk_info.png The Cluster Network Operator installation on Github.

SR-IOV

Supported NICs

The table below lists the currently supported NICs.

NICs PF PCI IDs VF PCI IDs Firmware
Intel XL710 8086:1583 8086:154c NVM 8.5
Intel XXV710 8086:158b 8086:154c NVM 8.5
Mellanox ConnectX-5 15b3:1017 15b3:1018 16.31.2006
Mellanox ConnectX-6 15b3:101b 15b3:101c 26.33.1048
Mellanox ConnectX-6 Dx 15b3:101d 15b3:101e 22.33.1048
Mellanox ConnectX-6 Lx 15b3:101f 15b3:101e 26.33.1048

VF Configuration

To define the SR-IOV Virtual Functions (VFs) used by the Service Proxy Traffic Management Microkernel (TMM), configure the following OpenShift network objects:

  • An external and internal Network node policy.
  • An external and internal Network attachment definition.
    • Set the spoofChk parameter to off.
    • Set the trust parameter to on.
    • Set the capabilities parameter to '{"mac": true, "ips": true}'.
    • Do not set the vlan parameter, set the F5SPKVlan tag parameter.
    • Do not set the ipam parameter, set the F5SPKVlan internal parameter.

_images/spk_info.png Refer to the SPK Config File Reference for examples.

CPU Allocation

Multiprocessor servers divide memory and CPUs into multiple NUMA nodes, each having a non-shared system bus. When installing the SPK Controller, the CPUs and SR-IOV VFs allocated to the Service Proxy TMM container must share the same NUMA node. To ensure the CPU NUMA node alignment is handled properly by the cluster ensure the following parameters are set:

  • Set the Performance Profile’s Topology Manager Policy to single-numa-node.
  • Set the Kubelet configuration’s CPU Manager Policy to static in the Kubelet configuration.

Simultaneous Multithreading (SMT)

Simultaneous multithreading (SMT) is a technique used on current processors to improve the efficiency of processor workload by allowing multiple independent threads to run on a single physical core at the same time.

SMT has a generally beneficial impact on performance for many applications and can increase the efficiency of most systems. For this reason, SMT is often enabled by default in the kernel where the hardware supports it. Performance can be reduced in some circumstances, notably when an application executes threads that create a bottleneck and the CPU design does not handle this particularly well, then you may want to disable SMT.

  • Disable SMT - To install Pods with Guaranteed QoS, each OpenShift worker node must have Simultaneous Multi-threading (SMT) disabled in the BIOS.
  • Enable SMT - If you choose to enable SMT on existing cluster. The recommended version is OpenShift version 4.12.10

F5 recommends SMT disabled for latency and performance.

To install SPK with SMT enabled, it is necessary to activate the ‘full-pcpus-only’ CPU policy option in Kubernetes. This can be achieved by adding the following annotation to your performance profile:

Here is the example of performance-profile.yaml file to enable SMT.

  • Apply a change to current performance-profile.yaml file, remove “-nosmt” line under the spec

    apiVersion: performance.openshift.io/v2
    kind: PerformanceProfile
    metadata:
     annotations:
     kubeletconfig.experimental: "{\"cpuManagerPolicyOptions\": {\"full-pcpus-only\": \"true\"}}"
    labels: 
      machineconfiguration.openshift.io/role: worker
      name: performance-profile
      spec:
      additionalKernelArgs:
        - nmi_watchdog=0
        - audit=0
        -  mce=off
        - processor.max_cstate=1
        - idle=poll
        - intel_idle.max_cstate=0
    cpu:
        balanceIsolated: false
        isolated: 4-39,44-79
        reserved: 0-3,40-43
    hugepages:
        defaultHugepagesSize: 2M
       pages:
        - count: 32768
          node: 0
          size: 2M
       - count: 32768
         node: 1
          size: 2M
    machineConfigPoolSelector:
      pools.operator.machineconfiguration.openshift.io/ worker: ""
    nodeSelector:
      node-role.kubernetes.io/worker: ""
    numa:
      topologyPolicy: single-numa-node 
    

_images/spk_warn.png Warning: Enabling the full-pcpus-only option will affect any Guaranteed QoS pod in cluster. All Guaranteed QoS pods will require SMT aligned CPU requests, for example minimum of 2, then 4 and so on, to avoid ‘SMTAlignmentError’ scheduling error. more information at: CPU Management Policies.

Summary of the CVE

CVE, short for Common Vulnerabilities and Exposures, is a list of publicly disclosed computer security flaws. When someone refers to a CVE, they mean a security flaw that’s been assigned a CVE ID number.

How to check the version of RH has been patched for a CVE

There are a few ways that you may check if a specific kernel has been patched for a specific CVE. Here are a few of them. If you have the rpm, you could use the rpm command to check the change log and grep for the CVE name.

Example:

rpm -qp kernel-3.10.0-862.11.6.el7.x86_64.rpm --changelog | grep CVE-2017-12190

If the kernel package for the kernel in question is in a repo that is configured and enabled on your server, you could use yum as follows:

yum list --cve CVE-2017-12190 | grep kernel.x86_64
kernel.x86_64                    3.10.0-327.22.2.el7     @rhel-7-server-eus-rpms
kernel.x86_64                    3.10.0-514.2.2.el7      @rhel-7-server-rpms    
kernel.x86_64                    3.10.0-693.2.2.el7      @rhel-7-server-rpms    
kernel.x86_64                    3.10.0-862.14.4.el7     rhel-7-server-rpms 

This shows that the above kernels include patches for CVE CVE-2017-12190

Verifying the version of OpenShift is vulnerable to the CVEs

Red Hat OpenShift Container Platform (RHOCP) 4 includes a fully managed node operating system, Red Hat Enterprise Linux CoreOS, commonly referred to as RHCOS.

The OpenShift cluster updates RHCOS when applying cluster updates, including sometimes updating between RHEL minor releases. This details the underlying RHEL minor versions present in currently supported versions of OCP.

OpenShift uses RHCOS (Red Hat CoreOs). This table shows which version of RHCOS maps to which RHEL version:

RHCOS/OCP Versions  RHEL Versions
4.6           RHEL  8.2
4.7.0-4.7.23  RHEL  8.3
4.7.24+       RHEL  8.4
4.8           RHEL  8.4
4.9           RHEL  8.4
4.10          RHEL  8.4
4.11          RHEL  8.6
4.12          RHEL  8.6
4.13          RHEL  9.2

OCP 4.12 uses RHEL 8.6. The above patches were applied on RHEL8, which implies the above patches are available in OCP 4.12.

Mitigations

  • Mitigations are generally to apply Intel microcode patches, Kernel/OS patches, or else compiler mitigations such as “return trampoline” (aka retpoline).
  • Another common mitigation is to make sure that no untrusted processes/Deployment is sharing the same CPU; configure Deployment to use 2 or more “whole cores”.
  • When full-pcpus-only policy option is specified along with static CPU Manager policy, an additional check in the allocation logic of the static policy ensures that CPUs would be allocated such that full cores are allocated. Because of this check, a pod would never have to acquire single threads with the aim to fill partially-allocated cores.
  • If the kubernetes instance configured properly with static CPU policy and full-pcpus-only policy, then when TMM starts with correct count of CPU resources, for example 2 it will be assigned both threads of the same core which means it will be assigned the whole core. No thread for the same core should be assigned to other work-load.
  • However, it is still dependent on how kubernetes is configured, and how TMM starts. There should be two stages here.
    1. When all configured properly and the TMM has two threads of the same core. At the moment, mapres will detect these two threads as separate cores and will therefore start 2 tmm threads. This initial mode should have a warning for performance implications.
    2. Future implementation, when TMM starts with two threads, mapres (or similar) detects SMT, validates threads belong to the same core, and only uses 1 of these threads. i.e. starts only a single TMM thread per core. Effectively, utilizing the entire core.

Persistent storage

The required Fluentd logging collector, dSSM database and Traffic Management Microkernel (TMM) Debug Sidecar require an available Kubernetes persistent storage to bind to during installation.

Feedback

Provide feedback to improve this document by emailing spkdocs@f5.com.

Supplemental