SPK Fixes and Known Issues

This list highlights fixes and known issues for this SPK release.

SPK Release Notes SPK Release Information

Version: 2.0.0
Build: 2.0.0

Note: This content is current as of the software release date
Updates to bug information occur periodically. For the most up-to-date bug data, see Bug Tracker.


Cumulative fixes from SPK v2.0.0 that are included in this release
Known Issues in SPK v2.0.0



Cumulative fixes from SPK v2.0.0 that are included in this release

ID Number Severity Links to More Info Description
1824317-1 3-Major BT1824317 Controller fails in discovering endpoints from service which are using named ports

Cumulative fix details for SPK v2.0.0 that are included in this release


1824317-1 : Controller fails in discovering endpoints from service which are using named ports

Links to More Info: BT1824317

Component: SPK

Symptoms:
Controller may not be able to find the target port from the endpoints and application deployments. When such case happens, the controller does not configure TMM for respective ingress TCP CR. CR status could reflect "False" state.

Conditions:
There are 3 conditions which could lead to this issue:

1. There may be a discrepancy in the pods and endpoints cache maintained by the controller. The caching mechanism is supported by the Kubernetes client-go library. If pod cache is not updated and service is using named port then controller may not get the target port. If controller finds the target port by referring pod cache but endpoint cache is not updated then also controller may fail in getting endpoints matching the port number. This can only happen if pod or endpoint cache never gets synchronized with Kubernetes API server due to issues related to environment, application pods etc.

2. The app's pods or endpoints might be in a faulty or inconsistent state. This could lead to incorrect updates of the pod and endpoint resources on the Kubernetes API server.

3. The Kubernetes API server itself might not be updating the pod or endpoint resources correctly due to environmental factors or underlying infrastructure issues.

Impact:
Controller may fail to configure TMM with ingress configs even if a ingress TCP CR exists, applications pods are deployed, and they seem to be running fine.

Workaround:
1st workaround is to scale down application deployment and scale it back up, which triggers new events and pod and endpoints cache in controller could get synchronized with Kubernetes API server.

2nd workaround is to scale down controller and scale it back up. That way, controller could process all the events related to CR, app service, endpoints etc. when it comes up and caches in controller could get synchronized.

Fix:
Below fix is made under the assumption that the issue stems from a cache synchronization problem. However, if the root cause is Application/Environmental issue, the CNE controller will not be able to resolve it.

Replaced the implementation that relies on application pod to retrieve the target port. Instead, the same information can be obtained directly from the endpoints/endpointSlices resource. Controller only relies on endpoints/endpointSlices resources to perform service discovery.

If an endpoint belongs to the service of interest lacks the port number specified in the service, controller records endpoint details stored in the cache to collect useful data for debugging purpose. In such case, controller also retrieves the endpoints details directly from the Kubernetes API server and log the information.



Known Issues in SPK v2.0.0


SPK Issues

ID Number Severity Links to More Info Description
1926357-1 3-Major Pods stuck in "Container Creating" status in SPK watched namespace
1926205-2 3-Major Modifying destination address and destination port in F5SPKIngressTCP and F5SPKIngressUDP CRs does not update
1890689-1 3-Major Undesired egress routing from pods in SPK-watched namespaces with ICNI2 and MEG enabled
1889425-1 3-Major Egress traffic fails during an f5-tmm pod restart event
1933273 3-Major The Helm chart installation fails for f5-crdconversion when using a custom service account due to missing ownership metadata

Known Issue details for SPK v2.0.0

1926357-1 : Pods stuck in "Container Creating" status in SPK watched namespace

Component: SPK

Symptoms:
Pods in an SPK-watched namespace might remain in the "ContainerCreating" status due to a misconfiguration in the SPK or the AdminPolicyBasedExternalRoute custom resource.

Conditions:
There are four conditions which could lead to this issue:

1. SPK is deployed with f5-tmm.tmm.icni2.enabled = true and f5-tmm.tmm.ovn_meg.enabled: false in the helm overrides file.

2. One or more namespaces are listed in controller.watchNamespace field of the helm overrides file.

3. An AdminPolicyBasedExternalRoute CR is applied with one of the watched namespaces list in the from section and the f5-tmm in the nextHops section.

4. A deployment is applied in the same watched namespace named in the AdminPolicyBasedExternalRoute CR.

Impact:
Pods in watch namespaces will never reach Ready state and are completely inoperative.

Workaround:
Delete all AdminPolicyBasedExternalRoute CRs referring to the watched namespace or re-install the SPK with f5-tmm.tmm.ovn_meg.enabled = true.


1926205-2 : Modifying destination address and destination port in F5SPKIngressTCP and F5SPKIngressUDP CRs does not update

Component: SPK

Symptoms:
When a destination address or destination port is modified in F5SPKIngressTCP or F5SPKIngressUDP CRs, the updated value is not sent to TMM.

Conditions:
Modify F5SPKIngressTCP and F5SPKIngressUDP destination address

Impact:
Updated value is not sent to TMM

Workaround:
Delete and recreate the CR


1890689-1 : Undesired egress routing from pods in SPK-watched namespaces with ICNI2 and MEG enabled

Component: SPK

Symptoms:
TMM has to be used as the external gateway for egress traffic only for the pods of the namespace matched in the AdminPolicyBasedExternalRoute CR. But, when both f5-tmm.tmm.icni2.enabled and f5-tmm.tmm.ovn_meg.enabled are set to true in the Helm overrides file, all pods in watched namespaces can egress through the f5-tmm pod, even if they do not have a matching AdminPolicyBasedExternalRoute CR.

Conditions:
There are four conditions which could lead to this issue:

1. SPK is installed with multiple application namespaces in the controller.watchNamespace, along with both f5-tmm.tmm.icni2.enabled and f5-tmm.tmm.ovn_meg.enabled are set to true in the helm overrides file.

2. An F5SPKEgress CR is applied to the SPK namespace.

3. AdminPolicyBasedExternalRoute CR(s) are applied with only some of the watched namespaces in the from section.

4. One or more watched namespaces do not have a matching AdminPolicyBasedExternalRoute CR.

Impact:
All pods in watched namespaces egress through the f5-tmm pod, despite not all namespaces having an AdminPolicyBasedExternalRoute CR listing them in the from section.

Workaround:
Reinstall SPK with f5-tmm.tmm.icni2.enabled set to false in the Helm overrides file. This ensures that only watched namespaces with a matching AdminPolicyBasedExternalRoute CR can egress through the f5-tmm pod.


1889425-1 : Egress traffic fails during an f5-tmm pod restart event

Component: SPK

Symptoms:
Egress traffic fails while one or more f5-tmm pods are restarting.

Conditions:
There are two conditions which could lead to this issue:

- SPK is installed and configured to handle egress traffic from watched namespaces.

- One or more f5-tmm pods crash or are restarted by Kubernetes.

Impact:
Egress traffic might fail from traffic routing to an f5-tmm pod that is restarting.

Workaround:
None


1933273 : The Helm chart installation fails for f5-crdconversion when using a custom Service Account (SA) due to missing ownership metadata.

Component: SPK

Symptoms:

The f5-crdconversion Helm chart fails to deploy when a custom Service Account (SA) is specified during installation. Helm requires ownership metadata labels and annotations (app.kubernetes.io/managed-by, meta.helm.sh/release-name, and meta.helm.sh/release-namespace) to be present. However, the chart does not automatically add these to the custom SA, causing Helm to reject the deployment. To complete the installation successfully, you must manually add the required labels and annotations to the custom SA.

Conditions:

  1. Used the following configuration in the Helm chart (values.yaml).
    
        crdconversion:
          serviceAccount:
            name: f5-sa-rdt
        
  2. Manually Create the custom Service Account.
    
        kubectl create sa f5-sa-rdt
        
  3. Attempt to Install the Helm Chart with custom Service Account.
    
        helm install f5-crdconversion f5-crdconversion -f f5-crdconversion/values.yaml
        
  4. Impact:

    The f5-crdconversion Helm chart fails to deploy when a custom Service Account (SA) is specified during installation with below error:

    
    Error: INSTALLATION FAILED: Unable to continue with install: ServiceAccount "f5-sa-rdt" in namespace "default" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "f5-crdconversion"; annotation validation error: missing key "meta.helm.sh/release-namespace": must be set to "default"
    

    Workaround:

    1. Label the custom Service Account.
      
          kubectl label serviceaccount f5-sa-rdt -n default app.kubernetes.io/managed-by=Helm --overwrite
          
    2. Annotate the custom Service Account.
      
          kubectl annotate serviceaccount f5-sa-rdt -n default \
          meta.helm.sh/release-name=f5-crdconversion \
          meta.helm.sh/release-namespace=default --overwrite
          
    3. Install the Helm Chart.
      
          helm install f5-crdconversion f5-crdconversion -f f5-crdconversion/values.yaml
          

    4. This issue may cause the configuration to fail to load or may significantly impact system performance after upgrade


      *********************** NOTICE

      For additional support resources and technical documentation, see:
      *******************************