Troubleshooting DNS/NAT46

Overview

The Service Proxy for Kubernetes (SPK) DNS/NAT46 feature is part of the F5SPKEgress Custom Resource (CR), and enables connectivity between internal IPv4 Pods and external IPv6 hosts. The DNS/NAT46 feature relies on a number of basic networking configurations to successfully translate IPv4 and IPv6 connections. If you have configured the DNS/NAT46 feature, and are unable to successfully translate between hosts, use this document to determine the missing or improperly configured networking components.

Configuration review

Review the points below to ensure the essential DNS/NAT46 configuration components are in place:

  • You must enable Intelligent CNI 2 (iCNI2) when installing the Ingress Controller.
  • You must have an associated F5SPKDnscache CR.
  • The IP address defined in the dnsNat46PoolIps parameter must not be reachable by internal Pods.
  • The dSSM Database Pods must be installed.

Requirements

Prior to getting started, ensure you have the Debug Sidecar enabled (default behavior).

Procedure

Use the steps below to verify the required networking components are present and correctly configured.

  1. Switch to the Ingress Controller and Service Proxy TMM Project:

    oc project <project>
    

    In this example, the Ingress Controller is installed in the spk-ingress Project:

    oc project spk-ingress
    
  2. Obtain Service Proxy TMM’s IPv4 and IPv6 routing tables:

    A. Obtain the IPv4 routing table:

    oc exec -it deploy/f5-tmm -- ip r
    

    The command output should resemble the following:

    default via 169.254.0.254 dev tmm
    10.20.2.0/24 dev external-1 proto kernel scope link src 10.20.2.207
    10.130.0.0/23 dev eth0 proto kernel scope link src 10.130.0.9
    10.144.175.0/24 dev internal proto kernel scope link src 10.144.175.231
    

    B. Obtain the IPv6 routing table:

    oc exec -it deploy/f5-tmm -- ip -6 r
    

    The command output should resemble the following:

    2002::10:20:2:0/112 dev external-2 proto kernel metric 256 pref medium
    2002::/32 via 2002::10:20:2:206 dev external-2 metric 1024 pref medium
    
  3. Quick check: Is the F5SPKEgress CR dnsNat46PoolIps parameter reachable from TMM?

    A. In this example, the dnsNat46PoolIps parameter is set to 10.10.2.100 and should be accessble via the external-1 interface. The routing table below reveals the IP address is not reachable:

    default via 169.254.0.254 dev tmm
    10.20.2.0/24 dev external-1 proto kernel scope link src 10.20.2.207
    10.130.0.0/23 dev eth0 proto kernel scope link src 10.130.0.9
    10.144.175.0/24 dev internal proto kernel scope link src 10.144.175.231
    

    B. Copy the example F5SPKStaticRoute to a file:

    apiVersion: "k8s.f5net.com/v1"
    kind: F5SPKStaticRoute
    metadata:
      name: "staticroute-dns"
      namespace: spk-ingress
    spec:
      destination: 10.10.2.100
      prefixLen: 32
      type: gateway
      gateway: 10.20.2.206
    

    C. Install the static route to enable reachability:

    oc apply -f staticroute-dns.yaml
    

    D. After installing the F5SPKStaticRoute CR, we can use Step 2 above to verify a route for 10.10.2.100 has been added, and is now reachable:

    default via 169.254.0.254 dev tmm
    10.10.2.100 via 10.20.2.206 dev external-1
    10.20.2.0/24 dev external-1 proto kernel scope link src 10.20.2.207
    10.130.0.0/23 dev eth0 proto kernel scope link src 10.130.0.9
    10.144.175.0/24 dev internal proto kernel scope link src 10.144.175.231
    
  4. If the external IPv6 application is still not accessable, tcpdumps will be required. Obtain the Service Proxy TMM interface information:

    oc exec -it deploy/f5-tmm -- ip a | grep -i '<interface names>:' -A2
    

    In this example, three interfaces are filtered: internal, external-1, and external-2:

    oc exec -it deploy/f5-tmm -- ip a | grep 'internal:\|external-1:\|external-2:' -A2
    
      7: external-1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
        link/ether a6:73:48:a4:de:cd brd ff:ff:ff:ff:ff:ff
        inet 10.20.2.207/24 brd 10.20.2.0 scope global external-1
    --
      8: internal: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
        link/ether 12:f3:10:d0:47:f7 brd ff:ff:ff:ff:ff:ff
        inet 10.144.175.231/24 brd 10.144.175.0 scope global internal
    --
      9: external-2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500
        link/ether a6:73:48:a4:de:cd brd ff:ff:ff:ff:ff:ff
        inet6 2002::10:20:2:207/112 scope global
    
  5. Enter the Service Proxy TMM debug sidecar:

    oc exec -it deploy/f5-tmm -- bash
    
  6. Start tcpdump on the external IPv4 interface, filter for DNS packets on port 53, and connect from the internal Pod:

    tcpdump -ni <external IPv4 interface> port 53 
    

    In this example, the DNS server 10.10.2.101 is not responding on the external-1 interface:

    tcpdump -ni external-1 port 53
    
    listening on external-1, link-type EN10MB (Ethernet), capture size 65535 bytes
    16:25:09.230728 IP 10.10.2.101.36227 > 10.20.2.206.53: 41724+ AAAA? ipv6.f5.com. (33) out slot1/tmm1
    16:25:09.230746 IP 10.10.2.101.36227 > 10.20.2.206.53: 8954+ A? ipv6.f5.com. (33) out slot1/tmm1
    16:25:09.235973 IP 10.10.2.101.46877 > 10.20.2.206.53: 8954+ A? ipv6.f5.com. (33) out slot1/tmm0
    16:25:09.235987 IP 10.10.2.101.46877 > 10.20.2.206.53: 41724+ AAAA? ipv6.f5.com. (33) out slot1/tmm0
    

    After configuring the DNS server to respond on the proper interface, the internal Pod receives a response:

    _images/spk_info.png Note: The 10.2.2.1 IP address is issued by TMM from the dnsNat46Ipv4Subnet.

    16:27:19.183862 IP 10.128.3.218.55087 > 1.2.3.4.53: 30790+ A? ipv6.f5.com. (32) in slot1/tmm1 
    16:27:19.183892 IP 10.128.3.218.55087 > 1.2.3.4.53: 2377+ AAAA? ipv6.f5.com. (32) in slot1/tmm1 
    16:27:19.238302 IP 1.2.3.4.53 > 10.128.3.218.55087: 30790* 1/1/0 A 10.2.2.1 (93) out slot1/tmm1 lis=egress-dns-ipv4
    16:27:19.238346 IP 1.2.3.4.53 > 10.128.3.218.55087: 2377* 1/0/0 AAAA 2002::10:20:2:216 (60) out slot1/tmm1 lis=egress-dns-ipv4
    
  7. If DNS/NAT46 translation is still not successful, start tcpdump on the external IPv6 interface and filter for application packets by service port:

    tcpdump -ni <external IPv6 interface> port <service port>
    

    In this example, the Pod attempts a connection to application service port 80, and the connection is reset R:

    23:07:48.407393 IP6 2002::10:20:2:101.43266 > 2002::10:20:2:216.80: Flags [S], seq 3294182200, win 26580, 
    23:07:48.410721 IP6 2002::10:20:2:216.80 > 2002::10:20:2:101.43266: Flags [R.], seq 0, ack 3294182201, win 0, 
    

    The application service was not exposed in the remote cluster. After exposing the service, the client receives a responds on service port 80:

    23:12:59.250111 IP6 2002::10:20:2:101.57914 > 2002::10:20:2:216.80: Flags [S], seq 991607777, win 26580, 
    23:12:59.251822 IP6 2002::10:20:2:216.80 > 2002::10:20:2:101.57914: Flags [S.], seq 3169072611, ack 991607778, win 14400, 
    23:12:59.254113 IP6 2002::10:20:2:101.57914 > 2002::10:20:2:216.80: Flags [.], ack 1, win 208, 
    23:12:59.255245 IP6 2002::10:20:2:101.57914 > 2002::10:20:2:216.80: Flags [P.], seq 1:142, ack 1, win 208, 
    23:12:59.256931 IP6 2002::10:20:2:216.80 > 2002::10:20:2:101.57914: Flags [.], ack 142, win 14541, 
    23:12:59.258614 IP6 2002::10:20:2:216.80 > 2002::10:20:2:101.57914: Flags [P.], seq 1:1429, ack 142, win 14541, 
    23:12:59.265990 IP6 2002::10:20:2:101.57914 > 2002::10:20:2:216.80: Flags [F.], seq 142, ack 3760, win 623, 
    23:12:59.268233 IP6 2002::10:20:2:216.80 > 2002::10:20:2:101.57914: Flags [.], ack 143, win 14541, 
    23:12:59.268246 IP6 2002::10:20:2:216.80 > 2002::10:20:2:101.57914: Flags [F.], seq 3760, ack 143, win 14541, 
    23:12:59.269932 IP6 2002::10:20:2:101.57914 > 2002::10:20:2:216.80: Flags [.], ack 3761, win 623, 
    
  8. If DNS/NAT46 translation is still not successful, view the Service Proxy TMM logs.

    _images/spk_info.png Note: If you enabled Fluentd Logging, refer to the Viewing Logs section for assistance.

    In this example, the SESSIONDB_EXTERNAL_SERVICE (Sentinel Service object name) is misspelled in the Ingress Controller Helm values file:

    {"type":"tmm0","pod_name":"f5-tmm","log":"redis_dns_resolver_cb/177: DNS resolution failed for type=1 with rcode=3 rr=0\nredis_reconnect_later/901: Scheduling REDIS connect: 2\n"
    

    After correcting the Sentinel Service object name and reinstalling the Ingress Controller, TMM is able to connect to the dSSM database:

    {"type":"tmm0","pod_name":"f5-tmm","log":"redis_sentinel_connected/687: Connecion establishment with REDIS SENTINEL server successful\n",
    

    Other errors may be evident viewing the egress-ipv4-dns46-irule events. A successful DB entry begins and ends with the following messages:

    {"type":"tmm0","pod_name":"f5-tmm","log":"<134>f5-tmm-84d46ddcb6-bskbb -l=32[19]: Rule egress-ipv4-dns46-irule <CLIENT_ACCEPTED>: <191> DNS46 (10.128.0.29) debug  ***** iRule: Simple DNS46 v0.6 executed *****\n"
    
    {"type":"tmm0","pod_name":"f5-tmm","log":"<134>f5-tmm-84d46ddcb6-bskbb -l=32[19]: Rule egress-ipv4-dns46-irule <DNS_RESPONSE>: <191> DNS46 (10.128.0.29) debug  ***** iRule: Simple DNS46 v0.6 successfully completed *****\n"
    

Feedback

Provide feedback to improve this document by emailing spkdocs@f5.com.