BIG-IP Next for Kubernetes Coremond¶
Overview¶
The F5 Coremond component runs as a DaemonSet on BIG-IP Next for Kubernetes. It is designed to monitor and collect kernel core files (core dumps) from processes that terminate unexpectedly. It then converts these core files into F5-specific core files for further analysis. A core file is a snapshot of a process or program’s memory and register state at the moment it crashes. These core files are crucial for performing root cause analysis as they provide detailed insights into the state of the system and the conditions leading up to the crash.
When a process unexpectedly stops, the OS will generate a core file in the pod volume /var/crash
that is mapped to /home/crash/f5
the host machine, and then the Coremond will use the core file to create an F5 core file. This automated crash data collection helps engineers quickly diagnose issues, improving system stability and reliability.
Prerequisites¶
Ensure you have the following:
A working Kubernetes cluster running on an Ubuntu OS platform.
Apport service running to handle crash report.
core_pattern is set correctly in
/proc/sys/kernel/core_pattern
the host file.platformType: other
In values.yaml
Platform-Specific Core Patterns¶
Generic: Ubuntu-based platforms use Apport for crash reporting, and this pattern ensures that core dumps are handled correctly by Apport.
|/usr/share/apport/apport -p%p -s%s -c%c -d%d -P%P -u%u -g%g -- %E
OCP: OCP uses systemd-coredump to capture and process core dumps. The pattern correctly passes the process ID (%P), user ID (%u), group ID (%g), signal (%s), timestamp (%t), and other relevant metadata.
|/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h %e
Robin: Robin.io uses a traditional file-based core dump storage format, where…
/var/crash/core.%e.%p.%h.%t
Generate a Core File¶
To generate a core file, follow these steps:
Run the command to get the list of pods.
kubectl get pods - A
Sample Output
NAMESPACE NAME READY STATUS RESTARTS AGE
default cluster-cert-manager-57cc54f85-fst6r 1/1 Running 0 7m45s
default cluster-cert-manager-cainjector-575b969df5-x6rjz 1/1 Running 0 7m45s
default cluster-cert-manager-webhook-854c759f74-vhk9c 1/1 Running 0 7m45s
default f5-afm-648c784cb-c2zzv 2/2 Running 0 6m57s
default f5-cne-controller-568c7cf87c-4tkns 4/4 Running 0 6m57s
default f5-ipam-operator-7bc8dccd9-ktz2f 1/1 Running 0 7m16s
default f5-node-labeler-bdz8v 0/1 Init:0/1 0 6m58s
default f5-observer-0 2/2 Running 0 6m48s
default f5-observer-operator-59d46f69b7-fm6gw 2/2 Running 0 6m49s
default f5-observer-receiver-0 2/2 Running 0 6m49s
default f5-tmm-pn4tq 7/7 Running 0 6m42s
default flo-f5-lifecycle-operator-58958c4f77-lmbs7 2/2 Running 0 7m16s
default otel-collector-5f8c9fbd8-dqxpt 1/1 Running 0 6m42s
f5-utils f5-coremond-bnhf7 2/2 Running 0 4m51s
f5-utils f5-crdconversion-576f7b7579-4d5n2 2/2 Running 0 6m47s
f5-utils f5-dssm-db-0 3/3 Running 0 6m45s
f5-utils f5-dssm-db-1 3/3 Running 0 5m26s
f5-utils f5-dssm-db-2 3/3 Running 0 4m50s
f5-utils f5-dssm-sentinel-0 3/3 Running 0 6m47s
f5-utils f5-dssm-sentinel-1 3/3 Running 0 5m16s
f5-utils f5-dssm-sentinel-2 0/3 Pending 0 4m40s
f5-utils f5-ipam-ctlr-595c467d8d-mfs58 2/2 Running 0 6m45s
f5-utils f5-rabbit-6c7d56ddfb-87jnf 2/2 Running 0 6m50s
f5-utils f5-spk-cwc-6f89988c86-5m56n 3/3 Running 0 6m45s
f5-utils f5-toda-fluentd-bf845d465-sfm62 1/1 Running 0 6m50s
kube-system coredns-ff8999cc5-4w2rc 1/1 Running 0 7m46s
kube-system csi-nfs-controller-69dc5b4c8c-c56g8 5/5 Running 0 7m23s
kube-system csi-nfs-node-rdvsn 3/3 Running 0 7m23s
kube-system helm-install-multus-99k8d 0/1 Completed 0 7m46s
kube-system local-path-provisioner-698b58967b-j22xd 1/1 Running 0 7m46s
kube-system metrics-server-8584b5786c-b4qkq 1/1 Running 0 7m46s
kube-system multus-lkxdn 1/1 Running 0 7m32s
Run the command to get the process list.
kubectl exec f5-observer-0 -- ps aux
Sample Output
Defaulting container name to f5-observer.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
f5docker 1 0.0 0.0 711880 3024 ? Ssl 09:45 0:00 /init
f5docker 25 0.0 0.0 3024 1200 ? S 09:45 0:00 s6-svscan -c30 -t0 /var/run/s6/services
f5docker 27 0.0 0.0 3036 1264 ? S 09:45 0:00 s6-supervise observer
f5docker 28 0.0 0.0 3036 1268 ? S 09:45 0:00 s6-supervise qkview-collect-daemon
f5docker 29 1.2 0.3 1270624 49540 ? Ssl 09:45 0:01 observer
f5docker 30 0.0 0.0 1235736 9892 ? Ssl 09:45 0:00 /usr/bin/qkview-collect-daemon
f5docker 212 0.0 0.0 7072 1592 ? Rs 09:47 0:00 ps aux
Run the command to kill a process and generate the core dumps.
kubectl exec f5-observer-0 -- kill -11 9
Sample Output
Defaulting container name to f5-observer.
Validate the Core File¶
To verify the generated core file, follow the instructions below:
Run the command to get the Coremond pod name.
kubectl get pods - A
Sample Output
NAMESPACE NAME READY STATUS RESTARTS AGE
default cluster-cert-manager-57cc54f85-fst6r 1/1 Running 0 7m45s
default cluster-cert-manager-cainjector-575b969df5-x6rjz 1/1 Running 0 7m45s
default cluster-cert-manager-webhook-854c759f74-vhk9c 1/1 Running 0 7m45s
default f5-afm-648c784cb-c2zzv 2/2 Running 0 6m57s
default f5-cne-controller-568c7cf87c-4tkns 4/4 Running 0 6m57s
default f5-ipam-operator-7bc8dccd9-ktz2f 1/1 Running 0 7m16s
default f5-node-labeler-bdz8v 0/1 Init:0/1 0 6m58s
default f5-observer-0 2/2 Running 0 6m48s
default f5-observer-operator-59d46f69b7-fm6gw 2/2 Running 0 6m49s
default f5-observer-receiver-0 2/2 Running 0 6m49s
default f5-tmm-pn4tq 7/7 Running 0 6m42s
default flo-f5-lifecycle-operator-58958c4f77-lmbs7 2/2 Running 0 7m16s
default otel-collector-5f8c9fbd8-dqxpt 1/1 Running 0 6m42s
f5-utils f5-coremond-bnhf7 2/2 Running 0 4m51s
f5-utils f5-crdconversion-576f7b7579-4d5n2 2/2 Running 0 6m47s
f5-utils f5-dssm-db-0 3/3 Running 0 6m45s
f5-utils f5-dssm-db-1 3/3 Running 0 5m26s
f5-utils f5-dssm-db-2 3/3 Running 0 4m50s
f5-utils f5-dssm-sentinel-0 3/3 Running 0 6m47s
f5-utils f5-dssm-sentinel-1 3/3 Running 0 5m16s
f5-utils f5-dssm-sentinel-2 0/3 Pending 0 4m40s
f5-utils f5-ipam-ctlr-595c467d8d-mfs58 2/2 Running 0 6m45s
f5-utils f5-rabbit-6c7d56ddfb-87jnf 2/2 Running 0 6m50s
f5-utils f5-spk-cwc-6f89988c86-5m56n 3/3 Running 0 6m45s
f5-utils f5-toda-fluentd-bf845d465-sfm62 1/1 Running 0 6m50s
kube-system coredns-ff8999cc5-4w2rc 1/1 Running 0 7m46s
kube-system csi-nfs-controller-69dc5b4c8c-c56g8 5/5 Running 0 7m23s
kube-system csi-nfs-node-rdvsn 3/3 Running 0 7m23s
kube-system helm-install-multus-99k8d 0/1 Completed 0 7m46s
kube-system local-path-provisioner-698b58967b-j22xd 1/1 Running 0 7m46s
kube-system metrics-server-8584b5786c-b4qkq 1/1 Running 0 7m46s
kube-system multus-lkxdn 1/1 Running 0 7m32s
Run the command to find the core file created.
kubectl -n f5-utils logs f5-coremond-bnhf7 -c f5-coremond
Sample Output
2025-03-23 14:01:35,840 CRIT Supervisor is running as root. Privileges were not dropped because no user is specified in the config file. If you intend to run as root, you can set user=root in the config file to avoid this message.
2025-03-23 14:01:35,843 INFO supervisord started with pid 1
2025-03-23 14:01:36,848 INFO spawned: 'coremond' with pid 7
2025-03-23 14:01:36,852 INFO spawned: 'crashagent' with pid 8
2025-03-23 14:01:36,860 INFO spawned: 'qkview-collect' with pid 9
"ts"="2025-03-23 14:01:36.867"|"l"="info"|"m"="POD_NAME is not set; defaulting to hostname"|"lt"="A"|"proc"="crashagent"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:36.867"|"l"="info"|"m"="listen and serve"|"lt"="A"|"proc"="crashagent"|"addr"="/run/apport.socket"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:36.903"|"l"="error"|"m"="failed to read levels file /logs/.minlevel.yaml: open /logs/.minlevel.yaml: no such file or directory"|"lt"="A"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:36.906"|"l"="info"|"m"="coremond started"|"lt"="A"|"version"="0.7.27+0.0.6"|"commitHash"="359bc2a"|"buildDate"="2025-03-13T19:53:15Z"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:36.906"|"l"="info"|"m"="coremon dest"|"lt"="A"|"dest"="/var/cores/k3d-minibip-server-0"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:36.948"|"l"="error"|"m"="failed to read levels file /logs/.minlevel.yaml: open /logs/.minlevel.yaml: no such file or directory"|"lt"="A"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:36.953"|"l"="info"|"m"="grpc server is starting up"|"lt"="A"|"proc"="qkd"|"address"="0.0.0.0:19891"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:37.107"|"l"="error"|"m"="no such file or directory"|"lt"="A"|"path"="/logs/.minlevel.yaml"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:01:37.150"|"l"="error"|"m"="no such file or directory"|"lt"="A"|"proc"="qkd"|"path"="/logs/.minlevel.yaml"|"ct"="f5-coremond"|"v"="1.0"
2025-03-23 14:01:38,152 INFO success: coremond entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-03-23 14:01:38,152 INFO success: crashagent entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
2025-03-23 14:01:38,152 INFO success: qkview-collect entered RUNNING state, process has stayed up for > than 1 seconds (startsecs)
"ts"="2025-03-23 14:03:35.923"|"l"="info"|"m"="new core file detected"|"lt"="A"|"file"="/var/crash/core.observer.9.f5-observer-0.1742738615173108860"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:03:35.950"|"l"="error"|"m"="failed to list pods"|"lt"="A"|"err"="pods is forbidden: User "system:serviceaccount:f5-utils:default" cannot list resource "pods" in API group "" at the cluster scope"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:03:35.950"|"l"="info"|"m"="creating coredump"|"lt"="A"|"src"="/var/crash/core.observer.9.f5-observer-0.1742738615173108860"|"dst"="/var/cores/k3d-minibip-server-0/core.f5-observer-0.f5-observer.observer.9.1742738615173108860"|"ct"="f5-coremond"|"v"="1.0"
"ts"="2025-03-23 14:03:36.996"|"l"="info"|"m"="deleting src core file"|"lt"="A"|"src"="/var/crash/core.observer.9.f5-observer-0.1742738615173108860"|"dst"="/var/cores/k3d-minibip-server-0/core.f5-observer-0.f5-observer.observer.9.1742738615173108860"|"ct"="f5-coremond"|"v"="1.0"
Run the command to validate the core file created by F5.
kubectl -n f5-utils exec f5-coremond-bnhf7 -- ls /var/cores/k3d-minibip-server-0/
Sample Output
Defaulting container name to f5-coremond.
Use 'kubectl describe pod/f5-coremond-bnhf7 -n f5-utils' to see all of the containers in this pod.
core.f5-observer-0.f5-observer.observer.9.1742738615173108860.gz
core.f5-observer-0.f5-observer.observer.9.1742738615173108860.gz.crc