Disaster recovery guide

This guide provides instructions for configuring and managing the F5 Insight disaster recovery (DR) feature.

What is F5 Insight DR?

The F5 Insight disaster recovery feature helps your system keep running even if one server fails. It works by maintaining two synchronized systems:

  1. Primary instance — Handles all active operations.
  2. Standby instance — A backup system that continuously syncs data from the primary.

If the primary fails or requires maintenance, you can “promote” the standby to take over as the primary. After fixing the first system, you can perform a “failback” to restore it to normal operation.

Important

Before promoting the standby, ensure the primary is fenced (not actively scraping BIG-IP devices). This prevents problems caused by both instances trying to process data at the same time.


How DR works

DR status and user interface

  • Primary node — Shows “Active” and “Read/Write” statuses in green. You can make and apply changes here.
  • Standby node — Shows “Standby” and “Read-only” in gray. You cannot make changes on the standby node.
  • DR pair status — Displays “Operational Status: Active” in green when everything is running correctly.

When you access the standby node, you will see a banner:

“This node is in standby mode. Configuration changes must be made on the primary node.”

This visual indicator confirms you are viewing the backup instance rather than the active system.

Data synchronization

F5 Insight automatically copies all data (for example, configuration changes, device information, and performance metrics) from the primary instance to the standby instance in real time.

Security with WireGuard tunnel

All data replication is secured through a WireGuard tunnel:

  • This is an encrypted connection between the primary and standby nodes.
  • It uses UDP port 51820 and requires minimal system resources.

Before you start

Infrastructure requirements

  • Two F5 Insight instances with the same software version. The instances must have network connectivity to each other, and you must have Secure Shell (SSH) access with sudo privileges on both systems.
  • Confirm these network ports are open between the instances:
    • UDP port 51820 for the WireGuard tunnel (critical).
    • TCP port 22 for SSH (initial setup only).
    • TCP port 443 for the HTTPS UI.
  • Administrator credentials are required for both instances for the web UI and SSH.

Checklist before setting up DR

  • Both instances are operational.
  • Passwordless SSH setup between the primary and standby instances is completed.

Set up DR

Prepare the instances

Verify the system is working on both instances. Record their IP addresses, for example:

  • Primary: 10.145.18.134
  • Standby: 10.145.16.150
F5 Insight login page

Configure SSH access

  1. Generate an SSH key on the primary if one does not already exist.
  2. Copy the public key to the standby instance.
  3. Repeat the process from standby to primary.

Set up the WireGuard tunnel

  1. Log in to the primary instance through SSH.
  2. Go to /opt/f5insight/scripts.
  3. Run the WireGuard setup script to:
    • Generate keys.
    • Establish the tunnel.
    • Configure iptables rules.

Once complete, a confirmation message displays.

WireGuard setup

Set up DR in the UI

  1. Access the primary instance web UI.

  2. Go to Settings > Disaster Recovery.

  3. Click Configure on the Standby Peer Node card.

  4. Provide the following details:

    • Primary (hostname and IP): for example, hostname: primary-node, IP: 10.145.18.134.
    • Standby (hostname and IP): for example, hostname: standby-node, IP: 10.145.16.150.
    DR configuration Save DR configuration
  5. Click Save Configuration.

This may take a few minutes while initial data synchronization occurs.

Verify DR configuration

Go back to the DR settings page:

  • Primary node — Should show “Active” and “Read/Write.”
  • Standby node — Should show “Standby” and “Read-only.”
  • DR pair status — Shows “Active” with a green indicator.
Verify DR configuration

To verify standby configuration, open a browser and go to your standby instance IP address. The standby mode banner should be visible on all settings pages: “This node is in standby mode. Configuration changes must be made on the primary node.”

Standby mode

Verify data synchronization

To verify data replication, make a configuration change on the primary instance, such as adding a device or modifying a setting. Then access the standby UI and refresh the page to verify the change has been synchronized.

For example, a BIG-IP device named bigip/bigip-africa-central is added to the primary.

Verify data synchronization Verify data synchronization

All the dashboards and Homepage on secondary should start showing data for this device.

Homepage screen Homepage dashboard

Disaster recovery operations

Failover (promote standby to primary)

If the primary instance fails or needs maintenance, promote the standby to act as the primary.

  1. Scale down 0 on the primary. Run the following command to scale OTEL to 0:

    sudo kubectl scale deployment otel-collector --replicas=0 -n f5-insight
    

    When the primary instance fails or requires maintenance, promote the standby instance to assume the primary role.

  2. Access the standby instance (https://<standby-ip>).

    Disaster recovery Disaster recovery confirmation message
  3. Go to Settings > Disaster Recovery.

  4. Click Promote Node to Primary on the Standby Node card.

    A confirmation message displays: “Switching node roles in progress.”

    Promote node to primary Switch role complete
  5. Log back in to the new primary instance and verify the status.

Post-promotion steps:

  • Update BIG-IP log configurations to redirect logs to the new primary.
  • You can now make changes on the new primary instance.

Failback (restore the original primary)

After fixing the original primary instance:

  1. Access the original primary (https://<primary-ip>).

  2. Go to Settings > Disaster Recovery.

  3. Click Demote Node to Standby on the Primary Node card.

    A confirmation message displays: “Switching node roles in progress.”

    • A confirmation dialog appears:
    Demote node to standby Demote node to standby complete
  4. Log back in to verify the system returns to standby mode. Confirm the standby mode banner is visible.

    Disaster recovery pair status BIG-IP settings

Conclusion

The F5 Insight DR feature helps your system stay functional even during failures. Synchronized instances, encrypted data replication, and UI-based controls make it simple for teams to maintain business continuity. Regular maintenance and familiarity with the failover and failback processes will keep your system running smoothly.