Backup and restore guide

Snapshots provide a way for the state of VNFM HA cluster. A VNFM snapshot should be done on a daily basis (suggest in an off peak time) and can be automated using the REST API as an alternative to an operator manually running the snapshot as shown here in this user guide.

Backing up the virtual machine that the F5 VNF managers run on should be done at regular intervals, this would be dictated by a backup policies and would involve daily, weekly, monthly and yearly backups as required. The method for backing up the F5 VNF Manager virtual machines falls outside the scope of this document.


Snapshots of the VNFM HA cluster should be taken at regular intervals (suggest daily), and can be automated through the REST-based Service API or done manually by an operator using the Snapshots or VNFM CLI.

Create snapshot

  1. Create snapshot:

    Code Block 1 CLI

    vnfm snapshots create —include-metrics —include-credentials SNAPSHOT_ID

    Code Block 2 REST

      curl -X PUT --header "Tenant: <manager-tenant>" -u <manager-username>:<manager-password> "http://<manager-ip>/api/v3.1/snapshots/<snapshot-id>"
    Parameters specification available in the :ref:`VNFM REST API <CreateSnapshot>`.
  2. Download snapshot:

    Code Block 3 CLI

    vnfm snapshots download [OPTIONS] SNAPSHOT_ID

    Code Block 4 REST

    curl -X GET --header "Tenant: <manager-tenant>" -u <manager-username>:<manager-password> "http://<manager-ip>/api/v3.1/snapshot/<snapshot-id>/archive" > <snapshot-archive-filename>.zip

    Parameters specification available in the VNFM REST API.

Apply snapshot

  1. Upload snapshot

    Code Block 5 CLI

    vnfm snapshots upload [OPTIONS] SNAPSHOT_PATH

    Code Block 6 REST

      curl -X PUT
      --header "Tenant: <manager-tenant>"
      -u <manager-username>:<manager-password>
    Parameters specification available in the :ref:`VNFM REST API <UpldSnapshot>`.
  2. Restore snapshot

    Code Block 7 CLI

    vnfm snapshots restore [OPTIONS] —tenant-name <TEXT> SNAPSHOT_ID

    Code Block 8 REST

       curl -s -X POST
       --header "Content-Type: application/json"
       --header "Tenant: <manager-tenant>"
       -u <manager-username>:<manager-password>
       -d '{"tenant_name": "<manager-tenant>", "recreate_deployments_envs": true, "force": false, "restore_certificates": false, "no_reboot": false}'
    Parameters specification available in the :ref:`VNFM REST API <RestoreSnapshot>`.

Failure Recovery

You can use the following steps to recover failed nodes for specific situations.

Whole cluster down or working incorrectly

  1. Save /home/admin/files/ssl/* files. Save /etc/cloudify/ssl/* files.
  2. Teardown managers.
  3. Install fresh managers with existing certificates in /home/admin/config.yaml.
  4. Create and join cluster.
  5. Apply latest working version snapshot on active manager.

One manager cluster node down

  1. Remove manager from the cluster.
  2. Destroy manager.
  3. Bootstrap fresh manager.
  4. Join existing cluster.

Effect: Healthy manager cluster

Active manager node down

  1. Other healthy manager from the cluster automatically becomes active manager.
  2. Investigate error:
  3. Either:
    • Fix problem
    • Destroy manager.
      1. Install manager.
      2. Join cluster.

Effect: Healthy manager cluster

Split brain

A split brain situation happens when for a while there is no connectivity between managers. Then each of them thinks that other managers are unhealthy and become master. After connectivity is back, the master becomes the only one in the cluster. It’s chosen based on the newest version of PostgreSQL database. All data from other managers will be synced with the active one and others will become standbys. All data/installed deployments/plugins will get lost.

What’s Next?