Last updated on: June 15 2024.

ESP Protocol

This page gives an overview of Encapsulating Security Payload (ESP) and considers potential trouble spots.

TL;DR

The ESP protocol carries the data packets and that communication flow is the tunnel. NAT detection solves a lot of problems.

About ESP

ESP is IP protocol 50 and has no concept of a port number. ESP uses a Security Parameter Index (SPI) and sequence (Seq) numbers to identify the flow along with providing an anti-replay capability.

  • SPI - The Security Parameter Index is an identification tag added to the header while using IPsec for tunneling the IP traffic.

  • Seq - A counter value that increases by one for each packet sent.

https://www.ietf.org/rfc/rfc4303.txt

The ESP header is inserted after the IP header and before the next layer protocol header (transport mode) or before an encapsulated IP header (tunnel mode). These modes are described in more detail below.

ESP can be used to provide confidentiality, data origin authentication, connectionless integrity, an anti-replay service (a form of partial sequence integrity), and (limited) traffic flow confidentiality.

The alternative to ESP is Authenticateion Header (AH) but it provides little security so is never implemented.

After SAs are established, the flow of user data (the protected network connections) are encapsulated into ESP using the definitions of the negotiated IPsec SA. It is important to remember that one ESP SPI provides one way connectivity. For bi-directional communication, two IPsec SAs are established so that two ESP flows exist, one for peer A -> B and another for peer B -> A.

If NAT exists between the two peers and NAT detection is enabled, the SA negotiation “floats” to UDP port 4500. ESP communication then happens over UDP port 4500. The ESP is simply wrapped by the UDP header.

Common Problems

If NAT exists in-path but the peers do not have NAT detection enabled then ESP may partially or completely fail. Stateful firewalls expect initiated flows to use predictable ports. The SPI in an ESP packet sits in the same byte location as UDP/TCP ports.

A firewall or other inspection/proxy device will set up a flow based on the SPI value of ESP for peer A -> B. Since the SPI value is different for the return flow peer B -> A, there is no match to an established connection and the packet is dropped. Even if bi-directional flow works (for example if the flow is keyed on IP addresses only), if traffic falls idle for a few minutes the connection may idle out and ESP packets from peer B cannot get to peer A.

Even without NAT, firewalls may drop ESP by default because it’s an unusual IP protocol.

If in-path network devices are suspected of dropping ESP, using the “force” NAT option in the ike-peer config may help.

Top | Flowchart | Contents