networking:mtu
Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
networking:mtu [2022/07/01 11:36] – aperez | networking:mtu [2025/10/04 15:48] (current) – aperez | ||
---|---|---|---|
Line 22: | Line 22: | ||
Before we look at TCP MSS, it helps to understand the build of the “unit” that’s being sent over the internet. | Before we look at TCP MSS, it helps to understand the build of the “unit” that’s being sent over the internet. | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | **Xubuntu - Linux** | ||
+ | |||
+ | <code bash> | ||
+ | aperez@St-Francis: | ||
+ | PING 8.8.8.8 (8.8.8.8) 1472(1500) bytes of data. | ||
+ | 1480 bytes from 8.8.8.8: icmp_seq=1 ttl=116 time=8.30 ms | ||
+ | 1480 bytes from 8.8.8.8: icmp_seq=2 ttl=116 time=8.06 ms | ||
+ | 1480 bytes from 8.8.8.8: icmp_seq=3 ttl=116 time=8.53 ms | ||
+ | 1480 bytes from 8.8.8.8: icmp_seq=4 ttl=116 time=8.32 ms | ||
+ | |||
+ | --- 8.8.8.8 ping statistics --- | ||
+ | 4 packets transmitted, | ||
+ | rtt min/ | ||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | |||
+ | |||
+ | <note tip> | ||
+ | **MTU & PMTU Validation (side note)** | ||
+ | |||
+ | Quick reference on how to confirm the effective MTU and Path MTU (PMTU) end-to-end. Use these tests when enabling jumbo frames or troubleshooting connectivity. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | **1) ICMP Check with ping** | ||
+ | Command: | ||
+ | <code bash> | ||
+ | ping -M do -s 1472 8.8.8.8 | ||
+ | </ | ||
+ | |||
+ | Expected output (OK, MTU 1500): | ||
+ | < | ||
+ | PING 8.8.8.8 (8.8.8.8) 1472(1500) bytes of data. | ||
+ | 1480 bytes from 8.8.8.8: icmp_seq=1 ttl=114 time=8.0 ms | ||
+ | 1480 bytes from 8.8.8.8: icmp_seq=2 ttl=114 time=7.9 ms | ||
+ | --- 8.8.8.8 ping statistics --- | ||
+ | 2 packets transmitted, | ||
+ | </ | ||
+ | |||
+ | 👉 If **1472 passes**, effective MTU ≈ **1500**. | ||
+ | 👉 If it fails (e.g., *Frag needed*), the link is forcing MTU < | ||
+ | |||
+ | ---- | ||
+ | |||
+ | **2) Path MTU Discovery with tracepath** | ||
+ | Command: | ||
+ | <code bash> | ||
+ | tracepath -n 8.8.8.8 | ||
+ | </ | ||
+ | |||
+ | **Case A — Standard network (PMTU 1500): | ||
+ | < | ||
+ | 1: 192.168.1.1 | ||
+ | 1: 192.168.1.1 | ||
+ | 2: 10.10.10.1 | ||
+ | 3: 8.8.8.8 | ||
+ | | ||
+ | </ | ||
+ | |||
+ | **Case B — Jumbo end-to-end (PMTU ~9000): | ||
+ | < | ||
+ | 1: 10.0.0.1 | ||
+ | 1: 10.0.0.1 | ||
+ | 2: 172.20.0.1 | ||
+ | 3: 8.8.8.8 | ||
+ | | ||
+ | </ | ||
+ | |||
+ | 👉 If you see **`pmtu 1500`**, the path is limited to standard frames. | ||
+ | 👉 If you see **`pmtu 9000`** (or similar), jumbo frames are preserved across the path. | ||
+ | 👉 If it drops (e.g., 9000 → 1500 mid-path), a hop does not support jumbo. | ||
+ | |||
+ | ---- | ||
+ | |||
+ | **Validation summary** | ||
+ | |||
+ | ^ Test ^ Expected outcome | ||
+ | | `ping -M do -s 1472` | Successful reply ⇒ effective MTU ≈ 1500 | | ||
+ | | `tracepath -n` | Reports PMTU 1500 (standard) or 9000 (jumbo)| | ||
+ | |||
+ | </ | ||
+ | |||
+ | ---- | ||
+ | |||
+ | **Windows** | ||
+ | |||
+ | |||
+ | <code dos> | ||
+ | C: | ||
+ | |||
+ | Pinging 8.8.8.8 with 1440 bytes of data: | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | </ | ||
+ | |||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | |||
+ | <code dos> | ||
+ | C: | ||
+ | |||
+ | Pinging 8.8.8.8 with 1440 bytes of data: | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | Reply from 8.8.8.8: bytes=1440 time=8ms TTL=116 | ||
+ | </ | ||
+ | |||
+ | **Technical Explanation: | ||
+ | |||
+ | The parameter `-l` in Windows `ping` specifies only the ICMP payload (data). | ||
+ | But the real MTU also includes protocol headers: | ||
+ | |||
+ | * **20 bytes** → IP header (source/ | ||
+ | * **8 bytes** | ||
+ | |||
+ | Therefore: | ||
+ | |||
+ | Real MTU = Payload (-l) + 20 + 8 | ||
+ | Example: `1440 + 28 = 1468` → Real MTU = **1468 bytes** | ||
+ | |||
+ | 👉 This is why you must always add **28 bytes** to the `-l` value to obtain the true MTU on the link. | ||
+ | |||
+ | |||
+ | ---- | ||
+ | |||
+ | > **Note: PMTU and PMTUD (Path MTU / Path MTU Discovery)** | ||
+ | |||
+ | The *Path Maximum Transmission Unit (PMTU)* is the **largest IP packet size that can travel end-to-end without fragmentation**. | ||
+ | It is determined by the **smallest MTU along the entire path**. | ||
+ | |||
+ | Example: | ||
+ | * Link 1: MTU = 9000 | ||
+ | * Link 2: MTU = 1500 | ||
+ | * Link 3: MTU = 1400 | ||
+ | → **PMTU = 1400 bytes** | ||
+ | |||
+ | --- | ||
+ | |||
+ | **Path MTU Discovery (PMTUD): | ||
+ | A dynamic mechanism to discover the PMTU using the **DF (Don’t Fragment)** bit. | ||
+ | |||
+ | Steps: | ||
+ | - Source sends a large packet with DF=1. | ||
+ | - If a router cannot forward due to its MTU, it discards the packet and replies with ICMP *Fragmentation Needed* (Type 3, Code 4). | ||
+ | - The source reduces the size until it learns the **real PMTU**. | ||
+ | |||
+ | --- | ||
+ | |||
+ | **Verification on Linux**:: | ||
+ | |||
+ | ping -M do -s 1472 8.8.8.8 | ||
+ | → If it replies: path supports 1500 (1472+28 headers). | ||
+ | → If it fails: PMTU is smaller. | ||
+ | |||
+ | tracepath -n 8.8.8.8 | ||
+ | → Displays estimated PMTU along the route. | ||
+ | |||
+ | --- | ||
+ | |||
+ | **Common pitfalls** | ||
+ | * Blocking ICMP → breaks PMTUD (TCP sessions may hang). | ||
+ | * Tunnel/VPN overhead → reduces MTU (e.g., IPSec 1500 → ~1400). | ||
+ | * Misconfigured Jumbo Frames → one 1500 hop breaks 9000 end-to-end. | ||
+ | |||
+ | --- | ||
+ | |||
+ | **In summary: | ||
+ | * **PMTU** = max packet size without fragmentation. | ||
+ | * **PMTUD** = process to dynamically discover it using ICMP + DF. | ||
+ | |||
+ | |||
+ | ---- | ||
Line 113: | Line 296: | ||
A common mistake is to configure the “ip tcp mss-adjust 1420” command on the tunnel interface of the customer. The reason it doesn’t work is because the SYN packets that are sent from the server toward the end user are not going via the GRE tunnel interface, but via the original ISP’s interface. | A common mistake is to configure the “ip tcp mss-adjust 1420” command on the tunnel interface of the customer. The reason it doesn’t work is because the SYN packets that are sent from the server toward the end user are not going via the GRE tunnel interface, but via the original ISP’s interface. | ||
+ | |||
+ | |||
+ | |||
+ | ---- | ||
+ | ---- | ||
+ | |||
+ | |||
+ | |||
+ | {{ : | ||
+ | |||
+ | {{pdfjs 46em >: | ||
+ | |||
+ | ---- | ||
+ | ---- | ||
+ | |||
+ | |||
+ | {{ : | ||
+ | |||
+ | {{pdfjs 46em >: | ||
+ | ---- | ||
+ | ---- | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
networking/mtu.1656693378.txt.gz · Last modified: 2022/07/01 11:36 by aperez