Hi,
Problem-
Offloading for VM packets (TSO enabled in VM's ) degrades severely with increase in VM's on a host
What is controlling the offloading of VM packets and how can we improve it ?
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Description-
We have a testbed OpenStack deployment. We boot 1, 10 and 25 VM's on a single compute node and start iperf traffic. ( VM's are iperf client ).
We then simultaneously do tcpdump at the veth-pair connecting the VM to the OVS Bridges.
Tcpdump data shows that on increasing the VM's on a host, the % of offloaded packets degrades severely
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Host configuration- 12 cores ( 24 vCPU ), 40 GB RAM
[root@rhel7-25 ~]# uname -a
Linux rhel7-25.in.ibm.com 3.10.0-229.el7.x86_64 #1 SMP Thu Jan 29 18:37:38 EST 2015 x86_64 x86_64 x86_64 GNU/Linux
VM MTU is set to 1450
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Analysis-
------------------------------------------------------------------------------------
| VMs | % Non-Offloaded packets |
|-----------------------------------------------------------------------------------|
| 1 | 11.11% |
| 10 | 71.78% |
| 25 | 80.44% |
|-----------------------|-----------------------------------------------------------|
Thus we see significant degradation in offloaded packets when 10 and 25 VM's are sending iperf data simultaneously. ( TSO enabled VM's )
Non-Offloaded packets means Ethernet Frame of size 1464 ( VM MTU is 1450 )
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Tcpdump details-
Iperf Server IP- 1.1.1.34
For 1 VM, we see majority offloaded packets-
[piyush@rhel7-34 25]$ cat qvoed7aa38d-22.log | grep "> 1.1.1.34.5001" | head -n 30
14:36:26.331073 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 74: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 0
14:36:26.331917 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 66: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 0
14:36:26.331946 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 90: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 24
14:36:26.331977 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 7056: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 6990
14:36:26.332018 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 5658: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 5592
14:36:26.332527 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 7056: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 6990
14:36:26.332560 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 9852: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 9786
14:36:26.333024 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 8454: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 8388
14:36:26.333054 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 7056: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 6990
14:36:26.333076 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 4260: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 4194
14:36:26.333530 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 16842: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 16776
14:36:26.333568 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 4260: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 4194
14:36:26.333886 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 21036: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 20970
14:36:26.333925 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 2862: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 2796
14:36:26.334303 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 21036: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 20970
14:36:26.334349 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 2862: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 2796
14:36:26.334741 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 22434: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 22368
14:36:26.335118 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 25230: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 25164
14:36:26.335566 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 25230: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 25164
14:36:26.336007 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 23832: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 23766
14:36:26.336050 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 2862: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 2796
14:36:26.336453 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 26628: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 26562
14:36:26.336898 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 22434: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 22368
14:36:26.336941 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 5658: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 5592
14:36:26.337235 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 23832: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 23766
14:36:26.337603 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 21036: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 20970
14:36:26.337644 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 8454: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 8388
14:36:26.337987 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 18240: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 18174
14:36:26.338040 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 12648: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 12582
14:36:26.338356 fa:16:3e:98:41:8b > fa:16:3e:ef:5f:16, IPv4, length 14046: 10.20.7.3.50395 > 1.1.1.34.5001: tcp 13980
For 20 VM's, we see reduction is size of offloaded packets. Tcpdump for one of the 10 VM's-
[piyush@rhel7-34 25]$ cat qvo255d8cdd-90.log | grep "> 1.1.1.34.5001" | head -n 30
15:09:25.024790 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 74: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 0
15:09:25.026834 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 66: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 0
15:09:25.026870 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 90: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 24
15:09:25.027186 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.027213 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 5658: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 5592
15:09:25.032500 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 5658: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 5592
15:09:25.032539 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 1464: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 1398
15:09:25.032567 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.035122 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.035631 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.035661 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.038508 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.038904 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.039300 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.040740 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 4260: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 4194
15:09:25.040774 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 2862: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 2796
15:09:25.040995 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.041235 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.042599 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.043209 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.043592 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.044312 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.044551 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.045232 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.045251 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.049951 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 5658: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 5592
15:09:25.049977 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 1464: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 1398
15:09:25.049996 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 7056: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 6990
15:09:25.050020 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 2862: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 2796
15:09:25.050039 fa:16:3e:b9:f8:ec > fa:16:3e:c1:de:cc, IPv4, length 4260: 10.20.18.3.36798 > 1.1.1.34.5001: tcp 4194
For 25 VM's, we hardly see very less offloaded packets. Tcpdump for one of the 25 VM's-
15:52:31.543613 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.543637 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 2862: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 2796
15:52:31.543957 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 2862: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 2796
15:52:31.544090 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 4260: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 4194
15:52:31.544272 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.544296 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.544316 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.544340 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545034 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545066 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 5658: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 5592
15:52:31.545474 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545501 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 2862: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 2796
15:52:31.545539 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 2862: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 2796
15:52:31.545572 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 7056: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 6990
15:52:31.545736 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545807 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545813 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545934 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545956 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.545974 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
15:52:31.546012 fa:16:3e:3c:7d:78 > fa:16:3e:aa:af:d5, IPv4, length 1464: 10.20.10.3.45892 > 1.1.1.34.5001: tcp 1398
Thanks and regards,
Piyush Raman
Mail: pirsriva@in.ibm.com