//

An OpenStack Crime Story solved by tcpdump, sysdig, and iostat – Episode 3

16.9.2014 | 5 minutes of reading time

Previously on OpenStack Crime Investigation … Two load balancers running as virtual machine in our OpenStack based cloud, sharing a keepalived based highly available IP address started to flap, switching the  IP address back and forth. After ruling out a misconfiguration of keepalived and issues in the virtual network, I finally got the hint that the problem might originate not in the virtual, but in the bare metal world of our cloud. Maybe high IO was causing the gap between the keep alive VRRP packets.

When I arrived at baremetal host node01, hosting virtual host loadbalancer01, I was anxious to see the IO statistics. The machine must be under heavy IO load when the virtual machine’s messages are waiting for up to five seconds.

I switched on my iostat flash light and saw this:

1$ iostat
2Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
3sda              12.00         0.00       144.00          0        144
4sdc               0.00         0.00         0.00          0          0
5sdb               6.00         0.00        24.00          0         24
6sdd               0.00         0.00         0.00          0          0
7sde               0.00         0.00         0.00          0          0
8sdf              20.00         0.00       118.00          0        118
9sdg               0.00         0.00         0.00          0          0
10sdi              22.00         0.00       112.00          0        112
11sdh               0.00         0.00         0.00          0          0
12sdj               0.00         0.00         0.00          0          0
13sdk              21.00         0.00        96.50          0         96
14sdl               0.00         0.00         0.00          0          0
15sdm               9.00         0.00        64.00          0         64
16

Nothing? Nothing at all? No IO on the disks? Maybe my bigger flash light iotop could help:

1$ iotop
2

Unfortunately, what I saw was too ghastly to show here and therefore I decided to omit the screenshots of iotop 1 . It was pure horror. Six qemu processes eating the physical CPUs alive in IO.

So, no disk IO, but super high IO caused by qemu. It must be network IO then. But all performance counters show almost no network activity. What if this IO wasn’t real, but virtual? It could be the virtual network driver! It had to  be the virtual network driver.

I checked the OpenStack configuration. It was set to use the para-virtualized network driver vhost_net.

I checked the running qemu processes. They were also configured to use the para-virtualized network driver.

1$ ps aux | grep qemu
2libvirt+  6875 66.4  8.3 63752992 11063572 ?   Sl   Sep05 4781:47 /usr/bin/qemu-system-x86_64
3 -name instance-000000dd -S ... -netdev tap,fd=25,id=hostnet0,vhost=on,vhostfd=27 ...
4

I was getting closer! I checked the kernel modules. Kernel module vhost_net was loaded and active.

1$ lsmod | grep net
2vhost_net              18104  2
3vhost                  29009  1 vhost_net
4macvtap                18255  1 vhost_net
5

I checked the qemu-kvm configuration and froze.

1$ cat /etc/default/qemu-kvm
2# To disable qemu-kvm's page merging feature, set KSM_ENABLED=0 and
3# sudo restart qemu-kvm
4KSM_ENABLED=1
5SLEEP_MILLISECS=200
6# To load the vhost_net module, which in some cases can speed up
7# network performance, set VHOST_NET_ENABLED to 1.
8VHOST_NET_ENABLED=0
9 
10# Set this to 1 if you want hugepages to be available to kvm under
11# /run/hugepages/kvm
12KVM_HUGEPAGES=0
13

vhost_net was disabled by default for qemu-kvm. We were running all packets through userspace and qemu instead of passing them to the kernel directly as vhost_net does! That’s where the lag was coming from!

I acted immediately to rescue the victims. I made the huge, extremely complicated, full 1 byte change on all our compute nodes by modifying a VHOST_NET_ENABLED=0 to a VHOST_NET_ENABLED=1, restarted all virtual machines and finally, after days of constantly screaming in pain, the flapping between both load balancers stopped.

I did it! I saved them!

But I couldn’t stop here. I wanted to find out, who did that to the poor little load balancers. Who’s behind this conspiracy of crippled network latency?

I knew there was only one way to finally catch the guy. I set a trap. I installed a fresh, clean, virgin Ubuntu 14.04 in a virtual machine and then, well, then I waited — for apt-get install qemu-kvm to finish:

1$ sudo apt-get install qemu-kvm
2Reading package lists... Done
3Building dependency tree
4Reading state information... Done
5The following extra packages will be installed:
6  acl cpu-checker ipxe-qemu libaio1 libasound2 libasound2-data libasyncns0
7  libbluetooth3 libboost-system1.54.0 libboost-thread1.54.0 libbrlapi0.6
8  libcaca0 libfdt1 libflac8 libjpeg-turbo8 libjpeg8 libnspr4 libnss3
9  libnss3-nssdb libogg0 libpulse0 librados2 librbd1 libsdl1.2debian
10  libseccomp2 libsndfile1 libspice-server1 libusbredirparser1 libvorbis0a
11  libvorbisenc2 libxen-4.4 libxenstore3.0 libyajl2 msr-tools qemu-keymaps
12  qemu-system-common qemu-system-x86 qemu-utils seabios sharutils
13Suggested packages:
14  libasound2-plugins alsa-utils pulseaudio samba vde2 sgabios debootstrap
15  bsd-mailx mailx
16The following NEW packages will be installed:
17  acl cpu-checker ipxe-qemu libaio1 libasound2 libasound2-data libasyncns0
18  libbluetooth3 libboost-system1.54.0 libboost-thread1.54.0 libbrlapi0.6
19  libcaca0 libfdt1 libflac8 libjpeg-turbo8 libjpeg8 libnspr4 libnss3
20  libnss3-nssdb libogg0 libpulse0 librados2 librbd1 libsdl1.2debian
21  libseccomp2 libsndfile1 libspice-server1 libusbredirparser1 libvorbis0a
22  libvorbisenc2 libxen-4.4 libxenstore3.0 libyajl2 msr-tools qemu-keymaps
23  qemu-kvm qemu-system-common qemu-system-x86 qemu-utils seabios sharutils
240 upgraded, 41 newly installed, 0 to remove and 2 not upgraded.
25Need to get 3631 kB/8671 kB of archives.
26After this operation, 42.0 MB of additional disk space will be used.
27Do you want to continue? [Y/n] <
28...
29Setting up qemu-system-x86 (2.0.0+dfsg-2ubuntu1.3) ...
30qemu-kvm start/running
31Setting up qemu-utils (2.0.0+dfsg-2ubuntu1.3) ...
32Processing triggers for ureadahead (0.100.0-16) ...
33Setting up qemu-kvm (2.0.0+dfsg-2ubuntu1.3) ...
34Processing triggers for libc-bin (2.19-0ubuntu6.3) ...
35

And then, I let the trap snap:

1$ cat /etc/default/qemu-kvm
2# To disable qemu-kvm's page merging feature, set KSM_ENABLED=0 and
3# sudo restart qemu-kvm
4KSM_ENABLED=1
5SLEEP_MILLISECS=200
6# To load the vhost_net module, which in some cases can speed up
7# network performance, set VHOST_NET_ENABLED to 1.
8VHOST_NET_ENABLED=0
9 
10# Set this to 1 if you want hugepages to be available to kvm under
11# /run/hugepages/kvm
12KVM_HUGEPAGES=0
13

I could not believe it! It was Ubuntu’s own default setting. Ubuntu, the very foundation of our cloud decided that despite all modern hardware supporting vhost_net to turn it off by default. Ubuntu was convicted and I could finally rest.


This is the end of my detective story. I found and arrested the criminal Ubuntu default setting and were able to prevent him from further crippling our virtual network latencies.

Please feel free to leave comments and ask questions about details of my journey. I’am already negotiating to sell the movie rights. But maybe there will be another season of OpenStack Crime Investigation in the future. So stay tuned on codecentric Blog.

Footnotes

1. Eh, and because I lost them.

share post

Likes

0

//

More articles in this subject area\n

Discover exciting further topics and let the codecentric world inspire you.

//

Gemeinsam bessere Projekte umsetzen

Wir helfen Deinem Unternehmen

Du stehst vor einer großen IT-Herausforderung? Wir sorgen für eine maßgeschneiderte Unterstützung. Informiere dich jetzt.

Hilf uns, noch besser zu werden.

Wir sind immer auf der Suche nach neuen Talenten. Auch für dich ist die passende Stelle dabei.