Re: [libvirt] blkio cgroup [solved]

24 Feb 2011

      Hi List

Vivek and I talked yesterday via IRC and here's what we found.

First of all, I need to define the "freeze"- or "hang"-situation in
order to avoid misunderstandings. When the "hang" state occures, this
means it is impossible to do any io on the system. vmstat shows 250-300
blocked threads. Therefore, it is not possible to open new ssh
connections or log into the servers console. Established ssh connections
however, keep working. It is possible to run commands. Just don't touch
any files. That immediately hangs the connections.

Okay, now that we got that cleared up ...

During the "hang" situation, it is possible to change from cfq to
deadline scheduler for /dev/sdb (the pv for the lvs). This makes the io
happen and the system is responsive as usual.

After applying the patch correctly (my mistake), we could see these
debug messages from debugfs:

qemu-kvm-3955  [004]   823.360594:   8,16   Q  WS 3408164104 + 480
[qemu-kvm]
qemu-kvm-3955  [004]   823.360594:   8,16   S  WS 3408164104 + 480
[qemu-kvm]
<...>-3252  [016]   823.361146:   8,16   m   N cfq idle timer fired
<...>-3252  [016]   823.361151:   8,16   m   N cfq3683 slice expired t=0
<...>-3252  [016]   823.361154:   8,16   m   N cfq3683 sl_used=2 disp=25
charge=2 iops=0 sect=9944
<...>-3252  [016]   823.361154:   8,16   m   N cfq3683 del_from_rr
<...>-3252  [016]   823.361161:   8,16   m   N cfq schedule dispatch:
busy_queues=38 rq_queued=132 rq_in_driver=0

quote Vivek Goyal
cfq has 38 busy queues and 132 requests queued and it tries to schedule
a dispatch
and that dispatch never happens for some reason and cfq is hung
/quote

So the next idea was to try with 2.6.38-rc6, just in case there is a bug
in workqueue logic which got fixed?

And it turns out: With 2.6.38-rc6, the problem does not happen.

I will see whether I can bisect the kernel patches and see which one was
the good one. I have to figure out a way to do that, but if I do it and
find out, I will keep you posted.

Vivek then asked me to use another 2.6.37 debug patch and re-run the
test. Attached are the logs from that run.

Regards
Dominik

On 02/23/2011 02:37 PM, Dominik Klein wrote:
...
Hi
so I ran these tests again. No patch applied yet. And - at least once -
it worked. I did everything exactly the same way as before. Since the
logs are 8 MB, even when best-bzip2'd, and I don't want everybody to
have to download these, I uploaded them to an external hoster:
http://www.megaupload.com/?d=SWKTC0V4
Traces were created with
blktrace -n 64 -b 16384 -d /dev/sdb -o - | blkparse -i -
blktrace -n 64 -b 16384 -d /dev/vdisks/kernel3 -o - | blkparse -i -
...
Can you please apply attached patch.
Unfortunately not. Cannot be applied to 2.6.37. I guess your source is
newer and I fail to find the places in the file to patch manually.
...
This just makes CFQ output little
more verbose and run the test again and capture the trace.
- Start the trace on /dev/sdb
- Start the dd jobs in virt machines
- Wait for system to hang
- Press CTRL-C
- Make sure there were no lost events otherwise increase the size and
  number of buffers.
Tried that. Unfortunately, even with max buffer size of 16 M [1], this
leaves some Skips. I also tried to increase the number of buffers over
64, but that produced Oops'es.
However, I attached kernel3's blktrace of a case where the error
occured. Maybe you can read something from that.
...
Can you also open tracing in another window and also trace one of the
throttled dm deivces, say /dev/disks/kernel3. Following the same procedure
as above. So let the two traces run in parallel.
So what next?
Regards
Dominik
[1]
http://git.kernel.org/?p=linux/kernel/git/axboe/blktrace.git;a=blob_plain;f=...
and look for "Invalid buffer"