Re: [libvirt] blkio cgroup

Monday, 21 February 2011

On 02/21/2011 09:19 AM, Dominik Klein wrote:
...
>> - Is it possible to capture 10-15 second blktrace on your
underlying
>>   physical device. That should give me some idea what's happening.
>
> Will do, read on.

 Just realized I missed this one ... Had better done it right away.

 So here goes.

 Setup as in first email. 8 Machines, 2 important, 6 not important ones
 with a throttle of ~10M. group_isolation=1. Each vm dd'ing zeroes.

 blktrace -d /dev/sdb -w 30
 === sdb ===
   CPU  0:                 4769 events,      224 KiB data
   CPU  1:                28079 events,     1317 KiB data
   CPU  2:                 1179 events,       56 KiB data
   CPU  3:                 5529 events,      260 KiB data
   CPU  4:                  295 events,       14 KiB data
   CPU  5:                  649 events,       31 KiB data
   CPU  6:                  185 events,        9 KiB data
   CPU  7:                  180 events,        9 KiB data
   CPU  8:                   17 events,        1 KiB data
   CPU  9:                   12 events,        1 KiB data
   CPU 10:                    6 events,        1 KiB data
   CPU 11:                   55 events,        3 KiB data
   CPU 12:                28005 events,     1313 KiB data
   CPU 13:                 1542 events,       73 KiB data
   CPU 14:                 4814 events,      226 KiB data
   CPU 15:                  389 events,       19 KiB data
   CPU 16:                 1545 events,       73 KiB data
   CPU 17:                  119 events,        6 KiB data
   CPU 18:                 3019 events,      142 KiB data
   CPU 19:                   62 events,        3 KiB data
   CPU 20:                  800 events,       38 KiB data
   CPU 21:                   17 events,        1 KiB data
   CPU 22:                  243 events,       12 KiB data
   CPU 23:                    1 events,        1 KiB data
   Total:                 81511 events (dropped 0),     3822 KiB data

 Very constant 296 blocked processes in vmstat during this run. But...
 apparently no data is written at all (see "bo" column).

 vmstat 2
 procs -----------memory---------- ---swap-- -----io---- -system--
 ----cpu----
  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy
 id wa
  0 296      0 125254224  21432 142016    0    0    16   633  181  331  0
  0 93  7
  0 296      0 125253728  21432 142016    0    0     0     0 17115 33794
  0  0 25 75
  0 296      0 125254112  21432 142016    0    0     0     0 17084 33721
  0  0 25 74
  1 296      0 125254352  21440 142012    0    0     0    18 17047 33736
  0  0 25 75
  0 296      0 125304224  21440 131060    0    0     0     0 17630 33989
  0  1 23 76
  1 296      0 125306496  21440 130260    0    0     0     0 16810 33401
  0  0 20 80
  4 296      0 125307208  21440 129856    0    0     0     0 17169 33744
  0  0 26 74
  0 296      0 125307496  21448 129508    0    0     0    14 17105 33650
  0  0 36 64
  0 296      0 125307712  21452 129672    0    0     2  1340 17117 33674
  0  0 22 78
  1 296      0 125307752  21452 129520    0    0     0     0 16875 33438
  0  0 29 70
  1 296      0 125307776  21452 129520    0    0     0     0 16959 33560
  0  0 21 79
  1 296      0 125307792  21460 129520    0    0     0    12 16700 33239
  0  0 15 85
  1 296      0 125307808  21460 129520    0    0     0     0 16750 33274
  0  0 25 74
  1 296      0 125307808  21460 129520    0    0     0     0 17020 33601
  0  0 26 74
  1 296      0 125308272  21460 129520    0    0     0     0 17080 33616
  0  0 20 80
  1 296      0 125308408  21460 129520    0    0     0     0 16428 32972
  0  0 42 58
  1 296      0 125308016  21460 129524    0    0     0     0 17021 33624
  0  0 22 77 
While we're on that ... It is impossible for me now to recover from this
state without pulling the power plug.

On the VMs console I see messages like
INFO: task (kjournald|flush-254|dd|rs:main|...) blocked for more than
120 seconds.

While the ssh sessions through which the dd was started seem intact
(pressing enter gives a new line), it is impossible to cancel the dd
command. Logging in on the VMs console also is impossible.

Opening a new ssh session to the host also does not work. Killing the
qemu-kvm processes from a session opened earlier leaves zomby processes.
Moving the VMs back to the root cgroup makes no difference either.

Regards
Dominik

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] blkio cgroup