
<snip>
I was also running "vmstat 2" when throttled machine did IO and noticed that number of blocked processes went up, around 25-35. I am assuming these all are qemu IO threads blocked waiting for throttled IO to finish. I am not sure if blocked processes also contribute towards load average.
While googling a bit, I found one wiki page which says following.
"Most UNIX systems count only processes in the running (on CPU) or runnable (waiting for CPU) states. However, Linux also includes processes in uninterruptible sleep states (usually waiting for disk activity), which can lead to markedly different results if many processes remain blocked in I/O due to a busy or stalled I/O system."
If this is true, that explains high system load in your testing. Throttling is working and we have around 30-35 IO threads/processes per qemu instance. You have 8 qemu instance running and roughly 240-280 processes blocked waiting for IO to finish and that will explain high load. But that is expected given the fact we are throttling IO?
I also tried direct IO in virtual machine and that seems to be forking only 1 IO thread.
# time dd if=/dev/zero of=/mnt/vdb/testfile bs=1M count=1500 oflag=direct 1500+0 records in 1500+0 records out 1572864000 bytes (1.6 GB) copied, 31.4301 s, 50.0 MB/s
real 0m31.664s user 0m0.003s sys 0m0.819s
While running this I noticed number of processes blocked was 1 all the time and hence low load average. Try oflag=direct option in your tests.
I can second that. oflag=direct in the vm shows 1 blocked process most of the time on the host. No oflag shows 1-16. Regards Dominik