Hello everyone,
I’m looking at an issue where I do see guests freezing (Dl) process state during a block
disk mirror from one storage to another storage (NFS) where the network stack of the guest
can freeze for up to 10 seconds.
Looking at the storage and IO I noticed good throughput ad low latency <3ms and I am
having trouble to track down the source for the issue, as neither storage nor networking
show issues. Interestingly when I do the same test with virtio-blk I do not really see the
process freezes at the frequency or duration compared to virtio-scsi which seem to
indicate a client side rather than storage side problem.
I had looked at the syscalls and nothing stuck out:
% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
28.51 20.672654 8339 2479 ioctl
27.81 20.162714 3379 5967 31 futex
22.02 15.964498 785 20335 poll
15.22 11.038403 150 73561 io_submit
4.17 3.023285 41 73540 lseek
1.20 0.868003 5 158591 write
0.63 0.459030 11 42871 ppoll
0.22 0.159263 8 19314 recvmsg
0.16 0.115520 5 22526 read
0.04 0.029149 29149 1 restart_syscall
0.01 0.009252 28 330 sendmsg
0.00 0.001221 1221 1 munmap
0.00 0.000458 22 21 fcntl
0.00 0.000286 95 3 openat
0.00 0.000166 5 32 rt_sigprocmask
0.00 0.000103 10 10 fdatasync
0.00 0.000099 25 4 clone
0.00 0.000081 7 12 mmap
0.00 0.000077 19 4 close
0.00 0.000076 6 12 mprotect
0.00 0.000056 14 4 madvise
0.00 0.000025 6 4 set_robust_list
0.00 0.000023 6 4 prctl
------ ----------- ----------- --------- --------- ----------------
100.00 72.504442 419626 31 total
Does anyone have an idea how to better debug this issue ?
Thanks
Bjoern