On 26/01/2016 18:21, Chris Friesen wrote:
>>
>> My question is, why doesn't qemu continue processing virtio packets
>> while the dirty page scanning and memory transfer over the network is
>> proceeding?
>
> QEMU (or vhost) _are_ processing virtio traffic, because otherwise you'd
> have no delay---only dropped packets. Or am I missing something?
I have separate timestamps embedded in the packet for when it was sent
and when it was echoed back by the target (which is the one being
migrated). What I'm seeing is that packets to the guest are being sent
every msec, but they get delayed somewhere for over a second on the way
to the destination VM while the migration is in progress. Once the
migration is over, a bunch of packets get delivered to the app in the
guest and are then processed all at once and echoed back to the sender
in a big burst (and a bunch of packets are dropped, presumably due to a
buffer overflowing somewhere).
That doesn't exclude a bug somewhere in net/ code. It doesn't pinpoint
it to QEMU or vhost-net.
In any case, what I would do is to use tracing at all levels (guest
kernel, QEMU, host kernel) for packet rx and tx, and find out at which
layer the hiccup appears.
Paolo
For comparison, we have a DPDK-based fastpath NIC type that we added
(sort of like vhost-net), and it continues to process packets while the
dirty page scanning is going on. Only the actual cutover affects it.