I'm using libvirt under Debian 12 (9.0.0-4+deb12u2 w/qemu
7.2+dfsg-7+deb12u12).
I have a vm using sr-iov, and configured it with a failover macvtap
interface so I could live migrate it. However, there is a significant
delay at the end of the migration resulting in a lot of lost traffic. If
I only have the macvtap interface, migration completes immediately at
the end of the transfer of memory with no loss of traffic.
I enabled debug logging, and found the following. On the source system,
it logs that the system is paused for the cutover:
2025-04-30 01:08:12.526+0000: 1696180: debug :
qemuMigrationAnyCompleted:1957 : Migration paused before switchover
at that point, for almost a minute, the source system just keeps
printing the same statistics:
2025-04-30 01:08:12.923+0000: 1696272: info :
qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY:
mon=0x7f8fdc0ad2f0 reply={"return": {"expected-downtime": 300,
"status":
"device", "setup-time": 297, "total-time": 26107,
"ram": {"total":
137452265472, "postcopy-requests": 0, "dirty-sync-count": 3,
"multifd-bytes": 2821784576, "pages-per-second": 297855,
"downtime-bytes": 13208, "page-size": 4096, "remaining": 0,
"postcopy-bytes": 0, "mbps": 9786.9158461538464,
"transferred":
3117658825, "dirty-sync-missed-zero-copy": 0, "precopy-bytes":
295861041, "duplicate": 32874480, "dirty-pages-rate": 56,
"skipped": 0,
"normal-bytes": 2804301824, "normal": 684644}}, "id":
"libvirt-577"}
[...]
2025-04-30 01:09:06.290+0000: 1696272: info :
qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY:
mon=0x7f8fdc0ad2f0 reply={"return": {"expected-downtime": 300,
"status":
"device", "setup-time": 297, "total-time"
: 79474, "ram": {"total": 137452265472, "postcopy-requests":
0,
"dirty-sync-count": 3, "multifd-bytes": 2821784576,
"pages-per-second":
297855, "downtime-bytes": 13208, "page-size": 4096,
"remaining": 0,
"postcopy-bytes"
: 0, "mbps": 9786.9158461538464, "transferred": 3117658825,
"dirty-sync-missed-zero-copy": 0, "precopy-bytes": 295861041,
"duplicate": 32874480, "dirty-pages-rate": 56, "skipped": 0,
"normal-bytes": 2804301824, "normal":
684644}}, "id": "libvirt-629"}
until finally it completes:
2025-04-30 01:09:06.327+0000: 1696272: info :
qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_EVENT:
mon=0x7f8fdc0ad2f0 event={"timestamp": {"seconds": 1745975346,
"microseconds": 327382}, "event": "MIGRATION", "dat
a": {"status": "completed"}}
On the destination side, it says something about negotiating failover
for the network link:
2025-04-30 01:08:12.923+0000: 1384503: info :
qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_EVENT: mon=
0x7fc7900ab2f0 event={"timestamp": {"seconds": 1745975292,
"microseconds": 922783}, "event": "FAILOVER_NEGOTIA
TED", "data": {"device-id": "ua-sr-iov-backup"}}
Then nothing happens for about a minute until it says it is done:
2025-04-30 01:09:06.328+0000: 1384503: debug :
qemuMonitorJSONIOProcessLine:189 : Line [{"timestamp": {"second
s": 1745975346, "microseconds": 327991}, "event":
"MIGRATION", "data":
{"status": "completed"}}]
Any thoughts on what is going on here to cause this delay? It's clearly
somehow related to the sv-iov component of the migration.
Thanks much…