significant delay live migrating a vm with an sr-iov interface
by Paul B. Henson
I'm using libvirt under Debian 12 (9.0.0-4+deb12u2 w/qemu
7.2+dfsg-7+deb12u12).
I have a vm using sr-iov, and configured it with a failover macvtap
interface so I could live migrate it. However, there is a significant
delay at the end of the migration resulting in a lot of lost traffic. If
I only have the macvtap interface, migration completes immediately at
the end of the transfer of memory with no loss of traffic.
I enabled debug logging, and found the following. On the source system,
it logs that the system is paused for the cutover:
2025-04-30 01:08:12.526+0000: 1696180: debug :
qemuMigrationAnyCompleted:1957 : Migration paused before switchover
at that point, for almost a minute, the source system just keeps
printing the same statistics:
2025-04-30 01:08:12.923+0000: 1696272: info :
qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY:
mon=0x7f8fdc0ad2f0 reply={"return": {"expected-downtime": 300, "status":
"device", "setup-time": 297, "total-time": 26107, "ram": {"total":
137452265472, "postcopy-requests": 0, "dirty-sync-count": 3,
"multifd-bytes": 2821784576, "pages-per-second": 297855,
"downtime-bytes": 13208, "page-size": 4096, "remaining": 0,
"postcopy-bytes": 0, "mbps": 9786.9158461538464, "transferred":
3117658825, "dirty-sync-missed-zero-copy": 0, "precopy-bytes":
295861041, "duplicate": 32874480, "dirty-pages-rate": 56, "skipped": 0,
"normal-bytes": 2804301824, "normal": 684644}}, "id": "libvirt-577"}
[...]
2025-04-30 01:09:06.290+0000: 1696272: info :
qemuMonitorJSONIOProcessLine:208 : QEMU_MONITOR_RECV_REPLY:
mon=0x7f8fdc0ad2f0 reply={"return": {"expected-downtime": 300, "status":
"device", "setup-time": 297, "total-time"
: 79474, "ram": {"total": 137452265472, "postcopy-requests": 0,
"dirty-sync-count": 3, "multifd-bytes": 2821784576, "pages-per-second":
297855, "downtime-bytes": 13208, "page-size": 4096, "remaining": 0,
"postcopy-bytes"
: 0, "mbps": 9786.9158461538464, "transferred": 3117658825,
"dirty-sync-missed-zero-copy": 0, "precopy-bytes": 295861041,
"duplicate": 32874480, "dirty-pages-rate": 56, "skipped": 0,
"normal-bytes": 2804301824, "normal":
684644}}, "id": "libvirt-629"}
until finally it completes:
2025-04-30 01:09:06.327+0000: 1696272: info :
qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_EVENT:
mon=0x7f8fdc0ad2f0 event={"timestamp": {"seconds": 1745975346,
"microseconds": 327382}, "event": "MIGRATION", "dat
a": {"status": "completed"}}
On the destination side, it says something about negotiating failover
for the network link:
2025-04-30 01:08:12.923+0000: 1384503: info :
qemuMonitorJSONIOProcessLine:203 : QEMU_MONITOR_RECV_EVENT: mon=
0x7fc7900ab2f0 event={"timestamp": {"seconds": 1745975292,
"microseconds": 922783}, "event": "FAILOVER_NEGOTIA
TED", "data": {"device-id": "ua-sr-iov-backup"}}
Then nothing happens for about a minute until it says it is done:
2025-04-30 01:09:06.328+0000: 1384503: debug :
qemuMonitorJSONIOProcessLine:189 : Line [{"timestamp": {"second
s": 1745975346, "microseconds": 327991}, "event": "MIGRATION", "data":
{"status": "completed"}}]
Any thoughts on what is going on here to cause this delay? It's clearly
somehow related to the sv-iov component of the migration.
Thanks much…
19 hours, 31 minutes
virt-install iscsi direct - Target not found
by kgore4 une
When I try to use an iscsi direct pool in a "--disk" clause for
virt-install, I get error "iSCSI: Failed to connect to LUN : Failed to log
in to target. Status: Target not found(515)". I've seen that sort of error
before when the initiator name isn't used. The SAN returns different LUNS
depending on the initiator.
I've run out of ideas on what to try next. Any advice welcome. I've
included what I thought was relevent below.
klint.
The disk parameter to virt-install is (it's part of a script but the
variables are correct when executed)
[code]
--disk
vol=${poolName}/unit:0:0:${vLun1},xpath.set="./source/initiator/iqn/@name='iqn.2024-11.localdomain.agbu.agbuvh1:${vName}'"
\
[/code]
I added the xpath.set as I noticed that the initiator wasn't in the debug
output of the disk definition and it didn't work without it either.
The iscsi direct pool is defined and it appears to work - it's active and
vol-list shows the correct luns.
Using --debug on virt install, I can see the drive is detected early in the
process as it's got the size of the drive.
[code]
[Wed, 19 Feb 2025 16:43:33 virt-install 2174805] DEBUG (cli:3554) Parsed
--disk volume as: pool=agbu-ldap1 vol=unit:0:0:3
[Wed, 19 Feb 2025 16:43:33 virt-install 2174805] DEBUG (disk:648)
disk.set_vol_object: volxml=
<volume type='network'>
<name>unit:0:0:3</name>
<key>ip-10.1.4.3:3260-iscsi-iqn.1992-09.com.seagate:01.array.00c0fff6c846-lun-3</key>
<capacity unit='bytes'>49996103168</capacity>
<allocation unit='bytes'>49996103168</allocation>
<target>
<path>ip-10.1.4.3:3260-iscsi-iqn.1992-09.com.seagate:01.array.00c0fff6c846-lun-3</path>
</target>
</volume>
[Wed, 19 Feb 2025 16:43:33 virt-install 2174805] DEBUG (disk:650)
disk.set_vol_object: poolxml=
<pool type='iscsi-direct'>
<name>agbu-ldap1</name>
<uuid>1c4ae810-9bae-433c-a92f-7d3501b6ba80</uuid>
<capacity unit='bytes'>49996103168</capacity>
<allocation unit='bytes'>49996103168</allocation>
<available unit='bytes'>0</available>
<source>
<host name='10.1.4.3'/>
<device path='iqn.1992-09.com.seagate:01.array.00c0fff6c846'/>
<initiator>
<iqn name='iqn.2024-11.localdomain.agbu.agbuvh1:agbu-ldap'/>
</initiator>
</source>
</pool>
[/code]
The generated initial_xml for the disk looks like
[code]
<disk type="network" device="disk">
<driver name="qemu" type="raw"/>
<source protocol="iscsi"
name="iqn.1992-09.com.seagate:01.array.00c0fff6c846-lun-3">
<host name="10.1.4.3"/>
<initiator>
<iqn name="iqn.2024-11.localdomain.agbu.agbuvh1:agbu-ldap"/>
</initiator>
</source>
<target dev="vda" bus="virtio"/>
</disk>
[/code]
The generated final_xml looks like
[code]
<disk type="network" device="disk">
<driver name="qemu" type="raw"/>
<source protocol="iscsi"
name="iqn.1992-09.com.seagate:01.array.00c0fff6c846-lun-3">
<host name="10.1.4.3"/>
<initiator>
<iqn name="iqn.2024-11.localdomain.agbu.agbuvh1:agbu-ldap"/>
</initiator>
</source>
<target dev="vda" bus="virtio"/>
</disk>
[/code]
The full error is
[code]
[Wed, 19 Feb 2025 16:43:35 virt-install 2174805] DEBUG (cli:256) File
"/usr/bin/virt-install", line 8, in <module>
virtinstall.runcli()
File "/usr/share/virt-manager/virtinst/virtinstall.py", line 1233, in
runcli
sys.exit(main())
File "/usr/share/virt-manager/virtinst/virtinstall.py", line 1226, in main
start_install(guest, installer, options)
File "/usr/share/virt-manager/virtinst/virtinstall.py", line 974, in
start_install
fail(e, do_exit=False)
File "/usr/share/virt-manager/virtinst/cli.py", line 256, in fail
log.debug("".join(traceback.format_stack()))
[Wed, 19 Feb 2025 16:43:35 virt-install 2174805] ERROR (cli:257) internal
error: process exited while connecting to monitor:
2025-02-19T05:43:35.075695Z qemu-system-x86_64: -blockdev
{"driver":"iscsi","portal":"10.1.4.3:3260","target":"iqn.1992-09.com.seagate:01.array.00c0fff6c846-lun-3","lun":0,"transport":"tcp","initiator-name":"iqn.2024-11.localdomain.agbu.agbuvh1:agbu-ldap","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}:
iSCSI: Failed to connect to LUN : Failed to log in to target. Status:
Target not found(515)
[Wed, 19 Feb 2025 16:43:35 virt-install 2174805] DEBUG (cli:259)
Traceback (most recent call last):
File "/usr/share/virt-manager/virtinst/virtinstall.py", line 954, in
start_install
domain = installer.start_install(
^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/share/virt-manager/virtinst/install/installer.py", line 695,
in start_install
domain = self._create_guest(
^^^^^^^^^^^^^^^^^^^
File "/usr/share/virt-manager/virtinst/install/installer.py", line 637,
in _create_guest
domain = self.conn.createXML(initial_xml or final_xml, 0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3/dist-packages/libvirt.py", line 4481, in createXML
raise libvirtError('virDomainCreateXML() failed')
libvirt.libvirtError: internal error: process exited while connecting to
monitor: 2025-02-19T05:43:35.075695Z qemu-system-x86_64: -blockdev
{"driver":"iscsi","portal":"10.1.4.3:3260","target":"iqn.1992-09.com.seagate:01.array.00c0fff6c846-lun-3","lun":0,"transport":"tcp","initiator-name":"iqn.2024-11.localdomain.agbu.agbuvh1:agbu-ldap","node-name":"libvirt-2-storage","auto-read-only":true,"discard":"unmap"}:
iSCSI: Failed to connect to LUN : Failed to log in to target. Status:
Target not found(515)
[/code]
Things that could affect the answer
* What I'm calling a SAN is a seagate exos-x iscsi unit
* virtual host is debian 12
* virsh version 9.0.0
* iscsiadm version 2.1.8
20 hours, 4 minutes
Guest cannot access host HTTP service in NAT
by icefrog1950@gmail.com
Hi,
I hit a libvirt networking problem that guest cannot access host HTTP service. I debug this issue and tried some efforts. Thanks for your suggestions!
Environment
-------------
guest IP: 192.168.122.46 (Linux, default NAT, installed using virt-manager)
host1 IP: 192.168.3.16 (Centos 8.5 running libvirt and qemu, default libvirt iptable rules)
HTTP service: 192.168.3.16:70 (firewall rules have allowed this port)
host2: 192.168.3.65:70 (for test only)
guest network xml
<interface type="network">
<mac address="52:54:00:f5:a8:9f"/>
<source network="default" portid="6e8ce7e7-6517-43fa-b113-aaddb6c1bc08" bridge="virbr0"/>
<target dev="vnet2"/>
<model type="e1000"/>
<alias name="net0"/>
<address type="pci" domain="0x0000" bus="0x00" slot="0x03" function="0x0"/>
</interface>
1. guest and host1/host2 can ping each other
2. *guest can visit host2 HTTP
3. *guest cannot visit host1 HTTP
When I capture traffic in guest, Wireshark shows:
192.168.122.46 -> 192.168.3.16 // SYN ok
192.168.122.46 <- 192.168.3.16 // Destination unreachable (Port unreachable)
guest route table:
------------------
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.122.1 0.0.0.0 UG 100 0 0 eth0
192.168.122.0 0.0.0.0 255.255.255.0 U 100 0 0 eth0
host1 route table:
------------------
Destination Gateway Genmask Flags Metric Ref Use Iface
0.0.0.0 192.168.3.1 0.0.0.0 UG 100 0 0 enp3s0
192.168.3.0 0.0.0.0 255.255.255.0 U 100 0 0 enp3s0
192.168.122.0 0.0.0.0 255.255.255.0 UG 0 0 0 virbr0
I delete the last rule and add a rule to make sure host1 visits guests will go through 192.168.122.1
------------------
Destination Gateway Genmask Flags Metric Ref Use Iface
(other rules omitted)
192.168.122.0 192.168.122.1 255.255.255.0 UG 0 0 0 virbr0
However, traceroute shows this new rule does not work (which should go to 192.168.122.1 first), and guest cannot visit host1 HTTP request.
[!] guest visit http://192.168.3.16:70 does not go through 192.168.122.1
traceroute 192.168.122.46
traceroute to 192.168.122.46 (192.168.122.46), 30 hops max, 60 byte packets
1 192.168.122.46 (192.168.122.46) 0.200 ms 0.194 ms 0.194 ms
# guest visit http://192.168.3.65:70 goes through 192.168.122.1
traceroute 192.168.3.65
traceroute to 192.168.3.65 (192.168.3.65), 30 hops max, 60 byte packets
1 192.168.122.1 (192.168.122.1) 0.244 ms 0.050 ms 0.120 ms
2 192.168.3.65 (192.168.3.65) 0.823 ms !X 0.802 ms !X 0.789 ms !X
In sum, is there a way to force the guest go through 192.168.122.1 when visiting the hosting machine?
-------------------
libvirt iptable rules:
-P INPUT ACCEPT
-P FORWARD ACCEPT
-P OUTPUT ACCEPT
-N LIBVIRT_INP
-N LIBVIRT_OUT
-N LIBVIRT_FWO
-N LIBVIRT_FWI
-N LIBVIRT_FWX
-A INPUT -j LIBVIRT_INP
-A FORWARD -j LIBVIRT_FWX
-A FORWARD -j LIBVIRT_FWI
-A FORWARD -j LIBVIRT_FWO
-A OUTPUT -j LIBVIRT_OUT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p udp -m udp --dport 67 -j ACCEPT
-A LIBVIRT_INP -i virbr0 -p tcp -m tcp --dport 67 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p udp -m udp --dport 53 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p tcp -m tcp --dport 53 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p udp -m udp --dport 68 -j ACCEPT
-A LIBVIRT_OUT -o virbr0 -p tcp -m tcp --dport 68 -j ACCEPT
-A LIBVIRT_FWO -s 192.168.122.0/24 -i virbr0 -j ACCEPT
-A LIBVIRT_FWO -i virbr0 -j REJECT --reject-with icmp-port-unreachable # delete this rule does not work
-A LIBVIRT_FWI -d 192.168.122.0/24 -o virbr0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A LIBVIRT_FWI -o virbr0 -j REJECT --reject-with icmp-port-unreachable # delete this rule does not work
-A LIBVIRT_FWX -i virbr0 -o virbr0 -j ACCEPT
1 week, 1 day
Booting from DVD
by Vince Schielack III
I have need to pass the host optical drive through to a guest and have the guest boot from it.
I can set the boot order and make sure the ROM BAR is enabled, but the VM won’t boot from the dvd unless I explicitly select it in the boot menu (3rd option).
I’d like to do this to avoid a lengthy and unnecessary dd copy to an ISO. That would essentially double my system installation time by reading the DVD twice.
3 weeks
Windows 11 fails to install under QEMU/libvirt while VirtualBox succeeds — why?
by Paul Larochelle
Hello,
I'm writing to better understand a surprising discrepancy I encountered while attempting to install Windows 11 in different virtualization environments.
On my Arch Linux system, I tested two setups:
1. **VirtualBox (GUI):** Windows 11 installs successfully out of the box.
2. **QEMU/KVM with libvirt (manually crafted XML):** Windows 11 refuses to install, stating that the system doesn't meet the requirements.
The libvirt domain configuration includes:
- UEFI boot using OVMF (`OVMF_CODE.4m.fd` and `OVMF_VARS.ms.fd`)
- TPM 2.0 emulator (`tpm-crb` with `backend type='emulator' version='2.0'`)
- Secure Boot enabled (verified using Microsoft-signed vars)
- 8 GiB of RAM, 4 vCPUs
- VirtIO disk + virtio-win ISO attached
- QXL or VirtIO video model
- `<hyperv>` feature set enabled
- Valid boot order (CD-ROM first, then disk)
Despite this, Windows 11 either refuses installation with the "This PC can't run Windows 11" message or fails to detect a valid bootable device.
In contrast, VirtualBox seems to pass all checks without exposing TPM configuration explicitly or enabling Secure Boot manually.
---
**My question:**
What is VirtualBox doing under the hood that makes Windows 11 accept the environment without issues?
- Is it exposing a minimal TPM implicitly?
- Is it modifying SMBIOS/ACPI fields in a way that satisfies Windows validation logic?
- Are there known tricks or missing XML elements in libvirt domains to replicate this behavior?
My goal is not to bypass Microsoft's requirements, but rather to understand the technical differences and replicate a compliant setup in QEMU/libvirt, ideally without resorting to ISO modifications.
Any insight or guidance would be highly appreciated.
Best regards, Paul
[signature_paul.png]
Paul Larochelle
819 342-5487
Envoyé avec la messagerie sécurisée [Proton Mail.](https://proton.me/mail/home)
3 weeks
Cannot restore internal snapshots since libvirt 11.2.0
by anonym
Hi!
Since libvirt 11.2.0 attempting to restore internal snapshots fails for
me with:
error: Failed to revert snapshot foo1
error: operation failed: load of internal snapshot 'foo1' job
failed: Device 'libvirt-1-format' is writable but does not support snapshots
I reported this issue to GitLab [0] where there is more info and a
reproducer, but I also wanted to ask the wider community here if anyone
has experienced this and maybe even come up with a workaround?
[0] https://gitlab.com/libvirt/libvirt/-/issues/771
Cheers!
4 weeks, 1 day