[libvirt] Question about verifying same uid:gid in src and dst for live migration

Hi, When I do live migration using virsh command line based on NFS shared storage between two systems having the same security mechanism and having the same kvm/qemu/libvirt version, I encounter the following error: debug : qemuMonitorJSONIOProcessLine:193 : Line [{"timestamp": {"seconds": 1524893525, "microseconds": 522686}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk0", "nospace": false, "node-name": "#block120", "reason": "Permission denied", "operation": "write", "action": "report"}}] ... error: internal error: qemu unexpectedly closed the monitor: qemu-system-x86_64: load of migration failed: Input/output error ... According to the "Permission denied" && "write" information, I find the below 2 ways can fix this error: - Change the mode of guest's .qcow2 file from 644 to 646 - Keep qemu's uid the same one between src host and dst host (They are not same before I change them) My environment and test cases: src:~ # id qemu uid=473(qemu) gid=476(qemu) groups=488(kvm),476(qemu) dst:~ # id qemu uid=467(qemu) gid=470(qemu) groups=488(kvm),470(qemu) In /etc/libvirt/qemu.conf, my confifuration is the following default: # The user for QEMU processes run by the system instance. It can be # specified as a user name or as a user id. The qemu driver will try to # parse this value first as a name and then, if the name doesn't exist, # as a user id. # # Since a sequence of digits is a valid user name, a leading plus sign # can be used to ensure that a user id will not be interpreted as a user # name. # # Some examples of valid values are: # # user = "qemu" # A user named "qemu" # user = "+0" # Super user (uid=0) # user = "100" # A user named "100" or a user with uid=100 # #user = "root" # The group for QEMU processes run by the system instance. It can be # specified in a similar way to user. #group = "root" # Whether libvirt should dynamically change file ownership # to match the configured user/group above. Defaults to 1. # Set to 0 to disable file ownership changes. #dynamic_ownership = 1 On the src, do live migration "virsh -d 0 migrate --live vm-name qemu+ssh://dst-ip/system": - after a vm is defined, user:group=root:root - after a vm is started, user:group=qemu:qemu - after migration begins, user:group=467:470 (that is dst's uid:gid) - after migration succeeds, user:group=467:470 (that is dst's uid:gid) - after a vm is destroyed, user:group=root:root (back to the src's) - after migration fails, user:group=467:470; the vm is still running in src but the file inside the guest becomes read-only even its mode is 644 Other notes: - I tried libvirt v3.3.0 && v4.0.0 to do the same test, both can see such error. After confirming that keeping qemu's uid identical between src host and dst host can fix such issue, my question is whether a fix in libvirt should be pursued or just document the requirement for same uid:gid across host systems in a migration cluster is ok? BTW, if a fix is needed, maybe the pre-migration checks in libvirt could determine different uid and/or gid and fail sooner with a better/explicit error like "Should keep the qemu uid in src and dst be the same for migration, or elsemigration will fail"? Does anyone have noticed this and could give some suggestions? Thanks a lot! Have a nice day, thanks again Fei

On Wed, May 09, 2018 at 01:45:53PM +0800, Fei Li wrote:
Hi,
When I do live migration using virsh command line based on NFS shared storage between two systems having the same security mechanism and having the same kvm/qemu/libvirt version, I encounter the following error:
debug : qemuMonitorJSONIOProcessLine:193 : Line [{"timestamp": {"seconds": 1524893525, "microseconds": 522686}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk0", "nospace": false, "node-name": "#block120", "reason": "Permission denied", "operation": "write", "action": "report"}}] ... error: internal error: qemu unexpectedly closed the monitor: qemu-system-x86_64: load of migration failed: Input/output error ...
According to the "Permission denied" && "write" information, I find the below 2 ways can fix this error: - Change the mode of guest's .qcow2 file from 644 to 646
Absolutely no - any process or user that can access the mount can then compromise your disk images
- Keep qemu's uid the same one between src host and dst host (They are not same before I change them)
You *must* have the same uid+gid between source and dest hosts
After confirming that keeping qemu's uid identical between src host and dst host can fix such issue, my question is whether a fix in libvirt should be pursued or just document the requirement for same uid:gid across host systems in a migration cluster is ok?
In Fedora and RHEL at least the system is setup so that these users get a fixed uid:gid upon installation to avoid this kind of problem. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, May 9, 2018 at 7:45 AM, Fei Li <fli@suse.com> wrote:
Hi, When I do live migration using virsh command line based on NFS shared storage between two systems having the same security mechanism and having the same kvm/qemu/libvirt version, I encounter the following error:
debug : qemuMonitorJSONIOProcessLine:193 : Line [{"timestamp": {"seconds": 1524893525, "microseconds": 522686}, "event": "BLOCK_IO_ERROR", "data": {"device": "drive-virtio-disk0", "nospace": false, "node-name": "#block120", "reason": "Permission denied", "operation": "write", "action": "report"}}] ...error: internal error: qemu unexpectedly closed the monitor: qemu-system-x86_64: load of migration failed: Input/output error ...
According to the "Permission denied" && "write" information, I find the below 2 ways can fix this error: - Change the mode of guest's .qcow2 file from 644 to 646 - Keep qemu's uid the same one between src host and dst host (They are not same before I change them)
My environment and test cases:
src:~ # id qemu uid=473(qemu) gid=476(qemu) groups=488(kvm),476(qemu) dst:~ # id qemu uid=467(qemu) gid=470(qemu) groups=488(kvm),470(qemu)
In /etc/libvirt/qemu.conf, my confifuration is the following default: # The user for QEMU processes run by the system instance. It can be # specified as a user name or as a user id. The qemu driver will try to # parse this value first as a name and then, if the name doesn't exist, # as a user id. # # Since a sequence of digits is a valid user name, a leading plus sign # can be used to ensure that a user id will not be interpreted as a user # name. # # Some examples of valid values are: # # user = "qemu" # A user named "qemu" # user = "+0" # Super user (uid=0) # user = "100" # A user named "100" or a user with uid=100 # #user = "root"
# The group for QEMU processes run by the system instance. It can be # specified in a similar way to user. #group = "root"
# Whether libvirt should dynamically change file ownership # to match the configured user/group above. Defaults to 1. # Set to 0 to disable file ownership changes. #dynamic_ownership = 1
On the src, do live migration "virsh -d 0 migrate --live vm-name qemu+ssh://dst-ip/system": - after a vm is defined, user:group=root:root - after a vm is started, user:group=qemu:qemu - after migration begins, user:group=467:470 (that is dst's uid:gid) - after migration succeeds, user:group=467:470 (that is dst's uid:gid) - after a vm is destroyed, user:group=root:root (back to the src's) - after migration fails, user:group=467:470; the vm is still running in src but the file inside the guest becomes read-only even its mode is 644
Other notes: - I tried libvirt v3.3.0 && v4.0.0 to do the same test, both can see such error.
After confirming that keeping qemu's uid identical between src host and dst host can fix such issue, my question is whether a fix in libvirt should be pursued or just document the requirement for same uid:gid across host systems in a migration cluster is ok? BTW, if a fix is needed, maybe the pre-migration checks in libvirt could determine different uid and/or gid and fail sooner with a better/explicit error like "Should keep the qemu uid in src and dst be the same for migration, or else migration will fail"?
Does anyone have noticed this and could give some suggestions? Thanks a lot!
Hi Fei Li, Yes we have seen this issue and fixed it downstream. IMHO there is not much libvirt as an upstream project can (or should) do about UID allocation on different hosts. For the scope of Debian/Ubuntu this was fixed [1][2] by ensuring that the relevant user is always the the same due to a pre-reserved UID/GID being assigned on install. I'd think you might want to implement something similar, but that is up to your consideration. [1]: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=844339 [2]: https://bugs.launchpad.net/ubuntu/+source/libvirt/+bug/1637601
Have a nice day, thanks again Fei
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Christian Ehrhardt Software Engineer, Ubuntu Server Canonical Ltd
participants (3)
-
Christian Ehrhardt
-
Daniel P. Berrangé
-
Fei Li