
On Wed, Mar 09, 2016 at 01:01:40PM -0500, Lars Kellogg-Stedman wrote:
I ran into an odd problem today. I wanted to share it here in the hopes of maybe saving someone else some lost time.
When you run libvirtd as an unprivileged user (e.g., if you target qemu:///session from a non-root account), then libvirt will open a unix domain socket in one of two places:
- If XDG_RUNTIME_DIR is defined, then inside $XDG_RUNTIME_DIR/libvirt/libvirt-sock
- If XDG_RUNTIME_DIR is *not* defined, then inside $HOME/.cache/libvirt/libvirt-sock
With a CentOS 7 system, at least, if you ssh directly into an account, XDG_RUNTIME_DIR is set. But! If you `su -` to the account from root, e.g:
# su - stack
Then XDG_RUNTIME_DIR is *not* set.
I see, didn't realize this. Refere below a quick test, based on what you mentioned
The problem is a little subtle, because most operations you will perform will work just fine in both cases: you can query for defined but not active guests, storagep pools, volumes, and so forth without a problem and you'll get the same answer.
Let's put this to test. I'm on a root shell: $ whoami root `su -` into a user: $ su - kashyapc $ echo $XDG_RUNTIME_DIR Try to enumerate instances, fails on the first attempt: $ virsh list error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory The above step seems to have prompted libvirt to create the socketL $ file ~/.cache/libvirt/libvirt-sock /home/kashyapc/.cache/libvirt/libvirt-sock: socket So, the second attempt to enumerate instances work just fine, since the socket is created. * * *
The problem crops up when you start a guest, which results in a persistent libvirtd process. Now, depending on how you got to your account, you will either (a) talk to the persistent process, and you'll be able to see the running guests, or (b) you'll end up spawning a new ephemeral libvirtd process listening in the *other* location, and you won't see anything, and you will wonder why there is a qemu process running for your guest but it's not showing up in "virsh list" and what the heck is going on here.
Test-2 ------ If I don't `su -` to get to my account at first, but spawn a new shell (Ctl + Shift + t), the XDG_RUNTIME_DIR variable is set: $ echo $XDG_RUNTIME_DIR /run/user/1000 $ virsh list --all Id Name State ---------------------------------------------------- - vm1 shut off $ virsh start vm1 Domain vm1 started And the socket is created under /run/user/1000/libvirt: $ ls /run/user/1000/libvirt/ hostdevmgr libvirtd.pid libvirt-sock network qemu storage $ ls ~/.cache/libvirt/ hostdevmgr libvirt network qemu storage virsh Then, continuing with the above same shell, do: $ sudo -i $ su - kashyapc Now, again try to enumerate the instance we started a few steps above, as it was looking for the socket in the other location). $ virsh list error: failed to connect to the hypervisor error: no valid connection error: Failed to connect socket to '/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory But, the socket is created, anyway, and try to enumerate the instance: $ virsh list $ virsh list --all Id Name State ---------------------------------------------------- - vm1 shut off As you've observed, while the QEMU process for the above VM is still intact, from libvirt's perspective, the instance is (a) not enumerated with `virsh list`; (b) and when the '--all' flag is supplied to `virsh list`, the VM is listed as "shut off", which can cause more confusion. (Finally: on the above session, if I logout of the `su -`'ed user session & the root session, then I'm back to the 'pristine shell state' where lIbvirt behaves 'properly'.)
I don't know if there's a good solution to this, but the failure mode is really non-obvious.
This seems worth filing a bug for. -- /kashyap