On Wed, Mar 09, 2016 at 01:01:40PM -0500, Lars Kellogg-Stedman wrote:
I ran into an odd problem today. I wanted to share it here in the
hopes of maybe saving someone else some lost time.
When you run libvirtd as an unprivileged user (e.g., if you target
qemu:///session from a non-root account), then libvirt will open a
unix domain socket in one of two places:
- If XDG_RUNTIME_DIR is defined, then inside
$XDG_RUNTIME_DIR/libvirt/libvirt-sock
- If XDG_RUNTIME_DIR is *not* defined, then inside
$HOME/.cache/libvirt/libvirt-sock
With a CentOS 7 system, at least, if you ssh directly into an
account, XDG_RUNTIME_DIR is set. But! If you `su -` to the account
from root, e.g:
# su - stack
Then XDG_RUNTIME_DIR is *not* set.
I see, didn't realize this. Refere below a quick test, based on what
you mentioned
The problem is a little subtle, because most operations you will
perform will work just fine in both cases: you can query for defined
but not active guests, storagep pools, volumes, and so forth without a
problem and you'll get the same answer.
Let's put this to test.
I'm on a root shell:
$ whoami
root
`su -` into a user:
$ su - kashyapc
$ echo $XDG_RUNTIME_DIR
Try to enumerate instances, fails on the first attempt:
$ virsh list
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to
'/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory
The above step seems to have prompted libvirt to create the socketL
$ file ~/.cache/libvirt/libvirt-sock
/home/kashyapc/.cache/libvirt/libvirt-sock: socket
So, the second attempt to enumerate instances work just fine, since the
socket is created.
* * *
The problem crops up when you start a guest, which results in a
persistent libvirtd process. Now, depending on how you got to your
account, you will either (a) talk to the persistent process, and
you'll be able to see the running guests, or (b) you'll end up
spawning a new ephemeral libvirtd process listening in the *other*
location, and you won't see anything, and you will wonder why there is
a qemu process running for your guest but it's not showing up in
"virsh list" and what the heck is going on here.
Test-2
------
If I don't `su -` to get to my account at first, but spawn a new shell
(Ctl + Shift + t), the XDG_RUNTIME_DIR variable is set:
$ echo $XDG_RUNTIME_DIR
/run/user/1000
$ virsh list --all
Id Name State
----------------------------------------------------
- vm1 shut off
$ virsh start vm1
Domain vm1 started
And the socket is created under /run/user/1000/libvirt:
$ ls /run/user/1000/libvirt/
hostdevmgr libvirtd.pid libvirt-sock network qemu storage
$ ls ~/.cache/libvirt/
hostdevmgr libvirt network qemu storage virsh
Then, continuing with the above same shell, do:
$ sudo -i
$ su - kashyapc
Now, again try to enumerate the instance we started a few steps above,
as it was looking for the socket in the other location).
$ virsh list
error: failed to connect to the hypervisor
error: no valid connection
error: Failed to connect socket to
'/home/kashyapc/.cache/libvirt/libvirt-sock': No such file or directory
But, the socket is created, anyway, and try to enumerate the instance:
$ virsh list
$ virsh list --all
Id Name State
----------------------------------------------------
- vm1 shut off
As you've observed, while the QEMU process for the above VM is still
intact, from libvirt's perspective, the instance is (a) not enumerated
with `virsh list`; (b) and when the '--all' flag is supplied to `virsh
list`, the VM is listed as "shut off", which can cause more confusion.
(Finally: on the above session, if I logout of the `su -`'ed user
session & the root session, then I'm back to the 'pristine shell state'
where lIbvirt behaves 'properly'.)
I don't know if there's a good solution to this, but the
failure mode
is really non-obvious.
This seems worth filing a bug for.
--
/kashyap