The /etc/libvirt/libvirtd.conf file was modified to include the following:
log_level = 1 log_filters="3:remote 4:event 3:json 3:rpc"
log_outputs="1:file:/var/log/libvirt/libvirtd.log"
The service was then restarted, and I ran "tail -f" against the log to determine
exactly which entries were being created as a result of specific actions on my part.
I then reproduced the error.
I ran "virsh console proxy2" expecting it to tell me whether the VM was running,
and that resulted in the VM booting. (So it's not just "list --all"
that's causing the unexpected boots.) That resulted in the following chunk of data
being written to the log:
2018-12-12 18:47:57.361+0000: 21264: debug : virNetlinkEventCallback:707 : dispatching to
max 0 clients, called from event watch 72018-12-12 18:47:57.361+0000: 21264: debug :
virNetlinkEventCallback:720 : event not handled.2018-12-12 18:47:57.361+0000: 21264: debug
: virNetlinkEventCallback:707 : dispatching to max 0 clients, called from event watch
72018-12-12 18:47:57.361+0000: 21264: debug : virNetlinkEventCallback:720 : event not
handled.2018-12-12 18:47:57.361+0000: 21264: debug : virNetlinkEventCallback:707 :
dispatching to max 0 clients, called from event watch 72018-12-12 18:47:57.361+0000:
21264: debug : virNetlinkEventCallback:720 : event not handled.2018-12-12
18:47:57.373+0000: 21323: debug : udevHandleOneDevice:1677 : udev action:
'add'2018-12-12 18:47:57.373+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'DRIVER' value '<null>' for device with sysname
'tap3'2018-12-12 18:47:57.374+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'SUBSYSTEM' value 'net' for device with sysname
'tap3'2018-12-12 18:47:57.374+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'INTERFACE' value 'tap3' for device with sysname
'tap3'2018-12-12 18:47:57.374+0000: 21323: debug : udevGetDeviceSysfsAttr:217 :
Found sysfs attribute 'address' value 'fe:52:16:f6:1a:8d' for device with
sysname 'tap3'2018-12-12 18:47:57.374+0000: 21323: debug :
udevGetDeviceSysfsAttr:217 : Found sysfs attribute 'addr_len' value '6'
for device with sysname 'tap3'2018-12-12 18:47:57.374+0000: 21323: debug :
virFileClose:110 : Closed fd 292018-12-12 18:47:57.375+0000: 21323: debug :
virPCIGetDeviceAddressFromSysfsLink:2624 : '/sys/class/net/tap3/device' does not
exist2018-12-12 18:47:57.375+0000: 21323: debug : virFileClose:110 : Closed fd
292018-12-12 18:47:57.375+0000: 21323: info : virObjectNew:254 : OBJECT_NEW:
obj=0x7f39f8001f70 classname=virNodeDeviceObj2018-12-12 18:47:57.375+0000: 21323: info :
virObjectRef:388 : OBJECT_REF: obj=0x7f39f8001f702018-12-12 18:47:57.375+0000: 21323: info
: virObjectNew:254 : OBJECT_NEW: obj=0x7f39f8002030
classname=virNodeDeviceEventLifecycle2018-12-12 18:47:57.375+0000: 21323: info :
virObjectUnref:350 : OBJECT_UNREF: obj=0x7f39f8001f702018-12-12 18:47:57.375+0000: 21323:
info : virObjectUnref:350 : OBJECT_UNREF: obj=0x7f39f80020302018-12-12 18:47:57.375+0000:
21323: info : virObjectUnref:352 : OBJECT_DISPOSE: obj=0x7f39f80020302018-12-12
18:47:57.375+0000: 21323: debug : udevHandleOneDevice:1677 : udev action:
'add'2018-12-12 18:47:57.375+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'DRIVER' value '<null>' for device with sysname
'rx-0'2018-12-12 18:47:57.375+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'SUBSYSTEM' value 'queues' for device with sysname
'rx-0'2018-12-12 18:47:57.375+0000: 21323: debug : udevGetDeviceType:1349 : Could
not determine device type for device with sysfs name 'rx-0'2018-12-12
18:47:57.375+0000: 21323: debug : udevAddOneDevice:1541 : Discarding device -1
0x7f39f8001900 /sys/devices/virtual/net/tap3/queues/rx-02018-12-12 18:47:57.375+0000:
21323: debug : udevHandleOneDevice:1677 : udev action: 'add'2018-12-12
18:47:57.375+0000: 21323: debug : udevGetDeviceProperty:149 : Found property key
'DRIVER' value '<null>' for device with sysname
'tx-0'2018-12-12 18:47:57.375+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'SUBSYSTEM' value 'queues' for device with sysname
'tx-0'2018-12-12 18:47:57.375+0000: 21323: debug : udevGetDeviceType:1349 : Could
not determine device type for device with sysfs name 'tx-0'2018-12-12
18:47:57.375+0000: 21323: debug : udevAddOneDevice:1541 : Discarding device -1
0x7f39f8001900 /sys/devices/virtual/net/tap3/queues/tx-02018-12-12 18:47:57.460+0000:
21264: debug : virNetlinkEventCallback:707 : dispatching to max 0 clients, called from
event watch 72018-12-12 18:47:57.460+0000: 21264: debug : virNetlinkEventCallback:720 :
event not handled.2018-12-12 18:47:57.464+0000: 21323: debug : udevHandleOneDevice:1677 :
udev action: 'change'2018-12-12 18:47:57.464+0000: 21323: debug :
udevGetDeviceProperty:149 : Found property key 'DRIVER' value
'<null>' for device with sysname 'kvm'2018-12-12 18:47:57.464+0000:
21323: debug : udevGetDeviceProperty:149 : Found property key 'SUBSYSTEM' value
'misc' for device with sysname 'kvm'2018-12-12 18:47:57.464+0000: 21323:
debug : udevGetDeviceType:1349 : Could not determine device type for device with sysfs
name 'kvm'2018-12-12 18:47:57.464+0000: 21323: debug : udevAddOneDevice:1541 :
Discarding device -1 0x7f39f8001900 /sys/devices/virtual/misc/kvm
I then ran "virsh shutdown proxy2" to stop the VM, resulting in the following
log output:
2018-12-12 18:49:24.556+0000: 21264: debug : virNetlinkEventCallback:707 : dispatching to
max 0 clients, called from event watch 72018-12-12 18:49:24.556+0000: 21264: debug :
virNetlinkEventCallback:720 : event not handled.2018-12-12 18:49:24.556+0000: 21264: debug
: virNetlinkEventCallback:707 : dispatching to max 0 clients, called from event watch
72018-12-12 18:49:24.556+0000: 21264: debug : virNetlinkEventCallback:720 : event not
handled.2018-12-12 18:49:24.556+0000: 21264: debug : virNetlinkEventCallback:707 :
dispatching to max 0 clients, called from event watch 72018-12-12 18:49:24.556+0000:
21264: debug : virNetlinkEventCallback:720 : event not handled.2018-12-12
18:49:24.561+0000: 21323: debug : udevHandleOneDevice:1677 : udev action:
'remove'2018-12-12 18:49:24.561+0000: 21323: debug : udevRemoveOneDevice:1409 :
Failed to find device to remove that has udev name
'/sys/devices/virtual/net/tap3/queues/tx-0'2018-12-12 18:49:24.561+0000: 21323:
debug : udevHandleOneDevice:1677 : udev action: 'remove'2018-12-12
18:49:24.562+0000: 21323: debug : udevRemoveOneDevice:1409 : Failed to find device to
remove that has udev name '/sys/devices/virtual/net/tap3/queues/rx-0'2018-12-12
18:49:24.562+0000: 21323: debug : udevHandleOneDevice:1677 : udev action:
'remove'2018-12-12 18:49:24.562+0000: 21323: info : virObjectRef:388 : OBJECT_REF:
obj=0x7f39f8001f702018-12-12 18:49:24.562+0000: 21323: info : virObjectNew:254 :
OBJECT_NEW: obj=0x7f39f80016f0 classname=virNodeDeviceEventLifecycle2018-12-12
18:49:24.562+0000: 21323: debug : udevRemoveOneDevice:1419 : Removing device
'net_tap3_fe_52_16_f6_1a_8d' with sysfs path
'/sys/devices/virtual/net/tap3'2018-12-12 18:49:24.562+0000: 21323: info :
virObjectRef:388 : OBJECT_REF: obj=0x7f39f8001f702018-12-12 18:49:24.562+0000: 21323: info
: virObjectUnref:350 : OBJECT_UNREF: obj=0x7f39f8001f702018-12-12 18:49:24.563+0000:
21323: info : virObjectUnref:350 : OBJECT_UNREF: obj=0x7f39f8001f702018-12-12
18:49:24.563+0000: 21323: info : virObjectUnref:350 : OBJECT_UNREF:
obj=0x7f39f8001f702018-12-12 18:49:24.563+0000: 21323: info : virObjectUnref:352 :
OBJECT_DISPOSE: obj=0x7f39f8001f702018-12-12 18:49:24.563+0000: 21323: info :
virObjectUnref:350 : OBJECT_UNREF: obj=0x7f39f80016f02018-12-12 18:49:24.563+0000: 21323:
info : virObjectUnref:352 : OBJECT_DISPOSE: obj=0x7f39f80016f02018-12-12
18:49:24.633+0000: 21264: debug : virNetlinkEventCallback:707 : dispatching to max 0
clients, called from event watch 72018-12-12 18:49:24.633+0000: 21264: debug :
virNetlinkEventCallback:720 : event not handled.2018-12-12 18:49:24.646+0000: 21323: debug
: udevHandleOneDevice:1677 : udev action: 'change'2018-12-12 18:49:24.646+0000:
21323: debug : udevGetDeviceProperty:149 : Found property key 'DRIVER' value
'<null>' for device with sysname 'kvm'2018-12-12 18:49:24.648+0000:
21323: debug : udevGetDeviceProperty:149 : Found property key 'SUBSYSTEM' value
'misc' for device with sysname 'kvm'2018-12-12 18:49:24.648+0000: 21323:
debug : udevGetDeviceType:1349 : Could not determine device type for device with sysfs
name 'kvm'2018-12-12 18:49:24.648+0000: 21323: debug : udevAddOneDevice:1541 :
Discarding device -1 0x7f39f8001560 /sys/devices/virtual/misc/kvm
I then ran "watch virsh list --all" for 10 minutes, and it continuously showed
the VM in a "shut off" state. Next I ran "virsh console proxy2" and
"virsh list --all" immediately after killing the "watch" process, and
it dumped the following output to the screen:
[pkukla@vs2 ~]$ virsh console proxy2error: The domain is not running
[pkukla@vs2 ~]$ virsh list --all Id Name
State---------------------------------------------------- - proxy2
shut off - proxy2-clone shut off
Then I waited ~2-3 minutes and ran "virsh list --all" again, and it caused the
VM to boot:
[pkukla@vs2 ~]$ virsh list --all Id Name
State---------------------------------------------------- 1 proxy2
running - proxy2-clone shut off
The debug output from the log corresponding to this boot sequence:
2018-12-12 19:03:53.712+0000: 21264: debug : virNetlinkEventCallback:707 : dispatching to
max 0 clients, called from event watch 72018-12-12 19:03:53.712+0000: 21264: debug :
virNetlinkEventCallback:720 : event not handled.2018-12-12 19:03:53.712+0000: 21264: debug
: virNetlinkEventCallback:707 : dispatching to max 0 clients, called from event watch
72018-12-12 19:03:53.712+0000: 21264: debug : virNetlinkEventCallback:720 : event not
handled.2018-12-12 19:03:53.713+0000: 21264: debug : virNetlinkEventCallback:707 :
dispatching to max 0 clients, called from event watch 72018-12-12 19:03:53.713+0000:
21264: debug : virNetlinkEventCallback:720 : event not handled.2018-12-12
19:03:53.723+0000: 21323: debug : udevHandleOneDevice:1677 : udev action:
'add'2018-12-12 19:03:53.723+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'DRIVER' value '<null>' for device with sysname
'tap3'2018-12-12 19:03:53.724+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'SUBSYSTEM' value 'net' for device with sysname
'tap3'2018-12-12 19:03:53.724+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'INTERFACE' value 'tap3' for device with sysname
'tap3'2018-12-12 19:03:53.724+0000: 21323: debug : udevGetDeviceSysfsAttr:217 :
Found sysfs attribute 'address' value 'fe:c9:eb:76:37:d8' for device with
sysname 'tap3'2018-12-12 19:03:53.724+0000: 21323: debug :
udevGetDeviceSysfsAttr:217 : Found sysfs attribute 'addr_len' value '6'
for device with sysname 'tap3'2018-12-12 19:03:53.724+0000: 21323: debug :
virFileClose:110 : Closed fd 292018-12-12 19:03:53.725+0000: 21323: debug :
virPCIGetDeviceAddressFromSysfsLink:2624 : '/sys/class/net/tap3/device' does not
exist2018-12-12 19:03:53.725+0000: 21323: debug : virFileClose:110 : Closed fd
292018-12-12 19:03:53.725+0000: 21323: info : virObjectNew:254 : OBJECT_NEW:
obj=0x7f39f8001760 classname=virNodeDeviceObj2018-12-12 19:03:53.725+0000: 21323: info :
virObjectRef:388 : OBJECT_REF: obj=0x7f39f80017602018-12-12 19:03:53.725+0000: 21323: info
: virObjectNew:254 : OBJECT_NEW: obj=0x7f39f80017d0
classname=virNodeDeviceEventLifecycle2018-12-12 19:03:53.725+0000: 21323: info :
virObjectUnref:350 : OBJECT_UNREF: obj=0x7f39f80017602018-12-12 19:03:53.725+0000: 21323:
info : virObjectUnref:350 : OBJECT_UNREF: obj=0x7f39f80017d02018-12-12 19:03:53.725+0000:
21323: info : virObjectUnref:352 : OBJECT_DISPOSE: obj=0x7f39f80017d02018-12-12
19:03:53.725+0000: 21323: debug : udevHandleOneDevice:1677 : udev action:
'add'2018-12-12 19:03:53.725+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'DRIVER' value '<null>' for device with sysname
'tx-0'2018-12-12 19:03:53.725+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'SUBSYSTEM' value 'queues' for device with sysname
'tx-0'2018-12-12 19:03:53.725+0000: 21323: debug : udevGetDeviceType:1349 : Could
not determine device type for device with sysfs name 'tx-0'2018-12-12
19:03:53.725+0000: 21323: debug : udevAddOneDevice:1541 : Discarding device -1
0x7f39f8001f60 /sys/devices/virtual/net/tap3/queues/tx-02018-12-12 19:03:53.725+0000:
21323: debug : udevHandleOneDevice:1677 : udev action: 'add'2018-12-12
19:03:53.725+0000: 21323: debug : udevGetDeviceProperty:149 : Found property key
'DRIVER' value '<null>' for device with sysname
'rx-0'2018-12-12 19:03:53.725+0000: 21323: debug : udevGetDeviceProperty:149 :
Found property key 'SUBSYSTEM' value 'queues' for device with sysname
'rx-0'2018-12-12 19:03:53.725+0000: 21323: debug : udevGetDeviceType:1349 : Could
not determine device type for device with sysfs name 'rx-0'2018-12-12
19:03:53.725+0000: 21323: debug : udevAddOneDevice:1541 : Discarding device -1
0x7f39f8001f60 /sys/devices/virtual/net/tap3/queues/rx-02018-12-12 19:03:53.800+0000:
21264: debug : virNetlinkEventCallback:707 : dispatching to max 0 clients, called from
event watch 72018-12-12 19:03:53.800+0000: 21264: debug : virNetlinkEventCallback:720 :
event not handled.2018-12-12 19:03:53.804+0000: 21323: debug : udevHandleOneDevice:1677 :
udev action: 'change'2018-12-12 19:03:53.804+0000: 21323: debug :
udevGetDeviceProperty:149 : Found property key 'DRIVER' value
'<null>' for device with sysname 'kvm'2018-12-12 19:03:53.804+0000:
21323: debug : udevGetDeviceProperty:149 : Found property key 'SUBSYSTEM' value
'misc' for device with sysname 'kvm'2018-12-12 19:03:53.804+0000: 21323:
debug : udevGetDeviceType:1349 : Could not determine device type for device with sysfs
name 'kvm'2018-12-12 19:03:53.804+0000: 21323: debug : udevAddOneDevice:1541 :
Discarding device -1 0x7f39f8001f60 /sys/devices/virtual/misc/kvm
I then disabled logging and restarted the service.
I've been doing has been under my personal account ("pkukla") on the
server. The "root" account has production instances, so I have not been working
under that account because I don't want to break anything. In fact, I had only
noticed this and reproduced it while working in my personal account, and I originally
wasn't sure that it was happening with any other user accounts.
While collecting this data, however, I am reasonably sure that I managed to boot one of
the VMs in the "root" account with this bug. I didn't monitor the VM in the
same way that I've been monitoring the bug in my own account, but a server that
shouldn't have been booted did boot around the time that I was working on the server
as "root", and I've confirmed with the other people with the appropriate
privileges to login as "root", and none of them started the VM. The VM in the
"root" userspace that was booted was the "slo_redmine" VM. I do not
know whether it was booted during the time that logging was enabled, so maybe that is
reflected in the log, also?
If I *did* managed to repro the bug while working as "root", then this implies
something else: This is apparently only affecting VM's in the user's on
environment. For example, I've been reproducing the trigger conditions frequently
over the past few days to diagnose this, but all of my testing as user "pkukla"
does NOT seem to have affected the VMs in the "root" user space. I'm not
willing to guarantee that the effects of this are only confined to a user's own VMs
because I haven't exhaustively tested the theory, but it seems likely that's the
case.
The full log has been provided to you as a link in another email.
Please let me know if you have any questions, and thanks again for looking at this!
Peter Kukla
On Wednesday, December 12, 2018, 2:19:10 AM EST, Jiri Denemark
<jdenemar(a)redhat.com> wrote:
On Tue, Dec 11, 2018 at 14:39:40 +0000, Peter Kukla wrote:
Hello,
I have a CentOS 7 client running on a CentOS 7 server. "virsh --version"
reports that 3.9.0 is being used.
I'm issuing a "virsh shutdown" command to shutdown the client. I then
confirm that the client has actually shutdown using "virsh list --all".
Lately I've been seeing instances where I shutdown the client, turn away to work on
something else, and then turn back to find that the client has booted again.
Initially I thought I was mistaken, and that I somehow hadn't properly shutdown the
client, but further experimentation shows that's definitely not the case. I've
been able to reproduce this repeatedly using the following steps.
1. Run "virsh shutdown CLIENT"2. Wait ~30-60 seconds3. Run "virsh list
--all; virsh console CLIENT"
If I run "virsh console CLIENT" on its own when the client is in a shutdown
state, I get the expected "error: The domain is not running" message. If I omit
step #2 and run step #3 immediately after shutting down the client, virsh also complains
that the client's not started and refuses to open a console.
However, if I add a pause -- step 2, above -- between the shutdown and the "list
--all" commands, then "list --all" shows that the client is running, and a
console is opened to the VM.
I have confirmed that the client boot is definitely being started by the "list
--all" command (and not being started by something else while I'm not looking) by
combining the "list --all" and "console" commands in step 3. This is
ensuring that the console is started as soon as the "list --all" is complete,
and the resulting screen output shows the BIOS boot options quickly followed by the kernel
boot messages being dumped to the console. I can watch the entire boot process from start
to finish.
That pause between the shutdown and the "list --all" is critical. I've
shutdown the client and then immediately followed it up with the command:
watch virsh list --all
...which will run the "list --all" command every 2 seconds. This will run
indefinitely without triggering the boot, which eliminates the possibility of the boot
occurring due to the amount of time since the shutdown.
It may be that other virsh commands are also causing the client to boot, but thus far
I've only reproduced it using the list --all command.
Can anyone tell me why this is happening? A bug? Something in my config? Something
else?
Thanks in advance,
This is indeed very strange and we would need to look at the logs to
check what's going on. Could you follow the steps in
https://wiki.libvirt.org/page/DebugLogs#Runtime_setting to enable debug
logs for libvirtd and share the (compressed) libvirtd.log somewhere?
Jirka