Hi Laine,
I'm sorry for starting 3 threads with the same topic, seems like my mail was
heavily delayed yesterday.
[ removing personal mail ]
Op 8/7/2015 om 12:24 AM schreef Laine Stump:
>>>Subject: Reproducible live disk detach bug
>>>
>>>Hey guys,
>>>
>>>I am currently running into a reproducible libvirt bug when I detach a
>>>disk from a running VM using libvirt 1.2.18 and qemu 2.3.0.
>>>
>>>I can attach a disk to a running VM without any problems.
>>>During the attach operation I don't create a disk alias name, which
>>>results in libvirt generating an unique alias.
>>>Using virsh dumpxml I get the following output(snippet):
>>>
>>><disk type='file' device='disk'>
>>> <driver name='qemu' type='raw'
cache='none'/>
>>> <source
file='/var/local/mnt/test-storage/test-storage.raw'/>
>>> <backingStore/>
>>> <target dev='vde' bus='virtio'/>
>>> <iotune>
>>> <read_bytes_sec>94371840</read_bytes_sec>
>>> <write_bytes_sec>31457280</write_bytes_sec>
>>> <read_iops_sec>1000</read_iops_sec>
>>> <write_iops_sec>200</write_iops_sec>
>>> </iotune>
>>> <alias name='virtio-disk4'/>
>>> <address type='pci' domain='0x0000'
bus='0x00' slot='0x09' function='0x0'/>
>>></disk>
>>>
>>>As you can see, libvirt generates 'virtio-disk4' as an alias. But as
>>>soon as I try to detach the disk from the running VM, I will receive the
>>>following two errors in my libvirt log:
>>>
>>>2015-08-06 10:48:59.943+0000: 11657: warning :
>>>qemuMonitorJSONHandleDeviceDeleted:939 : missing device in device
>>>deleted event
>>>2015-08-06 10:48:59.945+0000: 11662: error :
>>>qemuMonitorTextDriveDel:2594 : operation failed: deleting
>>>drive-virtio-disk4 drive failed: 2015-08-06T10:48:59.945058Z Device
>>>'drive-virtio-disk4' not found
>>>
>>>Apparently libvirt tries to detach the disk with a drive- prefix, which
>>>doesn't exist?
>I can't tell you right off why the failure message is there, but the
>alias with the added "drive-" is easily explained.
>
>Each <disk> element in the libvirt config corresponds to two entities in
>the qemu virtual machine - a "device" component (what is seen by the
>guest) and a "drive" component (what resources are used on the host to
>back up that device). Each of these corresponds to a separate qemu
>commandline argument (named "-device" and "-drive"), and each of
those
>must have a different, unique "id" option. For example, this is what is
>added to my qemu commandline when I add a virtio disk named
>"rawhide.qcow2" to my virtual machine:
>
>-drive
>file=/var/lib/libvirt/images/rawhide.qcow2,if=none,id=drive-virtio-disk0,format=qcow2
>-device
>virtio-blk-pci,scsi=off,bus=pci.2,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
>
>
>If you look at the status XML of the domain while it is running, you'll
>see that the <alias> for the disk element is "virtio-disk0", and
that's
>what is given to the -device half of the disk as its qemu id. The -drive
>half uses "drive-$alias" as its qemu id though (you'll notice that the
>-device half references that id with its "drive" option - that's how
the
>two halves are tied together by qemu).
>
>So, there is nothing unusual about qemu wanting to delete something
>called drive-virtio-disk4 when you have a <disk> with
>alias='virtio-disk4'. It obviously isn't right that an error message is
>being logged. Something sounds familiar about the "missing device in
>device deleted event" message, but I can't recall details. Does anyone
>else have a better memory? Frank - would it be possible for you to try
>this same sequence of events with the same libvirt and an older qemu
>(and/or vice-versa)?
>
I have some more interesting debug information.
I'm -not- able to reproduce it on libvirt 1.2.11 - qemu 2.3.0 and libvirt
1.2.11 - qemu 1.7.1 using a FreeBSD VM.
I'm able to reproduce the errors with libvirt 1.2.11 - qemu 2.3.0 and
libvirt 1.2.18 - qemu 2.3.0 using a Debian VM.
Seems like the error triggers depending on the guest OS?
During hot-unplugging of network interfaces libvirt also triggers the same
error on a Debian VM:
2015-08-06 10:48:59.945+0000: 11662: error : qemuMonitorJSONHandleDeviceDeleted:939 :
missing device in device deleted event
I gathered some logging information with info level logging enabled while
detaching a disk:
http://pastebin.com/raw.php?i=vVLuavG2
The interesting this is this line:
2015-08-06 21:14:56.847+0000: 8037: info : qemuMonitorIOProcess:452 :
QEMU_MONITOR_IO_PROCESS: mon=0x7f9477cc0fc0 buf={"return":
"2015-08-06T21:14:56.847359Z Device 'drive-virtio-disk4' not found\r\n",
"id": "libvirt-12"}
Notice how the data read from the monitor command has a timestamp
prefixed on it. This is very bad. We absolutely do not expect to
see a timestamp on HMP monitor replies, so whatever is doing that
in QEMU needs fixing.
AFAIK, QEMU is only supposed to add timestamps to stuff it prints
on stderr, when -msg timestamp=on. It should never timestamp HMP
Regards,
Daniel
--
|: