On Monday, 2 August 2021 14:30:05 CEST Peter Krempa wrote:
On Mon, Aug 02, 2021 at 14:20:44 +0200, Vojtech Juranek wrote:
> Hi,
> as a follow-up of BZ #1883399 [1], we are reviewing vdsm VM migration
> flows and solve few follow-up bugs, e.g. BZ #1981079 [2]. I have couple
> of questions related to libvirt:
>
> * if we run disk extend during migration, it can happen that migration
> finishes sooner than disk extend. In such case we will try to set disk
> threshold on already stopped VM (we handle libvirt event that VM was
> stopper, but due to Python GIL there can be a delay between obtaining
> appropriate signal from libvirt and handling it). In such case we get
> libvirt
> VIR_ERR_OPERATION_INVALID when setting disk threshold.
actually I was wrong here and the issue is actually caused by delay libvirt
setBlockThreshold() call, form vdsm log:
2021-08-02 09:06:01,918-0400 WARN (mailbox-hsm/3) [virt.vm]
(vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') setting theshold using dom
<vdsm.virt.virdomain.Notifying object at 0x7fd06610df28> (drivemonitor:122)
[...]
2021-08-02 09:06:03,967-0400 WARN (libvirt/events) [virt.vm]
(vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') libvirt event Stopped detail 3
opaque None (vm:5657)
[...]
2021-08-02 09:06:03,969-0400 WARN (mailbox-hsm/3) [virt.vm]
(vmId='2dad9038-3e3a-4b5e-8d20-b0da37d9ef79') Domain not connected, skipping set
block threshold for drive 'sdc' (drivemonitor:133)
so it took about 2 second to libvirt setBlockThreshold() call to return and in meantime
migration was finished and we get VIR_ERR_OPERATION_INVALID error from setBlockThreshold()
call.
What is the reason for this delay? Is this operation intentionally delayed until
migration finishes?
I posted relevant libvirt debug log on
https://pastebin.com/YkdKYKM5
> Is it safe to
> catch this exception and ignore it or it's thrown for various reasons and
> the root cause can be something else than stopped VM?
The API to set the block trheshold level can return the following errors
including cases when it can happen:
VIR_ERR_OPERATION_UNSUPPORTED <- unlikely new qemu supports it
VIR_ERR_INVALID_ARG <- disk was not found in VM definition
VIR_ERR_INTERNAL_ERROR <- on error from qemu
Thus VIR_ERR_OPERATION_INVALID seems to be safe to ignore in your
specific case, while not ignoring others can be used to catch problems.
thanks for your answer