
On Thu, Mar 14, 2019 at 15:31:48 +0100, Michal Privoznik wrote:
On 3/14/19 3:14 PM, Peter Krempa wrote:
On Thu, Mar 14, 2019 at 14:56:48 +0100, Michal Privoznik wrote:
On 3/14/19 2:18 PM, Peter Krempa wrote:
On Thu, Mar 14, 2019 at 13:22:38 +0100, Michal Privoznik wrote:
[...]
How can this be considered success? Also this introduces a possible regression. The DEVICE_DELETED event should be fired only after the device was entirely unplugged. Claiming success before seeing the event can lead to another race when qemu deleted the device from the internal list so that 'device_del' does not see it any more but did not finish cleanup fully.
We need to start the '*Remove' handler only after the DEVICE_DELETED event was received.
I beg to differ. If we were to report error here users would see the API failing with error "Device not found". So they'd run 'virsh dumpxml' only to find the device there. I don't find such behaviour sane. If one API tells me a devie is not there then another one shall not tell otherwise.
Well. The user semantics can be confusing here. What we can't allow though is that some of the steps done in the qemuDomainRemove*Device will fail because qemu will still have some internal reference to some backend object.
I'm not quite sure I follow. qemuDomainRemove*Device will be run exactly once. Not any more times. Running it more times is a problem, but I'm failing to see how my patch allows that. Can you shed more light into that please?
What I've meant is that qemuDomainRemove*Device shall never be called earlier than DEVICE_DELETED is received. What I've forgot to take into account is that we will still call qemuDomainWaitForDeviceRemoval (as your comment mentions). This means that the scenario I was outlining will not happen as success from this API does not automatically mean we'll try to delete the backends of the device (which I didn't notice at first). I'd probably just delete the mention of dumping XML. I think the statement that qemuDomainWaitForDeviceRemoval should be clear enough.