On Tue, 29 Sep 2020 15:53:39 -0400
Laine Stump <laine(a)redhat.com> wrote:
> +
> + if (vdpafdset < 0) {
> + VIR_WARN("Cannot determine fdset for vdpa device");
> + } else {
> + if (qemuMonitorRemoveFdset(priv->mon, vdpafdset) < 0) {
> + /* if it fails, there's not much we can do... just
> carry on */
> + VIR_WARN("failed to close vdpa device");
> + }
> + }
I agree there's not much we can do here to make the situation better,
but is it really going to be okay to inform the user that the device
is now free, since it apparently isn't? If we go ahead and send the
DEVICE_DELETED event up to the application, then it will think that
the same vdpa device is now available to be re-used elsewhere. Do you
have an idea what are the odds on that being true? (I don't, that's
why I'm asking :-)).
I don't either ;)
It may be safer to return failure, so the device is just stuck shown
as in-use by this guest; that would be bad, but maybe not as bad as
if it was still actually being used by this guest somehow (possible,
since the fd couldn't be deleted), and a 2nd guest started using it
too. (I really don't know what the consequences of any of this might
be; just trying to inject my sunny disposition into the mix;
truthfully I'd be willing to accept either way, just wanted to make
sure it's considered).
Well, that's a good point. The reason that I printed a warning rather
than returning an error is because I was influenced by some of the
nearby code.
In order to remove a network device, this function has to do a couple
things (depending on the type of network device). First It removes the
netdev (netdev_del), and then it may need to do some additional cleanup.
For TYPE_VHOSTUSER, it needs to detach a chardev. For TYPE_VDPA, it
needs to close the fd that we passed to qemu. So what do we do if
'netdev_del' succeeds, but 'remove-fd' does not?
If we return an error from this function, the caller will interpret that
as if we failed to remove the network device. But qemu has already
removed the netdev. So things are in an inconsistent state.
TYPE_VHOSTUSER just carries on without even printing a warning if the
chardev can't be removed. So I did something similar here for vDPA, but
added a warning. I'm not sure that there's really a "good" solution
here.
Regarding the possibility of a second guest attempting to use the vdpa
device that was unsuccessfully removed: I have only tested with the
vdpa_sim kernel module, but if the fd is not closed, attempting to
re-use it with a different guest fails like this:
error: Failed to attach device from vdpa.xml
error: Unable to open '/dev/vhost-vdpa-0' for vdpa device: Device or
resource busy
Jonathon