On 10/1/20 5:08 PM, Jonathon Jongsma wrote:
On Tue, 29 Sep 2020 15:53:39 -0400
Laine Stump <laine(a)redhat.com> wrote:
>> +
>> + if (vdpafdset < 0) {
>> + VIR_WARN("Cannot determine fdset for vdpa device");
>> + } else {
>> + if (qemuMonitorRemoveFdset(priv->mon, vdpafdset) < 0) {
>> + /* if it fails, there's not much we can do... just
>> carry on */
>> + VIR_WARN("failed to close vdpa device");
>> + }
>> + }
>
> I agree there's not much we can do here to make the situation better,
> but is it really going to be okay to inform the user that the device
> is now free, since it apparently isn't? If we go ahead and send the
> DEVICE_DELETED event up to the application, then it will think that
> the same vdpa device is now available to be re-used elsewhere. Do you
> have an idea what are the odds on that being true? (I don't, that's
> why I'm asking :-)).
I don't either ;)
> It may be safer to return failure, so the device is just stuck shown
> as in-use by this guest; that would be bad, but maybe not as bad as
> if it was still actually being used by this guest somehow (possible,
> since the fd couldn't be deleted), and a 2nd guest started using it
> too. (I really don't know what the consequences of any of this might
> be; just trying to inject my sunny disposition into the mix;
> truthfully I'd be willing to accept either way, just wanted to make
> sure it's considered).
Well, that's a good point. The reason that I printed a warning rather
than returning an error is because I was influenced by some of the
nearby code.
In order to remove a network device, this function has to do a couple
things (depending on the type of network device). First It removes the
netdev (netdev_del), and then it may need to do some additional cleanup.
For TYPE_VHOSTUSER, it needs to detach a chardev. For TYPE_VDPA, it
needs to close the fd that we passed to qemu. So what do we do if
'netdev_del' succeeds, but 'remove-fd' does not?
If we return an error from this function, the caller will interpret that
as if we failed to remove the network device. But qemu has already
removed the netdev. So things are in an inconsistent state.
TYPE_VHOSTUSER just carries on without even printing a warning if the
chardev can't be removed. So I did something similar here for vDPA, but
added a warning. I'm not sure that there's really a "good" solution
here.
Regarding the possibility of a second guest attempting to use the vdpa
device that was unsuccessfully removed: I have only tested with the
vdpa_sim kernel module, but if the fd is not closed, attempting to
re-use it with a different guest fails like this:
error: Failed to attach device from vdpa.xml
error: Unable to open '/dev/vhost-vdpa-0' for vdpa device: Device or
resource busy
Okay, based on that explanation, I think your solution is as good as, or
better than, any other.