On Mon, 1 Feb 2021 11:33:08 +0100
Erik Skultety <eskultet(a)redhat.com> wrote:
On Mon, Feb 01, 2021 at 09:48:11AM +0000, Daniel P. Berrangé wrote:
> On Mon, Feb 01, 2021 at 10:45:43AM +0100, Erik Skultety wrote:
> > On Mon, Feb 01, 2021 at 09:42:32AM +0000, Daniel P. Berrangé
> > wrote:
> > > On Fri, Jan 29, 2021 at 05:34:36PM -0600, Jonathon Jongsma
> > > wrote:
> > > > On Thu, 7 Jan 2021 17:43:54 +0100
> > > > Erik Skultety <eskultet(a)redhat.com> wrote:
> > > >
> > > > > > Tested with v6.10.0-283-g1948d4e61e.
> > > > > >
> > > > > > 1.Can define/start/destroy mdev device successfully;
> > > > > >
> > > > > > 2.'virsh nodedev-list' has no '--active'
option, which is
> > > > > > inconsistent with the description in the patch:
> > > > > > # virsh nodedev-list --active
> > > > > > error: command 'nodedev-list' doesn't support
option
> > > > > > --active
> > > > > >
> > > > > > 3.virsh client hang when trying to destroy a mdev device
> > > > > > which is using by a vm, and after that all 'virsh
nodev*'
> > > > > > cmds will hang. If restarting llibvirtd after that,
> > > > > > libvirtd will hang.
> > > > >
> > > > > It hangs because underneath a write to the 'remove'
sysfs
> > > > > attribute is now blocking for some reason and since we're
> > > > > relying on mdevctl to do it for us, hence "it hangs".
I'm
> > > > > not trying to make an excuse, it's plain wrong. I'd love
to
> > > > > rely on such a basic functionality, but it looks like we'll
> > > > > have to go with a extremely ugly workaround and try to get
> > > > > the list of active domains from the nodedev driver and see
> > > > > whether any of them has the device assigned before we try
> > > > > to destroy the mdev via the nodedev driver.
> > > >
> > > > So, I've been trying to figure out a way to do this, but as
> > > > far as I know, there's no way to get a list of active domains
> > > > from within the nodedev driver, and I can't think of any
> > > > better ways to handle it. Any ideas?
> > >
> > > Correct, the nodedev driver isn't permitted to talk to any of
> > > the virt drivers.
> >
> > Oh, not even via secondary connection? What makes nodedev so
> > special, since we can open a secondary connection from e.g. the
> > storage driver?
>
> It is technically possible, but it should never be done, because it
> introduces a bi-directional dependancy between the daemons which
> introduces the danger of deadlocking them. None of the secondary
> drivers should connect to the hypervisor drivers.
>
> > > Is there anything in sysfs which reports whether the device is
> > > in use ?
> >
> > Nothing that I know of, the way it used to work was that you
> > tried to write to sysfs and kernel returned a write error with
> > "device in use" or something like that, but that has changed
> > since :(.
Without having tried this and since mdevctl is just a Bash script,
can we bypass mdevctl on destroys a little bit by constructing the
path to the sysfs attribute ourselves and perform a non-blocking
write of zero bytes to see if we get an error? If so, we'll skip
mdevctl and report an error. If we don't, we'll invoke mdevctl to
remove the device in order to remain consistent. Would that be an
acceptable workaround (provided it would work)?
As far as I can tell, this doesn't work. According to my
tests, attempting to write zero bytes to $(mdev_sysfs_path)/remove
doesn't result in an error if the mdev is in use by a vm. It just
"successfully" writes zero bytes. Adding Alex to cc in case he has an
idea for a workaround here.
Jonathon