On Mon, Dec 18, 2023 at 01:25:02 -0500, Laine Stump wrote:
On 11/28/23 9:58 AM, Peter Krempa wrote:
> On Mon, Nov 06, 2023 at 02:38:57 -0500, Laine Stump wrote:
> > This patch makes it possible to manually specify which VFIO variant
> > driver to use for PCI hostdev device assignment, so that, e.g. you
> > could force use of the generic vfio-pci driver with
> >
> > <driver name='vfio-pci'/>
> >
> > when libvirt would have normally (after applying a subsequent patch)
> > found a "better match" for a device in the active kernel's
> > modules.alias file. (The main use of this manual override would be to
> > work around a bug in a new VFIO variant driver by temporarily not
> > using that driver).
> >
> > This sounds like a simple addition to XML parsing/formatting, but of
> > course it's messier than that, since the attribute we want to use was
> > previously used in config for a now non-existent purpose (choosing a
> > type of device assignment that was removed from the kernel nearly a
> > decade ago), and continues to be used *internal to the C code*.
> >
> > Background:
> >
> > Beginning when VFIO device assignment support was added to <hostdev>
> > or <interface type='hostdev'/>) in libvirt's QEMU driver, it
was
> > possible to specify which device assignment backend to use with
> > "<driver name='vfio|kvm'/>". This was only useful for a
couple of
> > years in the early 2010's when VFIO device assignment was newly
> > introduced, but "legacy KVM" device assignment was still possible
(in
> > reality "<driver name='blah'/>" was only intended to be
used (1) in
> > the *very* early days of VFIO when "kvm" was still the default, or
(2)
> > when the host kernel was too old to have VFIO support (meaning it was
> > e.g. pre-RHEL7) or (3) some bug was encountered with the then-new VFIO
> > code that could have been avoided by switching back to the older style
> > of device assignment (I don't recall anyone ever needing to set it for
> > this reason, but that is one of the main reasons we added the knob).
> >
> > Later, when the libxl (Xen) driver in libvirt began supporting "device
> > passthrough" with <hostdev>, they added <driver
name='xen'/> to
> > indicate that mode of operation. But in practice this was also never
> > necessary in any config, since Xen only supports one type of device
> > assignment, and so the attribute was anyway set in the C object by the
> > libxl driver code prior to calling any hypervisor-common code
> > (i.e. the stuff in hypervisor/hostdev.c and util/virpci.c) that needs
> > to know which type of device assignment is being used - setting it in
> > the XML config was superfluous (kind of like me saying "I am now
> > describing this patch in a human language", ref: Perd Hapley on
"Parks
> > and Recreation").
> >
> > So the setting was available in the XML, but never needed to be set by
> > the user. Because it was automatically set in the C code though,
> > sometimes libvirt-generated XML would contain the option even though
> > the user hadn't included it in the original input XML (e.g. the libxl
> > driver sets it in the PostParse, so all saved configs with a PCI
> > <hostdev> device will have <driver name='xen'/>; also
status XML from
> > the QEMU and libxl drivers will contain <driver name='vfio|xen'/>
- in
> > both cases completely unnecessary and redundant information).
> >
> > When adding support for specifying a variant driver for VFIO device
> > assignment, from the beginning I have wanted to use <driver
> > name='blah'/> to force a specific variant driver (e.g. <driver
> > name='mlx5_vfio_pci'/>), with the assumption that the name
attribute
> > could be easily repurposed because it had been *completely* unused for
> > so long. What I discovered though, was that 1) the field in the C
> > object corresponding to driver name was an enum value rather than a
> > string (so parser and formatter need to be changed, not just the
> > driver code looking at a string in the C object), and 2) even though
> > the XML attribute was effectively unused (except in some output), the
> > field in the C object was *always* being set internally as a way for
> > the hypervisor driver code to inform the hypervisor-common hostdev
> > manager (in src/hypervisor) which method of device assignment to
> > use. So re-use wasn't as simple as I'd wished.
> >
> > However.
> >
> > What I have hit upon that permits us to use
> > <driver name='mlx5_vfio_pci'/> is to do the following:
> >
> > 1) rename the existing driver name attribute to driver *type* - this
> > will now be the enum that is set internally by the hypervisor
> > driver prior to calling the hostdev manager (and also is what will
> > show up in status XML; the libxl driver was modified in a previous
> > patch so that the setting isn't done until domain runtime (rather
> > than during PostParse), so it will no longer show up in
> > regurgitated libxml config)
> >
> > 2) add a new attribute with the now-newly-unused name "name" - this
> > will be a string that can be used to communicate a specific host
> > kernel driver to the hostdev manager.
> >
> > 3) In the parsing code for <driver>, if <driver
name='vfio|xen|kvm'/>
> > is encountered, set the driver *type* to the appropriate enum
> > instead, and clear our the name string. (since the four places
> > that use the hostdev <driver> element now all call a common parser
> > function, this check is only needed in a single place)
> >
> > I could alternately just leave "name" as-is, and create a new
> > attribute with a never-before-used name within driver. I suppose
> > instead of "type" and "name", we could instead have
"name and "model"
> > (where "model" would be the string that could be set to
> > "mlx5_vfio_pci"). I just prefer type/name to name/model, and think
> > it's been long enough since <driver name='blah'/> has had a
useful
> > purpose (and the fixup for backward compatibility is limited to a few
> > lines in one function) that this change is safe.
>
> I'm not entirely sold on the reusing of the 'name' attribute and this
> paragraph seems to confirm that it's just for optics.
>
> I'm a bit worried that some tool may be checking the output here however
> useless it is and that tool might then need changing just because we
> wanted "better looking" attribute names.
Sigh. Yeah, you're probably right.
Okay, so how about keeping <driver name='vfio|xen'> (but still minimizing
its auto-generation as much as possible), and using
<driver model='blah'/> when we want to specify a particular driver?
That sounds good. The 'name=' attribute should be present in the runtime
XML when it was selected, but based on your previous explanation we
don't need to keep putting it into the config at post-parse time.
If it was specified though, don't remove it.
> > (another alternate would be to add an extra parameter to
the C API
> > that calls into the hostdev manager rather than keeping/adding the
> > publicly visible "type" attribute, but it seems at least possible
that
> > sometime in the future another alternate "type" of device assignment
> > could be introduced for either Xen or QEMU, and we would then need to
> > add back the XML attribute anyway)
> >
> > I'd be fine with doing it one of the other ways, but decided to post
> > this way first, since it's the one that I find the most aesthetically
> > pleasing :-)
>
> Okay this paragraph is a bit too much of discussion for a commit
> message. Please put stuff like this under the lines or as a RFC
> question.
Yeah, I meant to do that but forgot.
>
>
> > Note that a few test cases have been modified - places where an
> > existing "name='vfio'" was changed to
"type='vfio'" were in in status
> > XML or only the *output* XML for a test (except the case of the
> > virnetworkportxml2xmltest, which doesn't have a separate directory for
> > the XML result; fortunately the converged parsing of <driver> between
> > domain/network/networkport means that the test cases for network and
> > domain XML are already testing the same code that would convert
"name"
> > to "type" in thte networkport XML)
> >
> > Signed-off-by: Laine Stump <laine(a)redhat.com>
> > ---
>
> The code looks okay but I'm not a fan of the rename for cosmetical
> reasons. Especially if we fiddle with the value in the field with
> original name and display it back to the user/management sw.
I'll redo everything to use <driver model> and try to get it posted in the
next day or two. If you or anyone else thinks of a better name let me know.