On Mon, Dec 09, 2019 at 02:23:38PM -0600, Jonathon Jongsma wrote:
On Mon, 2019-11-18 at 19:00 +0000, Daniel P. Berrangé wrote:
> On Mon, Nov 18, 2019 at 10:06:34AM -0700, Alex Williamson wrote:
> > Hey folks,
> >
> > We had some discussions at KVM Forum around mdev live migration and
> > what that might mean for libvirt handling of mdev devices and
> > potential libvirt/mdevctl[1] flows. I believe the current
> > situation is
> > that libvirt knows nothing about an mdev beyond the UUID in the
> > XML.
> > It expects the mdev to exist on the system prior to starting the
> > VM.
> > The intention is for mdevctl to step in here by providing
> > persistence
> > for mdev devices such that these pre-defined mdevs are potentially
> > not
> > just ephemeral, for example, we can tag specific mdevs for
> > automatic
> > startup on each boot.
> >
> > It seems the next step in this journey is to figure out if libvirt
> > can
> > interact with mdevctl to "manage" a device. I believe we've
> > avoided
> > defining managed='yes' behavior for mdev hostdevs up to this point
> > because creating an mdev device involves policy decisions. For
> > example, which parent device hosts the mdev, are there optimal NUMA
> > considerations, are there performance versus power considerations,
> > what
> > is the nature of the mdev, etc. mdevctl doesn't necessarily want
> > to
> > make placement decisions either, but it does understand how to
> > create
> > and remove an mdev, what it's type is, associate it to a fixed
> > parent, apply attributes, etc. So would it be reasonable that for
> > a
> > manage='yes' mdev hostdev device, libvirt might attempt to use
> > mdevctl
> > to start an mdev by UUID and stop it when the VM is shutdown? This
> > assumes the mdev referenced by the UUID is already defined and
> > known to
> > mdevct. I'd expect semantics much like managed='yes' around vfio-
> > pci
> > binding, ex. start/stop if it doesn't exist, leave it alone if it
> > already exists.
> >
> > If that much seems reasonable, and someone is willing to invest
> > some
> > development time to support it, what are then the next steps to
> > enable
> > migration?
>
> The first step is to deal with our virNodeDevice APIs.
>
> Currently we have
>
> - Listing devices via ( virConnectListAllNodeDevices )
> - Create transient device ( virNodeDeviceCreateXML )
> - Delete transient device ( virNodeDeviceDestroy )
>
> The create/delete APIs only deal with NPIV HBAs right now, so we need
> to extend that to deal with mdevs as first step.
>
> This entails defining an XML format that can represent the
> information
> we need about an mdev.
So, there is already an XML format that represents information about an
mdev device [1]. Do you mean extending that to add any additional
properties needed for mdevctl? or defining something new?
[1]
https://libvirt.org/drvnodedev.html#MDEV
We'll use that with whatever additions we need to create devices
To define and create an mdev, mdevctl needs a UUID, a parent device,
and a type. These properties all appear to be supported via the
existing XML format.
mdevctl also supports assigning arbitrary sysfs attributes to a device.
These attributes have an explicit ordering and are written to sysfs in
the specified order when a device is started. This might be the only
thing that doesn't fit into the current xml format.
Well we need to define a schema, but there will need to be some kind
of validation added because. AFAICT, mdevctl does no validation, so a
plain passthrough of this allows arbitrary writing of files anywhere
on the host given a suitable malicious attribute name.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|