On Tue, 2019-08-20 at 08:44 +0200, Erik Skultety wrote:
On Mon, Aug 19, 2019 at 02:30:05PM -0500, Jonathon Jongsma wrote:
> When a host is rebooted, any mediated devices that were previously
> configured will disappear. There have been requests for libvirt to
> handle persisting these mediated devices across reboots, but the
> decision was made that this should be handled at a lower level.
> mdevctl
> is a new tool that handles registration and persistence of mediated
> devices. If desired, mdevctl can automatically start these mediated
> devices when the parent device becomes available.
>
> Since mdevctl is the recommended solution for handling persistent
> mediated devices, point users there when they encounter an error
> for a
> missing mediated device.
Is mdevctl already being packaged for distros?
Alex said that he's currently working on getting it into fedora. But as
far as I know it's not officially in any distros yet, so the patch may
be a bit premature.
> Signed-off-by: Jonathon Jongsma <jjongsma(a)redhat.com>
> ---
>
> NOTE:
> - previous patch which attempted to start missing devices using
> mdevctl has
> been withdrawn.
> src/util/virmdev.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/src/util/virmdev.c b/src/util/virmdev.c
> index 3d5488cdae..df02fc8d28 100644
> --- a/src/util/virmdev.c
> +++ b/src/util/virmdev.c
> @@ -149,7 +149,9 @@ virMediatedDeviceNew(const char *uuidstr,
> virMediatedDeviceModelType model)
>
> if (!virFileExists(sysfspath)) {
> virReportError(VIR_ERR_DEVICE_MISSING,
> - _("mediated device '%s' not found"),
> uuidstr);
> + _("mediated device '%s' not found. "
> + "Persistent devices can be managed with
> 'mdevctl'."),
> + uuidstr);
mdevctl is very cool, but unless it's being packaged for distros
which to me is
kind of a guarantee (maintenance commitment may be a better wording)
that it
will not stay just a cool experiment, I'm not sure we want to
reference it
directly in the error messages. Of course, it's been a few months
that I
followed up on mdevctl, so I don't know where it stands as of now,
can you
refresh my memory?
Now, my response comes quite late (sorry for that) as I've been
thinking about
the original patch you sent and I was also trying to remember all the
details
(I already know I surely forgot some bits). Last time we discussed
this, I
think the issue with libvirt in this case is migration of domains
with mdevs in
the future. Right now, libvirt domains only hold the UUID to
reference an
mdev, so as not to duplicate the information we already have about
the mdev in
the nodedev driver.
I'd very happily delegate the responsibility onto mdevctl, but do we
have a
guarantee that mdevctl is available on the destination machine?
Because if not,
then libvirt wouldn't be able to resume the machine because the
device would
likely not exist if someone didn't pre-create it manually. Then
there's the
problem with the UUID->mdev_type mapping because the state of mdevctl
would not
be transferred. On the source, the UUID maps to some type, but the
destination
doesn't know about this mapping, does it? This is why we discussed
that even
though mdevctl provides us with clear benefits, libvirt would still
need to
keep track of the mdevs. The initial idea was to re-work the nodedev-
driver (I
have a GSoC project listed on
libvirt.org), so that it:
a) supports the define;start;destroy;etc. commands which all the
drivers do
b) specifies what an mdev XML looks like and allows to define it
c) extends the current Create (from XML) function so that it supports
creation
of more than just NPIV/vHBAs - currently it's tied to NPIV
Once in place, the user/management layer would then have to make sure
that they
migrate the persistent configs across all the compute nodes before
the
migration actually takes place. Libvirt would then "start" (aka
create) the
mdevs on demand when the domain is started.
I haven't seen the migration discussion for a while, so I don't know
where we
stand ATM, I'm saying this because I don't know if and how the
mdev_types on
newer revisions of vGPU HW are compatible which plays a role in the
XML
structure we'd select for an mdev in libvirt.
Like I said, this is roughly where we were standing at the last time
we had a
discussion about this, situation may have changed already, these were
just my
2 cents and I myself need to find time to catch up with mdevs a bit
:).
Erik and I had a brief discussion on IRC. One of his questions was
whether mdevctl exposed any configuration:
<eskultet> I can imagine libvirt depending on mdevctl if it exposes
some configs or profiles, because once we have those, LP can migrate
those and libvirt should IMO be fine
And the answer is that mdevctl does provide a way to export
configuration in JSON format. For example, here's mdevctl showing the
mediated device that I configured with my laptop GPU:
$ mdevctl list -d --dumpjson
[
{
"0000:00:02.0": [
{
"20e75edd-1c34-43a2-a951-f1ee1e77e789": {
"mdev_type": "i915-GVTg_V5_8",
"start": "auto"
}
}
]
}
]
Something above libvirt would need to ensure that a suitable device
exists on the migration destination and then translate the PCI address
somehow, but it looks like mdevctl probably gives us enough
information. I need to think a bit more deeply about the exact
requirements and implementation details. I'd welcome any guidance from
those with more experience in this area.
Jonathon