On Tue, May 14, 2019 at 11:51:35AM +0200, Cornelia Huck wrote:
On Tue, 14 May 2019 03:47:36 -0400
Yan Zhao <yan.y.zhao(a)intel.com> wrote:
> On Tue, May 14, 2019 at 03:43:44PM +0800, Erik Skultety wrote:
> > On Tue, May 14, 2019 at 03:32:19AM -0400, Yan Zhao wrote:
> > > On Tue, May 14, 2019 at 03:20:40PM +0800, Erik Skultety wrote:
> > > > That said, from libvirt POV as a consumer, I'd expect there to be
truly only 2
> > > > errors (I believe Alex has mentioned something similar in one of his
responses
> > > > in one of the threads):
> > > > a) read error indicating that an mdev type doesn't support
migration
> > > > - I assume if one type doesn't support migration, none of
the other
> > > > types exposed on the parent device do, is that a fair
assumption?
Probably; but there might be cases where the migratability depends not
on the device type, but how the partitioning has been done... or is
that too contrived?
No, you have a point - once again I let my thoughts be carried away by the idea
of heterogeneous setups, which is a discussion for another time anyway, I was
just thinking out loud.
> > > > b) write error indicating that the mdev types are incompatible
for
> > > > migration
> > > >
> > > > Regards,
> > > > Erik
> > > Thanks for this explanation.
> > > so, can we arrive at below agreements?
> > >
> > > 1. "not to define the specific errno returned for a specific
situation,
> > > let the vendor driver decide, userspace simply needs to know that an errno
on
> > > read indicates the device does not support migration version comparison
and
> > > that an errno on write indicates the devices are incompatible or the
target
> > > doesn't support migration versions. "
> > > 2. vendor driver should log detailed error reasons in kernel log.
> >
> > That would be my take on this, yes, but I open to hear any other suggestions
and
> > ideas I couldn't think of as well.
So, read to find out whether migration is supported at all, write to
find out whether it is supported for that concrete pairing is
reasonable for libvirt?
Yes, more specifically, in the prepare phase of migration, we'd retrieve the
string (potentially reporting an error like: "Failed to query migration
support: <errno translation>"), put the string into the migration cookie and
do the check with write on destination. The only thing is that if the error is
on the destination, the error message in kernel log lives only on the
destination, which doesn't help libvirt users, so it would require setting up
remote logging, but for layered products, this is not a problem since those
already utilize central logging nodes.
Then there are the libvirt-specific bits out of scope of this discussion,
whether we should only assume identical mdev type pairs, or whether we should
employ best effort approach and iterate over all the available types exposed by
the vendor and check whether any of the types would support this migration
(back to your note Connie, partitioning would come into the picture here).
> >
> > Erik
> got it. thanks a lot!
>
> hi Cornelia and Dave,
> do you also agree on:
> 1. "not to define the specific errno returned for a specific situation,
> let the vendor driver decide, userspace simply needs to know that an errno on
> read indicates the device does not support migration version comparison and
> that an errno on write indicates the devices are incompatible or the target
> doesn't support migration versions. "
> 2. vendor driver should log detailed error reasons in kernel log.
Two questions:
- How reasonable is it to refer to the system log in order to find out
what exactly went wrong?
- If detailed error reporting is basically done to the syslog, do
different error codes still provide useful information? Or should the
vendor driver decide what it wants to do?
I'd leave anything beyond returning -1 on read/write from/to the sysfs to the
vendor driver, as user space has no control over it, even if there was a
facility to interpret different return codes for us, I'm not sure (in this
migration-related case) how much would userspace be able to recover or
fallback anyway, you either can or cannot migrate smoothely.
Regards,
Erik