On Tue, Mar 07, 2017 at 04:15:03PM +0100, Pavel Hrdina wrote:
On Tue, Mar 07, 2017 at 02:40:55PM +0000, Daniel P. Berrange wrote:
> On Tue, Mar 07, 2017 at 03:32:37PM +0100, Pavel Hrdina wrote:
> > On Tue, Mar 07, 2017 at 02:55:43PM +0100, Andrea Bolognani wrote:
> > > On Tue, 2017-03-07 at 09:19 +0100, Pavel Hrdina wrote:
> > > > > However, after migration is complete, the <controller>
> > > > > element has model='nec-xhci' instead of
model='pci-ohci',
> > > > > which means that power cycling the guest results in
> > > > > breaking the guest ABI.
> > > >
> > > > I'm not so sure that this is an ABI change. The guest ABI is to
ensure
> > > > that the same guest XML will always start the same QEMU
guest. However
> > > > the PERSISTENT migration can make ABI changes because it is the same
as
> > > > virsh dumpxml $domain > $domain.xml && copy the XML onto
remote host
> > > > and virsh define $domain.xml. This would also change the *model*.
> > > >
> > > > If this would be considered to be guest ABI stable it would mean
that
> > > > other changes done by using this flag would be wrong because they
also
> > > > modifies the persistent XML during migration.
> > >
> > > I assume there are very good reasons for persistent
> > > migration to behave differently, but as a user I find
> > > it extremely surprising. Do you have any insight on the
> > > rationale behind allowing ABI changes when performing
> > > persistent migration?
> >
> > Well it's a different kind of migration which is almost similar to the
> > virsh dumpxml --inactive && virsh define, with one more feature,
migration
> > hook that can update the XML before it's defined on the destination.
> >
> > I agree that it may seem that we should handle this type of migration to
> > be guest ABI stable, on the other hand this basically defines new domain
> > on the destination where we allow to do the ABI updates.
> >
> > I'm adding Dan to CC to get his opinion about the guest ABI.
>
> So, there's no particular reason why things must be different for
> live vs persistent migration. The only fundamental requirement is
> that we must never change ABI underneath a running guest, which
> mandates the strong ABI guarantee for live migration & save/restore.
>
> For persistent/offline migration, there's no running guest, so from
> a technical POV we have more freedom in some sense. That said, a
> change in guest hardaware that can cause a guest to become unbootable
> is not a nice thing todo.
>
> For this reason, when you use 'defineXML' libvirt aims to immediately
> fill in (most) XML elements that would affect guest ABI. ie we assign
> addresses at define time, not start time, and we expand non-versioned
> machine types into versioned machine types, etc.
>
> So, yes, we should try to *not* change ABI across a dumpxml & define
> operational.
>
> When we first introduced PCI device addressing in libvirt, we had an
> old state where XML contained no addresses, and upon starting in new
> libvirt we have to assign addresses which might not be the same as
> those seen with previous libvirt before addressing was recorded. Once
> the addresses are assigned though, we must never change them.
This is basically the same issue like with the PCI addresses. We used "-usb"
instead of "-device ...", but when we switched to using the "-device
..."
way we didn't record the default model in the XML. The default model
is recorded in the XML since Libvirt 2.2.0.
Ok, that makes sense then. Until we are able to use -device we can't
gaurantee stability for an arch.
> - if we recorded 'pci-ohci' in the XML, we must never
change that
> to 'nec-xhci'.
This will never happen.
> - if we didn't record 'pci-ochi' in the XML, we should fill in the
> model that is most likely to match what the previous default
> behaviour was, which would be 'pci-ochi' not 'nec-xhci' IIUC.
This patch tries to move from 'pci-ohci' to 'nec-xhci' as a default USB
controller because the current default model is probably broken, see [1],
and will never be fixed. In addition I would like to do this change in
Libvirt so all other tools that uses Libvirt don't have to implement
this change by themselves.
In this case I think it's better to change the default from 'pci-ohci' to
'nec-xhci' if you define a new domain where the model is not specified.
If I look at qemu-system-ppc64 -usb, that provides an 'nec-xhci' device.
I'm curious why libvirt picked pci-ohci ? Was -usb previously providing
a pci-ohci device instead of nec-xhci ?
I guess the thing that could justify the switch is that we don't guarantee
what default devices you'll get for any given guest. It is dependent on
the particular version of the machine type used. So in that sense if you
have an XML doc which is not fully populated with device info, you are
already liable to see changes in default devices based on which QEMU
you happen to define the XML against first.
There is a concern whether we should allow ABI change for persistent
migration which is more general issue that will affect some other changes
that are already "allowed" for persistent migration. I'm starting to feel
that we should remove the "VIR_DOMAIN_DEF_PARSE_ABI_UPDATE" from
"virDomainDefParseNode" call in "qemuMigrationCookieXMLParse"
function
which currently allows to do ABI changes for persistent migration.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://entangle-photo.org -o-
http://search.cpan.org/~danberr/ :|