On Tue, Oct 02, 2018 at 06:26:12PM +0200, Andrea Bolognani wrote:
On Tue, 2018-10-02 at 17:19 +0200, Peter Krempa wrote:
> On Tue, Oct 02, 2018 at 16:14:39 +0200, Andrea Bolognani wrote:
[...]
> > Two concrete examples are considered here: one is the
> > virConnectNumOfDomains() API which, while known to be racy and having
> > non-racy alternatives, can still be used by developers without
> > getting any kind of warning in the process; the other one is the
> > ability to define a domain without specifying the machine type, which
>
> Okay, but for these particular ones we could do a compile time warning.
I believe we should really have both, to address both applications
being rebuilt against newer libvirt all the time, such as open source
projects that are included in our CI or in any Linux distro, and
those that aren't and just get a new libvirt version from time to
time due to OS updates, such as home-grown management tools.
> Not that we ever can remove them though.
True, that has been our policy so far. Doesn't mean it cannot ever
change :)
[...]
> Our documentation states in multiple places that fields not populated by
> the user are mostly hypervisor dependant what the default will become.
>
> In my opinion the machine type is similarly hypervisor dependant, and in
> this case the "hypervisor" for the libvirt-qemu infrastructure also
> involves libvirt's qemu driver.
I don't necessarily disagree with you, but it should be noted that
attempts to change libvirt's own defaults have been rejected times
and times again on the basis that existing applications were, despite
that being a very bad idea, relying on them.
> > adoption, as well as being a manifestation of the more general
> > problem of libvirt's default being sometimes too conservative and at
> > odds with the existence of slimmed-down QEMU binaries being built
> > with reducing the total attack surface in mind.
>
> If your qemu binary does not support certain feature, libvirt will know
> it. We have capability detection and for that matter we also have
> machine type detection (we fill in the default according to the
> canonical name). In such case we are very welcome to choose anything
> which will satisfy the default.
>
> I'm afraid though that the downstreams you are mentioning can't in fact
> fully drop i440fx for some reason and thus are trying to weasel around
> by attempting to make us change the default. This I don't support as a
> worthy goal. If they want to slim down qemu, they are welcome and we can
> pick a suitable different default.
q35 is what sparked the discussion, but it's far from the only
offender. For example, if I create a guest using
$ virt-install \
--os-variant fedora28 \
...
then libosinfo will be queried for information about supported
network cards, and the resulting XML will look like
<interface type='network'>
<model type='virtio'/>
...
However, libvirt's own default for x86_64 guests' network devices is
rtl8139, which means that if I later 'virsh edit' the guest or 'virsh
attach-device' a new network interface I will get that model instead
of virtio; to add insult to injury, the above will happen even if my
QEMU binary has rtl8139 compiled out and virtio-net-pci compiled in!
That's not correct actually. If rtl8139 is missing in QEMU, libvirt
will try e1000, and if that's missing it'll try virtio-net.
> We can also consider using what qemu provides as a default.
Users will
> get the default they asked for. They always can specify their specific
> type if their software is not happy with it.
Using QEMU's default machine type is exactly what we were doing until
very recently, but we changed that because QEMU's default has been
i440fx for so long that applications have come to rely on it and
would break if q35 suddenly started showing up instead, which goes to
show that they should not have been relying on either QEMU's or
libvirt's default, and they should have been providing the machine
type explicitly, possibly as obtained by querying libosinfo, instead.
Or, looking at it from the other side, that libvirt should have
required them to provide the machine type instead of trying to be
helpful and filling it in if absent. We can't retroactively mandate
applications do that, but we can deprecate such behavior and thus
steer them firmly towards the proper solution.
The problem with saying applications were doing it "wrong" is that
this definition of "wrong" changes. Applications were perfectly
justified in not providing a machine type, because the concept
didn't even exist in earlier libvirt. Once it did exist, we still
only supported x86, and there was no q35, so it was still valid to
not specify it.
Even today it is reasonable to not care about machine type in case
where the app only cares about x86.
Our view of "best" way to configure a guest is changing and in many
cases it is becoming increasingly clear that there's no single
"best" way, or no single perfect default.
Taking something that's historically optional and saying it should
be mandatory is a defacto API breakage. Deprecating it doesn't
magically stop it being an API breakage. It is just giving apps
a warning that we're about to hurt them, and I don't consider that
a good thing.
I think we're largely missing the bigger picture here. Configuring
guests, and using libvirt APIs in general, can be very complicated.
We provide basic API contract docs, and a crude XML schema reference,
but this is woefully insufficient. There's very little telling apps
about the big picture way to configure things / implement tasks.
We're missing a good developer guide where you'd give info to apps
devs about how to effectively use libvirt, so it is no surprise that
apps do things that are sub-optimal. Providing better docs to app
devs would be far more useful than deprecation warnings which have
minimal contextual guidance.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|