On Thu, Sep 16, 2021 at 21:16:03 -0400, Laine Stump wrote:
On 9/13/21 11:38 AM, Peter Krempa wrote:
> On Fri, Sep 10, 2021 at 11:18:20 -0400, Laine Stump wrote:
> > On 9/10/21 3:30 AM, Peter Krempa wrote:
> > > On Thu, Sep 09, 2021 at 13:46:41 -0400, Laine Stump wrote:
Look, my problem is that we have an utterly inconsistent user story in
regards to device update and we are repeating our own mistakes.
If we are stating that the user should use the FULL xml for update, then
they should use the full XML and we should not arbitrarily choose what
we allow.
Similarly if we want to allow partial XML then that's fine as long as we
are consistent with ourselves, the user expectations and the docs.
Now for the repeating of our mistake part:
1) originally the alias was completely ignored in device-update,
since it
was always set automatically by libvirt, and so it was a read-only attribute
from the application's PoV.
2) Then user-specified aliases were added, with no associated change to the
update-device code, and QE testing found that a requested change to alias
during live device update was silently ignored:
https://bugzilla.redhat.com/1585108
Same for the ACPI index. It was added, and device-update was neglected.
3) This led to a patch that checked for any change in alias in the
update-device XML, with no special case for "not specified" (i.e. basically
what you're suggesting the code should be):
https://gitlab.com/libvirt/libvirt/-/commit/4ad54a417a1bbe10aeffcce783f43...
Yup, that's also exactly what this patch is doing.
4) That patch resulted in a bug filed by RHV saying that the behavior
of the
update-device API had changed, which broke their existing code for changing
the media in a CD drive or something like that (they don't read back the
normalized XML after they've started the guest, so they don't have any info
about auto-generated values, and thus weren't including alias in the XML
they sent to update-device (this was before RHV started specifying their own
aliases):
https://bugzilla.redhat.com/1621910
So this is still going to be a hypothetical.
Imagine a management application that uses partial XML when updating a
device because they generate it rather than taking a snippet from
dumpxml and modifying it. (We do have an example at least in terms of
disk update here.)
The same way as in libvirt, they figure out that they need to start
using the ACPI index for some reason. They implement it in their full
domain XML generator, but similarly to libvirt where we neglect the
update-device code they don't modify the generator that generates the
update XML.
So now they are using the ACPI index when defining the XML but not when
updating the device. The update-device code in libvirt thinks now that
they are attempting to change the ACPI index, which would seem to be a
regression, because nobody actually touched the update device XML
generator. This is also amplified by the fact libvirt actually fixed a
similar bug before. Bam a new BZ gets filed for this.
I grant you a mitigating circumstance that ACPI index is not really a
commonly used feature, so there's a chance that the scenario above might
not happen, but that doesn't mean that it's not a latent problem given
what we've done before.
5) *That* bug resulted in the code being changed to what it is today,
with
this patch:
https://gitlab.com/libvirt/libvirt/-/commit/b48d9e939bcf32a8d6e571313637e...
So essentially we are trapped into this "ignore unspecified alias" semantic
because alias previously existed but wasn't user-specified, and suddenly
requiring it broke existing applications trying to do things completely
unrelated to aliases.
See, we've acknowledged actually that partial XMLs are fine and thus
there was my reasoning that the same logic as with the alias should be
applied, otherwise it's opening a gate for another history repeat.