On 1/31/22 14:18, Daniel P. Berrangé wrote:
> On Mon, May 10, 2021 at 03:16:11PM +0200, Pavel Hrdina wrote:
>> When QEMU introduces new firmware features libvirt will fail until we
>> list that feature in our code as well which doesn't sound right.
>>
>> We should simply ignore the new feature until we add a proper support
>> for it.
>>
>> Reported-by: Laszlo Ersek <lersek(a)redhat.com>
>> Signed-off-by: Pavel Hrdina <phrdina(a)redhat.com>
>> ---
>> src/qemu/qemu_firmware.c | 12 ++++++------
>> 1 file changed, 6 insertions(+), 6 deletions(-)
>
> Re-visiting this patch....
>
> During guest startup, libvirt parses all firmware descriptors and
> treats any errors during parsing as fatal.
>
> The problem we were fixing here was that on the QEMU / EDK2 side a
> new firmware feature called 'amd-sev-es' was added which libvirt
> didn't know about, so all libvirt guests using <os
firmware='efi'/>
> broke when a distro shipped a descriptor.
>
> We semi-future proofed ourselves by ignoring unknown features
> hereafter.
>
> I'm now coming to introduce a new feature in the firmware descriptors
> to allow for OVMF builds which have a read-only CODE and *NOT* VARs
> file at all.
>
> Currently the firmware descriptor treats nvram path as mandatory
> and libvirt validates that.
>
> IOW, as soon as I make the nvram path optional and some distro
> ships a firmware descriptor taking advantage of that, libvirt again
> breaks for all guests using <os firmware='efi'/>. I've thought
about
> various different ways to handle this but can't figure out any way
> to support an VARs-free firmware descriptor in a way that's backwards
> compatible with libvirt's current parsing code.
>
> This is my current qemu patch:
>
>
https://listman.redhat.com/archives/libvir-list/2022-January/msg01316.html
>
> We weren't future proofed enough.
>
> It is nice reporting errors when parsing firmware descriptors because
> it highlights real world mistakes.
>
> It is bad reporting errors when parsing firmware descriptors because
> it introduces potential for bogus failure scenarios any time the spec
> is extended.
>
>
> If we quit reporting errors and simply log a warning and ignore the
> firmware descriptor we couldn't parse entirely, then we are more
> robust if a completely new firmware descriptor is introduced - we
> can still parse existing firmware descriptor files and probably
> do something sensible.
>
> On the flipside, if an existing firmware descriptor is modified and
> we ignore it due to some unexpected data being present, we cause
> a regression.
>
> I feel like we're in a bit of a no-win situation here wrt error
> reporting and future proofing.
>
> My gut feeling is that we have little choice but to rely on the OS
> distro not to introduce firmware descriptors with fields that
> libvirt is too old to support, except in limited scenarios like
> the features enum case below where we can reasonably ignore a
> very specific error.
>
> Anyone have other ideas ?
I think it's sane requirement for distros to ship FW descriptors that
libvirt understands. Or backport patches for libvirt. One way or
another, if a mgmt app would want to use SEV it would need bleeding edge
libvirt anyway. And we can use those number prefixes to make new
firmware files de-prioritized so that users who don't care are not
affected.
Oh, I hadn't checked how number prioritization affects it. So you're
saying that we stop parsing the firmware file as soon as we find a
valid match. That does indeed limit the impact of the change.
Regards,
Daniel
--
|: