[PATCH v2] docs: expand firmware descriptor to allow flash without NVRAM

The current firmware descriptor schema for flash requires that both the executable to NVRAM template paths be provided. This is fine for the most common usage of EDK2 builds in virtualization where the separate _CODE and _VARS files are provided. With confidential computing technology like AMD SEV, persistent storage of variables may be completely disabled because the firmware requires a known clean state on every cold boot. There is no way to express this in the firmware descriptor today. Even with regular EDK2 builds it is possible to create a firmware that has both executable code and variable persistence in a single file. This hasn't been commonly used, since it would mean every guest bootup would need to clone the full firmware file, leading to redundant duplicate storage of the code portion. In some scenarios this may not matter and might even be beneficial. For example if a public cloud allows users to bring their own firmware, such that the user can pre-enroll their own secure boot keys, you're going to have this copied on disk for each tenant already. At this point the it can be simpler to just deal with a single file rather than split builds. The firmware descriptor ought to be able to express this combined firmware model too. This all points towards expanding the schema for flash with a 'mode' concept: - "split" - the current implicit behaviour with separate files for code and variables. - "combined" - the alternate behaviour where a single file contains both code and variables. - "stateless" - the confidential computing use case where storage of variables is completely disable, leaving only the code. Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- docs/interop/firmware.json | 54 ++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 8 deletions(-) In v2: - Mark 'mode' as optional field - Misc typos in docs diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json index 8d8b0be030..f5d1d0b6e7 100644 --- a/docs/interop/firmware.json +++ b/docs/interop/firmware.json @@ -210,24 +210,61 @@ 'data' : { 'filename' : 'str', 'format' : 'BlockdevDriver' } } + +## +# @FirmwareFlashType: +# +# Describes how the firmware build handles code versus variable +# persistence. +# +# @split: the executable file contains code while the NVRAM +# template provides variable storage. The executable +# must be configured read-only and can be shared between +# multiple guests. The NVRAM template must be cloned +# for each new guest and configured read-write. +# +# @combined: the executable file contains both code and +# variable storage. The executable must be cloned +# for each new guest and configured read-write. +# No NVRAM template will be specified. +# +# @stateless: the executable file contains code and variable +# storage is not persisted. The executed must +# be configured read-only and can be shared +# between multiple guests. No NVRAM template +# will be specified. +# +# Since: 7.0.0 +## +{ 'enum': 'FirmwareFlashMode', + 'data': [ 'split', 'combined', 'stateless' ] } + ## # @FirmwareMappingFlash: # # Describes loading and mapping properties for the firmware executable # and its accompanying NVRAM file, when @FirmwareDevice is @flash. # -# @executable: Identifies the firmware executable. The firmware -# executable may be shared by multiple virtual machine -# definitions. The preferred corresponding QEMU command -# line options are +# @mode: describes how the firmware build handles code versus variable +# storage. If not present, it must be treated as if it was +# configured with value ``split``. Since: 7.0.0 +# +# @executable: Identifies the firmware executable. The @mode +# indicates whether there will be an associated +# NVRAM template present. The preferred +# corresponding QEMU command line options are # -drive if=none,id=pflash0,readonly=on,file=@executable.@filename,format=@executable.@format # -machine pflash0=pflash0 -# or equivalent -blockdev instead of -drive. +# or equivalent -blockdev instead of -drive. When +# @mode is ``combined`` the executable must be +# cloned before use and configured with readonly=off. # With QEMU versions older than 4.0, you have to use # -drive if=pflash,unit=0,readonly=on,file=@executable.@filename,format=@executable.@format # # @nvram-template: Identifies the NVRAM template compatible with -# @executable. Management software instantiates an +# @executable, when @mode is set to ``split``, +# otherwise it should not be present. +# Management software instantiates an # individual copy -- a specific NVRAM file -- from # @nvram-template.@filename for each new virtual # machine definition created. @nvram-template.@filename @@ -246,8 +283,9 @@ # Since: 3.0 ## { 'struct' : 'FirmwareMappingFlash', - 'data' : { 'executable' : 'FirmwareFlashFile', - 'nvram-template' : 'FirmwareFlashFile' } } + 'data' : { '*mode': 'FirmwareFlashMode', + 'executable' : 'FirmwareFlashFile', + '*nvram-template' : 'FirmwareFlashFile' } } ## # @FirmwareMappingKernel: -- 2.34.1

On Mon, Jan 31, 2022 at 12:55:09PM +0000, Daniel P. Berrangé wrote:
The current firmware descriptor schema for flash requires that both the executable to NVRAM template paths be provided. This is fine for the most common usage of EDK2 builds in virtualization where the separate _CODE and _VARS files are provided.
With confidential computing technology like AMD SEV, persistent storage of variables may be completely disabled because the firmware requires a known clean state on every cold boot. There is no way to express this in the firmware descriptor today.
Even with regular EDK2 builds it is possible to create a firmware that has both executable code and variable persistence in a single file. This hasn't been commonly used, since it would mean every guest bootup would need to clone the full firmware file, leading to redundant duplicate storage of the code portion. In some scenarios this may not matter and might even be beneficial. For example if a public cloud allows users to bring their own firmware, such that the user can pre-enroll their own secure boot keys, you're going to have this copied on disk for each tenant already. At this point the it can be simpler to just deal with a single file rather than split builds. The firmware descriptor ought to be able to express this combined firmware model too.
Cool, TIL that it's possible to include both the executable and the variables file into a single file. I briefly wondered if in this "combined" mode whether the no. of duplicate copies can ever fill up the storage. I doubt that, as the combined size of _VARS + _CODE is just about 2MB. So it only starts mattering if you're running tens of thousands of guests.
This all points towards expanding the schema for flash with a 'mode' concept:
- "split" - the current implicit behaviour with separate files for code and variables.
- "combined" - the alternate behaviour where a single file contains both code and variables.
- "stateless" - the confidential computing use case where storage of variables is completely disable, leaving only the code.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- docs/interop/firmware.json | 54 ++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 8 deletions(-)
In v2:
- Mark 'mode' as optional field - Misc typos in docs
diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json index 8d8b0be030..f5d1d0b6e7 100644 --- a/docs/interop/firmware.json +++ b/docs/interop/firmware.json @@ -210,24 +210,61 @@ 'data' : { 'filename' : 'str', 'format' : 'BlockdevDriver' } }
+ +## +# @FirmwareFlashType: +# +# Describes how the firmware build handles code versus variable +# persistence. +# +# @split: the executable file contains code while the NVRAM +# template provides variable storage. The executable +# must be configured read-only and can be shared between +# multiple guests. The NVRAM template must be cloned +# for each new guest and configured read-write. +# +# @combined: the executable file contains both code and +# variable storage. The executable must be cloned +# for each new guest and configured read-write. +# No NVRAM template will be specified.
Given my above wondering, is it worth adding a note here about storage considerations when running large number of guests in the "combined" mode? If not, ignore my comment.
+# @stateless: the executable file contains code and variable +# storage is not persisted. The executed must
I guess you meant: s/executed/executable/ Whoever is applying the patch can touch it up.
+# be configured read-only and can be shared +# between multiple guests. No NVRAM template +# will be specified. +# +# Since: 7.0.0 +## +{ 'enum': 'FirmwareFlashMode', + 'data': [ 'split', 'combined', 'stateless' ] } + ## # @FirmwareMappingFlash: # # Describes loading and mapping properties for the firmware executable # and its accompanying NVRAM file, when @FirmwareDevice is @flash. # -# @executable: Identifies the firmware executable. The firmware -# executable may be shared by multiple virtual machine -# definitions. The preferred corresponding QEMU command -# line options are +# @mode: describes how the firmware build handles code versus variable +# storage. If not present, it must be treated as if it was +# configured with value ``split``. Since: 7.0.0
For consistency, might want to capitalize the first word: s/describes/Describes/ (Here too, maintainer can touch it up.) [...] The concept looks very clear, and obviously useful. FWIW: Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
-- 2.34.1
-- /kashyap

On Mon, Jan 31, 2022 at 03:00:33PM +0100, Kashyap Chamarthy wrote:
On Mon, Jan 31, 2022 at 12:55:09PM +0000, Daniel P. Berrangé wrote:
The current firmware descriptor schema for flash requires that both the executable to NVRAM template paths be provided. This is fine for the most common usage of EDK2 builds in virtualization where the separate _CODE and _VARS files are provided.
With confidential computing technology like AMD SEV, persistent storage of variables may be completely disabled because the firmware requires a known clean state on every cold boot. There is no way to express this in the firmware descriptor today.
Even with regular EDK2 builds it is possible to create a firmware that has both executable code and variable persistence in a single file. This hasn't been commonly used, since it would mean every guest bootup would need to clone the full firmware file, leading to redundant duplicate storage of the code portion. In some scenarios this may not matter and might even be beneficial. For example if a public cloud allows users to bring their own firmware, such that the user can pre-enroll their own secure boot keys, you're going to have this copied on disk for each tenant already. At this point the it can be simpler to just deal with a single file rather than split builds. The firmware descriptor ought to be able to express this combined firmware model too.
Cool, TIL that it's possible to include both the executable and the variables file into a single file.
I briefly wondered if in this "combined" mode whether the no. of duplicate copies can ever fill up the storage. I doubt that, as the combined size of _VARS + _CODE is just about 2MB. So it only starts mattering if you're running tens of thousands of guests.
When guest root / data disk sizes are measured in 100's of MB, or GBs, I struggle to get worried about even a 16 MB OVMF blob being copied per guest. The firmware can be provided in qcow2 format too, so if really concerned, just create a qcow2 file with a backing store pointing to the readonly master, so you're only paying the price of the delta for any guest VARs writes. That's more efficient than what we do today with copying the separate raw format VARS.fd file.
This all points towards expanding the schema for flash with a 'mode' concept:
- "split" - the current implicit behaviour with separate files for code and variables.
- "combined" - the alternate behaviour where a single file contains both code and variables.
- "stateless" - the confidential computing use case where storage of variables is completely disable, leaving only the code.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- docs/interop/firmware.json | 54 ++++++++++++++++++++++++++++++++------ 1 file changed, 46 insertions(+), 8 deletions(-)
In v2:
- Mark 'mode' as optional field - Misc typos in docs
diff --git a/docs/interop/firmware.json b/docs/interop/firmware.json index 8d8b0be030..f5d1d0b6e7 100644 --- a/docs/interop/firmware.json +++ b/docs/interop/firmware.json @@ -210,24 +210,61 @@ 'data' : { 'filename' : 'str', 'format' : 'BlockdevDriver' } }
+ +## +# @FirmwareFlashType: +# +# Describes how the firmware build handles code versus variable +# persistence. +# +# @split: the executable file contains code while the NVRAM +# template provides variable storage. The executable +# must be configured read-only and can be shared between +# multiple guests. The NVRAM template must be cloned +# for each new guest and configured read-write. +# +# @combined: the executable file contains both code and +# variable storage. The executable must be cloned +# for each new guest and configured read-write. +# No NVRAM template will be specified.
Given my above wondering, is it worth adding a note here about storage considerations when running large number of guests in the "combined" mode? If not, ignore my comment.
I don't think its worth worrying about.
+# @stateless: the executable file contains code and variable +# storage is not persisted. The executed must
I guess you meant: s/executed/executable/
Opp yes.
Whoever is applying the patch can touch it up.
+# be configured read-only and can be shared +# between multiple guests. No NVRAM template +# will be specified. +# +# Since: 7.0.0 +## +{ 'enum': 'FirmwareFlashMode', + 'data': [ 'split', 'combined', 'stateless' ] } + ## # @FirmwareMappingFlash: # # Describes loading and mapping properties for the firmware executable # and its accompanying NVRAM file, when @FirmwareDevice is @flash. # -# @executable: Identifies the firmware executable. The firmware -# executable may be shared by multiple virtual machine -# definitions. The preferred corresponding QEMU command -# line options are +# @mode: describes how the firmware build handles code versus variable +# storage. If not present, it must be treated as if it was +# configured with value ``split``. Since: 7.0.0
For consistency, might want to capitalize the first word: s/describes/Describes/
Yep
(Here too, maintainer can touch it up.)
[...]
The concept looks very clear, and obviously useful. FWIW:
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Jan 31, 2022 at 02:36:46PM +0000, Daniel P. Berrangé wrote:
On Mon, Jan 31, 2022 at 03:00:33PM +0100, Kashyap Chamarthy wrote:
On Mon, Jan 31, 2022 at 12:55:09PM +0000, Daniel P. Berrangé wrote:
[...]
I briefly wondered if in this "combined" mode whether the no. of duplicate copies can ever fill up the storage. I doubt that, as the combined size of _VARS + _CODE is just about 2MB. So it only starts mattering if you're running tens of thousands of guests.
When guest root / data disk sizes are measured in 100's of MB, or GBs, I struggle to get worried about even a 16 MB OVMF blob being copied per guest.
Heh, fair enough.
The firmware can be provided in qcow2 format too, so if really concerned, just create a qcow2 file with a backing store pointing to the readonly master, so you're only paying the price of the delta for any guest VARs writes. That's more efficient than what we do today with copying the separate raw format VARS.fd file.
That's nice, I didn't know the qcow2 possibility in this context. For some reason I assumed the file format always has to be raw here. Your qcow2 point above should be documented, if it isn't already. Although I don't know the right place for it. [...] -- /kashyap

On Mon, Jan 31, 2022 at 04:21:36PM +0100, Kashyap Chamarthy wrote:
On Mon, Jan 31, 2022 at 02:36:46PM +0000, Daniel P. Berrangé wrote:
On Mon, Jan 31, 2022 at 03:00:33PM +0100, Kashyap Chamarthy wrote:
On Mon, Jan 31, 2022 at 12:55:09PM +0000, Daniel P. Berrangé wrote:
[...]
I briefly wondered if in this "combined" mode whether the no. of duplicate copies can ever fill up the storage. I doubt that, as the combined size of _VARS + _CODE is just about 2MB. So it only starts mattering if you're running tens of thousands of guests.
When guest root / data disk sizes are measured in 100's of MB, or GBs, I struggle to get worried about even a 16 MB OVMF blob being copied per guest.
Heh, fair enough.
The firmware can be provided in qcow2 format too, so if really concerned, just create a qcow2 file with a backing store pointing to the readonly master, so you're only paying the price of the delta for any guest VARs writes. That's more efficient than what we do today with copying the separate raw format VARS.fd file.
That's nice, I didn't know the qcow2 possibility in this context. For some reason I assumed the file format always has to be raw here. Your qcow2 point above should be documented, if it isn't already. Although I don't know the right place for it.
There's already a format field in the descriptor, but even if the firmware is distributed as raw, libvirt can choose to put qcow2 overlay on it, as its all configured with -blockdev Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Jan 31, 2022 at 03:35:02PM +0000, Daniel P. Berrangé wrote:
On Mon, Jan 31, 2022 at 04:21:36PM +0100, Kashyap Chamarthy wrote:
On Mon, Jan 31, 2022 at 02:36:46PM +0000, Daniel P. Berrangé wrote:
[...]
The firmware can be provided in qcow2 format too, so if really concerned, just create a qcow2 file with a backing store pointing to the readonly master, so you're only paying the price of the delta for any guest VARs writes. That's more efficient than what we do today with copying the separate raw format VARS.fd file.
That's nice, I didn't know the qcow2 possibility in this context. For some reason I assumed the file format always has to be raw here. Your qcow2 point above should be documented, if it isn't already. Although I don't know the right place for it.
There's already a format field in the descriptor, but even if the firmware is distributed as raw, libvirt can choose to put qcow2 overlay on it, as its all configured with -blockdev
Ah, understood. I should've first checked the spec to look for the @format field. For others reading the thread, the @format bit is located here infirmware.json: [...] # @FirmwareFlashFile: # # Defines common properties that are necessary for loading a firmware # file into a pflash chip. The corresponding QEMU command line option is # "-drive file=@filename,format=@format". Note however that the # option-argument shown here is incomplete; it is completed under # @FirmwareMappingFlash. # # @filename: Specifies the filename on the host filesystem where the # firmware file can be found. # # @format: Specifies the block format of the file pointed-to by # @filename, such as @raw or @qcow2. [...] -- /kashyap

On Mon, Jan 31, 2022 at 04:21:36PM +0100, Kashyap Chamarthy wrote:
On Mon, Jan 31, 2022 at 02:36:46PM +0000, Daniel P. Berrangé wrote:
On Mon, Jan 31, 2022 at 03:00:33PM +0100, Kashyap Chamarthy wrote:
On Mon, Jan 31, 2022 at 12:55:09PM +0000, Daniel P. Berrangé wrote:
[...]
I briefly wondered if in this "combined" mode whether the no. of duplicate copies can ever fill up the storage. I doubt that, as the combined size of _VARS + _CODE is just about 2MB. So it only starts mattering if you're running tens of thousands of guests.
When guest root / data disk sizes are measured in 100's of MB, or GBs, I struggle to get worried about even a 16 MB OVMF blob being copied per guest.
Heh, fair enough.
Main advantage of the split is that it is much easier to update the firmware code without smashing the guest vars, not so much the disk space requirements. take care, Gerd
participants (3)
-
Daniel P. Berrangé
-
Gerd Hoffmann
-
Kashyap Chamarthy