On 01/22/14 12:45, Laine Stump wrote:
On 01/22/2014 12:45 PM, Daniel P. Berrange wrote:
> On Wed, Jan 22, 2014 at 01:33:18AM +0100, Laszlo Ersek wrote:
>> Recently,
>>
>> commit 96fddee322c7d39a57cfdc5e7be71326d597d30a
>> Author: Laine Stump <laine(a)laine.org>
>> Date: Mon Dec 2 14:07:12 2013 +0200
>>
>> qemu: add "-boot strict" to commandline whenever possible
>>
>> introduced a regression for OVMF guests. The symptoms and causes are
>> described in patch 3/4, and in
>>
>>
https://bugzilla.redhat.com/show_bug.cgi?id=1056258
>>
>> Let's allow users to opt-out of "-boot strict=on" while preserving
it as
>> default.
> I don't really get from that bug description why this can't be
> made to work as desired in OVMF. It seems like its is just a
> bug in the OVMF impl that it doesn't work.
I was on the verge of making that same comment in question form. From
the information in the patches and the BZ, it sounds like either "--boot
strict" is implemented incorrectly for OVMF, or OVMF doesn't do the
proper thing with the "HALT".
What does OVMF do with bootable devices that aren't given a specific
boot order? For seabios, those devices are all on the boot list
following those with specific orders; this is what necessitates --boot
strict. The behavior of the option should be consistent regardless of
BIOS choice.
Here's again how OVMF works, in detail.
First, the list of openfirmware device paths is downloaded from fw_cfg.
Then they are translated to UEFI device path *prefixes*. This
translation (even just for the prefixes) is inexact (best effort),
because no complete mapping exists. Also it can only cover UEFI device
path prefixes because the OpenFirmware device paths don't extend into
file paths. In UEFI you can have two separate boot options that boot two
separate files from the exact same device (including partition), and you
can't distinguish these in OpenFirmware device paths. Certainly not on
the qemu command line.
OK, so now you have two lists, the list of UEFI boot options (pre-set by
the user in the firmware, or auto-generated by the firmware, doesn't
matter), and the translated prefix list from qemu/fw_cfg.
OVMF then iterates over the fw_cfg list, looks up the first prefix match
from the UEFI boot option list that matches the current translated
fw_cfg entry. If it is found, then this UEFI boot option is appended to
the output list, and the UEFI boot option is also marked as having been
added to the output list
When the outer loop completes, you have a third list (the output list)
which describes the user's boot preference. You also have some boot
options that are unmarked (left unmatched by any translated fw_cfg
entry). The question is what you do with these.
Originally, I simply dropped these. This is precisely the -boot
strict=on behavior. And it was wrong. Users wanted to keep at least
*some* of these entries at the end of the list. My first question was
"ok why don't you just specify those in fw_cfg?" And the answer is that
those options *cannot* be specified.
Therefore now we have a "survival policy" in OVMF that tacks *some* of
the unmatched boot options to the end of the list. I *can* most
certainly implement HALT parsing in OVMF, and make the survival policy
dependent on presence or absence of HALT.
But that still doesn't change the fact that you need to enable the user
to decide about passing HALT of not.
Obviously I need to describe the translations and to give you UEFI boot
option (device path) examples, otherwise you apparently simply don't
believe me.
The following stuff is implemented in
"OvmfPkg/Library/PlatformBdsLib/QemuBootOrder.c" in the edk2 tree. The
OpenFirmware device path is what comes from qemu over fw_cfg, and the
UEFI device path prefix is what the translation must output, in order
for the pefix matching to work.
(1) IDE disk, or IDE CD-ROM:
//
// OpenFirmware device path (IDE disk, IDE CD-ROM):
//
// /pci@i0cf8/ide@1,1/drive@0/disk@0
// ^ ^ ^ ^ ^
// | | | | master or slave
// | | | primary or secondary
// | PCI slot & function holding IDE controller
// PCI root at system bus port, PIO
//
// UEFI device path:
//
// PciRoot(0x0)/Pci(0x1,0x1)/Ata(Primary,Master,0x0)
// ^
// fixed LUN
(2) floppy disk:
//
// OpenFirmware device path (floppy disk):
//
// /pci@i0cf8/isa@1/fdc@03f0/floppy@0
// ^ ^ ^ ^
// | | | A: or B:
// | | ISA controller io-port (hex)
// | PCI slot holding ISA controller
// PCI root at system bus port, PIO
//
// UEFI device path:
//
// PciRoot(0x0)/Pci(0x1,0x0)/Floppy(0x0)
// ^
// ACPI UID
//
(3) virtio-block disk:
//
// OpenFirmware device path (virtio-blk disk):
//
// /pci@i0cf8/scsi@6[,3]/disk@0,0
// ^ ^ ^ ^ ^
// | | | fixed
// | | PCI function corresponding to disk
// | | (optional)
// | PCI slot holding disk
// PCI root at system bus port, PIO
//
// UEFI device path prefix:
//
// PciRoot(0x0)/Pci(0x6,0x0)/HD( -- if PCI function is 0 or absent
// PciRoot(0x0)/Pci(0x6,0x3)/HD( -- if PCI function is present and
// nonzero
//
(4) virtio-scsi unit (including disk and CD-ROM):
//
// OpenFirmware device path (virtio-scsi disk):
//
// /pci@i0cf8/scsi@7[,3]/channel@0/disk@2,3
// ^ ^ ^ ^ ^
// | | | | LUN
// | | | target
// | | channel (unused, fixed 0)
// | PCI slot[, function] holding SCSI controller
// PCI root at system bus port, PIO
//
// UEFI device path prefix:
//
// PciRoot(0x0)/Pci(0x7,0x0)/Scsi(0x2,0x3) -- if PCI function is 0
// or absent
// PciRoot(0x0)/Pci(0x7,0x3)/Scsi(0x2,0x3) -- if PCI function is
// present and nonzero
//
(5) Ethernet
//
// OpenFirmware device path (Ethernet NIC):
//
// /pci@i0cf8/ethernet@3[,2]/ethernet-phy@0
// ^ ^ ^
// | | fixed
// | PCI slot[, function] holding Ethernet card
// PCI root at system bus port, PIO
//
// UEFI device path prefix (dependent on presence of nonzero PCI
// function):
//
// PciRoot(0x0)/Pci(0x3,0x0)/MAC(525400E15EEF,0x1)
// PciRoot(0x0)/Pci(0x3,0x2)/MAC(525400E15EEF,0x1)
// ^ ^
// MAC address IfType
// (1 == Ethernet)
//
// (Some UEFI NIC drivers don't set 0x1 for IfType.)
//
Anything in fw_cfg that doesn't match any of these OpenFirmware patterns
is non-translatable right now (notice how all the patterns describe PCI
devices). But the question is not what I can currently recognize and
translate in OVMF, the question is what can come over the fw_cfg channel.
Here's two examples for you, of two valid and existent such UEFI device
paths (breaking out each device path node on a separate line for
readabiliy):
(a)
MemoryMapped(0xB,0x9F8C2000,0x9FFA1FFF)/
FvFile(7C04A583-9E3E-4F1C-AD65-E05268D0B4D1)
This happens to be the memory-mapped UEFI shell image. Please tell me
how you can specify a boot preference for it, all across libvirt and
qemu. You can rely on the GUID in the FvFile() node being well-known.
(b)
VenHw(C1E791A2-64CF-4B68-BDF1-1C31DABBDC84,0000131C00000000)/
HD(1,GPT,2F972E52-F7E0-4504-9FE7-F60E66352266,0x800,0x32000)/
\Image
This is the file called "Image", in the root directory of the filesystem
on the GPT hard disk partition identified by the HD() node, which can be
reached behind the Vendor Hardware device that corresponds to the GUID
and the rest of the binary garbage visible in the VenHw node.
In detail this happens to be a virtio-mmio block device whose register
range is mapped at 0x1C130000.
Please tell me how you can express this boot preference across libvirt
and qemu, in the OpenFirmware-format fw_cfg boot order.
The user request I had gotten earlier was to keep (a) on the list. As I
said above, the original approach was exactly -boot strict=on, which
meant that I invariably killed the UEFI shell (option (a)), because it's
unexpressible through the fw_cfg interface we have.
After the user request we introduced the survival policy, which
currently preserves all unmatched UEFI boot options as fallbacks (in
their original relative order) that start with
- neither PciRoot() (because we're saying that PCI devices can already
be expressed fairly well),
- nor HD() (this is a relative UEFI device path that the system can
complete with its missing prefix devpath segment for matching, and we're
saying that you likely want to boot HD() paths from PCI devices, which
you can already express fairly well).
This "policy" is obviously subject to change, it's just a heuristics
that looks halfway reasonable, and keeps stuff that we *know* qemu (and
likely OpenFirmware at all) can't express.
You *can* say that I should make this survival policy dependent on HALT,
but that still requires HALT not to be a *constant*.
If you make HALT a constant, and I comply with it in OVMF, then the user
can never reach the UEFI shell as a fallback.
If you make HALT a constant, and I ignore it in OVMF, then I might reach
a (say) CD-ROM entry in OVMF, even if the user doesn't specify it.
If you make HALT configurable, then I *could* comply with it, because
the user can add or remove it as he/she wishes.
I'm unable to describe the problem any better than this, sorry.
Thanks
Laszlo