The recent change in libvirt to pass storage arguments to qemu via
"-blockdev", explicity passing backing file chain information rather
than relying on qemu to figure it out, has bitten me quite painfully.
I've had the habit for years to use qcow2 with backing chains without
specifying the backing file format explicitly, as information like the
following was enough both for me and for qemu to figure out that the
backing file was in qcow2 format:
# qemu-img info sles15-gm-cc.qcow2
image: sles15-gm-cc.qcow2
file format: qcow2
virtual size: 20 GiB (21474836480 bytes)
disk size: 407 MiB
cluster_size: 65536
backing file: /mnt/vms/sles15-gm-base.qcow2
But if libvirt encounters a qemu-img file like this, it passes the
backing file to qemu with "driver";"raw":
-blockdev
'{"driver":"file","filename":"/mnt/vms/sles15-gm-base.qcow2","node-name":"libvirt-4-storage","auto-read-only":true,"discard":
"unmap"}' \
-blockdev
'{"node-name":"libvirt-4-format","read-only":true,"driver":"raw","file":"libvirt-4-storage"}'
\
The effect was that some of my VMs would refuse to start with a "no
bootable disk" error message from the VM's BIOS. Others did boot (I
believe this was because the boot sector had been modified in the top
image in the backing file chain), and I'm quite grateful I did not
attempt to boot them fully, because god knows what might have happened
if the OS had later encountered garbage data from the backing file at
some random point.
It took me half a day to figure out that this effect had been caused by
the recent libvirt update.
I'm aware that documentation about this has been added recently
(
https://libvirt.org/kbase/backing_chains.html (*)). Also I believe
that current libvirt master (unlike 5.10.0) would refuse to start such
images in the first place (
https://bugzilla.redhat.com/show_bug.cgi?id=1588373), perhaps providing
users with a better clue than before what was going wrong.
Meanwhile I've fixed my VM images by adding the backing file format
tag, as suggested in the documentation. However I still think that this
was quite a disruptive change and highly unexpected for users. IMO the
default behavior shouldn't have been switched like this without
appropriate warnings.
The rationale given for not autodetecting the file format is "a
malicious guest could rewrite the header of the disk leading to access
of host files". I suppose a guest would need to manipulate a raw image
to look like qcow2, qed or similar for this to happen (and set the
backing file to "/etc/shadow", maybe?). Still the malicious guest would
need to find a way to manipulate the data *on the backing store*,
because the format of the topmost image is explicit anyway.
Modifying the backing store could be difficult for the guest, because
it's normally read-only and changes go only to the top layer. Or am I
missing something? The opposite (manipulating qcow2 to look like raw)
shouldn't be possible, IMO.
While I can't deny that an attack like this might be feasible, I am
still wondering why this hasn't been an issue in past years (with qemu
auto-detecting the backing file format).
More importantly, perhaps the disruption caused by this change could be
mitigated by allowing autodetection in certain cases (e.g. if the file
name of the backing file indicates it's qcow2, as in the example
above), or by providing a configuration option to enable it in
environments (like mine, developer test environment) where evil guests
are very unlikely.
Best regards,
Martin
(*) This page is quite hard to find, googling for "libvirt backing
chain" does not pick it up prominently just yet. Actually I only found
this information by running "git grep" on the libvirt git repo.
--
Dr. Martin Wilck <mwilck(a)suse.com>, Tel. +49 (0)911 74053 2107
SUSE Software Solutions Germany GmbH
HRB 36809, AG Nürnberg GF: Felix
Imendörffer