On Thu, Jan 09, 2020 at 14:34:08 +0000, Daniel Berrange wrote:
On Wed, Jan 08, 2020 at 03:46:59PM +0100, Peter Krempa wrote:
> On Wed, Jan 08, 2020 at 13:34:14 +0000, Martin Wilck wrote:
> > The recent change in libvirt to pass storage arguments to qemu via
> > "-blockdev", explicity passing backing file chain information rather
> > than relying on qemu to figure it out, has bitten me quite painfully.
>
> I'm sorry for causing this. I'm also sorry for going to sound too
> dismissive later on.
snip
> > While I can't deny that an attack like this might be feasible, I am
> > still wondering why this hasn't been an issue in past years (with qemu
> > auto-detecting the backing file format).
>
> Well, for distros using selinux this attack is mitigated by selinux
> usage. That was also the reason why "assuming raw" was good enough. It
> was also 'good enough' because there wasn't any other option.
>
> It additionally created a whole lot of users which were already bitten
> by this many years ago and now use the format correctly since we'd not
> allow access otherwise when selinux is used.
IIUC there are three scenarios at play here
1. qcow2 file, first level raw backing
2. qcow2 file, first level qcow2 backing, no backing
3. qcow2 file, first level qcow2 backing, with second level backing
Historically with the SELinux driver, if no backing_fmt is set,
then we blindly assume that scenario (1) is in effect.
We cannot tell QEMU that we treated it as scenario (1) though,
so QEMU will still probe format and potentially open the
first level backing as qcow2 still.
IOW, despite our SELinux & QEMU driver assumption for scenario (1),
scenario (2) would still suceed in the past. Scenario (3) would
always have failed, because SELinux won't have labelled the second
level backing.
Essentially scenario (2) worked by accident, not design, but
IIUC we have not been warning users about this.
Exactly.
With new libvirt and -blockdev we still assume the backing_fmt
is raw if not set in the SELinux driver, but we now have the
ability to tell QEMU about our assumption via -blockdev. As
such we've not ensured scenario (2) fails.
Users who were silently relying on scenario (2) are now broken
and have three options IIUC
- Run qemu-img rebase to set the backing_fmt
or
- Update the guest XML to set the <backingStore> format value
or
- Update the /etc/libvirt/qemu.conf to set capability_filters
to disable "blockdev"
This thing is only for experimenting though. I don't think we should
suggest it as a solution as it may cause problems in the future.
Always assuming the format is raw certainly avoids the security
danger, but is unhelpful to users who relied on scenario (2).
I'm inclined to say we should make scenario (2)/(3) into a hard
They are a hard error now:
commit 3615e8b39badf2a526996a69dc91a92b04cf262e
Author: Peter Krempa <pkrempa(a)redhat.com>
Date: Tue Dec 17 17:04:04 2019 +0100
util: storage: Don't treat files with missing backing store format as
'raw'
Assuming that the backing image format is raw is wrong when doing image
detection:
1) In -drive mode qemu will still probe the image format of the backing
image. This means it will try to open a backing file of the image
which will fail if a more advanced security model is in use.
2) In blockdev mode the image will be opened as raw actually which is
wrong since it might be qcow. Not opening the backing images will
also end up in the guest seeing corrupted data.
Rather than attempt to solve various corner cases when us assuming the
storage file being raw and actually being right forbid startup when the
guest image doesn't have the format specified in the metadata.
error. Only allow scenario (1) to run normally.
but that makes also scenario 1 an error.
ie, we should probe the disk format for the backing file, and
if it is *not* raw, then report an immediate error, with the
My only grief here is that we'd actually have to run format detection at
all. That code was deleted from the startup path long time ago and I
don't feel particularly good about re-adding it.
error message pointing to the kbase page explaining how to
fix this. This will help the 99% common case identify the
fix more quickly, with no obvious downside that I see.
This would be the first error message to contain URI. Or do you have any
other suggestion how to word it?