On 10/11/19 9:45 AM, Ján Tomko wrote:
On Mon, Oct 07, 2019 at 05:49:14PM -0400, Cole Robinson wrote:
> This series is the first steps to teaching libvirt about qcow2
> data_file support, aka external data files or qcow2 external metadata.
>
> A bit about the feature: it was added in qemu 4.0. It essentially
> creates a two part image file: a qcow2 layer that just tracks the
> image metadata, and a separate data file which is stores the VM
> disk contents. AFAICT the driving use case is to keep a fully coherent
> raw disk image on disk, and only use qcow2 as an intermediate metadata
> layer when necessary, for things like incremental backup support.
>
> The original qemu patch posting is here:
>
https://lists.gnu.org/archive/html/qemu-devel/2019-02/msg07496.html
>
> For testing, you can create a new qcow2+raw data_file image from an
> existing image, like:
>
> qemu-img convert -O qcow2 \
> -o data_file=NEW.raw,data_file_raw=yes
> EXISTING.raw NEW.qcow2
>
> The goal of this series is to teach libvirt enough about this case
> so that we can correctly relabel the data_file on VM startup/shutdown.
> The main functional changes are
>
> * Teach storagefile how to parse out data_file from the qcow2 header
> * Store the raw string as virStorageSource->externalDataStoreRaw
> * Track that as its out virStorageSource in externalDataStore
> * dac/selinux relabel externalDataStore as needed
>
>> From libvirt's perspective, externalDataStore is conceptually pretty
> close to a backingStore, but the main difference is its read/write
> permissions should match its parent image, rather than being readonly
> like backingStore.
>
> This series has only been tested on top of the -blockdev enablement
> series, but I don't think it actually interacts with that work at
> the moment.
>
>
> Future work:
> * Exposing this in the runtime XML. We need to figure out an XML
This also belongs in the persistent XML.
Agreed
> schema. It will reuse virStorageSource obviously, but the
main
> thing to figure out is probably 1) what the top element name
> should be ('dataFile' maybe?), 2) where it sits in the XML
> hierarchy (under <disk> or under <source> I guess)
>
<metadataStore> maybe?
The way this code is structured, we have
src->path = FOO.qcow2
src->externalDataStore-> FOO.raw
FOO.raw contains the disk/OS contents, FOO.qcow2 just the qcow2
metadata. If we reflect that layout in the XML, we have
<disk>
<source file='FOO.qcow'>
<externalDataStore>
<source file='FOO.raw'/>
</externalDataStore>
</source>
</disk>
If we called it metadataStore it sounds like the layout is inverted.
>> * Exposing this on the qemu -blockdev command line.
Similar to how
> in the blockdev world we are explicitly putting the disk backing
> chain on the command line, we can do that for data_file too.
Historically, not being explicit on the command line and letting QEMU
do the right thing has bitten us, so yes, we have to do it for data_file
too.
> Then
> like persistent <backingStore> XML the user will have the power
> to overwrite the data_file location for an individual VM run.
>
If the point of the thin qcow2 layer is to contain the dirty bitmaps for
incremental backup then running this then you might as well use a
different metadata_file? Otherwise the metadata won't match the actual
data.
I'm not sure I follow this part, but maybe that's due to data_file
naming mixup
OTOH, I can imagine throwing away the metadata file and starting
over.
Yes this is one of the main drivers I think. That the qcow2 layer gives
qcow2 native features like dirty bitmaps, but if it ever comes to it,
the data is still in raw format which simplifies processing the image
with other tools. Plus raw is less of a boogieman than qcow2 for some
people, so I think there's some marketing opportunity behind it to say
'see your data is still there in FOO.raw'.
There's probably cases where the user would want to ditch the top level
layer and use that data raw layer directly, but similar to writing to a
backing image, it invalidates the top layer, and there's no rebase
operation for data_file AFAICT. But the persistent XML will allow
configuring that if someone wanted it
> * Figure out how we expect ovirt/rhev to be using this at
runtime.
> Possibly taking a running VM using a raw image, doing blockdev-*
> magic to pivot it to qcow2+raw data_file, so it can initiate
> incremental backup on top of a previously raw only VM?
>
>
> Known issues:
> * In the qemu driver, the qcow2 image metadata is only parsed
> in -blockdev world if no <backingStore> is specified in the
> persistent XML. So basically if there's a <backingStore> listed,
> we never parse the qcow2 header and detect the presence of
> data_file. Fixable I'm sure but I didn't look into it much yet.
This will be fixed by introducing an XML element for it.
It's part of the fix I think. We will still need to change qemu_block.c
logic to accomodate this in some way. Right now, whether we probe the
qcow2 file metadata is only dependent on <backingStore> in the
persistent XML or not. But now the probing provides info on both
backingStore and externalDataStore, so tying probing only to prescence
of backingStore XML isn't sufficient.
I'm thinking extend the storage_file.c entry points to have an option
like 'skipBackingStore' and 'skipExternalDataStore' or similar, so we
only probe what we want, and probing is skipped entirely only if both
backingStore and externalDataStore are in the XML. That's just an idea,
I'll look into it more next week and if there's no clear answer I'll
start a separate thread
Thanks,
Cole