On 03/13/2015 02:02 AM, Peter Krempa wrote:
> @@ -16172,8 +16169,12 @@ qemuDomainBlockJobImpl(virDomainObjPtr
vm,
> }
>
> qemuDomainObjEnterMonitor(driver, vm);
> - ret = qemuMonitorBlockJob(priv->mon, device, basePath, backingPath,
> - speed, mode, async);
> + if (baseSource)
> + basePath = qemuMonitorDiskNameLookup(priv->mon, device, disk->src,
I remember that at some point accessing of domain definition while in
the monitor was not okay for some reason, but I can't now remember why
nor whether it was fixed.
Oh, right. You're thinking of CVE-2013-6458. That problem was that as
soon as we enter the monitor, we drop locks. If we do not already own a
block job, then some other parallel API could be hot-unplugging a disk
before we regain control, freeing 'disk' before we dereference it. But
we fixed that problem by guaranteeing that we always own the job early
enough (no other thread can hot-unplug the disk as long as we own the
job), so it is not an issue for this patch.
> - * Copyright (C) 2006-2014 Red Hat, Inc.
> + * Copyright (C) 2006-2015 Red Hat, Inc.
Shouldn't we employ something as in gnulib, where copyrights would be
bumped at once everywhere?
Might be nice, but one wrinkle. Gnulib has a single copyright holder
(FSF), so they can afford to bump all files at once (the bump is also
owned by FSF, so FSF adding another year to its copyright is
appropriate). But libvirt is split among multiple copyright holders -
Red Hat can't claim copyright over all files, so it wouldn't be wise to
bump all files, just the ones that Red Hat has already touched.
Personally, I've just got an emacs hook that checks if any file I touch
has an up-to-date copyright line.
> +static char *
> +qemuMonitorJSONDiskNameLookupOne(virJSONValuePtr image,
> + virStorageSourcePtr top,
> + virStorageSourcePtr target)
> +{
> + virJSONValuePtr backing;
> + char *ret;
> +
> + if (!top)
> + return NULL;
In case the backing chain as remembered by libvirt is shorter than what
qemu sees you don't report error. Since the caller checks whether an
error was set and if not then adds one, please state this fact in a
comment here as it's not obvious until you follow the call chain.
Will do.
> + if (top != target) {
> + backing = virJSONValueObjectGet(image, "backing-image");
> + return qemuMonitorJSONDiskNameLookupOne(backing, top->backingStore,
> + target);
Also the recursion doesn't take into account that for some reason qemu
might report a shorter chain than libvirt thinks, which would crash
here.
Oh, good catch (and looks like it explains what Shanzhi reported).
> + if (!dev || dev->type != VIR_JSON_TYPE_OBJECT) {
[1]
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("block info device entry was not in expected
format"));
> + goto cleanup;
> + }
> +
> + if ((thisdev = virJSONValueObjectGetString(dev, "device")) ==
NULL) {
You are mixing styles of cheching of the pointer to be non-null within a
few lines ([1])
Copy-and-paste from another recursive parser of "query-block"
information, but I can make it more consistent.
ACK if you add the comment and fix the potential crash. I'm currently OK
with accessing domain definition while it's unlocked (but guarded via
the domain job) as I don't have an counter example where it wouldn't
work correctly.
I'm still a bit worried by Shanzhi's report of a crash; maybe I still
have a race condition. That is, we change libvirt's notion of the chain
length after a commit based on response to a qemu event rather than a
user command - I was thinking that libvirt's chain and qemu's chain will
always be the same length, but since Shanzhi provided a stack trace
where it is not true, I'm wondering if the qemu chain being shorter than
the libvirt chain might mean that we have some sort of window where a
qemu event happens at the wrong moment when repeatedly hammering on
consecutive commits. So I'll post a v3 after more testing rather than
just blindly going on this ACK.
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org