[libvirt] [PATCH 0/5] qemu: Fix job usage in several APIs

When fixing https://bugzilla.redhat.com/show_bug.cgi?id=1043069 I realized qemuDomainBlockStats is not the only API that does not acquire a job early enough. Generally, every API that is going to begin a job should do that before fetching data from vm->def. The following 5 APIs failed to do so and moreover used the data fetched early from vm->def after starting a job. In some circumstances this can lead to a crash. Jiri Denemark (5): qemu: Do not access stale data in virDomainBlockStats qemu: Avoid using stale data in virDomainGetBlockInfo qemu: Fix job usage in qemuDomainBlockJobImpl qemu: Fix job usage in qemuDomainBlockCopy qemu: Fix job usage in virDomainGetBlockIoTune src/qemu/qemu_driver.c | 92 ++++++++++++++++++++++++-------------------------- 1 file changed, 44 insertions(+), 48 deletions(-) -- 1.8.5.2

https://bugzilla.redhat.com/show_bug.cgi?id=1043069 When virDomainDetachDeviceFlags is called concurrently to virDomainBlockStats: libvirtd may crash because qemuDomainBlockStats finds a disk in vm->def before getting a job on a domain and uses the disk pointer after getting the job. However, the domain in unlocked while waiting on a job condition and thus data behind the disk pointer may disappear. This happens when thread 1 runs virDomainDetachDeviceFlags and enters monitor to actually remove the disk. Then another thread starts running virDomainBlockStats, finds the disk in vm->def, and while it's waiting on the job condition (owned by the first thread), the first thread finishes the disk removal. When the second thread gets the job, the memory pointed to be the disk pointer is already gone. That said, every API that is going to begin a job should do that before fetching data from vm->def. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 17 ++++++----------- 1 file changed, 6 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 45d11cd..8823187 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9020,34 +9020,29 @@ qemuDomainBlockStats(virDomainPtr dom, if (virDomainBlockStatsEnsureACL(dom->conn, vm->def) < 0) goto cleanup; + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) + goto cleanup; + if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("domain is not running")); - goto cleanup; + goto endjob; } if ((idx = virDomainDiskIndexByName(vm->def, path, false)) < 0) { virReportError(VIR_ERR_INVALID_ARG, _("invalid path: %s"), path); - goto cleanup; + goto endjob; } disk = vm->def->disks[idx]; if (!disk->info.alias) { virReportError(VIR_ERR_INTERNAL_ERROR, _("missing disk device alias name for %s"), disk->dst); - goto cleanup; + goto endjob; } priv = vm->privateData; - if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) - goto cleanup; - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_OPERATION_INVALID, - "%s", _("domain is not running")); - goto endjob; - } qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorGetBlockStatsInfo(priv->mon, -- 1.8.5.2

On 12/20/2013 02:36 PM, Jiri Denemark wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1043069
When virDomainDetachDeviceFlags is called concurrently to virDomainBlockStats: libvirtd may crash because qemuDomainBlockStats finds a disk in vm->def before getting a job on a domain and uses the disk pointer after getting the job. However, the domain in unlocked while waiting on a job condition and thus data behind the disk pointer may disappear. This happens when thread 1 runs virDomainDetachDeviceFlags and enters monitor to actually remove the disk. Then another thread starts running virDomainBlockStats, finds the disk in vm->def, and while it's waiting on the job condition (owned by the first thread), the first thread finishes the disk removal. When the second thread gets the job, the memory pointed to be the disk pointer is already gone.
That said, every API that is going to begin a job should do that before fetching data from vm->def.
Bummer, we'll need a CVE for this. It's been present since at least 0.9.x days. ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Generally, every API that is going to begin a job should do that before fetching data from vm->def. However, qemuDomainGetBlockInfo does not know whether it will have to start a job or not before checking vm->def. To avoid using disk alias that might have been freed while we were waiting for a job, we use its copy. In case the disk was removed in the meantime, we will fail with "cannot find statistics for device '...'" error message. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8823187..9db0c49 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9780,10 +9780,12 @@ cleanup: } -static int qemuDomainGetBlockInfo(virDomainPtr dom, - const char *path, - virDomainBlockInfoPtr info, - unsigned int flags) { +static int +qemuDomainGetBlockInfo(virDomainPtr dom, + const char *path, + virDomainBlockInfoPtr info, + unsigned int flags) +{ virQEMUDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; int ret = -1; @@ -9795,6 +9797,7 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom, int idx; int format; virQEMUDriverConfigPtr cfg = NULL; + char *alias = NULL; virCheckFlags(0, -1); @@ -9901,13 +9904,16 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom, virDomainObjIsActive(vm)) { qemuDomainObjPrivatePtr priv = vm->privateData; + if (VIR_STRDUP(alias, disk->info.alias) < 0) + goto cleanup; + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0) goto cleanup; if (virDomainObjIsActive(vm)) { qemuDomainObjEnterMonitor(driver, vm); ret = qemuMonitorGetBlockExtent(priv->mon, - disk->info.alias, + alias, &info->allocation); qemuDomainObjExitMonitor(driver, vm); } else { @@ -9921,6 +9927,7 @@ static int qemuDomainGetBlockInfo(virDomainPtr dom, } cleanup: + VIR_FREE(alias); virStorageFileFreeMetadata(meta); VIR_FORCE_CLOSE(fd); if (vm) -- 1.8.5.2

On 12/20/2013 02:36 PM, Jiri Denemark wrote:
Generally, every API that is going to begin a job should do that before fetching data from vm->def. However, qemuDomainGetBlockInfo does not know whether it will have to start a job or not before checking vm->def. To avoid using disk alias that might have been freed while we were waiting for a job, we use its copy. In case the disk was removed in the meantime, we will fail with "cannot find statistics for device '...'" error message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
This fixes the potential for a crash, but isn't this related to the same problem that John has been working on? That is, when one thread is migrating a domain and the other is getting block info, we have a race where qemu will quit before we get a chance to probe for the block info. https://www.redhat.com/archives/libvir-list/2013-December/msg00984.html Would changing when we grab the job (and grabbing it unconditionally, even if we end up not needing to query) help solve both issues? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Tue, Dec 24, 2013 at 07:22:38 -0700, Eric Blake wrote:
On 12/20/2013 02:36 PM, Jiri Denemark wrote:
Generally, every API that is going to begin a job should do that before fetching data from vm->def. However, qemuDomainGetBlockInfo does not know whether it will have to start a job or not before checking vm->def. To avoid using disk alias that might have been freed while we were waiting for a job, we use its copy. In case the disk was removed in the meantime, we will fail with "cannot find statistics for device '...'" error message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
This fixes the potential for a crash, but isn't this related to the same problem that John has been working on? That is, when one thread is migrating a domain and the other is getting block info, we have a race where qemu will quit before we get a chance to probe for the block info. https://www.redhat.com/archives/libvir-list/2013-December/msg00984.html
Would changing when we grab the job (and grabbing it unconditionally, even if we end up not needing to query) help solve both issues?
I might help with what John was trying to do but I don't think this patch alone would solve the issue. May I consider this patch ACKed? :-) Jirka

On 01/07/2014 07:22 AM, Jiri Denemark wrote:
On Tue, Dec 24, 2013 at 07:22:38 -0700, Eric Blake wrote:
On 12/20/2013 02:36 PM, Jiri Denemark wrote:
Generally, every API that is going to begin a job should do that before fetching data from vm->def. However, qemuDomainGetBlockInfo does not know whether it will have to start a job or not before checking vm->def. To avoid using disk alias that might have been freed while we were waiting for a job, we use its copy. In case the disk was removed in the meantime, we will fail with "cannot find statistics for device '...'" error message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
This fixes the potential for a crash, but isn't this related to the same problem that John has been working on? That is, when one thread is migrating a domain and the other is getting block info, we have a race where qemu will quit before we get a chance to probe for the block info. https://www.redhat.com/archives/libvir-list/2013-December/msg00984.html
Would changing when we grab the job (and grabbing it unconditionally, even if we end up not needing to query) help solve both issues?
I might help with what John was trying to do but I don't think this patch alone would solve the issue. May I consider this patch ACKed? :-)
I was debating whether a single patch could solve both issues at once, to avoid rebase churn; but at this point, given that it solves a CVE, I'm okay giving ACK to your patch as-is. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 01/07/2014 10:01 AM, Eric Blake wrote:
On 01/07/2014 07:22 AM, Jiri Denemark wrote:
On Tue, Dec 24, 2013 at 07:22:38 -0700, Eric Blake wrote:
On 12/20/2013 02:36 PM, Jiri Denemark wrote:
Generally, every API that is going to begin a job should do that before fetching data from vm->def. However, qemuDomainGetBlockInfo does not know whether it will have to start a job or not before checking vm->def. To avoid using disk alias that might have been freed while we were waiting for a job, we use its copy. In case the disk was removed in the meantime, we will fail with "cannot find statistics for device '...'" error message.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
This fixes the potential for a crash, but isn't this related to the same problem that John has been working on? That is, when one thread is migrating a domain and the other is getting block info, we have a race where qemu will quit before we get a chance to probe for the block info. https://www.redhat.com/archives/libvir-list/2013-December/msg00984.html
Would changing when we grab the job (and grabbing it unconditionally, even if we end up not needing to query) help solve both issues?
I might help with what John was trying to do but I don't think this patch alone would solve the issue. May I consider this patch ACKed? :-)
I was debating whether a single patch could solve both issues at once, to avoid rebase churn; but at this point, given that it solves a CVE, I'm okay giving ACK to your patch as-is.
I tagged this before break meaning to get back to it and respond, but other things got in the way... Thanks for the prod... Not sure this change will affect what's seen in the issue I looked at before the break. In that case, the code returns success (eg, 0 - zero) when virDomainObjIsActive() fails and that leaves "info->allocation = info->physical;" which ends up being unexpected from the callers POV. That case wanted a way to "detect" (via failure) that although the guest is not marked as shutoff/down it's not returning the "real" allocation value because we've hit a point in time during the migration where we cannot get the answer from QEMU. John

Every API that is going to begin a job should do that before fetching data from vm->def. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 9db0c49..dad8450 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14241,16 +14241,25 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, goto cleanup; } + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not running")); + goto endjob; + } + device = qemuDiskPathToAlias(vm, path, &idx); if (!device) - goto cleanup; + goto endjob; disk = vm->def->disks[idx]; if (mode == BLOCK_JOB_PULL && disk->mirror) { virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, _("disk '%s' already in active block copy job"), disk->dst); - goto cleanup; + goto endjob; } if (mode == BLOCK_JOB_ABORT && (flags & VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT) && @@ -14258,15 +14267,6 @@ qemuDomainBlockJobImpl(virDomainObjPtr vm, virReportError(VIR_ERR_OPERATION_INVALID, _("pivot of disk '%s' requires an active copy job"), disk->dst); - goto cleanup; - } - - if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) - goto cleanup; - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("domain is not running")); goto endjob; } -- 1.8.5.2

On 12/20/2013 02:36 PM, Jiri Denemark wrote:
Every API that is going to begin a job should do that before fetching data from vm->def.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Every API that is going to begin a job should do that before fetching data from vm->def. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index dad8450..2fb643a 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -14427,7 +14427,7 @@ qemuDomainBlockCopy(virDomainObjPtr vm, virQEMUDriverPtr driver = conn->privateData; qemuDomainObjPrivatePtr priv; char *device = NULL; - virDomainDiskDefPtr disk; + virDomainDiskDefPtr disk = NULL; int ret = -1; int idx; struct stat st; @@ -14442,29 +14442,32 @@ qemuDomainBlockCopy(virDomainObjPtr vm, priv = vm->privateData; cfg = virQEMUDriverGetConfig(driver); + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + if (!virDomainObjIsActive(vm)) { virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("domain is not running")); - goto cleanup; + goto endjob; } device = qemuDiskPathToAlias(vm, path, &idx); if (!device) { - goto cleanup; + goto endjob; } disk = vm->def->disks[idx]; if (disk->mirror) { virReportError(VIR_ERR_BLOCK_COPY_ACTIVE, _("disk '%s' already in active block copy job"), disk->dst); - goto cleanup; + goto endjob; } if (!(virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_DRIVE_MIRROR) && virQEMUCapsGet(priv->qemuCaps, QEMU_CAPS_BLOCKJOB_ASYNC))) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("block copy is not supported with this QEMU binary")); - goto cleanup; + goto endjob; } if (vm->persistent) { /* XXX if qemu ever lets us start a new domain with mirroring @@ -14473,17 +14476,9 @@ qemuDomainBlockCopy(virDomainObjPtr vm, * this on persistent domains. */ virReportError(VIR_ERR_OPERATION_INVALID, "%s", _("domain is not transient")); - goto cleanup; - } - - if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) - goto cleanup; - - if (!virDomainObjIsActive(vm)) { - virReportError(VIR_ERR_OPERATION_INVALID, "%s", - _("domain is not running")); goto endjob; } + if (qemuDomainDetermineDiskChain(driver, disk, false) < 0) goto endjob; @@ -14573,7 +14568,7 @@ qemuDomainBlockCopy(virDomainObjPtr vm, endjob: if (need_unlink && unlink(dest)) VIR_WARN("unable to unlink just-created %s", dest); - if (ret < 0) + if (ret < 0 && disk) disk->mirrorFormat = VIR_STORAGE_FILE_NONE; VIR_FREE(mirror); if (!qemuDomainObjEndJob(driver, vm)) -- 1.8.5.2

On 12/20/2013 02:36 PM, Jiri Denemark wrote:
Every API that is going to begin a job should do that before fetching data from vm->def.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 25 ++++++++++--------------- 1 file changed, 10 insertions(+), 15 deletions(-)
ACK -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Every API that is going to begin a job should do that before fetching data from vm->def. Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2fb643a..b54aa36 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -15056,12 +15056,6 @@ qemuDomainGetBlockIoTune(virDomainPtr dom, goto cleanup; } - device = qemuDiskPathToAlias(vm, disk, NULL); - - if (!device) { - goto cleanup; - } - if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) goto cleanup; @@ -15069,6 +15063,11 @@ qemuDomainGetBlockIoTune(virDomainPtr dom, &persistentDef) < 0) goto endjob; + device = qemuDiskPathToAlias(vm, disk, NULL); + if (!device) { + goto endjob; + } + if (flags & VIR_DOMAIN_AFFECT_LIVE) { priv = vm->privateData; qemuDomainObjEnterMonitor(driver, vm); -- 1.8.5.2

On 12/20/2013 02:36 PM, Jiri Denemark wrote:
Every API that is going to begin a job should do that before fetching data from vm->def.
Signed-off-by: Jiri Denemark <jdenemar@redhat.com> --- src/qemu/qemu_driver.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
ACK. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 12/20/2013 02:36 PM, Jiri Denemark wrote:
When fixing https://bugzilla.redhat.com/show_bug.cgi?id=1043069 I realized qemuDomainBlockStats is not the only API that does not acquire a job early enough. Generally, every API that is going to begin a job should do that before fetching data from vm->def. The following 5 APIs failed to do so and moreover used the data fetched early from vm->def after starting a job. In some circumstances this can lead to a crash.
This series has been assigned CVE-2013-6458. I ran out of time today to review the rest of the series and start the backports; but hopefully we can get progress on it before 2014.
Jiri Denemark (5): qemu: Do not access stale data in virDomainBlockStats qemu: Avoid using stale data in virDomainGetBlockInfo qemu: Fix job usage in qemuDomainBlockJobImpl qemu: Fix job usage in qemuDomainBlockCopy qemu: Fix job usage in virDomainGetBlockIoTune
src/qemu/qemu_driver.c | 92 ++++++++++++++++++++++++-------------------------- 1 file changed, 44 insertions(+), 48 deletions(-)
-- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Mon, Dec 23, 2013 at 23:09:12 -0700, Eric Blake wrote:
On 12/20/2013 02:36 PM, Jiri Denemark wrote:
When fixing https://bugzilla.redhat.com/show_bug.cgi?id=1043069 I realized qemuDomainBlockStats is not the only API that does not acquire a job early enough. Generally, every API that is going to begin a job should do that before fetching data from vm->def. The following 5 APIs failed to do so and moreover used the data fetched early from vm->def after starting a job. In some circumstances this can lead to a crash.
This series has been assigned CVE-2013-6458. I ran out of time today to review the rest of the series and start the backports; but hopefully we can get progress on it before 2014.
The series is pushed now. Thanks for the review. Jirka
participants (3)
-
Eric Blake
-
Jiri Denemark
-
John Ferlan