On 1/29/24 2:25 AM, Peter Krempa wrote:
On Thu, Jan 25, 2024 at 14:52:11 -0600, Jonathon Jongsma wrote:
> Previously we were only starting or stopping nbdkit when the guest was
> started or stopped or when hotplugging/unplugging a disk. But when doing
> block operations, the disk backing store sources can also be be added or
> removed independently of the disk device. When this happens the nbdkit
> backend was not being handled properly. For example, when doing a
> blockcopy from a nbdkit-backed disk to a new disk and pivoting to that
> new location, the nbdkit process did not get cleaned up properly. Add
> some functionality to qemuDomainStorageSourceAccessModify() to handle
> this scenario.
>
> Since we're now potentially starting nbdkit from another place in the
> code, add a check to qemuNbdkitProcessStart() to report an error if we
> are trying to start nbdkit for a disk source that already has a running
> nbdkit process. This should never happen. If it does, it indicates an
> error in another part of our code.
>
> Signed-off-by: Jonathon Jongsma <jjongsma(a)redhat.com>
> ---
> src/qemu/qemu_domain.c | 10 ++++++++++
> src/qemu/qemu_nbdkit.c | 7 +++++++
> 2 files changed, 17 insertions(+)
>
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index de36641137..973a9af0b6 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
[...]
> @@ -8059,6 +8061,11 @@ qemuDomainStorageSourceAccessModify(virQEMUDriver *driver,
>
> revoke_nvme = true;
>
> + if (qemuNbdkitStartStorageSource(driver, vm, src, chain) < 0)
> + goto revoke;
> +
> + revoke_nbdkit = true;
> +
Note that qemuDomainStorageSourceAccessModify can be called on already
existing 'virStorageSource' to e.g. change from read-only to read-write
mode or back.
That's precisely why I put it inside of this block:
if (!(flags & QEMU_DOMAIN_STORAGE_SOURCE_ACCESS_MODIFY_ACCESS)) {
And it seems that the MODIFY_ACCESS flag is always set unless we are
adding a new source since this is also the way that new nvme devices are
set up.
[...]
> if (qemuDomainNamespaceSetupDisk(vm, src, &revoke_namespace) < 0)
> goto revoke;
> }
> @@ -8113,6 +8120,9 @@ qemuDomainStorageSourceAccessModify(virQEMUDriver *driver,
> VIR_WARN("Unable to release lock on %s", srcstr);
> }
>
> + if (revoke_nbdkit)
> + qemuNbdkitStopStorageSource(src, vm, chain);
> +
> cleanup:
> src->readonly = was_readonly;
> virErrorRestore(&orig_err);
> diff --git a/src/qemu/qemu_nbdkit.c b/src/qemu/qemu_nbdkit.c
> index edbe2137d0..73dc90af3e 100644
> --- a/src/qemu/qemu_nbdkit.c
> +++ b/src/qemu/qemu_nbdkit.c
> @@ -1195,6 +1195,13 @@ qemuNbdkitProcessStart(qemuNbdkitProcess *proc,
> struct nbd_handle *nbd = NULL;
> #endif
>
> + /* don't try to start nbdkit again if we've already started it */
> + if (proc->pid > 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("Attempting to start nbdkit twice"));
> + return -1;
> + }
[...] Which means that this WILL get hit multiple times. Additionally
don't forget that qemuDomainStorageSourceAccessModify is called from.
qemuDomainStorageSourceChainAccessAllow.
Which is e.g. called from qemuDomainAttachDeviceDiskLiveInternal where
we have this code:
if (!virStorageSourceIsEmpty(disk->src)) {
if (qemuDomainPrepareDiskSource(disk, priv, cfg) < 0)
goto cleanup;
if (qemuDomainDetermineDiskChain(driver, vm, disk, NULL) < 0)
goto cleanup;
if (qemuDomainStorageSourceChainAccessAllow(driver, vm, disk->src) < 0)
^^^^^^^^^^^^^^^^^^^
goto cleanup;
releaseSeclabel = true;
if (qemuProcessPrepareHostStorageDisk(vm, disk) < 0)
goto cleanup;
if (qemuHotplugAttachManagedPR(vm, disk->src, VIR_ASYNC_JOB_NONE) < 0)
goto cleanup;
if (qemuNbdkitStartStorageSource(driver, vm, disk->src, true) < 0)
^^^^^^^^^^^^^^^^^
goto cleanup;
}
As you point out, I didn't catch this case. It seems to me that the
right approach here is to just remove the call to
qemuNbdkitStartStorageSource() from here and instead rely on the
qemuDomainStorageSourceChainAccessAllow() call above to initialize nbdkit.
> +
> if (!(cmd = qemuNbdkitProcessBuildCommand(proc)))
> return -1;
Also conceptually the qemuDomainStorageSourceAccessModify function deals
with security and permissions of image files. I don't think that the
NBDKit helper process code belongs into it.
NACK
Fair point, but we'll still need to do something to start the nbdkit
process in the same basic code path as these calls to
AccessAllow()/AccessRevoke() if we want to support nbdkit-backed disks
in these paths. And IMO it's not all that dissimilar to the work that
we're doing here to detach the nvme devices from the host, etc. Or is
there a better spot to handle this?
Jonathon