[PATCH 0/8] qemu: Add option to preserve shutdown VM during backup job and job/reconnection fixes
This series first fixes few job handling, reconnection and backup job bugs and then implements an option to keep the VM process around when the guest OS shuts down during backup, so that the backup doesn't need to be restarted. Peter Krempa (8): virDomainNestedJobAllowed: Allow VIR_JOB_MODIFY_MIGRATION_SAFE if VIR_JOB_MODIFY is allowed qemuProcessReconnect: Continue reconnection if VM untergoes fake-reboot qemu: backup: Don't attempt to stop the NBD server twice qemuBlockJobProcessEventConcludedBackup: Notify the backup job later lib: Introduce VIR_DOMAIN_EVENT_SUSPENDED_GUEST_SHUTDOWN event reason lib: Introduce VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN flag qemu: backup: Add support for VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN kbase: Add note about preserving VM on shutdown to backup article docs/kbase/live_full_disk_backup.rst | 18 +++++ docs/manpages/virsh.rst | 6 ++ examples/c/misc/event-test.c | 3 + include/libvirt/libvirt-domain.h | 7 +- src/conf/backup_conf.h | 4 ++ src/conf/virdomainjob.c | 1 + src/libvirt-domain.c | 5 ++ src/qemu/qemu_backup.c | 61 +++++++++++++--- src/qemu/qemu_backup.h | 4 ++ src/qemu/qemu_blockjob.c | 7 +- src/qemu/qemu_driver.c | 2 +- src/qemu/qemu_process.c | 101 +++++++++++++++++++++++++-- src/qemu/qemu_process.h | 3 +- tools/virsh-backup.c | 7 ++ tools/virsh-domain-event.c | 3 +- 15 files changed, 210 insertions(+), 22 deletions(-) -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> The VIR_JOB_MODIFY_MIGRATION_SAFE is supposed to be a subset of _MODIFY jobs which are allowed during migration. Now with async jobs which allow VIR_JOB_MODIFY (namely the backup job) it shouldn't be required to explicitly mention VIR_JOB_MODIFY_MIGRATION_SAFE since we already allow everything. Adjust the logic in virDomainNestedJobAllowed to accept VIR_JOB_MODIFY_MIGRATION_SAFE if VIR_JOB_MODIFY is allowed so that other places can simply allow the latter. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/conf/virdomainjob.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/conf/virdomainjob.c b/src/conf/virdomainjob.c index 99c362d593..c2e7d33097 100644 --- a/src/conf/virdomainjob.c +++ b/src/conf/virdomainjob.c @@ -257,6 +257,7 @@ virDomainNestedJobAllowed(virDomainJobObj *jobs, virDomainJob newJob) { return !jobs->asyncJob || newJob == VIR_JOB_NONE || + (newJob == VIR_JOB_MODIFY_MIGRATION_SAFE && jobs->mask & JOB_MASK(VIR_JOB_MODIFY)) || (jobs->mask & JOB_MASK(newJob)); } -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> 'qemuProcessShutdownOrReboot' may or may not kill the VM. In 'qemuProcessReconnect' if we decided that the VM was in a state requiring 'qemuProcessShutdownOrReboot' to be called we'd stop the reconnection unconditionally. Now if the VM ought to undergo a fake reboot we really need to reconnect to the process because the process will be kept around for much longer. Make qemuProcessShutdownOrReboot return whether it killed the VM and continue the reconnection if it didn't. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_driver.c | 2 +- src/qemu/qemu_process.c | 25 +++++++++++++++++++++---- src/qemu/qemu_process.h | 3 ++- 3 files changed, 24 insertions(+), 6 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 1f7e587f61..d99528724e 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -3557,7 +3557,7 @@ processGuestPanicEvent(virQEMUDriver *driver, case VIR_DOMAIN_LIFECYCLE_ACTION_RESTART: qemuDomainSetFakeReboot(vm, true); - qemuProcessShutdownOrReboot(vm); + ignore_value(qemuProcessShutdownOrReboot(vm)); break; case VIR_DOMAIN_LIFECYCLE_ACTION_PRESERVE: diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 45fc32a663..bbd9859ef4 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -597,7 +597,16 @@ qemuProcessFakeReboot(void *opaque) } -void +/** + * qemuProcessShutdownOrReboot: + * @vm: domain object + * + * Perform the appropriate action when the guest OS shuts down. This can be + * either fake reboot (the VM is reset started again) or the VM is terminated. + * + * The function returns true if the VM was terminated. + */ +bool qemuProcessShutdownOrReboot(virDomainObj *vm) { qemuDomainObjPrivate *priv = vm->privateData; @@ -620,9 +629,14 @@ qemuProcessShutdownOrReboot(virDomainObj *vm) qemuDomainSetFakeReboot(vm, false); virObjectUnref(vm); } + + return false; } else { ignore_value(qemuProcessKill(vm, VIR_QEMU_PROCESS_KILL_NOWAIT)); + return true; } + + return false; } @@ -714,7 +728,7 @@ qemuProcessHandleShutdown(qemuMonitor *mon G_GNUC_UNUSED, if (priv->agent) qemuAgentNotifyEvent(priv->agent, QEMU_AGENT_EVENT_SHUTDOWN); - qemuProcessShutdownOrReboot(vm); + ignore_value(qemuProcessShutdownOrReboot(vm)); unlock: virObjectUnlock(vm); @@ -9705,8 +9719,11 @@ qemuProcessReconnect(void *opaque) reason == VIR_DOMAIN_PAUSED_USER)) { VIR_DEBUG("Finishing shutdown sequence for domain %s", obj->def->name); - qemuProcessShutdownOrReboot(obj); - goto cleanup; + /* qemuProcessShutdownOrReboot returns 'true' if the VM was terminated. + * If the VM is kept (e.g. for fake reboot) we need to continue the + * reconnection */ + if (qemuProcessShutdownOrReboot(obj)) + goto cleanup; } /* if domain requests security driver we haven't loaded, report error, but diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h index 9f783790ac..426e11d79e 100644 --- a/src/qemu/qemu_process.h +++ b/src/qemu/qemu_process.h @@ -192,7 +192,8 @@ int qemuProcessKill(virDomainObj *vm, unsigned int flags); int qemuProcessFakeRebootViaRecreate(virDomainObj *vm, bool locked); -void qemuProcessShutdownOrReboot(virDomainObj *vm); +bool qemuProcessShutdownOrReboot(virDomainObj *vm) + G_GNUC_WARN_UNUSED_RESULT; void qemuProcessAutoDestroy(virDomainObj *dom, virConnectPtr conn); -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> When notifying the backup code about termination of the block job which is part of a backup operation the code attempts to terminate the NBD server. This is done for every blockjob so could cause us to attempt to terminate the NBD server multiple times which doesn't cause problems but generates spurious errors. Add a flag that the NBD server was stopped and do it just once. Don't bother storing the flag in the status XML as it's just for the shutdown phase. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/conf/backup_conf.h | 4 ++++ src/qemu/qemu_backup.c | 19 +++++++++++-------- 2 files changed, 15 insertions(+), 8 deletions(-) diff --git a/src/conf/backup_conf.h b/src/conf/backup_conf.h index 9c3532a546..f90a4dcaee 100644 --- a/src/conf/backup_conf.h +++ b/src/conf/backup_conf.h @@ -99,6 +99,10 @@ struct _virDomainBackupDef { char *errmsg; /* error message of failed sub-blockjob */ unsigned int apiFlags; /* original flags used when starting the job */ + + bool nbdStopped; /* The NBD server for a pull-mode backup was stopped. This + flag is deliberately not stored in the status XML as + it's related only to termination of the backup. */ }; typedef enum { diff --git a/src/qemu/qemu_backup.c b/src/qemu/qemu_backup.c index 3b4fe54854..9832c186a8 100644 --- a/src/qemu/qemu_backup.c +++ b/src/qemu/qemu_backup.c @@ -1006,14 +1006,17 @@ qemuBackupNotifyBlockjobEnd(virDomainObj *vm, return; if (backup->type == VIR_DOMAIN_BACKUP_TYPE_PULL) { - if (qemuDomainObjEnterMonitorAsync(vm, asyncJob) < 0) - return; - ignore_value(qemuMonitorNBDServerStop(priv->mon)); - if (backup->tlsAlias) - ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsAlias, false)); - if (backup->tlsSecretAlias) - ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsSecretAlias, false)); - qemuDomainObjExitMonitor(vm); + if (!backup->nbdStopped) { + if (qemuDomainObjEnterMonitorAsync(vm, asyncJob) < 0) + return; + ignore_value(qemuMonitorNBDServerStop(priv->mon)); + if (backup->tlsAlias) + ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsAlias, false)); + if (backup->tlsSecretAlias) + ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsSecretAlias, false)); + qemuDomainObjExitMonitor(vm); + backup->nbdStopped = true; + } /* update the final statistics with the current job's data */ backup->pull_tmp_used += cur; -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> Move the notification to the backup job after finishing the cleanup of the current block job the backup operation consists of. Currently the termination of the blockjob would e.g. delete the scratch files before they are detached from qemu. In later patches the termination of the backup job may cause the qemu process to be killed (if the guest OS shut down but the qemu process was being kept alive to finish the backup) which would cause errors in the monitor commands for dismissing the block job. Since the NBD server still needs to be terminated first as otherwise the scratch files can't be unplugged from qemu we need to split the operation into two. First the NBD server is terminated, then the current block job is finalized and then the backup job is notified. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_backup.c | 41 +++++++++++++++++++++++++++------------- src/qemu/qemu_backup.h | 4 ++++ src/qemu/qemu_blockjob.c | 7 +++++-- 3 files changed, 37 insertions(+), 15 deletions(-) diff --git a/src/qemu/qemu_backup.c b/src/qemu/qemu_backup.c index 9832c186a8..5eed35b471 100644 --- a/src/qemu/qemu_backup.c +++ b/src/qemu/qemu_backup.c @@ -981,6 +981,33 @@ qemuBackupGetXMLDesc(virDomainObj *vm, } +void +qemuBackupNotifyBlockjobEndStopNBD(virDomainObj *vm, + int asyncJob) +{ + qemuDomainObjPrivate *priv = vm->privateData; + virDomainBackupDef *backup = priv->backup; + + VIR_DEBUG("vm: '%s'", vm->def->name); + + if (!backup || + backup->type != VIR_DOMAIN_BACKUP_TYPE_PULL || + backup->nbdStopped) + return; + + if (qemuDomainObjEnterMonitorAsync(vm, asyncJob) < 0) + return; + ignore_value(qemuMonitorNBDServerStop(priv->mon)); + if (backup->tlsAlias) + ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsAlias, false)); + if (backup->tlsSecretAlias) + ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsSecretAlias, false)); + qemuDomainObjExitMonitor(vm); + + backup->nbdStopped = true; +} + + void qemuBackupNotifyBlockjobEnd(virDomainObj *vm, const char *diskdst, @@ -1005,20 +1032,8 @@ qemuBackupNotifyBlockjobEnd(virDomainObj *vm, if (!backup) return; + /* update the final statistics with the current job's data */ if (backup->type == VIR_DOMAIN_BACKUP_TYPE_PULL) { - if (!backup->nbdStopped) { - if (qemuDomainObjEnterMonitorAsync(vm, asyncJob) < 0) - return; - ignore_value(qemuMonitorNBDServerStop(priv->mon)); - if (backup->tlsAlias) - ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsAlias, false)); - if (backup->tlsSecretAlias) - ignore_value(qemuMonitorDelObject(priv->mon, backup->tlsSecretAlias, false)); - qemuDomainObjExitMonitor(vm); - backup->nbdStopped = true; - } - - /* update the final statistics with the current job's data */ backup->pull_tmp_used += cur; backup->pull_tmp_total += end; } else { diff --git a/src/qemu/qemu_backup.h b/src/qemu/qemu_backup.h index 768da6cbef..c259883bca 100644 --- a/src/qemu/qemu_backup.h +++ b/src/qemu/qemu_backup.h @@ -34,6 +34,10 @@ qemuBackupJobCancelBlockjobs(virDomainObj *vm, bool terminatebackup, int asyncJob); +void +qemuBackupNotifyBlockjobEndStopNBD(virDomainObj *vm, + int asyncJob); + void qemuBackupNotifyBlockjobEnd(virDomainObj *vm, const char *diskdst, diff --git a/src/qemu/qemu_blockjob.c b/src/qemu/qemu_blockjob.c index 315b742053..b54a5b3811 100644 --- a/src/qemu/qemu_blockjob.c +++ b/src/qemu/qemu_blockjob.c @@ -1392,8 +1392,7 @@ qemuBlockJobProcessEventConcludedBackup(virQEMUDriver *driver, if (job->disk) diskdst = job->disk->dst; - qemuBackupNotifyBlockjobEnd(vm, diskdst, newstate, job->errmsg, - progressCurrent, progressTotal, asyncJob); + qemuBackupNotifyBlockjobEndStopNBD(vm, asyncJob); if (job->data.backup.store && !(backend = qemuBlockStorageSourceDetachPrepare(job->data.backup.store))) @@ -1415,6 +1414,10 @@ qemuBlockJobProcessEventConcludedBackup(virQEMUDriver *driver, if (job->data.backup.store) qemuDomainStorageSourceAccessRevoke(driver, vm, job->data.backup.store); + + qemuBackupNotifyBlockjobEnd(vm, diskdst, newstate, job->errmsg, + progressCurrent, progressTotal, asyncJob); + } -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> Upcoming patches will introduce the possibility for the domain to be kept paused after the guest OS shuts itself down. It'll allow jobs such as backup to finish as e.g. in the qemu driver it requires the qemu process. Add an the appropriate reason for the VIR_DOMAIN_EVENT_SUSPENDED lifecycle event. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- examples/c/misc/event-test.c | 3 +++ include/libvirt/libvirt-domain.h | 1 + tools/virsh-domain-event.c | 3 ++- 3 files changed, 6 insertions(+), 1 deletion(-) diff --git a/examples/c/misc/event-test.c b/examples/c/misc/event-test.c index 347ec44682..2ce82ca9e0 100644 --- a/examples/c/misc/event-test.c +++ b/examples/c/misc/event-test.c @@ -180,6 +180,9 @@ eventDetailToString(int event, case VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY_FAILED: return "Post-copy Error"; + case VIR_DOMAIN_EVENT_SUSPENDED_GUEST_SHUTDOWN: + return "guest OS shutdown"; + case VIR_DOMAIN_EVENT_SUSPENDED_LAST: break; } diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index 56bd085ef5..a2cf762e1a 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -5401,6 +5401,7 @@ typedef enum { VIR_DOMAIN_EVENT_SUSPENDED_API_ERROR = 6, /* Some APIs (e.g., migration, snapshot) internally need to suspend a domain. This event detail is used when resume operation at the end of such API fails. (Since: 1.0.1) */ VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY = 7, /* suspended for post-copy migration (Since: 1.3.3) */ VIR_DOMAIN_EVENT_SUSPENDED_POSTCOPY_FAILED = 8, /* suspended after failed post-copy (Since: 1.3.3) */ + VIR_DOMAIN_EVENT_SUSPENDED_GUEST_SHUTDOWN = 9, /* suspended after guest os shut-down (a long running job is preserving the VM until completion) (Since: 11.10.0) */ # ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_EVENT_SUSPENDED_LAST /* (Since: 0.9.10) */ diff --git a/tools/virsh-domain-event.c b/tools/virsh-domain-event.c index a47fdfc7fd..b9d1cdf019 100644 --- a/tools/virsh-domain-event.c +++ b/tools/virsh-domain-event.c @@ -85,7 +85,8 @@ VIR_ENUM_IMPL(virshDomainEventSuspended, N_("Snapshot"), N_("API error"), N_("Post-copy"), - N_("Post-copy Error")); + N_("Post-copy Error"), + N_("guest shutdown")); VIR_ENUM_DECL(virshDomainEventResumed); VIR_ENUM_IMPL(virshDomainEventResumed, -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> This flag will instruct the hypervisor driver to keep the VM around while the backup is running if the guest OS decides to shut down, so that the backup can be finished. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/manpages/virsh.rst | 6 ++++++ include/libvirt/libvirt-domain.h | 6 ++++-- src/libvirt-domain.c | 5 +++++ tools/virsh-backup.c | 7 +++++++ 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/docs/manpages/virsh.rst b/docs/manpages/virsh.rst index 73263ffc7f..a9d691824e 100644 --- a/docs/manpages/virsh.rst +++ b/docs/manpages/virsh.rst @@ -2186,6 +2186,7 @@ backup-begin :: backup-begin domain [backupxml] [checkpointxml] [--reuse-external] + [--preserve-domain-on-shutdown] Begin a new backup job. If *backupxml* is omitted, this defaults to a full backup using a push model to filenames generated by libvirt; supplying XML @@ -2199,6 +2200,11 @@ libvirt. For more information on backup XML, see: If *--reuse-external* is used it instructs libvirt to reuse temporary and output files provided by the user in *backupxml*. +When the *--preserve-domain-on-shutdown* flag is used libvirt will not +terminate the VM if the guest OS shuts down while the backup is running. The VM +will be instead kept in VIR_DOMAIN_PAUSED state until the backup job finishes. +The vm can be also resumed in order to boot again. + If *checkpointxml* is specified, a second file with a top-level element of *domaincheckpoint* is used to create a simultaneous checkpoint, for doing a later incremental backup relative to the time diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index a2cf762e1a..ad25ed14e1 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -8518,8 +8518,10 @@ int virDomainAgentSetResponseTimeout(virDomainPtr domain, * Since: 6.0.0 */ typedef enum { - VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL = (1 << 0), /* reuse separately - provided images (Since: 6.0.0) */ + /* reuse separately provided images (Since: 6.0.0) */ + VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL = (1 << 0), + /* preserve the domain if the guest OS shuts down while the backup is running (Since: 11.10.0) */ + VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN = (1 << 1), } virDomainBackupBeginFlags; int virDomainBackupBegin(virDomainPtr domain, diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index ca110bdf85..74c70a0a43 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -13682,6 +13682,11 @@ virDomainAgentSetResponseTimeout(virDomainPtr domain, * temporary files described by the @backupXML document were created by the * caller with correct format and size to hold the backup or temporary data. * + * When the VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN flag is used + * libvirt will not terminate the VM if the guest OS shuts down while the + * backup is running. The VM will be kept in the VIR_DOMAIN_PAUSED state + * until the backup job finishes or until it's resumed via virDomainResume. + * * The creation of a new checkpoint allows for future incremental backups. * Note that some hypervisors may require a particular disk format, such as * qcow2, in order to take advantage of checkpoints, while allowing arbitrary diff --git a/tools/virsh-backup.c b/tools/virsh-backup.c index 39e62f9ba9..1d009a3f2c 100644 --- a/tools/virsh-backup.c +++ b/tools/virsh-backup.c @@ -48,6 +48,10 @@ static const vshCmdOptDef opts_backup_begin[] = { .type = VSH_OT_BOOL, .help = N_("reuse files provided by caller"), }, + {.name = "preserve-domain-on-shutdown", + .type = VSH_OT_BOOL, + .help = N_("avoid shutdown of the domain while the backup is running"), + }, {.name = NULL} }; @@ -65,6 +69,9 @@ cmdBackupBegin(vshControl *ctl, if (vshCommandOptBool(cmd, "reuse-external")) flags |= VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL; + if (vshCommandOptBool(cmd, "preserve-domain-on-shutdown")) + flags |= VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN; + if (!(dom = virshCommandOptDomain(ctl, cmd, NULL))) return false; -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> Implement the support for VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN which will keep the qemu process around while the backup is still running. The above is achieved by avoiding killing the qemu process in the shutdown qemu monitor event handlers. Instead 'system_reset' QMP command is issued and the domain object is transitioned into _PAUSED state in sync with what qemu does. Now once the backup job finishes (or is cancelled e.g. for pull mode backups) the backup job termination code re-asseses if the qemu process needs to be killed or the VM was re-started by un-pausing. Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- src/qemu/qemu_backup.c | 23 ++++++++++++- src/qemu/qemu_process.c | 76 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 98 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_backup.c b/src/qemu/qemu_backup.c index 5eed35b471..c3566bcd57 100644 --- a/src/qemu/qemu_backup.c +++ b/src/qemu/qemu_backup.c @@ -27,6 +27,7 @@ #include "qemu_checkpoint.h" #include "qemu_command.h" #include "qemu_security.h" +#include "qemu_process.h" #include "storage_source.h" #include "storage_source_conf.h" @@ -559,6 +560,8 @@ qemuBackupJobTerminate(virDomainObj *vm, { qemuDomainObjPrivate *priv = vm->privateData; g_autoptr(virQEMUDriverConfig) cfg = NULL; + /* some flags need to be probed after the private data is freed */ + unsigned int apiFlags = priv->backup->apiFlags; size_t i; for (i = 0; i < priv->backup->ndisks; i++) { @@ -623,6 +626,23 @@ qemuBackupJobTerminate(virDomainObj *vm, if (vm->job->asyncJob == VIR_ASYNC_JOB_BACKUP) virDomainObjEndAsyncJob(vm); + + /* Users can request that the VM is preserved after a guest OS shutdown for + * the duration of the backup. This is the place where we need to check if + * this happened and optionally terminate the VM if the guest OS is still + * shut down */ + if (apiFlags & VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN) { + int reason = -1; + virDomainState state = virDomainObjGetState(vm, &reason); + + VIR_DEBUG("state: '%u', reason:'%d'", state, reason); + + if (state == VIR_DOMAIN_SHUTDOWN || + (state == VIR_DOMAIN_PAUSED && reason == VIR_DOMAIN_PAUSED_SHUTTING_DOWN)) { + VIR_DEBUG("backup job finished terminating the previously shutdown VM"); + ignore_value(qemuProcessShutdownOrReboot(vm)); + } + } } @@ -766,7 +786,8 @@ qemuBackupBegin(virDomainObj *vm, int ret = -1; g_autoptr(qemuFDPassDirect) fdpass = NULL; - virCheckFlags(VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL, -1); + virCheckFlags(VIR_DOMAIN_BACKUP_BEGIN_REUSE_EXTERNAL | + VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN, -1); if (!(def = virDomainBackupDefParseString(backupXML, priv->driver->xmlopt, 0))) return -1; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index bbd9859ef4..4769d6694d 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -597,6 +597,51 @@ qemuProcessFakeReboot(void *opaque) } +static void +qemuProcessResetPreservedDomain(void *opaque) +{ + virDomainObj *vm = opaque; + qemuDomainObjPrivate *priv = vm->privateData; + virQEMUDriver *driver = priv->driver; + virObjectEvent *event = NULL; + int rc; + + VIR_DEBUG("vm=%p", vm); + + virObjectLock(vm); + if (virDomainObjBeginJob(vm, VIR_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest unexpectedly quit")); + goto endjob; + } + + qemuDomainObjEnterMonitor(vm); + rc = qemuMonitorSystemReset(priv->mon); + qemuDomainObjExitMonitor(vm); + + /* A guest-initiated OS shutdown completes qemu pauses the CPUs thus we need + * to also update the state */ + virDomainObjSetState(vm, VIR_DOMAIN_PAUSED, VIR_DOMAIN_PAUSED_SHUTTING_DOWN); + event = virDomainEventLifecycleNewFromObj(vm, + VIR_DOMAIN_EVENT_SUSPENDED, + VIR_DOMAIN_EVENT_SUSPENDED_GUEST_SHUTDOWN); + + if (rc < 0) + goto endjob; + + endjob: + virDomainObjEndJob(vm); + + cleanup: + qemuDomainSaveStatus(vm); + virDomainObjEndAPI(&vm); + virObjectEventStateQueue(driver->domainEventState, event); +} + + /** * qemuProcessShutdownOrReboot: * @vm: domain object @@ -630,6 +675,37 @@ qemuProcessShutdownOrReboot(virDomainObj *vm) virObjectUnref(vm); } + return false; + } else if (priv->backup && priv->backup->apiFlags & VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN) { + /* The users can request that while the 'backup' job is active (and + * possibly also other block jobs in the future) the qemu process will + * be kept around even when the guest OS shuts down, evem when the + * requested action is to terminate the VM. + * + * In such case we'll reset the VM and keep it paused with proper state + * so that users can re-start it if needed. + * + * Terminating of the qemu process once the backup job is + * completed/terminated (unless the guest was unpaused/restarted) is + * then done in qemuBackupJobTerminate by invoking this function once + * again. + */ + g_autofree char *name = g_strdup_printf("reset-%s", vm->def->name); + virThread th; + + VIR_DEBUG("preserving qemu process while backup job is running"); + + virObjectRef(vm); + if (virThreadCreateFull(&th, + false, + qemuProcessResetPreservedDomain, + name, + false, + vm) < 0) { + VIR_WARN("Failed to create thread to reset shutdown VM"); + virObjectUnref(vm); + } + return false; } else { ignore_value(qemuProcessKill(vm, VIR_QEMU_PROCESS_KILL_NOWAIT)); -- 2.51.1
From: Peter Krempa <pkrempa@redhat.com> Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/kbase/live_full_disk_backup.rst | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/docs/kbase/live_full_disk_backup.rst b/docs/kbase/live_full_disk_backup.rst index be95d9d2e2..2b3b19772e 100644 --- a/docs/kbase/live_full_disk_backup.rst +++ b/docs/kbase/live_full_disk_backup.rst @@ -84,6 +84,24 @@ This requires libvirt-7.2.0 and QEMU-4.2, or higher versions. 15M -rw-r--r--. 1 qemu qemu 15M May 10 12:22 vm1.qcow2 21M -rw-------. 1 root root 21M May 10 12:23 vm1.qcow2.1620642185 +Shutdown of the guest OS during backup +-------------------------------------- + +The backup job is a long running job, potentially copying a lot of data, which +requires the VM to be active (The backup is done by the qemu process) and +can't be continued if the VM shuts down. This includes shut down initiated by +the guest OS itself. + +Starting from ``libvirt-11.10`` the ``virDomainBackupBegin()`` supports the +``VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN`` flag +(``virsh backup-begin --preserve-domain-on-shutdown``) which instructs libvirt +to avoid termination of the VM if the guest OS shuts down while the backup is +still running. The VM is in that scenario reset and paused instead of terminated +allowing the backup to finish. Once the backup finishes the VM process is +terminated. Users can resume the VM (e.g. ``virsh resume``) which causes it +to boot normally using the exsiting VM process and will continue to run after +completion of the backup job. + Full backup with older libvirt versions ======================================= -- 2.51.1
On a Wednesday in 2025, Peter Krempa via Devel wrote:
From: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/kbase/live_full_disk_backup.rst | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/docs/kbase/live_full_disk_backup.rst b/docs/kbase/live_full_disk_backup.rst index be95d9d2e2..2b3b19772e 100644 --- a/docs/kbase/live_full_disk_backup.rst +++ b/docs/kbase/live_full_disk_backup.rst @@ -84,6 +84,24 @@ This requires libvirt-7.2.0 and QEMU-4.2, or higher versions. 15M -rw-r--r--. 1 qemu qemu 15M May 10 12:22 vm1.qcow2 21M -rw-------. 1 root root 21M May 10 12:23 vm1.qcow2.1620642185
+Shutdown of the guest OS during backup +-------------------------------------- + +The backup job is a long running job, potentially copying a lot of data, which +requires the VM to be active (The backup is done by the qemu process) and +can't be continued if the VM shuts down. This includes shut down initiated by +the guest OS itself. + +Starting from ``libvirt-11.10`` the ``virDomainBackupBegin()`` supports the
Can you use the :since: annotation for this?
+``VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN`` flag +(``virsh backup-begin --preserve-domain-on-shutdown``) which instructs libvirt +to avoid termination of the VM if the guest OS shuts down while the backup is +still running. The VM is in that scenario reset and paused instead of terminated +allowing the backup to finish. Once the backup finishes the VM process is +terminated. Users can resume the VM (e.g. ``virsh resume``) which causes it +to boot normally using the exsiting VM process and will continue to run after
*existing Jano
+completion of the backup job. +
Full backup with older libvirt versions ======================================= -- 2.51.1
On Wed, Nov 19, 2025 at 11:26:47 +0100, Ján Tomko wrote:
On a Wednesday in 2025, Peter Krempa via Devel wrote:
From: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Peter Krempa <pkrempa@redhat.com> --- docs/kbase/live_full_disk_backup.rst | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/docs/kbase/live_full_disk_backup.rst b/docs/kbase/live_full_disk_backup.rst index be95d9d2e2..2b3b19772e 100644 --- a/docs/kbase/live_full_disk_backup.rst +++ b/docs/kbase/live_full_disk_backup.rst @@ -84,6 +84,24 @@ This requires libvirt-7.2.0 and QEMU-4.2, or higher versions. 15M -rw-r--r--. 1 qemu qemu 15M May 10 12:22 vm1.qcow2 21M -rw-------. 1 root root 21M May 10 12:23 vm1.qcow2.1620642185
+Shutdown of the guest OS during backup +-------------------------------------- + +The backup job is a long running job, potentially copying a lot of data, which +requires the VM to be active (The backup is done by the qemu process) and +can't be continued if the VM shuts down. This includes shut down initiated by +the guest OS itself. + +Starting from ``libvirt-11.10`` the ``virDomainBackupBegin()`` supports the
Can you use the :since: annotation for this?
Yes, provided that I also add the definition of the 'since' role at the beginning of the file: .. role:: since
+``VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN`` flag +(``virsh backup-begin --preserve-domain-on-shutdown``) which instructs libvirt +to avoid termination of the VM if the guest OS shuts down while the backup is +still running. The VM is in that scenario reset and paused instead of terminated +allowing the backup to finish. Once the backup finishes the VM process is +terminated. Users can resume the VM (e.g. ``virsh resume``) which causes it +to boot normally using the exsiting VM process and will continue to run after
*existing
Jano
+completion of the backup job. +
Full backup with older libvirt versions ======================================= -- 2.51.1
On a Wednesday in 2025, Peter Krempa via Devel wrote:
This series first fixes few job handling, reconnection and backup job bugs and then implements an option to keep the VM process around when the guest OS shuts down during backup, so that the backup doesn't need to be restarted.
Peter Krempa (8): virDomainNestedJobAllowed: Allow VIR_JOB_MODIFY_MIGRATION_SAFE if VIR_JOB_MODIFY is allowed qemuProcessReconnect: Continue reconnection if VM untergoes fake-reboot qemu: backup: Don't attempt to stop the NBD server twice qemuBlockJobProcessEventConcludedBackup: Notify the backup job later lib: Introduce VIR_DOMAIN_EVENT_SUSPENDED_GUEST_SHUTDOWN event reason lib: Introduce VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN flag qemu: backup: Add support for VIR_DOMAIN_BACKUP_BEGIN_PRESERVE_SHUTDOWN_DOMAIN kbase: Add note about preserving VM on shutdown to backup article
docs/kbase/live_full_disk_backup.rst | 18 +++++ docs/manpages/virsh.rst | 6 ++ examples/c/misc/event-test.c | 3 + include/libvirt/libvirt-domain.h | 7 +- src/conf/backup_conf.h | 4 ++ src/conf/virdomainjob.c | 1 + src/libvirt-domain.c | 5 ++ src/qemu/qemu_backup.c | 61 +++++++++++++--- src/qemu/qemu_backup.h | 4 ++ src/qemu/qemu_blockjob.c | 7 +- src/qemu/qemu_driver.c | 2 +- src/qemu/qemu_process.c | 101 +++++++++++++++++++++++++-- src/qemu/qemu_process.h | 3 +- tools/virsh-backup.c | 7 ++ tools/virsh-domain-event.c | 3 +- 15 files changed, 210 insertions(+), 22 deletions(-)
Reviewed-by: Ján Tomko <jtomko@redhat.com> Jano
participants (2)
-
Ján Tomko -
Peter Krempa