[libvirt] [PATCH 1/3] snapshot: allow deletion of just snapshot metadata

A future patch will make it impossible to remove a domain if it would leave behind any libvirt-tracked metadata about snapshots, since stale metadata interferes with a new domain by the same name. But requiring snaphot contents to be deleted before removing a domain is harsh; with qemu, qemu-img can still make use of the contents after the libvirt domain is gone. Therefore, we need an option to get rid of libvirt tracking information, but not the actual contents. For hypervisors that do not track any metadata in libvirt, the implementation is trivial; all remaining hypervisors (really, just qemu) will be dealt with separately. * include/libvirt/libvirt.h.in (VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY): New flag. * src/libvirt.c (virDomainSnapshotDelete): Document it. * src/esx/esx_driver.c (esxDomainSnapshotDelete): Trivially supported when there is no libvirt metadata. * src/vbox/vbox_tmpl.c (vboxDomainSnapshotDelete): Likewise. --- Progress towards implementing my RFC. Patch 2/3 took a lot more of my time to diagnose and fix than I was expecting. include/libvirt/libvirt.h.in | 1 + src/esx/esx_driver.c | 10 +++++++++- src/libvirt.c | 10 +++++++--- src/vbox/vbox_tmpl.c | 10 +++++++++- 4 files changed, 26 insertions(+), 5 deletions(-) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index a625479..3aec4b0 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -2580,6 +2580,7 @@ int virDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, /* Delete a snapshot */ typedef enum { VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN = (1 << 0), + VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY = (2 << 0), } virDomainSnapshotDeleteFlags; int virDomainSnapshotDelete(virDomainSnapshotPtr snapshot, diff --git a/src/esx/esx_driver.c b/src/esx/esx_driver.c index c097651..beeafbd 100644 --- a/src/esx/esx_driver.c +++ b/src/esx/esx_driver.c @@ -4543,7 +4543,8 @@ esxDomainSnapshotDelete(virDomainSnapshotPtr snapshot, unsigned int flags) esxVI_TaskInfoState taskInfoState; char *taskInfoErrorMessage = NULL; - virCheckFlags(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN, -1); + virCheckFlags(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN | + VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY, -1); if (esxVI_EnsureSession(priv->primary) < 0) { return -1; @@ -4561,6 +4562,13 @@ esxDomainSnapshotDelete(virDomainSnapshotPtr snapshot, unsigned int flags) goto cleanup; } + /* ESX snapshots do not require any libvirt metadata, making this + * flag trivial once we know we have a valid snapshot. */ + if (flags & VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY) { + result = 0; + goto cleanup; + } + if (esxVI_RemoveSnapshot_Task(priv->primary, snapshotTree->snapshot, removeChildren, &task) < 0 || esxVI_WaitForTaskCompletion(priv->primary, task, snapshot->domain->uuid, diff --git a/src/libvirt.c b/src/libvirt.c index c8af3e1..8ee9e96 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -15792,14 +15792,18 @@ error: /** * virDomainSnapshotDelete: * @snapshot: a domain snapshot object - * @flags: flag parameters + * @flags: bitwise-or of supported virDomainSnapshotDeleteFlags * * Delete the snapshot. * * If @flags is 0, then just this snapshot is deleted, and changes from * this snapshot are automatically merged into children snapshots. If - * flags is VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN, then this snapshot - * and any children snapshots are deleted. + * @flags includes VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN, then this snapshot + * and any children snapshots are deleted. If @flags includes + * VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY, then any snapshot metadata + * tracked by libvirt is removed while keeping the snapshot contents + * intact; if a hypervisor does not require any libvirt metadata to + * track snapshots, then this flag results in no changes. * * Returns 0 if the snapshot was successfully deleted, -1 on error. */ diff --git a/src/vbox/vbox_tmpl.c b/src/vbox/vbox_tmpl.c index 822e899..8de2bae 100644 --- a/src/vbox/vbox_tmpl.c +++ b/src/vbox/vbox_tmpl.c @@ -6361,7 +6361,8 @@ vboxDomainSnapshotDelete(virDomainSnapshotPtr snapshot, PRUint32 state; nsresult rc; - virCheckFlags(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN, -1); + virCheckFlags(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN | + VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY, -1); vboxIIDFromUUID(&domiid, dom->uuid); rc = VBOX_OBJECT_GET_MACHINE(domiid.value, &machine); @@ -6382,6 +6383,13 @@ vboxDomainSnapshotDelete(virDomainSnapshotPtr snapshot, goto cleanup; } + /* VBOX snapshots do not require any libvirt metadata, making this + * flag trivial once we know we have a valid snapshot. */ + if (flags & VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY) { + ret = 0; + goto cleanup; + } + if (state >= MachineState_FirstOnline && state <= MachineState_LastOnline) { vboxError(VIR_ERR_OPERATION_INVALID, "%s", -- 1.7.4.4

This one's nasty. Ever since we fixed virHashForEach to prevent nested hash iterations for safety reasons, virDomainSnapshotDelete with VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN has been broken for qemu: it deletes children, while leaving grandchildren intact but pointing to a no-longer-present parent. But even before then, the code would often appear to succeed to clean up grandchildren, but risked memory corruption if you have a large and deep hierarchy of snapshots. For acting on just children, a single virHashForEach is sufficient. But for acting on an entire subtree, it requires iteration; and since we declared recursion as invalid, we have to switch to a while loop. Doing this correctly requires quite a bit of overhaul, so I added a new helper function to isolate the algorithm from the actions, so that callers do not have to reinvent the iteration. * src/conf/domain_conf.h (_virDomainSnapshotDef): Add mark. (virDomainSnapshotForEachDescendant): New prototype. * src/libvirt_private.syms (domain_conf.h): Export it. * src/conf/domain_conf.c (virDomainSnapshotMarkDescendant) (virDomainSnapshotActOnDescendant) (virDomainSnapshotForEachDescendant): New functions. * src/qemu/qemu_driver.c (qemuDomainSnapshotDiscardChildren): Replace... (qemuDomainSnapshotDiscardDescenent): ...with callback that doesn't nest hash traversal. (qemuDomainSnapshotDelete): Use new function. --- src/conf/domain_conf.c | 103 ++++++++++++++++++++++++++++++++++++++++++++++ src/conf/domain_conf.h | 9 ++++- src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 35 ++++++---------- 4 files changed, 125 insertions(+), 23 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 010ce57..b3770c4 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -11233,6 +11233,109 @@ int virDomainSnapshotHasChildren(virDomainSnapshotObjPtr snap, return children.number; } +typedef enum { + MARK_NONE, /* No relation determined yet */ + MARK_DESCENDANT, /* Descendant of target */ + MARK_OTHER, /* Not a descendant of target */ +} snapshot_mark; + +struct snapshot_mark_descendant { + const char *name; /* Parent's name on round 1, NULL on other rounds. */ + virDomainSnapshotObjListPtr snapshots; + bool marked; /* True if descendants were found in this round */ +}; + +/* To be called in a loop until no more descendants are found. + * Additionally marking known unrelated snapshots reduces the number + * of total hash searches needed. */ +static void +virDomainSnapshotMarkDescendant(void *payload, + const void *name ATTRIBUTE_UNUSED, + void *data) +{ + virDomainSnapshotObjPtr obj = payload; + struct snapshot_mark_descendant *curr = data; + virDomainSnapshotObjPtr parent = NULL; + + /* Learned on a previous pass. */ + if (obj->mark) + return; + + if (curr->name) { + /* First round can only find root nodes and direct children. */ + if (!obj->def->parent) { + obj->mark = MARK_OTHER; + } else if (STREQ(obj->def->parent, curr->name)) { + obj->mark = MARK_DESCENDANT; + curr->marked = true; + } + } else { + /* All remaining rounds propagate marks from parents to children. */ + parent = virDomainSnapshotFindByName(curr->snapshots, obj->def->parent); + if (!parent) { + VIR_WARN("snapshot hash table is inconsistent!"); + obj->mark = MARK_OTHER; + return; + } + if (parent->mark) { + obj->mark = parent->mark; + if (obj->mark == MARK_DESCENDANT) + curr->marked = true; + } + } +} + +struct snapshot_act_on_descendant { + int number; + virHashIterator iter; + void *data; +}; + +static void +virDomainSnapshotActOnDescendant(void *payload, + const void *name, + void *data) +{ + virDomainSnapshotObjPtr obj = payload; + struct snapshot_act_on_descendant *curr = data; + + if (obj->mark == MARK_DESCENDANT) { + curr->number++; + (curr->iter)(payload, name, curr->data); + } + obj->mark = MARK_NONE; +} + +/* Run iter(data) on all descendants of snapshot, while ignoring all + * other entries in snapshots. Return the number of descendants + * visited. No particular ordering is guaranteed. */ +int +virDomainSnapshotForEachDescendant(virDomainSnapshotObjListPtr snapshots, + virDomainSnapshotObjPtr snapshot, + virHashIterator iter, + void *data) +{ + struct snapshot_mark_descendant mark; + struct snapshot_act_on_descendant act; + + /* virHashForEach does not support nested traversal, so we must + * instead iterate until no more snapshots get marked. We + * guarantee that on exit, all marks have been cleared again. */ + mark.name = snapshot->def->name; + mark.snapshots = snapshots; + mark.marked = true; + while (mark.marked) { + mark.marked = false; + virHashForEach(snapshots->objs, virDomainSnapshotMarkDescendant, &mark); + mark.name = NULL; + } + act.number = 0; + act.iter = iter; + act.data = data; + virHashForEach(snapshots->objs, virDomainSnapshotActOnDescendant, &act); + + return act.number; +} int virDomainChrDefForeach(virDomainDefPtr def, bool abortOnError, diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index abf9cbd..373a88a 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1140,13 +1140,16 @@ struct _virDomainSnapshotDef { long long creationTime; /* in seconds */ int state; - long active; + long active; /* XXX make this internal use only to identify current snap */ }; typedef struct _virDomainSnapshotObj virDomainSnapshotObj; typedef virDomainSnapshotObj *virDomainSnapshotObjPtr; struct _virDomainSnapshotObj { virDomainSnapshotDefPtr def; + + /* Internal use only */ + int mark; /* Used in identifying descendents. */ }; typedef struct _virDomainSnapshotObjList virDomainSnapshotObjList; @@ -1176,6 +1179,10 @@ void virDomainSnapshotObjListRemove(virDomainSnapshotObjListPtr snapshots, virDomainSnapshotObjPtr snapshot); int virDomainSnapshotHasChildren(virDomainSnapshotObjPtr snap, virDomainSnapshotObjListPtr snapshots); +int virDomainSnapshotForEachDescendant(virDomainSnapshotObjListPtr snapshots, + virDomainSnapshotObjPtr snapshot, + virHashIterator iter, + void *data); typedef struct _virDomainVcpuPinDef virDomainVcpuPinDef; typedef virDomainVcpuPinDef *virDomainVcpuPinDefPtr; diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 830222b..b8ea369 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -384,6 +384,7 @@ virDomainSnapshotDefFormat; virDomainSnapshotDefFree; virDomainSnapshotDefParseString; virDomainSnapshotFindByName; +virDomainSnapshotForEachDescendant; virDomainSnapshotHasChildren; virDomainSnapshotObjListGetNames; virDomainSnapshotObjListNum; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index a1e89ba..eeee9dd 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -9008,31 +9008,21 @@ cleanup: struct snap_remove { struct qemud_driver *driver; virDomainObjPtr vm; - char *parent; int err; }; -static void qemuDomainSnapshotDiscardChildren(void *payload, - const void *name ATTRIBUTE_UNUSED, - void *data) +static void +qemuDomainSnapshotDiscardDescendant(void *payload, + const void *name ATTRIBUTE_UNUSED, + void *data) { virDomainSnapshotObjPtr snap = payload; struct snap_remove *curr = data; - struct snap_remove this; - - if (snap->def->parent && STREQ(snap->def->parent, curr->parent)) { - this.driver = curr->driver; - this.vm = curr->vm; - this.parent = snap->def->name; - this.err = 0; - virHashForEach(curr->vm->snapshots.objs, - qemuDomainSnapshotDiscardChildren, &this); - - if (this.err) - curr->err = this.err; - else - this.err = qemuDomainSnapshotDiscard(curr->driver, curr->vm, snap); - } + int err; + + err = qemuDomainSnapshotDiscard(curr->driver, curr->vm, snap); + if (err && !curr->err) + curr->err = err; } struct snap_reparent { @@ -9108,10 +9098,11 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, if (flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN) { rem.driver = driver; rem.vm = vm; - rem.parent = snap->def->name; rem.err = 0; - virHashForEach(vm->snapshots.objs, qemuDomainSnapshotDiscardChildren, - &rem); + virDomainSnapshotForEachDescendant(&vm->snapshots, + snap, + qemuDomainSnapshotDiscardDescendant, + &rem); if (rem.err < 0) goto endjob; } else { -- 1.7.4.4

Adding this was trivial compared to the previous patch for fixing qemu snapshot deletion in the first place. * src/qemu/qemu_driver.c (qemuDomainSnapshotDiscard): Add parameter. (qemuDomainSnapshotDiscardDescendant, qemuDomainSnapshotDelete): Update callers. --- Most of the diff is indentation. src/qemu/qemu_driver.c | 78 ++++++++++++++++++++++++++--------------------- 1 files changed, 43 insertions(+), 35 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index eeee9dd..c5b3965 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -8923,7 +8923,8 @@ cleanup: static int qemuDomainSnapshotDiscard(struct qemud_driver *driver, virDomainObjPtr vm, - virDomainSnapshotObjPtr snap) + virDomainSnapshotObjPtr snap, + bool metadata_only) { const char *qemuimgarg[] = { NULL, "snapshot", "-d", NULL, NULL, NULL }; char *snapFile = NULL; @@ -8932,41 +8933,43 @@ static int qemuDomainSnapshotDiscard(struct qemud_driver *driver, qemuDomainObjPrivatePtr priv; virDomainSnapshotObjPtr parentsnap; - if (!virDomainObjIsActive(vm)) { - qemuimgarg[0] = qemuFindQemuImgBinary(); - if (qemuimgarg[0] == NULL) - /* qemuFindQemuImgBinary set the error */ - goto cleanup; - - qemuimgarg[3] = snap->def->name; - - for (i = 0; i < vm->def->ndisks; i++) { - /* FIXME: we also need to handle LVM here */ - if (vm->def->disks[i]->device == VIR_DOMAIN_DISK_DEVICE_DISK) { - if (!vm->def->disks[i]->driverType || - STRNEQ(vm->def->disks[i]->driverType, "qcow2")) { - /* we continue on even in the face of error, since other - * disks in this VM may have this snapshot in place - */ - continue; - } - - qemuimgarg[4] = vm->def->disks[i]->src; + if (!metadata_only) { + if (!virDomainObjIsActive(vm)) { + qemuimgarg[0] = qemuFindQemuImgBinary(); + if (qemuimgarg[0] == NULL) + /* qemuFindQemuImgBinary set the error */ + goto cleanup; - if (virRun(qemuimgarg, NULL) < 0) { - /* we continue on even in the face of error, since other - * disks in this VM may have this snapshot in place - */ - continue; + qemuimgarg[3] = snap->def->name; + + for (i = 0; i < vm->def->ndisks; i++) { + /* FIXME: we also need to handle LVM here */ + if (vm->def->disks[i]->device == VIR_DOMAIN_DISK_DEVICE_DISK) { + if (!vm->def->disks[i]->driverType || + STRNEQ(vm->def->disks[i]->driverType, "qcow2")) { + /* we continue on even in the face of error, since other + * disks in this VM may have this snapshot in place + */ + continue; + } + + qemuimgarg[4] = vm->def->disks[i]->src; + + if (virRun(qemuimgarg, NULL) < 0) { + /* we continue on even in the face of error, since other + * disks in this VM may have this snapshot in place + */ + continue; + } } } + } else { + priv = vm->privateData; + qemuDomainObjEnterMonitorWithDriver(driver, vm); + /* we continue on even in the face of error */ + qemuMonitorDeleteSnapshot(priv->mon, snap->def->name); + qemuDomainObjExitMonitorWithDriver(driver, vm); } - } else { - priv = vm->privateData; - qemuDomainObjEnterMonitorWithDriver(driver, vm); - /* we continue on even in the face of error */ - qemuMonitorDeleteSnapshot(priv->mon, snap->def->name); - qemuDomainObjExitMonitorWithDriver(driver, vm); } if (snap == vm->current_snapshot) { @@ -9008,6 +9011,7 @@ cleanup: struct snap_remove { struct qemud_driver *driver; virDomainObjPtr vm; + bool metadata_only; int err; }; @@ -9020,7 +9024,8 @@ qemuDomainSnapshotDiscardDescendant(void *payload, struct snap_remove *curr = data; int err; - err = qemuDomainSnapshotDiscard(curr->driver, curr->vm, snap); + err = qemuDomainSnapshotDiscard(curr->driver, curr->vm, snap, + curr->metadata_only); if (err && !curr->err) curr->err = err; } @@ -9072,8 +9077,10 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, char uuidstr[VIR_UUID_STRING_BUFLEN]; struct snap_remove rem; struct snap_reparent rep; + bool metadata_only = !!(flags & VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY); - virCheckFlags(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN, -1); + virCheckFlags(VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN | + VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY, -1); qemuDriverLock(driver); virUUIDFormat(snapshot->domain->uuid, uuidstr); @@ -9098,6 +9105,7 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, if (flags & VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN) { rem.driver = driver; rem.vm = vm; + rem.metadata_only = metadata_only; rem.err = 0; virDomainSnapshotForEachDescendant(&vm->snapshots, snap, @@ -9116,7 +9124,7 @@ static int qemuDomainSnapshotDelete(virDomainSnapshotPtr snapshot, goto endjob; } - ret = qemuDomainSnapshotDiscard(driver, vm, snap); + ret = qemuDomainSnapshotDiscard(driver, vm, snap, metadata_only); endjob: if (qemuDomainObjEndJob(driver, vm) == 0) -- 1.7.4.4

On 08/12/2011 07:30 AM, Eric Blake wrote:
A future patch will make it impossible to remove a domain if it would leave behind any libvirt-tracked metadata about snapshots, since stale metadata interferes with a new domain by the same name. But requiring snaphot contents to be deleted before removing a domain is harsh; with qemu, qemu-img can still make use of the contents after the libvirt domain is gone. Therefore, we need an option to get rid of libvirt tracking information, but not the actual contents. For hypervisors that do not track any metadata in libvirt, the implementation is trivial; all remaining hypervisors (really, just qemu) will be dealt with separately.
+++ b/include/libvirt/libvirt.h.in @@ -2580,6 +2580,7 @@ int virDomainRevertToSnapshot(virDomainSnapshotPtr snapshot, /* Delete a snapshot */ typedef enum { VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN = (1<< 0), + VIR_DOMAIN_SNAPSHOT_DELETE_METADATA_ONLY = (2<< 0),
Don't know what came over me today :) '2 << 0' happens to be the same as '1 << 1', but it is the wrong pattern (since by continuing the pattern, 3 << 0 is _not_ the same as 1 << 2). -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
participants (1)
-
Eric Blake