[libvirt] [RFC PATCH 0/5] Refactor storage file metadata operations when dealing with snapshots.

Now that we track storage source as a pointer we might use this. Note that this is currently work in progress as I've didn't check all code paths and whether it's safe to modify the backing chains as I did. Peter Krempa (5): storage: encryption: Add deep copy function for storage encryption util: seclabel: Add deep copy function for device labels util: storagefile: Introduce helper to free storage source perms util: storagefile: Add deep copy for struct virStorageSource qemu: snapshot: Improve approach to deal with snapshot metadata src/libvirt_private.syms | 1 + src/qemu/qemu_driver.c | 108 +++++--------------------- src/util/virseclabel.c | 22 ++++++ src/util/virseclabel.h | 5 +- src/util/virstorageencryption.c | 39 ++++++++++ src/util/virstorageencryption.h | 2 + src/util/virstoragefile.c | 166 +++++++++++++++++++++++++++++++++++++++- src/util/virstoragefile.h | 2 + 8 files changed, 251 insertions(+), 94 deletions(-) -- 1.9.3

--- src/util/virstorageencryption.c | 39 +++++++++++++++++++++++++++++++++++++++ src/util/virstorageencryption.h | 2 ++ 2 files changed, 41 insertions(+) diff --git a/src/util/virstorageencryption.c b/src/util/virstorageencryption.c index 1306490..f750642 100644 --- a/src/util/virstorageencryption.c +++ b/src/util/virstorageencryption.c @@ -67,6 +67,45 @@ virStorageEncryptionFree(virStorageEncryptionPtr enc) } static virStorageEncryptionSecretPtr +virStorageEncryptionSecretCopy(const virStorageEncryptionSecret *src) +{ + virStorageEncryptionSecretPtr ret; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + memcpy(ret, src, sizeof(*src)); + + return ret; +} + +virStorageEncryptionPtr +virStorageEncryptionCopy(const virStorageEncryption *src) +{ + virStorageEncryptionPtr ret; + size_t i; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->nsecrets = src->nsecrets; + + if (VIR_ALLOC_N(ret->secrets, ret->nsecrets) < 0) + goto error; + + for (i = 0; i < src->nsecrets; i++) { + if (!(ret->secrets[i] = virStorageEncryptionSecretCopy(src->secrets[i]))) + goto error; + } + + return ret; + + error: + virStorageEncryptionFree(ret); + return NULL; +} + +static virStorageEncryptionSecretPtr virStorageEncryptionSecretParse(xmlXPathContextPtr ctxt, xmlNodePtr node) { diff --git a/src/util/virstorageencryption.h b/src/util/virstorageencryption.h index f63c9ee..313d7ac 100644 --- a/src/util/virstorageencryption.h +++ b/src/util/virstorageencryption.h @@ -61,6 +61,8 @@ struct _virStorageEncryption { virStorageEncryptionSecretPtr *secrets; }; +virStorageEncryptionPtr virStorageEncryptionCopy(const virStorageEncryption *src); + void virStorageEncryptionFree(virStorageEncryptionPtr enc); virStorageEncryptionPtr virStorageEncryptionParseNode(xmlDocPtr xml, -- 1.9.3

On 06/12/2014 09:02 AM, Peter Krempa wrote:
--- src/util/virstorageencryption.c | 39 +++++++++++++++++++++++++++++++++++++++ src/util/virstorageencryption.h | 2 ++ 2 files changed, 41 insertions(+)
I've definitely been wishing for this; as our struct gets more complicated, piece-wise transfers into a temporary will not be robust, while copying into a temporary can be. Did you need to export the new symbol in libvirt_private.syms?
+virStorageEncryptionPtr +virStorageEncryptionCopy(const virStorageEncryption *src) +{ + virStorageEncryptionPtr ret; + size_t i; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->nsecrets = src->nsecrets;
Swap this line...
+ + if (VIR_ALLOC_N(ret->secrets, ret->nsecrets) < 0) + goto error;
...here, and use src->nsecrets instead of ret->nsecrets in the VIR_ALLOC_N. Why? Because if VIR_ALLOC_N fails, the error label calls virStorageEncryptionFree(ret), but that function blindly assumes that ret->nsecrets is valid and tries to dereference memory. You forgot: ret->format = src->format;
+++ b/src/util/virstorageencryption.h @@ -61,6 +61,8 @@ struct _virStorageEncryption { virStorageEncryptionSecretPtr *secrets; };
+virStorageEncryptionPtr virStorageEncryptionCopy(const virStorageEncryption *src);
add ATTRIBUTE_NONNULL(1), since we blindly dereference src. ACK with problems fixed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 06/12/14 23:19, Eric Blake wrote:
On 06/12/2014 09:02 AM, Peter Krempa wrote:
--- src/util/virstorageencryption.c | 39 +++++++++++++++++++++++++++++++++++++++ src/util/virstorageencryption.h | 2 ++ 2 files changed, 41 insertions(+)
I've definitely been wishing for this; as our struct gets more complicated, piece-wise transfers into a temporary will not be robust, while copying into a temporary can be.
Did you need to export the new symbol in libvirt_private.syms?
Currently it isn't needed as it's only used in the "src/util/" subdir.
...
ACK with problems fixed.
I'll hold of pushing this until the next revision when I'll figure out whether the approach is reasonable. Peter

--- src/util/virseclabel.c | 22 ++++++++++++++++++++++ src/util/virseclabel.h | 5 ++++- 2 files changed, 26 insertions(+), 1 deletion(-) diff --git a/src/util/virseclabel.c b/src/util/virseclabel.c index 5a4d78e..93c12cc 100644 --- a/src/util/virseclabel.c +++ b/src/util/virseclabel.c @@ -80,3 +80,25 @@ virSecurityDeviceLabelDefNew(const char *model) return seclabel; } + +virSecurityDeviceLabelDefPtr +virSecurityDeviceLabelDefCopy(const virSecurityDeviceLabelDef *src) +{ + virSecurityDeviceLabelDefPtr ret; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->norelabel = src->norelabel; + ret->labelskip = src->labelskip; + + if (VIR_STRDUP(ret->model, src->model) < 0 || + VIR_STRDUP(ret->label, src->label) < 0) + goto error; + + return ret; + + error: + virSecurityDeviceLabelDefFree(ret); + return NULL; +} diff --git a/src/util/virseclabel.h b/src/util/virseclabel.h index 9e970a5..993fd81 100644 --- a/src/util/virseclabel.h +++ b/src/util/virseclabel.h @@ -45,7 +45,7 @@ struct _virSecurityLabelDef { }; -/* Security configuration for domain */ +/* Security configuration for device */ typedef struct _virSecurityDeviceLabelDef virSecurityDeviceLabelDef; typedef virSecurityDeviceLabelDef *virSecurityDeviceLabelDefPtr; struct _virSecurityDeviceLabelDef { @@ -61,6 +61,9 @@ virSecurityLabelDefNew(const char *model); virSecurityDeviceLabelDefPtr virSecurityDeviceLabelDefNew(const char *model); +virSecurityDeviceLabelDefPtr +virSecurityDeviceLabelDefCopy(const virSecurityDeviceLabelDef *src); + void virSecurityLabelDefFree(virSecurityLabelDefPtr def); void virSecurityDeviceLabelDefFree(virSecurityDeviceLabelDefPtr def); -- 1.9.3

On 06/12/2014 09:02 AM, Peter Krempa wrote:
--- src/util/virseclabel.c | 22 ++++++++++++++++++++++ src/util/virseclabel.h | 5 ++++- 2 files changed, 26 insertions(+), 1 deletion(-)
+++ b/src/util/virseclabel.c @@ -80,3 +80,25 @@ virSecurityDeviceLabelDefNew(const char *model)
return seclabel; } + +virSecurityDeviceLabelDefPtr
Nit: 2 blank lines between functions.
@@ -61,6 +61,9 @@ virSecurityLabelDefNew(const char *model); virSecurityDeviceLabelDefPtr virSecurityDeviceLabelDefNew(const char *model);
+virSecurityDeviceLabelDefPtr +virSecurityDeviceLabelDefCopy(const virSecurityDeviceLabelDef *src);
ATTRIBUTE_NONNULL(1) ACK with nits fixed. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

It will also be reused later. --- src/util/virstoragefile.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 0792dd8..67c1471 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1444,6 +1444,17 @@ virStorageNetHostDefFree(size_t nhosts, } +static void +virStoragePermsFree(virStoragePermsPtr def) +{ + if (!def) + return; + + VIR_FREE(def->label); + VIR_FREE(def); +} + + virStorageNetHostDefPtr virStorageNetHostDefCopy(size_t nhosts, virStorageNetHostDefPtr hosts) @@ -1557,10 +1568,7 @@ virStorageSourceClear(virStorageSourcePtr def) virSecurityDeviceLabelDefFree(def->seclabels[i]); VIR_FREE(def->seclabels); } - if (def->perms) { - VIR_FREE(def->perms->label); - VIR_FREE(def->perms); - } + virStoragePermsFree(def->perms); VIR_FREE(def->timestamps); virStorageNetHostDefFree(def->nhosts, def->hosts); -- 1.9.3

On 06/12/2014 09:02 AM, Peter Krempa wrote:
It will also be reused later. --- src/util/virstoragefile.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
+static void +virStoragePermsFree(virStoragePermsPtr def) +{ + if (!def) + return; + + VIR_FREE(def->label); + VIR_FREE(def);
VIR_FREE sets the local variable def to NULL...
+} + + virStorageNetHostDefPtr virStorageNetHostDefCopy(size_t nhosts, virStorageNetHostDefPtr hosts) @@ -1557,10 +1568,7 @@ virStorageSourceClear(virStorageSourcePtr def) virSecurityDeviceLabelDefFree(def->seclabels[i]); VIR_FREE(def->seclabels); } - if (def->perms) { - VIR_FREE(def->perms->label); - VIR_FREE(def->perms); - } + virStoragePermsFree(def->perms);
...but does not change def->perms. If virStorageSourceClear is ever used in a context where the storage remains live (that is, we are clearing the object with an intent to reuse it, rather than clearing it because we are about to free its container), then we risk a stale pointer being left around. Explicitly setting def->perms = NULL would avoid that risk. That said...
VIR_FREE(def->timestamps);
virStorageNetHostDefFree(def->nhosts, def->hosts);
...we already have other places that don't bother to clean up stale pointers, and a quick audit of the source shows that all existing callers of virStorageSourceClear immediately free the containing object. If it were worth the paranoia to set things to NULL, we should be doing it everywhere and not just on the code you are changing here. So: ACK as-is. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

Now that we have pointers to store disk source information and thus can easily exchange the structs behind we need a function to copy all the data. --- src/libvirt_private.syms | 1 + src/util/virstoragefile.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++ src/util/virstoragefile.h | 2 + 3 files changed, 153 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 122c572..08e67b2 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -1889,6 +1889,7 @@ virStorageNetProtocolTypeToString; virStorageSourceAuthClear; virStorageSourceClear; virStorageSourceClearBackingStore; +virStorageSourceCopy; virStorageSourceFree; virStorageSourceGetActualType; virStorageSourceNewFromBacking; diff --git a/src/util/virstoragefile.c b/src/util/virstoragefile.c index 67c1471..8998807 100644 --- a/src/util/virstoragefile.c +++ b/src/util/virstoragefile.c @@ -1489,6 +1489,156 @@ virStorageNetHostDefCopy(size_t nhosts, } +static virStorageTimestampsPtr +virStorageTimestampsCopy(const virStorageTimestamps *src) +{ + virStorageTimestampsPtr ret; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + memcpy(ret, src, sizeof(*src)); + + return ret; +} + + +static virStoragePermsPtr +virStoragePermsCopy(const virStoragePerms *src) +{ + virStoragePermsPtr ret; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->mode = src->mode; + ret->uid = src->uid; + ret->gid = src->gid; + + if (VIR_STRDUP(ret->label, src->label)) + goto error; + + return ret; + + error: + virStoragePermsFree(ret); + return NULL; +} + + +static virStorageSourcePoolDefPtr +virStorageSourcePoolDefCopy(const virStorageSourcePoolDef *src) +{ + virStorageSourcePoolDefPtr ret; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->voltype = src->voltype; + ret->pooltype = src->pooltype; + ret->actualtype = src->actualtype; + ret->mode = src->mode; + + if (VIR_STRDUP(ret->pool, src->pool) < 0 || + VIR_STRDUP(ret->volume, src->volume) < 0) + goto error; + + return ret; + + error: + virStorageSourcePoolDefFree(ret); + return NULL; +} + + +virStorageSourcePtr +virStorageSourceCopy(const virStorageSource *src, + bool backingChain) +{ + size_t i; + virStorageSourcePtr ret = NULL; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->type = src->type; + ret->protocol = src->protocol; + ret->format = src->format; + ret->allocation = src->allocation; + ret->capacity = src->capacity; + ret->backingRelative = src->backingRelative; + + /* storage driver metadata are not copied */ + ret->drv = NULL; + + if (VIR_STRDUP(ret->path, src->path) < 0 || + VIR_STRDUP(ret->volume, src->volume) < 0 || + VIR_STRDUP(ret->driverName, src->driverName) < 0 || + VIR_STRDUP(ret->relPath, src->relPath) < 0 || + VIR_STRDUP(ret->relDir, src->relDir) < 0 || + VIR_STRDUP(ret->backingStoreRaw, src->backingStoreRaw) < 0 || + VIR_STRDUP(ret->compat, src->compat) < 0 || + VIR_STRDUP(ret->auth.username, src->auth.username) < 0) + goto error; + + if (!(ret->hosts = virStorageNetHostDefCopy(src->nhosts, src->hosts))) + goto error; + ret->nhosts = src->nhosts; + + if (!(ret->srcpool = virStorageSourcePoolDefCopy(src->srcpool))) + goto error; + + if (!(ret->features = virBitmapNewCopy(src->features))) + goto error; + + if (!(ret->encryption = virStorageEncryptionCopy(src->encryption))) + goto error; + + if (!(ret->perms = virStoragePermsCopy(src->perms))) + goto error; + + if (!(ret->timestamps = virStorageTimestampsCopy(src->timestamps))) + goto error; + + ret->auth.secretType = src->auth.secretType; + switch ((virStorageSecretType) src->auth.secretType) { + case VIR_STORAGE_SECRET_TYPE_NONE: + case VIR_STORAGE_SECRET_TYPE_LAST: + break; + + case VIR_STORAGE_SECRET_TYPE_UUID: + memcpy(ret->auth.secret.uuid, src->auth.secret.uuid, VIR_UUID_BUFLEN); + break; + + case VIR_STORAGE_SECRET_TYPE_USAGE: + if (VIR_STRDUP(ret->auth.secret.usage, src->auth.secret.usage) < 0) + goto error; + break; + } + + if (backingChain && src->backingStore) { + if (!(ret->backingStore = virStorageSourceCopy(src->backingStore, + true))) + goto error; + } + + ret->nseclabels = src->nseclabels; + if (VIR_ALLOC_N(ret->seclabels, ret->nseclabels) < 0) + goto error; + + for (i = 0; i < ret->nseclabels; i++) { + if (!(ret->seclabels[i] = virSecurityDeviceLabelDefCopy(src->seclabels[i]))) + goto error; + } + + return ret; + + error: + virStorageSourceFree(ret); + return NULL; +} + + void virStorageSourcePoolDefFree(virStorageSourcePoolDefPtr def) { diff --git a/src/util/virstoragefile.h b/src/util/virstoragefile.h index 34b3625..cda9e76 100644 --- a/src/util/virstoragefile.h +++ b/src/util/virstoragefile.h @@ -324,6 +324,8 @@ int virStorageSourceGetActualType(virStorageSourcePtr def); void virStorageSourceFree(virStorageSourcePtr def); void virStorageSourceClearBackingStore(virStorageSourcePtr def); virStorageSourcePtr virStorageSourceNewFromBacking(virStorageSourcePtr parent); +virStorageSourcePtr virStorageSourceCopy(const virStorageSource *src, + bool backingChain); #endif /* __VIR_STORAGE_FILE_H__ */ -- 1.9.3

On 06/12/2014 09:02 AM, Peter Krempa wrote:
Now that we have pointers to store disk source information and thus can easily exchange the structs behind we need a function to copy all the data. --- src/libvirt_private.syms | 1 + src/util/virstoragefile.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++ src/util/virstoragefile.h | 2 + 3 files changed, 153 insertions(+)
+virStorageSourcePtr +virStorageSourceCopy(const virStorageSource *src, + bool backingChain) +{ + size_t i; + virStorageSourcePtr ret = NULL; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->type = src->type; + ret->protocol = src->protocol; + ret->format = src->format; + ret->allocation = src->allocation; + ret->capacity = src->capacity; + ret->backingRelative = src->backingRelative;
If I'm not mistaken, backingRelative disappears in your other series; I'm guessing you plan on rebasing that on top of this to pick up on the easier semantics.
+ + /* storage driver metadata are not copied */ + ret->drv = NULL;
Seems okay, particularly since doing a deep copy of that would require callback functions in the storage drivers.
+ + if (VIR_STRDUP(ret->path, src->path) < 0 || + VIR_STRDUP(ret->volume, src->volume) < 0 || + VIR_STRDUP(ret->driverName, src->driverName) < 0 || + VIR_STRDUP(ret->relPath, src->relPath) < 0 || + VIR_STRDUP(ret->relDir, src->relDir) < 0 || + VIR_STRDUP(ret->backingStoreRaw, src->backingStoreRaw) < 0 || + VIR_STRDUP(ret->compat, src->compat) < 0 || + VIR_STRDUP(ret->auth.username, src->auth.username) < 0) + goto error; ... + + ret->auth.secretType = src->auth.secretType; + switch ((virStorageSecretType) src->auth.secretType) { + case VIR_STORAGE_SECRET_TYPE_NONE: + case VIR_STORAGE_SECRET_TYPE_LAST: + break;
It might be worth adding an intermediate patch to break out the embedded auth struct into a formal top-level struct, then add a function that copies an auth struct, rather than having to inline all the nested struct work here. But for now I can live with it.
+ if (backingChain && src->backingStore) { + if (!(ret->backingStore = virStorageSourceCopy(src->backingStore, + true))) + goto error; + }
Nice - we have the choice of shallow or deep chain copy. Random thought - I wonder if it would be better to make virStorageSourcePtr inherit from virObject, and make it reference-counted, so that we can share a single object among multiple users rather than copying users right and left. But even if it is a good idea, it's enough refactoring that I don't think we need to tackle it right away.
+ + ret->nseclabels = src->nseclabels; + if (VIR_ALLOC_N(ret->seclabels, ret->nseclabels) < 0) + goto error;
Maybe consider swapping those two lines. Unlike my comment in patch 1/5, in _this_ case, we were careful enough that virStorageSourceClear checks seclabels for non-NULL before paying attention to nseclabels. But as we are not consistently careful in the clear/free functions, it's more robust to just get in the habit of setting the count after the allocation is successful without having to check a particular free function. Hmm - do I dare suggest simplifying virStorageSourceClear to unconditionally iterate over nseclabels instead of hiding it in an 'if (def->seclabels)' block? Or is it better form to suggest that all cleanup functions be audited to be robust to non-zero count but NULL arrays?
+++ b/src/util/virstoragefile.h @@ -324,6 +324,8 @@ int virStorageSourceGetActualType(virStorageSourcePtr def); void virStorageSourceFree(virStorageSourcePtr def); void virStorageSourceClearBackingStore(virStorageSourcePtr def); virStorageSourcePtr virStorageSourceNewFromBacking(virStorageSourcePtr parent); +virStorageSourcePtr virStorageSourceCopy(const virStorageSource *src, + bool backingChain);
ATTRIBUTE_NONNULL(1) ACK -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 06/13/14 00:38, Eric Blake wrote:
On 06/12/2014 09:02 AM, Peter Krempa wrote:
Now that we have pointers to store disk source information and thus can easily exchange the structs behind we need a function to copy all the data. --- src/libvirt_private.syms | 1 + src/util/virstoragefile.c | 150 ++++++++++++++++++++++++++++++++++++++++++++++ src/util/virstoragefile.h | 2 + 3 files changed, 153 insertions(+)
+virStorageSourcePtr +virStorageSourceCopy(const virStorageSource *src, + bool backingChain) +{ + size_t i; + virStorageSourcePtr ret = NULL; + + if (VIR_ALLOC(ret) < 0) + return NULL; + + ret->type = src->type; + ret->protocol = src->protocol; + ret->format = src->format; + ret->allocation = src->allocation; + ret->capacity = src->capacity; + ret->backingRelative = src->backingRelative;
If I'm not mistaken, backingRelative disappears in your other series; I'm guessing you plan on rebasing that on top of this to pick up on the easier semantics.
+ + /* storage driver metadata are not copied */ + ret->drv = NULL;
Seems okay, particularly since doing a deep copy of that would require callback functions in the storage drivers.
Actually I'm planning on adding reference counting to the storage driver access data so that we will be able to initialize and de-initialize it in a nested way. This place would also benefit from that as you'd gain a initialized file. Not sure though whether that's feasible yet as we don't always initialize it. For now I'll leave it this way and in the future we might want to change it.

Until now we were changing information about the disk source via multiple steps of copying data. Now that we changed to a pointer to store the disk source we might use it to change the approach to track the data. Additionally this will allow proper tracking of the backing chain. --- src/qemu/qemu_driver.c | 108 +++++++++---------------------------------------- 1 file changed, 19 insertions(+), 89 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e147d28..92abdc9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12842,14 +12842,10 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, enum qemuDomainAsyncJob asyncJob) { qemuDomainObjPrivatePtr priv = vm->privateData; + virStorageSourcePtr tmp; char *device = NULL; char *source = NULL; - char *newsource = NULL; - virStorageNetHostDefPtr newhosts = NULL; - virStorageNetHostDefPtr persistHosts = NULL; - int format = snap->src->format; const char *formatStr = NULL; - char *persistSource = NULL; int ret = -1; int fd = -1; bool need_unlink = false; @@ -12863,26 +12859,12 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, if (virAsprintf(&device, "drive-%s", disk->info.alias) < 0) goto cleanup; - /* XXX Here, we know we are about to alter disk->src->backingStore if - * successful, so we nuke the existing chain so that future commands will - * recompute it. Better would be storing the chain ourselves rather than - * reprobing, but this requires modifying domain_conf and our XML to fully - * track the chain across libvirtd restarts. */ - virStorageSourceClearBackingStore(disk->src); - if (virStorageFileInit(snap->src) < 0) goto cleanup; if (qemuGetDriveSourceString(snap->src, NULL, &source) < 0) goto cleanup; - if (VIR_STRDUP(newsource, snap->src->path) < 0) - goto cleanup; - - if (persistDisk && - VIR_STRDUP(persistSource, snap->src->path) < 0) - goto cleanup; - switch ((virStorageType)snap->src->type) { case VIR_STORAGE_TYPE_BLOCK: reuse = true; @@ -12910,15 +12892,6 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, case VIR_STORAGE_TYPE_NETWORK: switch (snap->src->protocol) { case VIR_STORAGE_NET_PROTOCOL_GLUSTER: - if (!(newhosts = virStorageNetHostDefCopy(snap->src->nhosts, - snap->src->hosts))) - goto cleanup; - - if (persistDisk && - !(persistHosts = virStorageNetHostDefCopy(snap->src->nhosts, - snap->src->hosts))) - goto cleanup; - break; default: @@ -12969,33 +12942,18 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, /* Update vm in place to match changes. */ need_unlink = false; - VIR_FREE(disk->src->path); - virStorageNetHostDefFree(disk->src->nhosts, disk->src->hosts); - - disk->src->path = newsource; - disk->src->format = format; - disk->src->type = snap->src->type; - disk->src->protocol = snap->src->protocol; - disk->src->nhosts = snap->src->nhosts; - disk->src->hosts = newhosts; + if (!(tmp = virStorageSourceCopy(snap->src, false))) + goto cleanup; - newsource = NULL; - newhosts = NULL; + tmp->backingStore = disk->src; + disk->src = tmp; if (persistDisk) { - VIR_FREE(persistDisk->src->path); - virStorageNetHostDefFree(persistDisk->src->nhosts, - persistDisk->src->hosts); - - persistDisk->src->path = persistSource; - persistDisk->src->format = format; - persistDisk->src->type = snap->src->type; - persistDisk->src->protocol = snap->src->protocol; - persistDisk->src->nhosts = snap->src->nhosts; - persistDisk->src->hosts = persistHosts; + if (!(tmp = virStorageSourceCopy(snap->src, false))) + goto cleanup; - persistSource = NULL; - persistHosts = NULL; + tmp->backingStore = persistDisk->src; + persistDisk->src = tmp; } cleanup: @@ -13004,10 +12962,6 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, virStorageFileDeinit(snap->src); VIR_FREE(device); VIR_FREE(source); - VIR_FREE(newsource); - VIR_FREE(persistSource); - virStorageNetHostDefFree(snap->src->nhosts, newhosts); - virStorageNetHostDefFree(snap->src->nhosts, persistHosts); return ret; } @@ -13017,21 +12971,15 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, static void qemuDomainSnapshotUndoSingleDiskActive(virQEMUDriverPtr driver, virDomainObjPtr vm, - virDomainDiskDefPtr origdisk, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, bool need_unlink) { - char *source = NULL; - char *persistSource = NULL; + virStorageSourcePtr tmp; struct stat st; ignore_value(virStorageFileInit(disk->src)); - if (VIR_STRDUP(source, origdisk->src->path) < 0 || - (persistDisk && VIR_STRDUP(persistSource, source) < 0)) - goto cleanup; - qemuDomainPrepareDiskChainElement(driver, vm, disk, disk->src, VIR_DISK_CHAIN_NO_ACCESS); if (need_unlink && @@ -13039,35 +12987,18 @@ qemuDomainSnapshotUndoSingleDiskActive(virQEMUDriverPtr driver, virStorageFileUnlink(disk->src) < 0) VIR_WARN("Unable to remove just-created %s", disk->src->path); + virStorageFileDeinit(disk->src); + /* Update vm in place to match changes. */ - VIR_FREE(disk->src->path); - disk->src->path = source; - source = NULL; - disk->src->format = origdisk->src->format; - disk->src->type = origdisk->src->type; - disk->src->protocol = origdisk->src->protocol; - virStorageNetHostDefFree(disk->src->nhosts, disk->src->hosts); - disk->src->nhosts = origdisk->src->nhosts; - disk->src->hosts = virStorageNetHostDefCopy(origdisk->src->nhosts, - origdisk->src->hosts); + tmp = disk->src->backingStore; + virStorageSourceFree(disk->src); + disk->src = tmp; + if (persistDisk) { - VIR_FREE(persistDisk->src->path); - persistDisk->src->path = persistSource; - persistSource = NULL; - persistDisk->src->format = origdisk->src->format; - persistDisk->src->type = origdisk->src->type; - persistDisk->src->protocol = origdisk->src->protocol; - virStorageNetHostDefFree(persistDisk->src->nhosts, - persistDisk->src->hosts); - persistDisk->src->nhosts = origdisk->src->nhosts; - persistDisk->src->hosts = virStorageNetHostDefCopy(origdisk->src->nhosts, - origdisk->src->hosts); + tmp = persistDisk->src->backingStore; + virStorageSourceFree(persistDisk->src); + persistDisk->src = tmp; } - - cleanup: - virStorageFileDeinit(disk->src); - VIR_FREE(source); - VIR_FREE(persistSource); } /* The domain is expected to be locked and active. */ @@ -13170,7 +13101,6 @@ qemuDomainSnapshotCreateDiskActive(virQEMUDriverPtr driver, } qemuDomainSnapshotUndoSingleDiskActive(driver, vm, - snap->def->dom->disks[i], vm->def->disks[i], persistDisk, need_unlink); -- 1.9.3

[warning - long email ahead. tl;dr: this patch is not ready yet, because I found a problem in the design] On 06/12/2014 09:02 AM, Peter Krempa wrote:
Until now we were changing information about the disk source via multiple steps of copying data. Now that we changed to a pointer to store the disk source we might use it to change the approach to track the data.
Additionally this will allow proper tracking of the backing chain. --- src/qemu/qemu_driver.c | 108 +++++++++---------------------------------------- 1 file changed, 19 insertions(+), 89 deletions(-)
It's also a nice diffstat :)
@@ -12863,26 +12859,12 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, if (virAsprintf(&device, "drive-%s", disk->info.alias) < 0) goto cleanup;
- /* XXX Here, we know we are about to alter disk->src->backingStore if - * successful, so we nuke the existing chain so that future commands will - * recompute it. Better would be storing the chain ourselves rather than - * reprobing, but this requires modifying domain_conf and our XML to fully - * track the chain across libvirtd restarts. */ - virStorageSourceClearBackingStore(disk->src); -
Yay - moving the code in the direction that the comments hinted we should be moving.
if (virStorageFileInit(snap->src) < 0) goto cleanup;
if (qemuGetDriveSourceString(snap->src, NULL, &source) < 0) goto cleanup;
- if (VIR_STRDUP(newsource, snap->src->path) < 0) - goto cleanup; - - if (persistDisk && - VIR_STRDUP(persistSource, snap->src->path) < 0) - goto cleanup;
One thing this code was trying to do is hoist all allocation up front, so that if we hit OOM, we have not made any permanent changes and can therefore safely roll back.
@@ -12969,33 +12942,18 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, /* Update vm in place to match changes. */ need_unlink = false;
- VIR_FREE(disk->src->path); - virStorageNetHostDefFree(disk->src->nhosts, disk->src->hosts); - - disk->src->path = newsource; - disk->src->format = format; - disk->src->type = snap->src->type; - disk->src->protocol = snap->src->protocol; - disk->src->nhosts = snap->src->nhosts; - disk->src->hosts = newhosts; + if (!(tmp = virStorageSourceCopy(snap->src, false))) + goto cleanup;
But here, you've delayed the allocation until after we've already made some changes. You'll need to hoist the call to virStorageSourceCopy() to occur in the same place as the code it is replacing.
- newsource = NULL; - newhosts = NULL; + tmp->backingStore = disk->src; + disk->src = tmp;
This line looks like it is in the correct location - we are merely manipulating the linked list in place to insert the newly-created wrapper. I do see a potential problem, though. Lots of graphics below, in three major demo sections: ================== In the _existing_ code, we are doing in-place modification of _portions_ of the storage source, then calling the security/audit/lease manager on the modified source. Why? Because the portions of the disk that we do NOT modify include the labeling information, and that's in part because historically we had multiple pieces of labeling information stored only once at the disk level. Graphically, our current behavior has a lifecycle something like this, when booting a domain with a chain of 2 and snapshotting it into a chain of 3: at parse time, we only learn the top level, and the fact that we need to generate multiple labels: disk: dev = vda src = +-------------------+ + path = mid + + rwlabel = NULL + + rolabel = NULL + + backing = NULL + + mid is unlabeled + +-------------------+ at domain start time, we then probe the whole chain, which learns more file names, but does not populate label information in the backing files; we also perform labeling: disk: dev = vda src = +-------------------+ +--------------------+ + path = mid + + path = base + + rwlabel = foo + + rwlabel = NULL + + rolabel = bar + + rolabel = NULL + + backing = ---------->+ backing = NULL + + mid has label foo + + base has label bar + +-------------------+ +--------------------+ Now the snapshot comes along, and we are adding a new layer: snapshot: dev = vda src = +-------------------+ + path = top + + rwlabel = NULL + + rolabel = NULL + + backing = NULL + + top is unlabeled + +-------------------+ so we label it - but with what label? The label was tied to 'mid'. Hence, the in-place swapping, so that we have: disk: dev = vda src = +-------------------+ + path = top + + rwlabel = foo + + rolabel = bar + + backing = NULL + + top has label foo + +-------------------+ long enough to give us a label. If successful, the code takes a shortcut at this point, and just reprobes the chain, finally leaving us with the desired chain (if unsuccessful, we undo the temporary changes by restoring src to mid, and backing to point to base). But note that the reprobe does NOT know what label backing elements actually have; what's more, mid is still labeled 'foo' for read-write privileges, even though after the snapshot we could have safely downgraded it to label 'bar' for read-only privileges: disk: dev = vda src = +-------------------+ +-------------------+ +--------------------+ + src = top + + src = mid + + src = base + + rwlabel = foo + + rwlabel = NULL + + rwlabel = NULL + + rolabel = bar + + rolabel = NULL + + rolabel = NULL + + backing = ---------->+ backing = ---------->+ backing = NULL + + top has label foo + + mid has label ?? + + base has label ?? + +-------------------+ +-------------------+ +--------------------+ =============== But with your code, merely inserting (a copy of) the snapshot source at the front of the chain would leave us with: disk: dev = vda src = +-------------------+ +-------------------+ +--------------------+ + src = top + + src = mid + + src = base + + rwlabel = NULL + + rwlabel = foo + + rwlabel = NULL + + rolabel = NULL + + rolabel = bar + + rolabel = NULL + + backing = ---------->+ backing = ---------->+ backing = NULL + + top has label foo + + mid has label foo + + base has label bar + +-------------------+ +-------------------+ +--------------------+ it has the benefit of remembering what we have previously labeled things, but has the absolute failure of forgetting what labels we have generated for the <disk> (since those labels are buried in the backing chain instead of the top element of the chain). For your code to work, you have to ALSO transfer the security information into the top of the chain, probably best done via a new helper function. =============== Here's what I envision that we should be doing instead. The existing code is tracking two different labels at the top level source, and none at any other layer. Instead, we should be tracking both read-write and read-only label templates at the <disk> level, then for any given virStorageSourcePtr, have a notion only of a current label. The security manager only needs access to the disk label templates (the generated read-write and read-only labels don't change over the life of the disk), as well as the current label per disk, along with the ability to swap that label (we basically have three label states: unlabeled, read-only, and read-write - and use the security manager to switch between those states). So the same lifecycle now looks more like: original parse: disk: dev = vda rwlabel = NULL rolabel = NULL src = +-------------------+ + path = mid + + backing = NULL + + mid is unlabeled + +-------------------+ probe the backing chain, and generate labels: disk: dev = vda rwlabel = foo rolabel = bar src = +-------------------+ +--------------------+ + path = mid + + path = base + + backing = ---------->+ backing = NULL + + mid has label foo + + base has label bar + +-------------------+ +--------------------+ Parse the snapshot, and pass THAT storage source pointer to the security code, but reusing the label template from the disk, so that we get the correct label in-place: snapshot: dev = vda src = +-------------------+ + path = top + + backing = NULL + + top has label foo + +-------------------+ Wire up the chain: disk: dev = vda rwlabel = foo rolabel = bar src = +-------------------+ +-------------------+ +--------------------+ + src = top + + src = mid + + src = base + + backing = ---------->+ backing = ---------->+ backing = NULL + + top has label foo + + mid has label foo + + base has label bar + +-------------------+ +-------------------+ +--------------------+ and finally revoke the read-write label on mid: disk: dev = vda rwlabel = foo rolabel = bar src = +-------------------+ +-------------------+ +--------------------+ + src = top + + src = mid + + src = base + + backing = ---------->+ backing = ---------->+ backing = NULL + + top has label foo + + mid has label bar + + base has label bar + +-------------------+ +-------------------+ +--------------------+ and if a read-only backing file is shared by more than one chain (or even across more than one <domain>), now you can see why reference counting becomes important, for knowing when it is safe to revoke read-write or even to forbid granting read-write. Furthermore, once we start allowing backing chain as user input instead of crawling the chain ourselves, the per-source label can be overridden for any element of the chain, instead of the current limitation of only having an override label at the top level. I don't think we are going to get to the idea setup of security management overnight, so in the meantime, we need some crutches to make sure that your code is not botching things. ================= So, with that explanation out of the way, I'm resuming the review. I think we can get away with your method in the short term (the 2nd of the 3 sections above) IF you also manage to transfer labeling information to the active layer of the chain, but this patch needs a respin because in its current state it does not do that.
if (persistDisk) { - VIR_FREE(persistDisk->src->path); - virStorageNetHostDefFree(persistDisk->src->nhosts, - persistDisk->src->hosts); - - persistDisk->src->path = persistSource; - persistDisk->src->format = format; - persistDisk->src->type = snap->src->type; - persistDisk->src->protocol = snap->src->protocol; - persistDisk->src->nhosts = snap->src->nhosts; - persistDisk->src->hosts = persistHosts; + if (!(tmp = virStorageSourceCopy(snap->src, false))) + goto cleanup;
Again, this is a late allocation; we risk OOM after making changes that can't be undone. I'd rather have two temporary virStorageSourcePtr objects allocated up front, and then plugged in at the end on success, rather than risk late allocation failure.
@@ -13017,21 +12971,15 @@ qemuDomainSnapshotCreateSingleDiskActive(virQEMUDriverPtr driver, static void qemuDomainSnapshotUndoSingleDiskActive(virQEMUDriverPtr driver, virDomainObjPtr vm, - virDomainDiskDefPtr origdisk, virDomainDiskDefPtr disk, virDomainDiskDefPtr persistDisk, bool need_unlink)
Arggh - I ran out of time to review the rest. But hopefully this gives you some ideas to think about. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (2)
-
Eric Blake
-
Peter Krempa