[libvirt] [PATCH v2 0/3] Honour sparse files on migration

When the storage is pre-created on the destination, it might have been allocated as sparse on the source. Let's keep it that way if we can. diff to v1: - in 3/3 I've reworked the handling of @allocation attribute as John suggested Michal Privoznik (3): qemuMigrationPrecreateStorage: Fix debug message qemuMigrationCookieNBD: Turn some items into their own struct qemuMigrationPrecreateDisk: Preserve sparse files src/qemu/qemu_migration.c | 48 ++++++++++++++++++++++++++++++++++++----------- 1 file changed, 37 insertions(+), 11 deletions(-) -- 2.0.5

When pre-creating storage for domains, we need to find corresponding disk in the XML on the destination (domain XML may differ there, e.g. disk is accessible under different path). For better debugging, I'm printing all info I received on a disk. But there was a typo when printing the disk capacity: "%lluu" instead of "%llu". Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 29f5173..3a05346 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1581,7 +1581,7 @@ qemuMigrationPrecreateStorage(virConnectPtr conn, int indx; const char *diskSrcPath; - VIR_DEBUG("Looking up disk target '%s' (capacity=%lluu)", + VIR_DEBUG("Looking up disk target '%s' (capacity=%llu)", nbd->disks[i].target, nbd->disks[i].capacity); if ((indx = virDomainDiskIndexByName(vm->def, -- 2.0.5

On Fri, Apr 10, 2015 at 17:25:39 +0200, Michal Privoznik wrote:
When pre-creating storage for domains, we need to find corresponding disk in the XML on the destination (domain XML may differ there, e.g. disk is accessible under different path). For better debugging, I'm printing all info I received on a disk. But there was a typo when printing the disk capacity: "%lluu" instead of "%llu".
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
ACK, Peter

It will be more future proof if we just pass a pointer to a single NBD disk representation to the qemuMigrationPrecreateDisk() instead of copying every single item onto the stack. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 3a05346..d4757e4 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -139,16 +139,20 @@ struct _qemuMigrationCookieNetwork { qemuMigrationCookieNetDataPtr net; }; +typedef struct _qemuMigrationCookieNBDDisk qemuMigrationCookieNBDDisk; +typedef qemuMigrationCookieNBDDisk *qemuMigrationCookieNBDDiskPtr; +struct _qemuMigrationCookieNBDDisk { + char *target; /* Disk target */ + unsigned long long capacity; /* And its capacity */ +}; + typedef struct _qemuMigrationCookieNBD qemuMigrationCookieNBD; typedef qemuMigrationCookieNBD *qemuMigrationCookieNBDPtr; struct _qemuMigrationCookieNBD { int port; /* on which port does NBD server listen for incoming data */ size_t ndisks; /* Number of items in @disk array */ - struct { - char *target; /* Disk target */ - unsigned long long capacity; /* And its capacity */ - } *disks; + qemuMigrationCookieNBDDiskPtr disks; }; typedef struct _qemuMigrationCookie qemuMigrationCookie; @@ -1460,7 +1464,7 @@ qemuMigrationRestoreDomainState(virConnectPtr conn, virDomainObjPtr vm) static int qemuMigrationPrecreateDisk(virConnectPtr conn, virDomainDiskDefPtr disk, - unsigned long long capacity) + qemuMigrationCookieNBDDiskPtr nbd) { int ret = -1; virStoragePoolPtr pool = NULL; @@ -1536,7 +1540,7 @@ qemuMigrationPrecreateDisk(virConnectPtr conn, virBufferAddLit(&buf, "<volume>\n"); virBufferAdjustIndent(&buf, 2); virBufferEscapeString(&buf, "<name>%s</name>\n", volName); - virBufferAsprintf(&buf, "<capacity>%llu</capacity>\n", capacity); + virBufferAsprintf(&buf, "<capacity>%llu</capacity>\n", nbd->capacity); virBufferAddLit(&buf, "<target>\n"); virBufferAdjustIndent(&buf, 2); virBufferAsprintf(&buf, "<format type='%s'/>\n", format); @@ -1603,7 +1607,7 @@ qemuMigrationPrecreateStorage(virConnectPtr conn, VIR_DEBUG("Proceeding with disk source %s", NULLSTR(diskSrcPath)); - if (qemuMigrationPrecreateDisk(conn, disk, nbd->disks[i].capacity) < 0) + if (qemuMigrationPrecreateDisk(conn, disk, nbd->disks + i) < 0) goto cleanup; } -- 2.0.5

https://bugzilla.redhat.com/show_bug.cgi?id=817700 When pre-creating a disk on the destination, a volume XML is constructed. The XML is then passed to virStorageVolCreateXML() which does the work. But, since there's no <allocation/> in the XML, the disk are fully allocated. This possibly breaks sparse allocation user has on the migration source. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index d4757e4..7a40548 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -144,6 +144,7 @@ typedef qemuMigrationCookieNBDDisk *qemuMigrationCookieNBDDiskPtr; struct _qemuMigrationCookieNBDDisk { char *target; /* Disk target */ unsigned long long capacity; /* And its capacity */ + unsigned long long allocation; /* And its allocation */ }; typedef struct _qemuMigrationCookieNBD qemuMigrationCookieNBD; @@ -593,6 +594,7 @@ qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, disk->dst) < 0) goto cleanup; mig->nbd->disks[mig->nbd->ndisks].capacity = entry->capacity; + mig->nbd->disks[mig->nbd->ndisks].allocation = entry->wr_highest_offset; mig->nbd->ndisks++; } @@ -833,8 +835,12 @@ qemuMigrationCookieXMLFormat(virQEMUDriverPtr driver, for (i = 0; i < mig->nbd->ndisks; i++) { virBufferEscapeString(buf, "<disk target='%s'", mig->nbd->disks[i].target); - virBufferAsprintf(buf, " capacity='%llu'/>\n", + virBufferAsprintf(buf, " capacity='%llu'", mig->nbd->disks[i].capacity); + if (mig->nbd->disks[i].capacity != mig->nbd->disks[i].allocation) + virBufferAsprintf(buf, " allocation='%llu'", + mig->nbd->disks[i].allocation); + virBufferAddLit(buf, "/>\n"); } virBufferAdjustIndent(buf, -2); virBufferAddLit(buf, "</nbd>\n"); @@ -972,7 +978,7 @@ static qemuMigrationCookieNBDPtr qemuMigrationCookieNBDXMLParse(xmlXPathContextPtr ctxt) { qemuMigrationCookieNBDPtr ret = NULL; - char *port = NULL, *capacity = NULL; + char *port = NULL, *capacity = NULL, *allocation = NULL; size_t i; int n; xmlNodePtr *disks = NULL; @@ -1000,6 +1006,7 @@ qemuMigrationCookieNBDXMLParse(xmlXPathContextPtr ctxt) for (i = 0; i < n; i++) { ctxt->node = disks[i]; VIR_FREE(capacity); + VIR_FREE(allocation); if (!(ret->disks[i].target = virXPathString("string(./@target)", ctxt))) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", @@ -1016,12 +1023,24 @@ qemuMigrationCookieNBDXMLParse(xmlXPathContextPtr ctxt) NULLSTR(capacity)); goto error; } + + allocation = virXPathString("string(./@allocation)", ctxt); + if (allocation) { + if (virStrToLong_ull(allocation, NULL, 10, + &ret->disks[i].allocation) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Malformed disk allocation: '%s'"), + allocation); + goto error; + } + } } } cleanup: VIR_FREE(port); VIR_FREE(capacity); + VIR_FREE(allocation); VIR_FREE(disks); ctxt->node = save_ctxt; return ret; @@ -1541,6 +1560,8 @@ qemuMigrationPrecreateDisk(virConnectPtr conn, virBufferAdjustIndent(&buf, 2); virBufferEscapeString(&buf, "<name>%s</name>\n", volName); virBufferAsprintf(&buf, "<capacity>%llu</capacity>\n", nbd->capacity); + if (nbd->allocation) + virBufferAsprintf(&buf, "<allocation>%llu</allocation>\n", nbd->allocation); virBufferAddLit(&buf, "<target>\n"); virBufferAdjustIndent(&buf, 2); virBufferAsprintf(&buf, "<format type='%s'/>\n", format); @@ -1585,8 +1606,9 @@ qemuMigrationPrecreateStorage(virConnectPtr conn, int indx; const char *diskSrcPath; - VIR_DEBUG("Looking up disk target '%s' (capacity=%llu)", - nbd->disks[i].target, nbd->disks[i].capacity); + VIR_DEBUG("Looking up disk target '%s' (capacity=%llu allocation=%llu)", + nbd->disks[i].target, nbd->disks[i].capacity, + nbd->disks[i].allocation); if ((indx = virDomainDiskIndexByName(vm->def, nbd->disks[i].target, false)) < 0) { -- 2.0.5

On Fri, Apr 10, 2015 at 17:25:41 +0200, Michal Privoznik wrote:
This patch certainly does not resolve this bug. See below ...
When pre-creating a disk on the destination, a volume XML is constructed. The XML is then passed to virStorageVolCreateXML() which does the work. But, since there's no <allocation/> in the XML, the disk are fully allocated. This possibly breaks sparse allocation user has on the migration source.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/qemu/qemu_migration.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-)
diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index d4757e4..7a40548 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -144,6 +144,7 @@ typedef qemuMigrationCookieNBDDisk *qemuMigrationCookieNBDDiskPtr; struct _qemuMigrationCookieNBDDisk { char *target; /* Disk target */ unsigned long long capacity; /* And its capacity */ + unsigned long long allocation; /* And its allocation */ };
typedef struct _qemuMigrationCookieNBD qemuMigrationCookieNBD; @@ -593,6 +594,7 @@ qemuMigrationCookieAddNBD(qemuMigrationCookiePtr mig, disk->dst) < 0) goto cleanup; mig->nbd->disks[mig->nbd->ndisks].capacity = entry->capacity; + mig->nbd->disks[mig->nbd->ndisks].allocation = entry->wr_highest_offset;
This "allocation" value works only for thin-provisioned LVM with qcow2 inside. Qemu docs state the following: - "wr_highest_offset": Highest offset of a sector written since the BlockDriverState has been opened (json-int) As the documentation hints this information doesn't make much sense for regular files. The file may (and certainly will) be sparse in between of the start and the highest offset. Additionally since the docs state that the value is "since it has been opened" the number may actually be lower than the count of blocks that are used.
mig->nbd->ndisks++; }
...
@@ -1541,6 +1560,8 @@ qemuMigrationPrecreateDisk(virConnectPtr conn, virBufferAdjustIndent(&buf, 2); virBufferEscapeString(&buf, "<name>%s</name>\n", volName); virBufferAsprintf(&buf, "<capacity>%llu</capacity>\n", nbd->capacity); + if (nbd->allocation) + virBufferAsprintf(&buf, "<allocation>%llu</allocation>\n", nbd->allocation);
You add this to the snippet that is used to pre-create the file, but ..
virBufferAddLit(&buf, "<target>\n"); virBufferAdjustIndent(&buf, 2); virBufferAsprintf(&buf, "<format type='%s'/>\n", format);
The @flags variable in this function is never touched for raw files. For qcow2 files, full preallocation is not supported. This means that the @allocation info is never used.
@@ -1585,8 +1606,9 @@ qemuMigrationPrecreateStorage(virConnectPtr conn, int indx; const char *diskSrcPath;
- VIR_DEBUG("Looking up disk target '%s' (capacity=%llu)", - nbd->disks[i].target, nbd->disks[i].capacity); + VIR_DEBUG("Looking up disk target '%s' (capacity=%llu allocation=%llu)", + nbd->disks[i].target, nbd->disks[i].capacity, + nbd->disks[i].allocation);
if ((indx = virDomainDiskIndexByName(vm->def, nbd->disks[i].target, false)) < 0) {
The bugreport states that the user actually pre-created the file on the destination as sparse so everything in this patch actually won't fix the original bug. The problem is that the block-copy job in qemu copies the entire disk even with blocks that were not touched. For the reporter the fix would probably be to use qcow2 as it should only copy the blocks that were touched by the VM since qemu has the information. As for this patch: NACK, the code you add doesn't do anything useful. Peter
participants (2)
-
Michal Privoznik
-
Peter Krempa