May 2021 - Devel - libvirt List Archives

[PATCH v4] libxl: adjust handling of libxl_device_nic objects
by Olaf Hering 27 May '21

27 May '21

libxl objects are supposed to be initialized and disposed. Adjust libxlMakeNic to use an already initialized object, it is owned by the caller. Adjust libxlMakeNicList to initialize the list of objects, before they are filled by libxlMakeNic. In case of error the objects are disposed by libxl_domain_config_dispose. Signed-off-by: Olaf Hering <olaf(a)aepfle.de> --- src/libxl/libxl_conf.c | 11 ++--------- 1 file changed, 2 insertions(+), 9 deletions(-) diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c index 2ecbcf6911..c672bafbe9 100644 --- a/src/libxl/libxl_conf.c +++ b/src/libxl/libxl_conf.c @@ -1335,8 +1335,6 @@ libxlMakeNic(virDomainDef *def, return -1; } - libxl_device_nic_init(x_nic); - virMacAddrGetRaw(&l_nic->mac, x_nic->mac); /* @@ -1531,8 +1529,9 @@ libxlMakeNicList(virDomainDef *def, libxl_domain_config *d_config) if (virDomainNetGetActualType(l_nics[i]) == VIR_DOMAIN_NET_TYPE_HOSTDEV) continue; + libxl_device_nic_init(&x_nics[nvnics]); if (libxlMakeNic(def, l_nics[i], &x_nics[nvnics], false)) - goto error; + return -1; /* * The devid (at least right now) will not get initialized by * libxl in the setup case but is required for starting the @@ -1549,12 +1548,6 @@ libxlMakeNicList(virDomainDef *def, libxl_domain_config *d_config) d_config->num_nics = nvnics; return 0; - - error: - for (i = 0; i < nnics; i++) - libxl_device_nic_dispose(&x_nics[i]); - VIR_FREE(x_nics); - return -1; } int

2 2

[PATCH RESEND v3 0/8] cleanup meson checks for runtime binaries
by Pavel Hrdina 26 May '21

26 May '21

Recent attempt to add a lot of meson options to specify different runtime paths motivated me enough to cleanup this from meson. Changes in v3: - some patches were already pushed - removed patch that moved virFindFileInPath from testutilsqemu.c Changes in v2: - split and rework patch 16/17 to address review comments - added a new patch to cleanup libvirt.spec.in file Pavel Hrdina (8): virfile: introduce virFindFileInPathFull() qemu_conf: use virFindFileInPathFull for runtime binaries meson: drop check for runtime binary dependencies meson: move iscsiadm check into storage_iscsi condition meson: stop setting runtime binaries defines during compilation meson: use runtime binaries to only resolve features with "auto" value meson: optional_programs should be used only for building libvirt libvirt.spec: drop no longer required build dependencies libvirt.spec.in | 31 ---- meson.build | 214 +++++++++---------------- src/bhyve/bhyve_command.c | 4 + src/libvirt_private.syms | 2 +- src/locking/lock_driver_lockd.c | 12 +- src/network/bridge_driver.c | 2 + src/node_device/node_device_driver.c | 2 + src/qemu/qemu_conf.c | 23 ++- src/storage/storage_backend_logical.c | 13 ++ src/storage/storage_backend_sheepdog.c | 2 + src/storage/storage_backend_zfs.c | 3 + src/storage/storage_util.c | 2 + src/storage/storage_util.h | 6 + src/util/virdnsmasq.c | 1 + src/util/virfile.c | 16 +- src/util/virfile.h | 5 +- src/util/virfirewall.h | 4 + src/util/viriscsi.h | 2 + src/util/virkmod.h | 3 + src/util/virnetdevbandwidth.h | 2 + src/util/virnetdevip.c | 2 + src/util/virnetdevmidonet.c | 2 + src/util/virnetdevopenvswitch.c | 2 + src/util/virnuma.c | 1 + src/util/virsysinfo.c | 1 + src/util/virutil.c | 2 + tests/testutilsqemu.c | 3 +- tests/virfirewallmock.c | 3 +- 28 files changed, 173 insertions(+), 192 deletions(-) -- 2.31.1

4 21

[PATCH] meson.build: Fix the -Wvla-larger-than flag
by Thomas Huth 26 May '21

26 May '21

It's "...-than=..." and not "...-then=...". Fixes: 8dd259d0c6 ("meson: add manywarnings") Signed-off-by: Thomas Huth <thuth(a)redhat.com> --- meson.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meson.build b/meson.build index 4f23f9104e..8ee0416700 100644 --- a/meson.build +++ b/meson.build @@ -390,7 +390,7 @@ cc_flags += [ '-Wvariadic-macros', '-Wvector-operation-performance', '-Wvla', - '-Wvla-larger-then=4031', + '-Wvla-larger-than=4031', '-Wvolatile-register-var', '-Wwrite-strings', ] -- 2.27.0

2 1

availability of cluster_size support (re. [libvirt PATCH v2 0/2] storage_driver: add support for cluster_size QCOW2 option)
by Geoff Macartney1 25 May '21

25 May '21

1 0

[PATCH v3] libxl: adjust handling of libxl_device_nic objects
by Olaf Hering 25 May '21

25 May '21

libxl objects are supposed to be initialized and disposed. Adjust libxlMakeNic to use an already initialized object, it is owned by the caller. Adjust libxlMakeNicList to initialize the list of objects, before they are filled by libxlMakeNic. In case of error the objects are disposed by libxl_domain_config_dispose. Signed-off-by: Olaf Hering <olaf(a)aepfle.de> --- src/libxl/libxl_conf.c | 34 +++++++++++++--------------------- 1 file changed, 13 insertions(+), 21 deletions(-) diff --git a/src/libxl/libxl_conf.c b/src/libxl/libxl_conf.c index f29a28a841..8dc7e26cea 100644 --- a/src/libxl/libxl_conf.c +++ b/src/libxl/libxl_conf.c @@ -1339,8 +1339,6 @@ libxlMakeNic(virDomainDef *def, return -1; } - libxl_device_nic_init(x_nic); - virMacAddrGetRaw(&l_nic->mac, x_nic->mac); /* @@ -1526,39 +1524,33 @@ libxlMakeNicList(virDomainDef *def, libxl_domain_config *d_config) { virDomainNetDef **l_nics = def->nets; size_t nnics = def->nnets; - libxl_device_nic *x_nics; size_t i, nvnics = 0; - x_nics = g_new0(libxl_device_nic, nnics); - for (i = 0; i < nnics; i++) { if (virDomainNetGetActualType(l_nics[i]) == VIR_DOMAIN_NET_TYPE_HOSTDEV) continue; + nvnics++; + } + if (!nvnics) + return 0; - if (libxlMakeNic(def, l_nics[i], &x_nics[nvnics], false)) - goto error; + d_config->nics = g_new0(libxl_device_nic, nvnics); + d_config->num_nics = nvnics; + + for (i = 0; i < nvnics; i++) { + libxl_device_nic_init(&d_config->nics[i]); + if (libxlMakeNic(def, l_nics[i], &d_config->nics[i], false)) + return -1; /* * The devid (at least right now) will not get initialized by * libxl in the setup case but is required for starting the * device-model. */ - if (x_nics[nvnics].devid < 0) - x_nics[nvnics].devid = nvnics; - - nvnics++; + if (d_config->nics[i].devid < 0) + d_config->nics[i].devid = i; } - VIR_SHRINK_N(x_nics, nnics, nnics - nvnics); - d_config->nics = x_nics; - d_config->num_nics = nvnics; - return 0; - - error: - for (i = 0; i < nnics; i++) - libxl_device_nic_dispose(&x_nics[i]); - VIR_FREE(x_nics); - return -1; } int

2 2

[PATCH] spec: Fix %endif indentation
by Michal Privoznik 25 May '21

25 May '21

In recent commit f772c1fd2a a misaligned %endif sneaked in which upsets syntax-check. Align it properly. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- Pushed as trivial and build breaker fix. libvirt.spec.in | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/libvirt.spec.in b/libvirt.spec.in index cdca7d8db6..7920599498 100644 --- a/libvirt.spec.in +++ b/libvirt.spec.in @@ -73,7 +73,7 @@ %else %ifnarch %{arches_qemu_kvm} %define with_storage_gluster 0 - %endif + %endif %endif %endif -- 2.26.3

1 0

[PATCH for 7.4.0 v2 0/3] Avoid double indentation of <metadata/> element
by Michal Privoznik 25 May '21

25 May '21

v2 of: https://listman.redhat.com/archives/libvir-list/2021-May/msg00767.html diff to v1: - Deduplicated the code per Pavel's suggestion Michal Prívozník (3): virxml: Introduce and use virXMLFormatMetadata() virxml: Report error if virXMLFormatMetadata() fails virxml: Avoid double indentation of <metadata/> element src/conf/domain_conf.c | 23 ++---------------- src/conf/network_conf.c | 23 ++---------------- src/libvirt_private.syms | 1 + src/util/virxml.c | 51 ++++++++++++++++++++++++++++++++++++++++ src/util/virxml.h | 3 +++ 5 files changed, 59 insertions(+), 42 deletions(-) -- 2.26.3

3 6

[PATCH for 7.4.0] conf: Avoid double indentation of <metadata/> element
by Michal Privoznik 25 May '21

25 May '21

There was a recent change in libxml2 that caused a trouble for us. To us, <metadata/> in domain or network XMLs are just opaque value where management application can store whatever data it finds fit. At XML parser/formatter level, we just make a copy of the element during parsing and then format it back. For formatting we use xmlNodeDump() which allows caller to specify level of indentation. Previously, the indentation was not applied onto the very first line, but as of v2.9.12-2-g85b1792e libxml2 is applying indentation also on the first line. This does not work well with out virBuffer because as soon as we call virBufferAsprintf() to append <metadata/> element, virBufferAsprintf() will apply another level of indentation. Instead of version checking, let's skip any indentation added by libxml2 before virBufferAsprintf() is called. Note, the problem is only when telling xmlNodeDump() to use indentation, i.e. level argument is not zero. Therefore, virXMLNodeToString() which also calls xmlNodeDump() is safe as it passes zero. Signed-off-by: Michal Privoznik <mprivozn(a)redhat.com> --- I've raised this issue with libxml2 team: https://gitlab.gnome.org/GNOME/libxml2/-/commit/85b1792e37b131e7a51af98a37f… but I'm not sure it'll be fixed. Unfortunately, the patch that's causing us trouble is being backported all over the place, because it's supposedly fixing a regression. src/conf/domain_conf.c | 9 ++++++++- src/conf/network_conf.c | 9 ++++++++- 2 files changed, 16 insertions(+), 2 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 4f78b7b43d..84a8c269be 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -27808,6 +27808,7 @@ virDomainDefFormatInternalSetRootName(virDomainDef *def, if (def->metadata) { g_autoptr(xmlBuffer) xmlbuf = NULL; + const char *xmlbufContent = NULL; int oldIndentTreeOutput = xmlIndentTreeOutput; /* Indentation on output requires that we previously set @@ -27824,7 +27825,13 @@ virDomainDefFormatInternalSetRootName(virDomainDef *def, xmlIndentTreeOutput = oldIndentTreeOutput; return -1; } - virBufferAsprintf(buf, "%s\n", (char *) xmlBufferContent(xmlbuf)); + + /* After libxml2-v2.9.12-2-g85b1792e even the first line is indented. + * But virBufferAsprintf() also adds indentation. Skip one of them. */ + xmlbufContent = (const char *) xmlBufferContent(xmlbuf); + virSkipSpaces(&xmlbufContent); + + virBufferAsprintf(buf, "%s\n", xmlbufContent); xmlIndentTreeOutput = oldIndentTreeOutput; } diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index a9eadff29c..20c6dc091a 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -2488,6 +2488,7 @@ virNetworkDefFormatBuf(virBuffer *buf, if (def->metadata) { g_autoptr(xmlBuffer) xmlbuf = NULL; + const char *xmlbufContent = NULL; int oldIndentTreeOutput = xmlIndentTreeOutput; /* Indentation on output requires that we previously set @@ -2504,7 +2505,13 @@ virNetworkDefFormatBuf(virBuffer *buf, xmlIndentTreeOutput = oldIndentTreeOutput; return -1; } - virBufferAsprintf(buf, "%s\n", (char *) xmlBufferContent(xmlbuf)); + + /* After libxml2-v2.9.12-2-g85b1792e even the first line is indented. + * But virBufferAsprintf() also adds indentation. Skip one of them. */ + xmlbufContent = (const char *) xmlBufferContent(xmlbuf); + virSkipSpaces(&xmlbufContent); + + virBufferAsprintf(buf, "%s\n", xmlbufContent); xmlIndentTreeOutput = oldIndentTreeOutput; } -- 2.26.3

5 6

RFC: Qemu backup interface plans
by Vladimir Sementsov-Ogievskiy 25 May '21

25 May '21

Hi all! I'd like to share and discuss some plans on Qemu backup interface I have. (actually, most of this I've presented on KVM Forum long ago.. But now I'm a lot closer to realization:) I'd start with a reminder about image fleecing: We have image fleecing scheme to export point-in-time state of active disk (iotest 222): backup(sync=none) ┌───────────────────────────────────────┐ ▼ │ ┌────────────┐ ┌────────────────┐ backing ┌─────────────┐ │ NBD export │ ─── │ temp qcow2 img │ ───────────────────▶ │ active disk │ └────────────┘ └────────────────┘ └─────────────┘ ▲ ┌────────────┐ │ │ guest blk │ ──────────────────────────────────────────────┘ └────────────┘ Actually, backup job inserts a backup-top filter, so in detail it looks like: backup(sync=none) ┌───────────────────────────────────────┐ ▼ │ ┌────────────┐ ┌────────────────┐ backing ┌─────────────┐ │ NBD export │ ─── │ temp qcow2 img │ ───────────────────▶ │ active disk │ └────────────┘ └────────────────┘ └─────────────┘ ▲ ▲ │ target │ │ │ ┌────────────┐ ┌────────────────┐ backing │ │ guest blk │ ──▶ │ backup-top │ ───────────────────────┘ └────────────┘ └────────────────┘ This scheme is also called external backup or pull backup. It allows some external tool to write data to actual backup, and Qemu only provides this data. We support also incremental external backup: Qemu can manage dirty bitmaps in any way user wants, and we can export bitmaps through NBD protocol. So, client of NBD export can get the bitmap, and read only "dirty" regions of exported image. What we lack in this scheme: 1. handling dirty bitmap in backup-top filter: backup-top does copy-before-write operation on any guest write, when actually we are interested only in "dirty" regions for incremental backup Probable solution would allowing specifying bitmap for sync=none mode of backup, but I think what I propose below is better. 2. [actually it's a tricky part of 1]: possibility to not do copy-before-write operations for regions that was already copied to final backup. With normal Qemu backup job, this is achieved by the fact that block-copy state with its internal bitmap is shared between backup job and copy-before-write filter. 3. Not a real problem but fact: backup block-job does nothing in the scheme, the whole job is done by filter. So, it would be interesting to have a possibility to simply insert/remove the filter, and avoid block-job creation and managing at all for external backup. (and I'd like to send another RFC on how to insert/remove filters, let's not discuss it here). Next. Think about internal backup. It has one drawback too: 4. If target is remote with slow connection, copy-before-write operations will slow down guest writes appreciably. It may be solved with help of image fleecing: we create temporary qcow2 image, setup fleecing scheme, and instead of exporting temp image through NBD we start a second backup with source = temporary image and target would be real backup target (NBD for example). Still, with such solution there are same [1,2] problems, 3 becomes worse: 5. We'll have two jobs and two automatically inserted filters, when actually one filter and one job are enough (as first job is needed only to insert a filter, second job doesn't need a filter at all). Note also, that this (starting two backup jobs to make push backup with fleecing) doesn't work now, op-blockers will be against. It's simple to fix (and in Virtuozzo we live with downstream-only patch, which allows push backup with fleecing, based on starting two backup jobs).. But I never send a patch, as I have better plan, which will solve all listed problems. So, what I propose: 1. We make backup-top filter public, so that it could be inserted/removed where user wants through QMP (how to properly insert/remove filter I'll post another RFC, as backup-top is not the only filter that can be usefully inserted somewhere). For this first step I've sent a series today: subject: [PATCH 00/21] block: publish backup-top filter id: <20210517064428.16223-1-vsementsov(a)virtuozzo.com> patchew: https://patchew.org/QEMU/20210517064428.16223-1-vsementsov@virtuozzo.com/ (note, that one of things in this series is rename s/backup-top/copy-before-write/, still, I call it backup-top in this letter) This solves [3]. [4, 5] are solved partly: we still have one extra filter, created by backup block jobs, and also I didn't test does this work, probably some op-blockers or permissions should be tuned. So, let it be step 2: 2. Test, that we can start backup job with source = (target of backup-top filter), so that we have "push backup with fleecing". Make an option for backup to start without a filter, when we don't need copy-before-write operations, to not create extra superfluous filter. 3. Support bitmap in backup-top filter, to solve [1] 3.1 and make it possible to modify the bitmap externally, so that consumer of fleecing can say to backup-top filter: I've already copied these blocks, don't bother with copying them to temp image". This is to solve [2]. Still, how consumer of fleecing will reset shared bitmap after copying blocks? I have the following idea: we make a "discard-bitmap-filter" filter driver, that own some bitmap and on discard request unset corresponding bits. Also, on read, if read from the region with unset bits the EINVAL returned immediately. This way both consumers (backup job and NBD client) are able to use this interface: Backup job can simply call discard on source, we can add an option for this. External backup tool will send TRIM request after reading some region. This way disk space will be freed and no extra copy-before-write operations will be done. I also have a side idea that we can implement READ_ONCE flag, so that READ and TRIM can be done in one NBD command. But this works only for clients that don't want to implement any kind of retrying. So, finally, how will it look (here I call backup-top with a new name, and "file" child is used instead of "backing", as this change I propose in "[PATCH 00/21] block: publish backup-top filter"): Pull backup: ┌────────────────────────────────────┐ │ NBD export │ └────────────────────────────────────┘ │ │ ┌────────────────────────────────────┐ file ┌───────────────────────────────────────┐ backing ┌─────────────┐ │ discard-bitmap filter (bitmap=bm0) │ ──────▶ │ temp qcow2 img │ ─────────▶ │ active disk │ └────────────────────────────────────┘ └───────────────────────────────────────┘ └─────────────┘ ▲ ▲ │ target │ │ │ ┌────────────────────────────────────┐ ┌───────────────────────────────────────┐ file │ │ guest blk │ ──────▶ │ copy-before-write filter (bitmap=bm0) │ ─────────────┘ └────────────────────────────────────┘ └───────────────────────────────────────┘ Operations: - Create temp qcow2 image - blockdev-add to add the new image, setup its backing, and add two filters - some command to actually set backup-top filter as child of guest blk. That's a "point-in-time" of backup. Should be done during fs-freeze. - start NBD export on top of discard filter (and we can export bitmap bm0 as well, for the client) Now NBD client (our external backup tool) can: - import the bitmap - READ the data - send DISCARD requests on already handled areas to save disk space and avoid extra copy-before-write operations on host node Push backup with fleecing: ┌─────────────────────┐ │ final backup target │ └─────────────────────┘ ▲ │ backup job (bitmap=bm0, insert-filter=False, discard-source=True) │ ┌────────────────────────────────────┐ file ┌───────────────────────────────────────┐ backing ┌─────────────┐ │ discard-bitmap filter (bitmap=bm0) │ ──────▶ │ temp qcow2 img │ ─────────▶ │ active disk │ └────────────────────────────────────┘ └───────────────────────────────────────┘ └─────────────┘ ▲ ▲ │ target │ │ │ ┌────────────────────────────────────┐ ┌───────────────────────────────────────┐ file │ │ guest blk │ ──────▶ │ copy-before-write filter (bitmap=bm0) │ ─────────────┘ └────────────────────────────────────┘ └───────────────────────────────────────┘ Note, that the whole fleecing part is the same, we only need to run backup job instead of NBD export. Additional future idea. Why we need push backup with fleecing? To handle cases with slow connection to backup target. In any case when writing to remote target is slower than writing to local file, push-backup-with-fleecing will less disturb the running guest than simple backup job. But this is not free: 1. We need additional disk space on source. No way to fix that (that's a core idea of the scheme, store data locally), still discard-source option for backup job will help 2. If connection is not too slow than probably part of CBW (copy before write) operations could go to final target immediately. But with the scheme above all CBW operations go to qcow2 temporary image. This can be solved with help of ram-cache format driver (to be implemented, of course): ┌─────────────────────┐ │ final backup target │ └─────────────────────┘ ▲ │ backup job (bitmap=bm0, insert-filter=False, discard-source=True) │ ┌────────────────────────────────────┐ ┌───────────────────────────────────────┐ backing ┌─────────────┐ │ discard-bitmap filter (bitmap=bm0) │ │ temp qcow2 img │ ─────────▶ │ active disk │ └────────────────────────────────────┘ └───────────────────────────────────────┘ └─────────────┘ │ ▲ ▲ │ │ file │ │ │ │ │ file ┌───────────┐ │ └───────────▶ │ ram-cache │ │ └───────────┘ │ ▲ │ │ target │ │ │ ┌────────────────────────────────────┐ ┌───────────────────────────────────────┐ file │ │ guest blk │ ──────▶ │ copy-before-write filter (bitmap=bm0) │ ─────────────┘ └────────────────────────────────────┘ └───────────────────────────────────────┘ This way data from copy-before-write filter goes first to ram-cache, and backup job could read it from ram. ram-cache will automatically flush data to temp qcow2 image, when ram-usage limit is reached. We'll also need a way to say backup-job that it should first read clusters that are cached in ram, and only then other clusters. So, we'll have a priory for clusters to be copied by block-copy: 1. clusters in ram-cache 2. clusters not in temp img (to avoid copy-before-write operations in future) 3. clusters in temp img. This will be a kind of block_status() thing, that allows a block driver to give recommendations on sequence of reading to be effective. Not also, that there is another benefit of such thing: we'll implement this callback in qcow2 driver, so that backup will read clusters not in guest cluster order, but in host cluster order, to read more sequentially, which should bring better performance on rotating disks. -- Best regards, Vladimir

3 8

[libvirt PATCH 0/4] adjust max memlock limit when hotplugging a vDPA device
by Laine Stump 25 May '21

25 May '21

see patch 4 for the why. Found during testing of https://bugzilla.redhat.com/1939776 Laine Stump (4): qemu_hotplug.c: don't skip cleanup on failures of qemuDomainAttachNetDevice conf: new function virDomainNetRemoveByObj() qemu_hotplug.c: add net devices to the domain list earlier qemu: adjust the maxmemlock limit when hotplugging a vDPA device src/conf/domain_conf.c | 16 ++++++++++++++++ src/conf/domain_conf.h | 1 + src/libvirt_private.syms | 1 + src/qemu/qemu_hotplug.c | 41 +++++++++++++++++++++++++++++++--------- 4 files changed, 50 insertions(+), 9 deletions(-) -- 2.31.1

2 5