[libvirt PATCH v3] cgroup/LXC: Do not condition availability of v2 by controllers
by Eric van Blokland
systemd in hybrid mode uses v1 hierarchies for controllers and v2 for
process tracking.
The LXC code uses virCgroupAddMachineProcess() to move processes into
appropriate cgroup by manipulating cgroupfs directly. (Note, despite
libvirt also supports talking to systemd directly via
org.freedesktop.machine1 API.)
If this path is taken, libvirt/lxc must convince systemd that processes
really belong to new cgroup, i.e. also the tracking v2 hierarchy must
undergo migration too.
The current check would evaluate v2 backend as unavailable with hybrid
mode (because there are no available controllers). Simplify the
condition and consider the mounted cgroup2 as sufficient to touch v2
hierarchy.
This consequently creates an issue with binding the V2 mount. In hybrid
mode the V2 filesystem may be mounted upon the V1 filesystem. By reversing
the order in which backends are mounted in virCgroupBindMount this problem
is circumvented.
Fixes: #182
Signed-off-by: Eric van Blokland <mail(a)ericvanblokland.nl>
---
src/util/vircgroup.c | 8 +++++---
src/util/vircgroupv2.c | 12 ------------
2 files changed, 5 insertions(+), 15 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c
index a6a409af3d..48fbcf625a 100644
--- a/src/util/vircgroup.c
+++ b/src/util/vircgroup.c
@@ -2924,9 +2924,11 @@ virCgroupBindMount(virCgroup *group, const char *oldroot,
size_t i;
virCgroup *parent = virCgroupGetNested(group);
- for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) {
- if (parent->backends[i] &&
- parent->backends[i]->bindMount(parent, oldroot, mountopts) < 0) {
+ /* In hybrid environments, V2 may be mounted over V1.
+ * Mount the backends in reverse order. */
+ for (i = 1; i <= VIR_CGROUP_BACKEND_TYPE_LAST; i++) {
+ if (parent->backends[VIR_CGROUP_BACKEND_TYPE_LAST - i] &&
+ parent->backends[VIR_CGROUP_BACKEND_TYPE_LAST - i]->bindMount(parent, oldroot, mountopts) < 0) {
return -1;
}
}
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c
index 4c110940cf..0e0c61d466 100644
--- a/src/util/vircgroupv2.c
+++ b/src/util/vircgroupv2.c
@@ -75,22 +75,10 @@ virCgroupV2Available(void)
if (STRNEQ(entry.mnt_type, "cgroup2"))
continue;
- /* Systemd uses cgroup v2 for process tracking but no controller is
- * available. We should consider this configuration as cgroup v2 is
- * not available. */
- contFile = g_strdup_printf("%s/cgroup.controllers", entry.mnt_dir);
-
- if (virFileReadAll(contFile, 1024 * 1024, &contStr) < 0)
- goto cleanup;
-
- if (STREQ(contStr, ""))
- continue;
-
ret = true;
break;
}
- cleanup:
VIR_FORCE_FCLOSE(mounts);
return ret;
}
--
2.35.3
2 years, 1 month
[PATCH 0/5] node_device: Tiny code cleanups
by Michal Privoznik
My aim is to move virNodeDeviceDriver declaration into
node_device_driver.c, eventually. BUT that's going to be more patches
and as I continue my work on that I've noticed couple of almost trivial
patches that can be merged regardless. Even during freeze ;-)
Michal Prívozník (5):
node_device_udev.h: Drop unused macro
node_device: Move DMI_DEVPATH into node_device_udev.c
node_device_udev.h: Drop include of libudev.h
node_device: Move fwd declaration of udevNodeRegister() into correct
header file
node_device_driver.h: Drop nodeDeviceLock() and nodeDeviceUnlock() fwd
declarations
src/node_device/node_device_driver.c | 3 +++
src/node_device/node_device_driver.h | 12 ------------
src/node_device/node_device_udev.c | 2 ++
src/node_device/node_device_udev.h | 6 ++----
4 files changed, 7 insertions(+), 16 deletions(-)
--
2.37.4
2 years, 1 month
[libvirt RFC 00/24] basic snapshot delete implementation
by Pavel Hrdina
I'm sending it as RFC even though it's somehow completed and works, it
probably needs some documentation and most likely unit testing.
This implements virDomainSnapshotDelete API to support external
snapshots. The support doesn't include flags
VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN and
VIR_DOMAIN_SNAPSHOT_DELETE_CHILDREN_ONLY as it would add more complexity
and IMHO these flags should not existed at all.
The last patch is just here to show how we could support deleting
external snapshot if all children are internal only, without this patch
the user would have to call children-only and then with another call
delete the external snapshot itself.
There are some limitation that will be needing the mentioned
documentation. If parent snapshot is internal the external snapshot
cannot be deleted, workaround is to delete any internal parent snapshots
and after that the external can be deleted.
Pavel Hrdina (24):
qemu_block: extract block commit code to separate function
qemu_block: move qemuDomainBlockPivot out of qemu_driver
qemu_block: extract qemuBlockCommit impl to separate function
qemu_block: add sync option to qemuBlockCommitImpl
qemu_monitor: introduce qemuMonitorJobFinalize
qemu_monitor: allow setting autofinalize for block commit
qemu_block: introduce qemuBlockFinalize
qemu_blockjob: process QEMU_MONITOR_JOB_STATUS_PENDING signal
qemu_snapshot: refactor qemuSnapshotDelete
qemu_snapshot: extract single snapshot deletion to separate function
qemu_snapshot: extract children snapshot deletion to separate function
qemu_snapshot: rework snapshot children deletion
qemu_snapshot: move snapshot discard out of qemu_domain.c
qemu_snapshot: introduce qemuSnapshotDiscardMetadata
qemu_snapshot: call qemuSnapshotDiscardMetadata from
qemuSnapshotDiscard
qemu_snapshot: pass update_parent into qemuSnapshotDiscardMetadata
qemu_snapshot: move metadata changes to qemuSnapshotDiscardMetadata
qemu_snapshot: introduce qemuSnapshotDeleteValidate function
qemu_snapshot: refactor validation of snapshot delete
qemu_snapshot: prepare data for external snapshot deletion
qemu_snapshot: implement deletion of external snapshot
qemu_snapshot: update metadata when deleting snapshots
qemu_snapshot: when deleting snapshot invalidate parent snapshot
qemu_snapshot: allow deletion of external snapshot with internal
snapshot children
src/conf/snapshot_conf.c | 5 +
src/conf/snapshot_conf.h | 1 +
src/qemu/qemu_backup.c | 1 +
src/qemu/qemu_block.c | 356 ++++++++++++++++
src/qemu/qemu_block.h | 30 ++
src/qemu/qemu_blockjob.c | 13 +-
src/qemu/qemu_blockjob.h | 1 +
src/qemu/qemu_domain.c | 95 +----
src/qemu/qemu_domain.h | 9 -
src/qemu/qemu_driver.c | 306 +-------------
src/qemu/qemu_monitor.c | 21 +-
src/qemu/qemu_monitor.h | 8 +-
src/qemu/qemu_monitor_json.c | 26 +-
src/qemu/qemu_monitor_json.h | 8 +-
src/qemu/qemu_snapshot.c | 764 +++++++++++++++++++++++++++++++----
src/qemu/qemu_snapshot.h | 4 +
tests/qemumonitorjsontest.c | 2 +-
17 files changed, 1151 insertions(+), 499 deletions(-)
--
2.37.2
2 years, 1 month
[libvirt PATCH v2] Revert "cgroup/LXC: Do not condition availability of v2 by controllers"
by Pavel Hrdina
This reverts commit e49313b54ed2a149c71f9073659222742ff3ffb0.
This reverts commit a0f37232b9c4296ca16955cc625f75eb848ace39.
Revert them together to not break build.
This fix of the issue is incorrect and breaks usage of other controllers
in hybrid mode that systemd creates, specifically usage of devices and
cpuacct controllers as they are now assumed to be part of the cgroup v2
topology which is not true.
We need to find different solution to the issue.
Signed-off-by: Pavel Hrdina <phrdina(a)redhat.com>
---
src/util/vircgroup.c | 6 ++----
src/util/vircgroupv2.c | 15 +++++++++++++++
2 files changed, 17 insertions(+), 4 deletions(-)
diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c
index 49ebd37ded..a6a409af3d 100644
--- a/src/util/vircgroup.c
+++ b/src/util/vircgroup.c
@@ -2921,12 +2921,10 @@ int
virCgroupBindMount(virCgroup *group, const char *oldroot,
const char *mountopts)
{
- ssize_t i;
+ size_t i;
virCgroup *parent = virCgroupGetNested(group);
- /* In hybrid environments, V2 may be mounted over V1.
- * Mount the backends in reverse order. */
- for (i = VIR_CGROUP_BACKEND_TYPE_LAST - 1; i >= 0; i--) {
+ for (i = 0; i < VIR_CGROUP_BACKEND_TYPE_LAST; i++) {
if (parent->backends[i] &&
parent->backends[i]->bindMount(parent, oldroot, mountopts) < 0) {
return -1;
diff --git a/src/util/vircgroupv2.c b/src/util/vircgroupv2.c
index bf6bd11fef..4c110940cf 100644
--- a/src/util/vircgroupv2.c
+++ b/src/util/vircgroupv2.c
@@ -69,13 +69,28 @@ virCgroupV2Available(void)
return false;
while (getmntent_r(mounts, &entry, buf, sizeof(buf)) != NULL) {
+ g_autofree char *contFile = NULL;
+ g_autofree char *contStr = NULL;
+
if (STRNEQ(entry.mnt_type, "cgroup2"))
continue;
+ /* Systemd uses cgroup v2 for process tracking but no controller is
+ * available. We should consider this configuration as cgroup v2 is
+ * not available. */
+ contFile = g_strdup_printf("%s/cgroup.controllers", entry.mnt_dir);
+
+ if (virFileReadAll(contFile, 1024 * 1024, &contStr) < 0)
+ goto cleanup;
+
+ if (STREQ(contStr, ""))
+ continue;
+
ret = true;
break;
}
+ cleanup:
VIR_FORCE_FCLOSE(mounts);
return ret;
}
--
2.37.3
2 years, 1 month
[libvirt PATCH 0/2] revert attempt fixing lxc with hybrid systemd cgroups
by Pavel Hrdina
Pavel Hrdina (2):
Revert "vircgroup: Remove unused variables in virCgroupV2Available"
Revert "cgroup/LXC: Do not condition availability of v2 by
controllers"
src/util/vircgroup.c | 6 ++----
src/util/vircgroupv2.c | 15 +++++++++++++++
2 files changed, 17 insertions(+), 4 deletions(-)
--
2.37.3
2 years, 1 month
Interface changed after 'started' event in hook
by Christopher Pereira
Hi,
We have a libvirt-qemu hook script that intercepts the "started" event
and configures the virtual network interface (we set a private IP,
remove the interface from the virtual bridge and set some custom iptable
rules).
After upgrading from libvirt 3.2 to 4.5 we noticed that the interface is
configured *after* the 'started' event is triggered and thus our custom
configuration is overwritten (the IP we set is removed).
As a workaround we added a "sleep 5" to our scripts which works, but we
wonder what is the correct way to avoid libvirt to change the virtual
interface after the 'started' event is triggered to keep our custom
configuration.
We have currently this setting:
<interface type='bridge'>
<mac address='*********'/>
<source bridge='virbr0'/>
<model type='virtio'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>
</interface>
What would be the correct setting to avoid libvirt to change the
interface after the hook is triggered?
2 years, 1 month
[PATCH] node_device: fix missing return from function nodedevRegister
by jcfaracco@gmail.com
From: Julio Faracco <jcfaracco(a)gmail.com>
The function nodedevRegister() (or all register functions) requires an
integer as a return. That function is not returning a value when UDEV is
not set. This commit just adds a generic return for that specific case.
Signed-off-by: Julio Faracco <jcfaracco(a)gmail.com>
---
src/node_device/node_device_driver.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/node_device/node_device_driver.c b/src/node_device/node_device_driver.c
index 8e93b0dd6f..4d51851cfd 100644
--- a/src/node_device/node_device_driver.c
+++ b/src/node_device/node_device_driver.c
@@ -1584,6 +1584,8 @@ nodedevRegister(void)
{
#ifdef WITH_UDEV
return udevNodeRegister();
+#else
+ return 0;
#endif
}
--
2.37.3
2 years, 1 month
need directions for filling a bug report
by alexandru.iancu@gmail.com
Hi everybody,
I'm running a KVM linux hypervisor and a Windows guest. I'm passing
through the GPU. From time to time, the guest crashes.
I'd like to file a bug report.
a) I'm aware of the many gitlab projects but I'm not sure which one can
address this issue or should have a first look at and dispatch it.
b) Windows dump is big(2G). I'm probably going to need to put it
somewhere for inspection.
Best regards,
Alex
2 years, 1 month
[PATCH v3 0/5] network: firewalld: fix routed network
by Eric Garver
This series fixes routed networks when a newer firewalld (>= 1.0.0) is
present [1]. Firewalld 1.0.0 included a change that disallows implicit
forwarding between zones [2]. libvirt was relying on this behavior to
allow routed networks to function.
Firewalld policies are added. Policies have been supported since
firewalld 0.9.0. If the running firewall does not support policies, then
it will fallback to the current zone only behavior.
My goal is to get libvirt to a fully native firewalld backend; no
iptables rules. This series is phase 1 of that effort. The next steps
are:
1. introduce a "libvirt-nat" zone and policies
- the current "libvirt" zone will become obsolete
2. go full native firewalld, do not use iptables directly
- currently a hybrid of iptables + firewalld is used
v3:
- rebase, retest, resend
v2:
- keep existing libvirt zone as is
- remove "<forward />" in libvirt-routed zone because this feature
requires firewalld >= 0.9.0. Has no impact since the added policies
allow forwarding libvirt-routed <--> ANY zone (including itself).
- add probe for policies: virFirewallDGetPolicies(),
virFirewallDPolicyExists()
[1]: https://bugzilla.redhat.com/show_bug.cgi?id=2055706
[2]: https://github.com/firewalld/firewalld/issues/177
Eric Garver (5):
util: add virFirewallDGetPolicies()
util: add virFirewallDPolicyExists()
network: firewalld: add zone for routed networks
network: firewalld: add policies for routed networks
network: firewalld: add support for routed networks
src/libvirt_private.syms | 2 +
src/network/bridge_driver_linux.c | 11 +++-
src/network/libvirt-routed-in.policy | 11 ++++
src/network/libvirt-routed-out.policy | 12 +++++
src/network/libvirt-routed.zone | 10 ++++
src/network/libvirt-to-host.policy | 20 ++++++++
src/network/meson.build | 20 ++++++++
src/util/virfirewalld.c | 72 +++++++++++++++++++++++++++
src/util/virfirewalld.h | 2 +
9 files changed, 159 insertions(+), 1 deletion(-)
create mode 100644 src/network/libvirt-routed-in.policy
create mode 100644 src/network/libvirt-routed-out.policy
create mode 100644 src/network/libvirt-routed.zone
create mode 100644 src/network/libvirt-to-host.policy
--
2.35.3
2 years, 1 month
[libvirt PATCH 00/28] Synchronize x86 cpu features from qemu
by Tim Wiederhake
libvirt is missing support for some x86 cpu features supported by qemu. This
series adds the missing cpu features to libvirt's feature map and adds a script
to ease the detection of new features in the future.
Support for amx-int8, amx-tile, and amx-bf16 is deliberately not included, as
they were already proposed on the list[1].
[1] https://listman.redhat.com/archives/libvir-list/2022-September/234292.html
Tim Wiederhake (28):
cpu-data.py: Allow for more than child in feature nodes
cpu_x86: Ignore alias names
cpu: make x86 feature alias names machine readable
cpu_map: Add script to sync from QEMU i386 cpu features
cpu_map: Rename sync_qemu_i386.py
cpu_map: Add missing x86 feature alias names
cpu_map: Add missing x86 feature "sgx"
cpu_map: Add missing x86 feature "sgxlc"
cpu_map: Add missing x86 feature "sgx-exinfo"
cpu_map: Add missing x86 feature "sgx1"
cpu_map: Add missing x86 feature "sgx2"
cpu_map: Add missing x86 feature "sgx-debug"
cpu_map: Add missing x86 feature "sgx-mode64"
cpu_map: Add missing x86 feature "sgx-provisionkey"
cpu_map: Add missing x86 feature "sgx-tokenkey"
cpu_map: Add missing x86 feature "sgx-kss"
cpu_map: Add missing x86 feature "bus-lock-detect"
cpu_map: Add missing x86 feature "pks"
cpu_map: Add missing x86 feature "avx512-vp2intersect"
cpu_map: Add missing x86 feature "avx512-fp16"
cpu_map: Add missing x86 feature "serialize"
cpu_map: Add missing x86 feature "tsx-ldtrk"
cpu_map: Add missing x86 feature "arch-lbr"
cpu_map: Add missing x86 feature "xfd"
cpu_map: Add missing x86 feature "intel-pt-lip"
cpu_map: Add missing x86 feature "avic"
cpu_map: Add missing x86 feature "v-vmsave-vmload"
cpu_map: Add missing x86 feature "vgif"
src/cpu/cpu_x86.c | 10 +-
src/cpu_map/sync_qemu_features_i386.py | 278 ++++++++++++++++++
..._qemu_i386.py => sync_qemu_models_i386.py} | 0
src/cpu_map/x86_features.xml | 133 +++++++--
tests/cputestdata/cpu-data.py | 11 +-
.../x86_64-cpuid-Atom-P5362-disabled.xml | 1 +
.../x86_64-cpuid-Atom-P5362-guest.xml | 1 +
.../x86_64-cpuid-Atom-P5362-host.xml | 1 +
.../x86_64-cpuid-Core-i7-7600U-disabled.xml | 2 +-
.../x86_64-cpuid-Core-i7-7600U-guest.xml | 1 +
.../x86_64-cpuid-Core-i7-7600U-host.xml | 1 +
.../x86_64-cpuid-Core-i7-7700-disabled.xml | 2 +-
.../x86_64-cpuid-Core-i7-7700-guest.xml | 1 +
.../x86_64-cpuid-Core-i7-7700-host.xml | 1 +
.../x86_64-cpuid-Core-i7-8550U-disabled.xml | 2 +-
.../x86_64-cpuid-Core-i7-8550U-guest.xml | 1 +
.../x86_64-cpuid-Core-i7-8550U-host.xml | 1 +
.../x86_64-cpuid-Core-i7-8700-disabled.xml | 2 +-
.../x86_64-cpuid-Core-i7-8700-guest.xml | 2 +
.../x86_64-cpuid-Core-i7-8700-host.xml | 2 +
...86_64-cpuid-EPYC-7502-32-Core-disabled.xml | 2 +-
.../x86_64-cpuid-EPYC-7502-32-Core-guest.xml | 3 +
.../x86_64-cpuid-EPYC-7502-32-Core-host.xml | 3 +
...86_64-cpuid-EPYC-7601-32-Core-disabled.xml | 2 +-
.../x86_64-cpuid-EPYC-7601-32-Core-guest.xml | 3 +
.../x86_64-cpuid-EPYC-7601-32-Core-host.xml | 3 +
...-cpuid-EPYC-7601-32-Core-ibpb-disabled.xml | 2 +-
..._64-cpuid-EPYC-7601-32-Core-ibpb-guest.xml | 3 +
...6_64-cpuid-EPYC-7601-32-Core-ibpb-host.xml | 3 +
...-cpuid-Hygon-C86-7185-32-core-disabled.xml | 2 +-
..._64-cpuid-Hygon-C86-7185-32-core-guest.xml | 3 +
...6_64-cpuid-Hygon-C86-7185-32-core-host.xml | 3 +
.../x86_64-cpuid-Ice-Lake-Server-disabled.xml | 2 +-
.../x86_64-cpuid-Ice-Lake-Server-guest.xml | 2 +
.../x86_64-cpuid-Ice-Lake-Server-host.xml | 2 +
...puid-Ryzen-7-1800X-Eight-Core-disabled.xml | 2 +-
...4-cpuid-Ryzen-7-1800X-Eight-Core-guest.xml | 3 +
...64-cpuid-Ryzen-7-1800X-Eight-Core-host.xml | 3 +
...4-cpuid-Ryzen-9-3900X-12-Core-disabled.xml | 2 +-
...6_64-cpuid-Ryzen-9-3900X-12-Core-guest.xml | 3 +
...86_64-cpuid-Ryzen-9-3900X-12-Core-host.xml | 3 +
.../x86_64-cpuid-Xeon-E3-1225-v5-disabled.xml | 2 +-
.../x86_64-cpuid-Xeon-E3-1225-v5-guest.xml | 1 +
.../x86_64-cpuid-Xeon-E3-1225-v5-host.xml | 1 +
.../x86_64-cpuid-Xeon-E3-1245-v5-disabled.xml | 2 +-
.../x86_64-cpuid-Xeon-E3-1245-v5-guest.xml | 1 +
.../x86_64-cpuid-Xeon-E3-1245-v5-host.xml | 1 +
.../domaincapsdata/qemu_6.0.0-tcg.x86_64.xml | 1 +
.../domaincapsdata/qemu_6.1.0-tcg.x86_64.xml | 1 +
.../domaincapsdata/qemu_6.2.0-tcg.x86_64.xml | 2 +
.../domaincapsdata/qemu_7.0.0-tcg.x86_64.xml | 2 +
.../domaincapsdata/qemu_7.1.0-tcg.x86_64.xml | 2 +
52 files changed, 487 insertions(+), 36 deletions(-)
create mode 100755 src/cpu_map/sync_qemu_features_i386.py
rename src/cpu_map/{sync_qemu_i386.py => sync_qemu_models_i386.py} (100%)
--
2.36.1
2 years, 1 month