[libvirt-users] [virtual interface] detach interface during boot succeed with no changes
by Yalan Zhang
Hi guys,
when I detach an interface from vm during boot (vm boot not finished), it
always fail. I'm not sure if there is an existing bug. I have
confirmed with someone that for disk, there is similar behavior, if
this is also acceptable?
# virsh destroy rhel7.2; virsh start rhel7.2 ;sleep 2; virsh
detach-interface rhel7.2 network 52:54:00:98:c4:a0; sleep 2; virsh
dumpxml rhel7.2 |grep /interface -B9
Domain rhel7.2 destroyed
Domain rhel7.2 started
Interface detached successfully
<address type='pci' domain='0x0000' bus='0x00' slot='0x06'
function='0x0'/>
</controller>
<interface type='network'>
<mac address='52:54:00:98:c4:a0'/>
<source network='default' bridge='virbr0'/>
<target dev='vnet0'/>
<model type='rtl8139'/>
<alias name='net0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03'
function='0x0'/>
</interface>
When I detach after the vm boot, expand the sleep time to 10, it will succeed.
# virsh destroy rhel7.2; virsh start rhel7.2 ;sleep 10; virsh
detach-interface rhel7.2 network 52:54:00:98:c4:a0; sleep 2; virsh
dumpxml rhel7.2 |grep /interface -B9
Domain rhel7.2 destroyed
Domain rhel7.2 started
Interface detached successfully
-------
Best Regards,
Yalan Zhang
IRC: yalzhang
Internal phone: 8389413
2 years, 1 month
[libvirt-users] Question about disabling UFO on guest
by Bao Nguyen
Hello everyone,
I would like to ask a question regarding to disable UFO of virtio vNIC in
my guest. I have read the document at https://libvirt.org/formatdomain.html
*host*
The csum, gso, tso4, tso6, ecn and ufo attributes with possible
values on and off can be used to turn off host offloading options. By
default, the supported offloads are enabled by QEMU. *Since 1.2.9 (QEMU
only)* The mrg_rxbuf attribute can be used to control mergeable rx buffers
on the host side. Possible values are on (default) and off. *Since 1.2.13
(QEMU only)*
*guest*
The csum, tso4, tso6, ecn and ufo attributes with possible
values on and off can be used to turn off guest offloading options. By
default, the supported offloads are enabl
ed by QEMU.
*Since 1.2.9 (QEMU only)*
Then I disabled UFO on my vNIC on guest as the following configuration
<devices>
<interface type='network'>
<source network='default'/>
<target dev='vnet1'/>
<model type='virtio'/>
<driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off'
queues='5' rx_queue_size='256' tx_queue_size='256'>
*<host gso='off' ufo='off' />*
*<guest ufo='off'/>*
</driver>
</interface>
</devices>
Then I reboot my node to get the change effect and it works. However, can I
disable the UFO without touching the host OS? or it always has to disable
on both host and guest like that?
Thanks,
Brs,
Natsu
4 years, 2 months
[libvirt-users] Add support for vhost-user-scsi-pci/vhost-user-blk-pci
by Li Feng
Hi Guys,
And I want to add the vhost-user-scsi-pci/vhost-user-blk-pci support
for libvirt.
The usage in qemu like this:
Vhost-SCSI
-chardev socket,id=char0,path=/var/tmp/vhost.0
-device vhost-user-scsi-pci,id=scsi0,chardev=char0
Vhost-BLK
-chardev socket,id=char1,path=/var/tmp/vhost.1
-device vhost-user-blk-pci,id=blk0,chardev=char1
What type should I add for libvirt.
Type1:
<hostdev mode='subsystem' type='vhost-user'>
<source protocol='vhost-user-scsi' path='/tmp/vhost-scsi.sock'></source>
<alias name="vhost-user-scsi-disk1"/>
</hostdev>
Type2:
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native'/>
<source protocol='vhost-user' path='/tmp/vhost-scsi.sock'>
</source>
<target dev='sdb' bus='vhost-user-scsi'/>
<boot order='3'/>
<alias name='scsi0-0-0-1'/>
<address type='drive' controller='0' bus='0' target='0' unit='1'/>
</disk>
<disk type='network' device='disk'>
<driver name='qemu' type='raw' cache='none' io='native'/>
<source protocol='vhost-user' path='/tmp/vhost-blk.sock'>
</source>
<target dev='vda' bus='vhost-user-blk'/>
<boot order='1'/>
<alias name='virtio-disk0'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07'
function='0x0'/>
</disk>
Could anyone give some suggestions?
Thanks,
Feng Li
--
The SmartX email address is only for business purpose. Any sent message
that is not related to the business is not authorized or permitted by
SmartX.
本邮箱为北京志凌海纳科技有限公司(SmartX)工作邮箱. 如本邮箱发出的邮件与工作无关,该邮件未得到本公司任何的明示或默示的授权.
4 years, 10 months
[libvirt-users] About vhost-user-blk support
by Su Hua
Hi, everyone, ask a question, which version can fully support the device type
of qemu hw/block/vhost-user-blk.c? If so, what should the format of the xml
file look like?
Regards
Su
5 years, 1 month
[libvirt-users] Recover snapshots from qcow images
by Petr Stodulka
Hi guys,
I had to move to the new laptop week ago and I screw migration of my virtual
machines. I recovered my virtual machines on the new laptop (virsh define)
using the backed up xml files, but I am missing any file with metadata about
snapshots. The original storage is cleaned so I cannot take these files
anymore.
Using qemu-info I can see my snapshots inside the qcow images, but libvirt
doesn't know about them:
###########################################
# virsh snapshot-list rhlvm
Name Creation Time State
-------------------------------
# qemu-img info rhlvm.qcow2
image: rhlvm.qcow2
file format: qcow2
virtual size: 25G (26843545600 bytes)
disk size: 2.9G
cluster_size: 65536
Snapshot list:
ID TAG VM SIZE DATE VM CLOCK
1 prepared 0 2018-09-05 11:06:06 00:00:00.000
Format specific information:
compat: 1.1
lazy refcounts: true
refcount bits: 16
corrupt: false
###########################################
Is there any nice way to regenerate snapshot metadata for libvirt from
the data inside qcow images? I have bunch of VMs so if there is nice way
how to recover those data, you will make me really happy :)
Thanks,
--
Petr Stodulka
OS & Application Modernization
IRC nicks: pstodulk, skytak
Software Engineer
Red Hat Czech s.r.o.
5 years, 1 month
[libvirt-users] virsh list -all zombies
by ofoerster@posteo.de
Hello Community,
I have a problem with two deleted KVMs. So far we have done everything
that is necessary:
virsh undefine [kvm-name] --managed-save --snapshots-metadata
--remove-all-storage --nvram
virsh destroy [kvm-name]
and we delete the xml-File and the .img.
If we now call "virsh list -all", then the machines are also gone.
However, after a restart of libvirtd they reappear in the list, like
zombies.
What did we do wrong? Or is that even a bug?
Kind regards,
Oliver
5 years, 2 months
[libvirt-users] [libvirtd] qemu_process: reset CPU affinity to all enabled CPUs, when runs in custom cpuset
by Valentina Krasnobaeva
Hello All,
Since 4.5.0-23.el7 version (Red Hat 7.7), when I launch pinned VM,
libvirtd reset CPU affinity to all enabled in host CPUs, if it runs in
custom cpuset.
I can't reproduce this behavior with 4.5.0-10.el7_6.12 with the same
kernel version (Red Hat 7.7).
Libvirt runs in a custom cpuset 'libvirt', where the number of
available cpus is restricted to 0,2,4,6,8.
And this 'libvirt' cpuset is created in a system with total cpus
number: 40 (all cpus are enabled in BIOS) and I have '0-39' range in
/sys/fs/cgroup/cpuset/cpuset.cpus.
When a VM with pinned vcpus is launched, VM XML config is in attachement:
<vcpu placement='static'>2</vcpu>
<vcpupin vcpu='0' cpuset='4'/>
<vcpupin vcpu='1' cpuset='6'/>
I have the following error:
# virsh create /tmp/vm1_2vms-2cores_host1.xml
error: Failed to create domain from /tmp/vm1_2vms-2cores_host1.xml
error: Unable to write to
'/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.cpus':
Permission denied
And in the debug log I can see, that
/sys/fs/cgroup/cpuset/machine.slice was created with a proper cpuset
list: 0,2,4,6,8.
This list at the beginning was successfully inherited and set in
/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.cpus.
But finally libvirtd tries to reset inherited cpus and set there: "0-39"
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupDetect:747 :
Detected mount/mapping 0:cpu at /sys/fs/cgroup/cpu,cpuacct in
/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator
for pid -1
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupDetect:747 :
Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpu,cpuacct in
/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator
for pid -1
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupDetect:747 :
Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in
/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator
for pid -1
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupMakeGroup:1049 :
Make group /machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupMakeGroup:1073 :
Make controller
/sys/fs/cgroup/cpu,cpuacct/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupMakeGroup:1073 :
Make controller
/sys/fs/cgroup/cpu,cpuacct/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupMakeGroup:1073 :
Make controller
/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/
2019-08-28 15:11:03.357+0000: 25536: debug :
virCgroupCpuSetInherit:989 : Setting up inheritance
/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope ->
/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupGetValueStr:832
: Get value /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/cpuset.cpus
2019-08-28 15:11:03.357+0000: 25536: debug : virFileClose:111 : Closed fd 27
2019-08-28 15:11:03.357+0000: 25536: debug :
virCgroupCpuSetInherit:999 : Inherit cpuset.cpus = 0,2,4,6,8
2019-08-28 15:11:03.357+0000: 25536: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.cpus'
to '0,2,4,6,8'
2019-08-28 15:11:03.358+0000: 25536: debug : virFileClose:111 : Closed fd 27
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupGetValueStr:832
: Get value /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/cpuset.mems
2019-08-28 15:11:03.358+0000: 25536: debug : virFileClose:111 : Closed fd 27
2019-08-28 15:11:03.358+0000: 25536: debug :
virCgroupCpuSetInherit:999 : Inherit cpuset.mems = 0-1
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.mems'
to '0-1'
2019-08-28 15:11:03.358+0000: 25536: debug : virFileClose:111 : Closed fd 27
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupGetValueStr:832
: Get value /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/cpuset.memory_migrate
2019-08-28 15:11:03.358+0000: 25536: debug : virFileClose:111 : Closed fd 27
2019-08-28 15:11:03.358+0000: 25536: debug :
virCgroupCpuSetInherit:999 : Inherit cpuset.memory_migrate = 1
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.memory_migrate'
to '1'
2019-08-28 15:11:03.358+0000: 25536: debug : virFileClose:111 : Closed fd 27
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller memory
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller devices
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller freezer
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller blkio
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller net_cls
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller perf_event
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1055 :
Not creating systemd controller group
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupMakeGroup:1126 :
Done making controllers for group
2019-08-28 15:11:03.358+0000: 25536: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.cpus'
to '0-39'
2019-08-28 15:11:03.358+0000: 25536: debug : virFileClose:111 : Closed fd 27
2019-08-28 15:11:03.358+0000: 25536: error : virCgroupSetValueStr:806
: Unable to write to
'/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d4\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.cpus':
Permission denied
The same log snippet with 4.5.0-10.el7_6.12 version, everything works well:
2019-08-28 16:13:22.837+0000: 26937: debug : virCgroupDetect:747 :
Detected mount/mapping 0:cpu at /sys/fs/cgroup/cpu,cpuacct in
/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1
for pid -1
2019-08-28 16:13:22.837+0000: 26937: debug : virCgroupDetect:747 :
Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpu,cpuacct in
/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1
for pid -1
2019-08-28 16:13:22.837+0000: 26937: debug : virCgroupDetect:747 :
Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in
/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1
for pid -1
2019-08-28 16:13:22.837+0000: 26937: debug : virCgroupMakeGroup:1049 :
Make group /machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1
2019-08-28 16:13:22.837+0000: 26937: debug : virCgroupMakeGroup:1073 :
Make controller
/sys/fs/cgroup/cpu,cpuacct/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/
2019-08-28 16:13:22.837+0000: 26937: debug : virCgroupMakeGroup:1073 :
Make controller
/sys/fs/cgroup/cpu,cpuacct/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/
2019-08-28 16:13:22.837+0000: 26937: debug : virCgroupMakeGroup:1073 :
Make controller
/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/
2019-08-28 16:13:22.838+0000: 26937: debug :
virCgroupCpuSetInherit:989 : Setting up inheritance
/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope ->
/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupGetValueStr:832
: Get value /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/cpuset.cpus
2019-08-28 16:13:22.838+0000: 26937: debug : virFileClose:111 : Closed fd 29
2019-08-28 16:13:22.838+0000: 26937: debug :
virCgroupCpuSetInherit:999 : Inherit cpuset.cpus = 0,2,4,6,8
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/cpuset.cpus'
to '0,2,4,6,8'
2019-08-28 16:13:22.838+0000: 26937: debug : virFileClose:111 : Closed fd 29
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupGetValueStr:832
: Get value /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/cpuset.mems
2019-08-28 16:13:22.838+0000: 26937: debug : virFileClose:111 : Closed fd 29
2019-08-28 16:13:22.838+0000: 26937: debug :
virCgroupCpuSetInherit:999 : Inherit cpuset.mems = 0-1
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/cpuset.mems'
to '0-1'
2019-08-28 16:13:22.838+0000: 26937: debug : virFileClose:111 : Closed fd 29
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupGetValueStr:832
: Get value /sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/cpuset.memory_migrate
2019-08-28 16:13:22.838+0000: 26937: debug : virFileClose:111 : Closed fd 29
2019-08-28 16:13:22.838+0000: 26937: debug :
virCgroupCpuSetInherit:999 : Inherit cpuset.memory_migrate = 1
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/cpuset.memory_migrate'
to '1'
2019-08-28 16:13:22.838+0000: 26937: debug : virFileClose:111 : Closed fd 29
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller memory
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller devices
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller freezer
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller blkio
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller net_cls
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1062 :
Skipping unmounted controller perf_event
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1055 :
Not creating systemd controller group
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupMakeGroup:1126 :
Done making controllers for group
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/cpuset.mems'
to '0'
2019-08-28 16:13:22.838+0000: 26937: debug : virFileClose:111 : Closed fd 29
2019-08-28 16:13:22.838+0000: 26937: debug : virCgroupSetValueStr:796
: Set value '/sys/fs/cgroup/cpu,cpuacct/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/vcpu1/tasks'
to '27045'
Would someone give me a hint, please, if we need to change settings
lied to NUMA or vCPUs in VM XML config according to last changed in
libvirtd ?
Please, find a bug report about this issue here:
https://bugzilla.redhat.com/show_bug.cgi?id=1746517
I suppose, that this may be an impact of one of the following patches
applied to 4.5.0-10.el7_6.12 source code version:
libvirt-qemu-Rework-setting-process-affinity.patch
libvirt-qemu-Fix-qemuProcessInitCpuAffinity.patch
libvirt-qemu-Fix-NULL-pointer-access-in-qemuProcessInitCpuAffinity.patch
libvirt-qemu-Fix-leak-in-qemuProcessInitCpuAffinity.patch
How reproducible: Always
Steps to Reproduce:
1. create a cpuset with reduced cpus list and launch libvirtd in it
2. prepare VM XML with pinned vcpus and <numatune> setting
3. virsh create vm.xml
Actual results:
error: Failed to create domain from /tmp/vm1_2vms-2cores_host1.xml
error: Unable to write to
'/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dvm12vms\x2d2coreshost1.scope/emulator/cpuset.cpus':
Permission denied
Expected results:
should work as with a previous version 4.5.0-10.el7_6.12.
Additional info:
# ps auxf | grep libvirt
root 24571 2.3 0.0 1895564 28052 ? Ssl 16:52 0:00
/usr/sbin/libvirtd
# grep Cpus_allowed /proc/24571/task/*/status
/proc/24571/task/24571/status:Cpus_allowed: 00,00000554
/proc/24571/task/24571/status:Cpus_allowed_list: 2,4,6,8,10
/proc/24571/task/24572/status:Cpus_allowed: 00,00000554
/proc/24571/task/24572/status:Cpus_allowed_list: 2,4,6,8,10
/proc/24571/task/24573/status:Cpus_allowed: 00,00000554
/proc/24571/task/24573/status:Cpus_allowed_list: 2,4,6,8,10
/proc/24571/task/24574/status:Cpus_allowed: 00,00000554
/proc/24571/task/24574/status:Cpus_allowed_list: 2,4,6,8,10
/proc/24571/task/24575/status:Cpus_allowed: 00,00000554
..
for all threads
Processor: "Intel(R) Xeon(R) CPU E5-2630 v4 @ 2.20GHz"
Sockets: 2, Cores: 20, HyperThreads: 40
socket 0 socket 1
+---------------------+ +---------------------+
| c0 c1 | | c0 c1 |
| +-------+ +-------+ | | +-------+ +-------+ |
| | 0| 20| | 2| 22| | | | 1| 21| | 3| 23| |
| +-------+ +-------+ | | +-------+ +-------+ |
| c2 c3 | | c2 c3 |
| +-------+ +-------+ | | +-------+ +-------+ |
| | 4| 24| | 6| 26| | | | 5| 25| | 7| 27| |
| +-------+ +-------+ | | +-------+ +-------+ |
| c4 c8 | | c4 c8 |
| +-------+ +-------+ | | +-------+ +-------+ |
| | 8| 28| | 10| 30| | | | 9| 29| | 11| 31| |
| +-------+ +-------+ | | +-------+ +-------+ |
| c9 c10 | | c9 c10 |
| +-------+ +-------+ | | +-------+ +-------+ |
| | 12| 32| | 14| 34| | | | 13| 33| | 15| 35| |
| +-------+ +-------+ | | +-------+ +-------+ |
| c11 c12 | | c11 c12 |
| +-------+ +-------+ | | +-------+ +-------+ |
| | 16| 36| | 18| 38| | | | 17| 37| | 19| 39| |
| +-------+ +-------+ | | +-------+ +-------+ |
+---------------------+ +---------------------+
# cat /sys/fs/cgroup/cpuset/cpuset.cpus
0-39
# cat /sys/fs/cgroup/cpuset/libvirt/cpuset.cpus
2,4,6,8,10
# cat /sys/fs/cgroup/cpuset/machine.slice/cpuset.cpus
0,2,4,6,8
5 years, 2 months
[libvirt-users] RLIMIT_MEMLOCK in container environment
by Ihar Hrachyshka
Hi all,
KubeVirt uses libvirtd to manage qemu VMs represented as Kubernetes
API resources. In this case, libvirtd is running inside an
unprivileged pod, with some host mounts / capabilities added to the
pod, needed by libvirtd and other services.
One of the capabilities libvirtd requires for successful startup
inside a pod is SYS_RESOURCE. This capability is used to adjust
RLIMIT_MEMLOCK ulimit value depending on devices attached to the
managed guest, both on startup and during hotplug. AFAIU the need to
lock the memory is to avoid pages being pushed out from RAM into swap.
In KubeVirt world, several libvirtd assumptions do not apply:
1. In Kubernetes environments, swap is usually disabled. (e.g. kubeadm
official deployment tool won't even initialize a cluster until you
disable it.) This is documented in lots of places, f.e.:
https://docs.platform9.com/kubernetes/disabling-swap-kubernetes-node/
(note: while it's vendor docs, regardless it's well known community
recommendation.)
2. hotplug is not supported. Domain definition is stable through its
whole lifetime.
We are working on a series of patches that would remove the need for
SYS_RESOURCE capability from the pod running libvirtd:
https://github.com/kubevirt/kubevirt/pull/2584
We achieve it by making another, *privileged* component to set
RLIMIT_MEMLOCK for libvirtd process using prlimit() syscall, using the
value that is higher than the final value libvirtd uses with
setrlimit() [Linux kernel will allow to lower the value without the
capability.] Since the formula to calculate the actual MEMLOCK value
is embedded in libvirt and is not simple to reproduce outside, we pick
the upper limit value set for libvirtd process quite conservatively
even if ideally we would use the exact same value as libvirtd would
do. The estimation code is here:
https://github.com/kubevirt/kubevirt/pull/2584/files#diff-6edccf5f0d11c09...
While the solution works, there are some drawbacks:
1. the value we use for prlimit() is not exactly equal to the final
value used by libvirtd;
2. we are doing all this work in environment that is not prone to
issues because of disabled swap space.
I believe we would benefit from one of the following features on
libvirt side (or both):
a) expose the memory lock value calculated by libvirtd through libvirt
ABI so that we can use it when calling prlimit() on libvirtd process;
b) allow to disable setrlimit() calls via libvirtd config file knob or
domain definition.
Do you think it would be acceptable to have one of these enhancements
in libvirtd, or perhaps both, for degenerate cases like KubeVirt?
Thanks for attention,
Ihar
5 years, 2 months