[libvirt] Problem with setting up KVM guests to use HugePages

Configure hugepages and then start virtual guest via "virsh start". However, virtual guest failed to use hugepages although it's configured The initial usage of system memory [root@local ~]# free total used free shared buff/cache available Mem: 263767352 1900628 261177892 9344 688832 261431872 Swap: 4194300 0 4194300 After configuring hugepages, [root@local ~]# cat /proc/meminfo | grep uge AnonHugePages: 2048 kB HugePages_Total: 140 HugePages_Free: 140 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB [root@local ~]# mount | grep huge cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec, relatime,hugetlb)hugetlbfs on /dev/hugepages type hugetlbfs (rw,realtime) On the guest side, its config xml contains <domain type='kvm'> <name>test-01</name> <uuid>e1b72349-4a0b-4b91-aedc-fd34e92251e4</uuid> <description>test-01</description> <memory unit='KiB'>134217728</memory> <currentMemory unit='KiB'>134217728</currentMemory> <memoryBacking> <hugepages/> <nosharepages/> </memoryBacking> Tried to virsh start the guest, but it failed although the hugepages memory was sufficient virsh # start test-01 error: Failed to start domain test-01 error: internal error: early end of file from monitor: possible problem: Cannot set up guest memory 'pc.ram': Cannot allocate memory however, if decrease the number of hugepages to smaller(say 100) that makes the memory not used by hugepages more than required by guest, then guest can start. But memory is not allocated from hugepages. [root@local ~]# cat /proc/meminfo | grep uge AnonHugePages: 134254592 kB HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB In summary, although hugepages is configured, the guest seemed not instructed to use hugepage, instead, the guest is allocated memory from the memory pool leftover by hugepages.

On Tue, Jun 09, 2015 at 05:52:04AM +0000, Vivi L wrote:
Configure hugepages and then start virtual guest via "virsh start". However, virtual guest failed to use hugepages although it's configured
The initial usage of system memory [root@local ~]# free total used free shared buff/cache available Mem: 263767352 1900628 261177892 9344 688832 261431872 Swap: 4194300 0 4194300
After configuring hugepages, [root@local ~]# cat /proc/meminfo | grep uge AnonHugePages: 2048 kB HugePages_Total: 140 HugePages_Free: 140 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB
[root@local ~]# mount | grep huge cgroup on /sys/fs/cgroup/hugetlb type cgroup (rw,nosuid,nodev,noexec, relatime,hugetlb)hugetlbfs on /dev/hugepages type hugetlbfs (rw,realtime)
On the guest side, its config xml contains <domain type='kvm'> <name>test-01</name> <uuid>e1b72349-4a0b-4b91-aedc-fd34e92251e4</uuid> <description>test-01</description> <memory unit='KiB'>134217728</memory> <currentMemory unit='KiB'>134217728</currentMemory> <memoryBacking> <hugepages/>
You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this: $ virsh dumpxml f21-vm | grep hugepages -B3 -A2 <memory unit='KiB'>2000896</memory> <currentMemory unit='KiB'>2000000</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>8</vcpu> I haven't tested this exhaustively, but some basic test notes here: https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
<nosharepages/> </memoryBacking>
Tried to virsh start the guest, but it failed although the hugepages memory was sufficient virsh # start test-01 error: Failed to start domain test-01 error: internal error: early end of file from monitor: possible problem: Cannot set up guest memory 'pc.ram': Cannot allocate memory
however, if decrease the number of hugepages to smaller(say 100) that makes the memory not used by hugepages more than required by guest, then guest can start. But memory is not allocated from hugepages. [root@local ~]# cat /proc/meminfo | grep uge AnonHugePages: 134254592 kB HugePages_Total: 100 HugePages_Free: 100 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 1048576 kB
In summary, although hugepages is configured, the guest seemed not instructed to use hugepage, instead, the guest is allocated memory from the memory pool leftover by hugepages.
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- /kashyap

Kashyap Chamarthy <kchamart <at> redhat.com> writes:
You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this:
$ virsh dumpxml f21-vm | grep hugepages -B3 -A2 <memory unit='KiB'>2000896</memory> <currentMemory unit='KiB'>2000000</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>8</vcpu>
I haven't tested this exhaustively, but some basic test notes here:
https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
Current QEMU does not support setting <page> element. Could it be the cause of my aforementioned problem? unsupported configuration: huge pages per NUMA node are not supported with this QEMU

On 10.06.2015 01:05, Vivi L wrote:
Kashyap Chamarthy <kchamart <at> redhat.com> writes:
You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this:
$ virsh dumpxml f21-vm | grep hugepages -B3 -A2 <memory unit='KiB'>2000896</memory> <currentMemory unit='KiB'>2000000</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>8</vcpu>
I haven't tested this exhaustively, but some basic test notes here:
https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
Current QEMU does not support setting <page> element. Could it be the cause of my aforementioned problem?
unsupported configuration: huge pages per NUMA node are not supported with this QEMU
So this is explanation why the memory for you guest is not backed by hugepages. Can you please share the qemu command line that libvirt generated for your guest? Michal

Michal Privoznik <mprivozn <at> redhat.com> writes:
On 10.06.2015 01:05, Vivi L wrote:
Kashyap Chamarthy <kchamart <at> redhat.com> writes:
You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this:
$ virsh dumpxml f21-vm | grep hugepages -B3 -A2 <memory unit='KiB'>2000896</memory> <currentMemory unit='KiB'>2000000</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>8</vcpu>
I haven't tested this exhaustively, but some basic test notes here:
https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
Current QEMU does not support setting <page> element. Could it be the cause of my aforementioned problem?
unsupported configuration: huge pages per NUMA node are not supported with this QEMU
So this is explanation why the memory for you guest is not backed by hugepages.
I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest?
Can you please share the qemu command line that libvirt generated for your guest?
If setting 100 hugepages, the guest can start while not using hugepages at all. In this case, [root@local ~]# ps -ef | grep qemu qemu 3403 1 99 17:42 ? 00:36:42 /usr/libexec/qemu-kvm -name qvpc-di-03-sf -S -machine pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu host -m 131072 -realtime mlock=off -smp 32,sockets=2,cores=16,threads=1 -numa node,nodeid=0,cpus=0-15,mem=65536 -numa node,nodeid=1, cpus=16-31,mem=65536 -uuid e1b72349-4a0b-4b91-aedc-fd34e92251e4 - smbios type=1,serial=SCALE-SLOT-03 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/qvpc-di-03-sf.monitor,server, nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc -no-shutdown -boot strict=on -device piix3-usb-uhci,id=usb,bus=pci.0, addr=0x1.0x2 -drive file=/var/lib/libvirt/images/asr5700/qvpc-di-03-sf- hda.img,if=none,id=drive-ide0-0-0,format=qcow2 -device ide- hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=2 -drive if=none,id=drive-ide0-1-0,readonly=on,format=raw -device ide- cd,bus=ide.1,unit=0,drive=drive-ide0-1-0,id=ide0-1-0,bootindex=1 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev pty,id=charserial1 -device isa-serial,chardev=charserial1,id=serial1 -vnc 127.0.0.1:0 -vga cirrus -device i6300esb,id=watchdog0,bus=pci.0, addr=0x3 -watchdog-action reset -device vfio-pci,host=08:00.0,id=hostdev0,bus=pci.0,addr=0x5 -device vfio- pci,host=09:00.0,id=hostdev1,bus=pci.0,addr=0x6 -device vfio- pci,host=0a:00.0,id=hostdev2,bus=pci.0,addr=0x7 -device virtio-balloon- pci,id=balloon0,bus=pci.0,addr=0x4 -msg timestamp=on
Michal

On Wed, Jun 10, 2015 at 09:20:40PM +0000, Vivi L wrote:
Michal Privoznik <mprivozn <at> redhat.com> writes:
On 10.06.2015 01:05, Vivi L wrote:
Kashyap Chamarthy <kchamart <at> redhat.com> writes:
You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this:
$ virsh dumpxml f21-vm | grep hugepages -B3 -A2 <memory unit='KiB'>2000896</memory> <currentMemory unit='KiB'>2000000</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>8</vcpu>
I haven't tested this exhaustively, but some basic test notes here:
https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
Current QEMU does not support setting <page> element. Could it be the cause of my aforementioned problem?
unsupported configuration: huge pages per NUMA node are not supported with this QEMU
So this is explanation why the memory for you guest is not backed by hugepages.
I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest?
No, it should not be mandatory. You should be able to use <memoryBacking> <hugepages/> </memoryBacking> With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11.06.2015 10:13, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 09:20:40PM +0000, Vivi L wrote:
Michal Privoznik <mprivozn <at> redhat.com> writes:
On 10.06.2015 01:05, Vivi L wrote:
Kashyap Chamarthy <kchamart <at> redhat.com> writes:
You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this:
$ virsh dumpxml f21-vm | grep hugepages -B3 -A2 <memory unit='KiB'>2000896</memory> <currentMemory unit='KiB'>2000000</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>8</vcpu>
I haven't tested this exhaustively, but some basic test notes here:
https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
Current QEMU does not support setting <page> element. Could it be the cause of my aforementioned problem?
unsupported configuration: huge pages per NUMA node are not supported with this QEMU
So this is explanation why the memory for you guest is not backed by hugepages.
I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest?
No, it should not be mandatory. You should be able to use
<memoryBacking> <hugepages/> </memoryBacking>
With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug.
Unless hugepages are requested for guest NUMA nodes. In that case memory-backend-file object is required. From my investigation, this seems to be the case. Michal

On Thu, Jun 11, 2015 at 10:27:05AM +0200, Michal Privoznik wrote:
On 11.06.2015 10:13, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 09:20:40PM +0000, Vivi L wrote:
Michal Privoznik <mprivozn <at> redhat.com> writes:
On 10.06.2015 01:05, Vivi L wrote:
Kashyap Chamarthy <kchamart <at> redhat.com> writes:
You might want re-test by explicitly setting the 'page' element and 'size' attribute? From my test, I had something like this:
$ virsh dumpxml f21-vm | grep hugepages -B3 -A2 <memory unit='KiB'>2000896</memory> <currentMemory unit='KiB'>2000000</currentMemory> <memoryBacking> <hugepages> <page size='2048' unit='KiB' nodeset='0'/> </hugepages> </memoryBacking> <vcpu placement='static'>8</vcpu>
I haven't tested this exhaustively, but some basic test notes here:
https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
Current QEMU does not support setting <page> element. Could it be the cause of my aforementioned problem?
unsupported configuration: huge pages per NUMA node are not supported with this QEMU
So this is explanation why the memory for you guest is not backed by hugepages.
I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest?
No, it should not be mandatory. You should be able to use
<memoryBacking> <hugepages/> </memoryBacking>
With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug.
Unless hugepages are requested for guest NUMA nodes. In that case memory-backend-file object is required. From my investigation, this seems to be the case.
memory-backend-file should only be required if trying to setup different hugepage configs for each guest NUMA node, or if trying to pin each guest NUMA node to a different host node. If they just want hugepages across the whole VM and no pinning shouldn't the traditional setup work Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 11.06.2015 10:48, Daniel P. Berrange wrote:
On Thu, Jun 11, 2015 at 10:27:05AM +0200, Michal Privoznik wrote:
On 11.06.2015 10:13, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 09:20:40PM +0000, Vivi L wrote:
Michal Privoznik <mprivozn <at> redhat.com> writes:
On 10.06.2015 01:05, Vivi L wrote:
Kashyap Chamarthy <kchamart <at> redhat.com> writes:
> You might want re-test by explicitly setting the 'page' element and > 'size' attribute? From my test, I had something like this: > > $ virsh dumpxml f21-vm | grep hugepages -B3 -A2 > <memory unit='KiB'>2000896</memory> > <currentMemory unit='KiB'>2000000</currentMemory> > <memoryBacking> > <hugepages> > <page size='2048' unit='KiB' nodeset='0'/> > </hugepages> > </memoryBacking> > <vcpu placement='static'>8</vcpu> > > I haven't tested this exhaustively, but some basic test notes here: > > https://kashyapc.fedorapeople.org/virt/test-hugepages-with-libvirt.txt
Current QEMU does not support setting <page> element. Could it be the cause of my aforementioned problem?
unsupported configuration: huge pages per NUMA node are not supported with this QEMU
So this is explanation why the memory for you guest is not backed by hugepages.
I thought setting hugepages per NUMA node is a nice-to-have feature. Is it required to enable the use of hugepages for the guest?
No, it should not be mandatory. You should be able to use
<memoryBacking> <hugepages/> </memoryBacking>
With pretty much any KVM/QEMU version that exists. If that's broken then its a libvit bug.
Unless hugepages are requested for guest NUMA nodes. In that case memory-backend-file object is required. From my investigation, this seems to be the case.
memory-backend-file should only be required if trying to setup different hugepage configs for each guest NUMA node, or if trying to pin each guest NUMA node to a different host node. If they just want hugepages across the whole VM and no pinning shouldn't the traditional setup work
Vivi L, now you see why you should never ever drop the list from CC. He has sent me an additional info with this snippet in the domain: <numatune> <memory mode='strict' nodeset='0-1'/> </numatune> Therefore the memory object is required. We can't guarantee the placement without it. Michal
participants (4)
-
Daniel P. Berrange
-
Kashyap Chamarthy
-
Michal Privoznik
-
Vivi L