After adding the memoryBacking tag in xml as below (in addition, to other xml changes to add nvdimm), virsh could allocate AD memory larger than the system RAM and VMs could start successfully.
<memoryBacking>
<access mode='shared'/>
<discard/>
</memoryBacking>
This adds share=yes in command line.
-object memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/mnt/pmem0/file1,share=yes,size=414464344064 -device nvdimm,node=0,label-size=131072,memdev=memnvdimm0,id=nvdimm0,slot=0
For reference qemu command line where VM starts quickly:
qemu-system-x86_64 \
-name qemu-gold29 \
-drive file=/var/lib/libvirt/images/gold29-ad.qcow2,format=qcow2,index=0,media=disk \ -m 2G,slots=4,maxmem=428G \ -smp 2 \ -machine pc,accel=kvm,nvdimm=on \ -enable-kvm \ -object memory-backend-file,id=pmem1,share=on,mem-path=/mnt/pmem0/file1,size=386G,align=4K \ -device nvdimm,memdev=pmem1,id=nv1 \ -daemonize
Qemu command line generated from virsh: (please note VM now starts with this command line, shared=yes.)
/usr/bin/qemu-system-x86_64 -machine accel=kvm -name guest=mix-test,debug-threads=on -S -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-18-mix-test/master-key.aes -machine pc-i440fx-3.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,nvdimm=on -cpu Skylake-Server-IBRS,hypervisor=on -m size=2097152k,slots=16,maxmem=419430400k -realtime mlock=off -smp 2,sockets=2,cores=1,threads=1 -object memory-backend-file,id=ram-node0,mem-path=/var/lib/libvirt/qem /ram/libvirt/qemu/18-mix-test/ram-node0,discard-data=yes,share=yes,size=2147483648 -numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/mnt/pmem0/file1,share=yes,size=414464344064 -device nvdimm,node=0,label-size=131072,memdev=memnvdimm0,id=nvdimm0,slot=0 -uuid 318c0529-0330-460b-8d0a-3b253e9decdd -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=32,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5 -device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1 -device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2 -device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive file=/var/lib/libvirt/images/mix-test.qcow2,format=qcow2,if=none,id=drive-virtio-disk0 -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1 -drive if=none,id=drive-ide0-0-0,readonly=on -device ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev tap,fd=34,id=hostnet0 -device e1000,netdev=hostnet0,id=net0,mac=52:54:00:06:db:55,bus=pci.0,addr=0xa -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -chardev socket,id=charchannel0,fd=35,server,nowait -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0 -chardev spicevmc,id=charchannel1,name=vdagent -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0 -device usb-tablet,id=input0,bus=usb.0,port=1 -spice port=5901,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on -device qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2 -device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev spicevmc,id=charredir0,name=usbredir -device usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev spicevmc,id=charredir1,name=usbredir -device usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object rng-random,id=objrng0,filename=/dev/urandom -device virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on
VM start takes longer that qemu. Any thoughts?? Why is prealloc=yes default on for nvdimm? Any other important deltas?
On 8/31/19 7:33 PM, Seema Pandit wrote:
> Hi Michal,
> Thank you for the reply.
> I was having issues compiling qemu code on fedora29. So instead of dropping
> prealloc in virsh, tried adding prealloc=yes in qemu command line.
> prealloc=yes works. It does not lead to using more system memory when using
> DAX.
> +Dan
> Here are the steps:
>
> ndctl create-namespace -t pmem -m fsdax --align=4k -s 400G
>
> mkfs.ext4 /dev/pmem0
>
> mount -o dax /dev/pmem0 /mnt/pmem0
>
> dd if=/dev/zero of=/mnt/pmem0/file1 bs=4k count=104857600
>
> [root@system-name]# dd if=/dev/zero of=/mnt/pmem0/file1 bs=4k
> count=104857600
>
> dd: error writing '/mnt/pmem0/file1': No space left on device
>
> 101313980+0 records in
>
> 101313979+0 records out
>
> 414982057984 bytes (415 GB, 386 GiB) copied, 946.495 s, 438 MB/s
>
>
> Slightly smaller file is created than asked.
>
> [root@system-name]# du -sh
>
> 387G .
>
>
> sample qemu command line which works:
>
> qemu-system-x86_64 \
>
> -name test \
>
> -drive
> file=/var/lib/libvirt/images/test-ad.qcow2,format=qcow2,index=0,media=disk
> \ -m 2G,slots=4,maxmem=428G \ -smp 2 \ -machine pc,accel=kvm,nvdimm=on \
> -enable-kvm \ -object
> memory-backend-file,id=pmem1,prealloc=yes,share=on,mem-path=/mnt/pmem0/file1,size=386G,align=4K
> \ -device nvdimm,memdev=pmem1,id=nv1 \ -daemonize
Looks like the only difference to the cmd line generated by libvirt and
this one then is align=4K. To confirm that, can you share the full qemu
cmd line as generated by libvirt please?
Libvirt does not do anything special with guest memory, so this is
matter of qemu cmd line and that's why we need to see what's different,
what works and what doesn't. Then we have some lead to understand the
problem IMO.
Michal