Actually there is still some issue around this. When trying to start
another VM, so using even more pmem, there is a different issue now. Error
copied below.
[]# virsh start manual_clone
error: Failed to start domain manual_clone
error: internal error: qemu unexpectedly closed the monitor: ftruncate:
Invalid argument
2019-09-03T21:28:41.031924Z qemu-system-x86_64: -object
memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/dev/dax1.1,share=yes,size=73014444032:
os_mem_prealloc: Insufficient free host memory pages available to allocate
guest RAM
Old error was
Error:
# virsh start test
error: Failed to start domain test
error: internal error: qemu unexpectedly closed the monitor: ftruncate:
Invalid argument 2019-08-22T04:16:08.744402Z qemu-system-x86_64: -object
memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/dev/dax1.0,size=403726925824:
unable to map backing store for guest RAM: Cannot allocate memory
On Tue, Sep 3, 2019 at 9:53 AM Dan Williams <dan.j.williams(a)intel.com>
wrote:
On Mon, Sep 2, 2019 at 10:10 AM Seema Pandit
<pan.blr.17(a)gmail.com> wrote:
>
> After adding the memoryBacking tag in xml as below (in addition, to
other xml changes to add nvdimm), virsh could allocate AD memory larger
than the system RAM and VMs could start successfully.
>
> <memoryBacking>
>
> <access mode='shared'/>
>
> <discard/>
>
> </memoryBacking>
>
>
>
> This adds share=yes in command line.
>
>
>
> -object
memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/mnt/pmem0/file1,share=yes,size=414464344064
-device nvdimm,node=0,label-size=131072,memdev=memnvdimm0,id=nvdimm0,slot=0
>
>
>
>
>
> For reference qemu command line where VM starts quickly:
>
> qemu-system-x86_64 \
>
> -name qemu-gold29 \
>
> -drive
file=/var/lib/libvirt/images/gold29-ad.qcow2,format=qcow2,index=0,media=disk
\ -m 2G,slots=4,maxmem=428G \ -smp 2 \ -machine pc,accel=kvm,nvdimm=on \
-enable-kvm \ -object
memory-backend-file,id=pmem1,share=on,mem-path=/mnt/pmem0/file1,size=386G,align=4K
\ -device nvdimm,memdev=pmem1,id=nv1 \ -daemonize
>
>
>
>
>
> Qemu command line generated from virsh: (please note VM now starts with
this command line, shared=yes.)
>
>
>
> /usr/bin/qemu-system-x86_64 -machine accel=kvm -name
guest=mix-test,debug-threads=on -S -object
secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-18-mix-test/master-key.aes
-machine
pc-i440fx-3.0,accel=kvm,usb=off,vmport=off,dump-guest-core=off,nvdimm=on
-cpu Skylake-Server-IBRS,hypervisor=on -m
size=2097152k,slots=16,maxmem=419430400k -realtime mlock=off -smp
2,sockets=2,cores=1,threads=1 -object
memory-backend-file,id=ram-node0,mem-path=/var/lib/libvirt/qem
/ram/libvirt/qemu/18-mix-test/ram-node0,discard-data=yes,share=yes,size=2147483648
-numa node,nodeid=0,cpus=0-1,memdev=ram-node0 -object
memory-backend-file,id=memnvdimm0,prealloc=yes,mem-path=/mnt/pmem0/file1,share=yes,size=414464344064
-device nvdimm,node=0,label-size=131072,memdev=memnvdimm0,id=nvdimm0,slot=0
-uuid 318c0529-0330-460b-8d0a-3b253e9decdd -no-user-config -nodefaults
-chardev socket,id=charmonitor,fd=32,server,nowait -mon
chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew
-global kvm-pit.lost_tick_policy=delay -no-hpet -no-shutdown -global
PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1 -boot strict=on -device
ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7 -device
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5
-device ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1
-device ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2
-device virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x6 -drive
file=/var/lib/libvirt/images/mix-test.qcow2,format=qcow2,if=none,id=drive-virtio-disk0
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x7,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-drive if=none,id=drive-ide0-0-0,readonly=on -device
ide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0 -netdev
tap,fd=34,id=hostnet0 -device
e1000,netdev=hostnet0,id=net0,mac=52:54:00:06:db:55,bus=pci.0,addr=0xa
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
socket,id=charchannel0,fd=35,server,nowait -device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
-chardev spicevmc,id=charchannel1,name=vdagent -device
virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel1,id=channel1,name=com.redhat.spice.0
-device usb-tablet,id=input0,bus=usb.0,port=1 -spice
port=5901,addr=127.0.0.1,disable-ticketing,image-compression=off,seamless-migration=on
-device
qxl-vga,id=video0,ram_size=67108864,vram_size=67108864,vram64_size_mb=0,vgamem_mb=16,max_outputs=1,bus=pci.0,addr=0x2
-device intel-hda,id=sound0,bus=pci.0,addr=0x4 -device
hda-duplex,id=sound0-codec0,bus=sound0.0,cad=0 -chardev
spicevmc,id=charredir0,name=usbredir -device
usb-redir,chardev=charredir0,id=redir0,bus=usb.0,port=2 -chardev
spicevmc,id=charredir1,name=usbredir -device
usb-redir,chardev=charredir1,id=redir1,bus=usb.0,port=3 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x8 -object
rng-random,id=objrng0,filename=/dev/urandom -device
virtio-rng-pci,rng=objrng0,id=rng0,bus=pci.0,addr=0x9 -sandbox
on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny
-msg timestamp=on
>
>
> VM start takes longer that qemu. Any thoughts?? Why is prealloc=yes
default on for nvdimm? Any other important deltas?
>
The "prealloc" option is a "pay me now" vs "pay me later"
type of
decision. If the guest workload is ok to absorb fault latency at run
time then disable prealloc, if it would prefer more predictable
latency and pay all the fault penalty up front then specify prealloc.
The "shared" parameter, if it means MAP_SHARED vs MAP_PRIVATE, must be
set to "yes" if the guest expects persistence. MAP_PRIVATE is
otherwise unacceptable for emulating persistent memory because writes
to private memory are discarded, and as you have seen require volatile
DRAM backing even in the DAX case.