On Thu, May 12, 2016 at 04:00:29PM +0000, Mooney, Sean K wrote:
> > Today it is possible to use Libvirt to spawn a vm without
hugepage
> > memory and a file descriptor backed memdev Via the use of the
> qemu:commandline element.
> >
> > <qemu:commandline>
> > <qemu:arg value='-object'/>
> > <qemu:arg value='memory-backend-file,id=mem,size=1024M,mem-
> path=/var/lib/libvirt/qemu,share=on'/>
> > <qemu:arg value='-numa'/>
> > <qemu:arg value='node,memdev=mem'/>
> > <qemu:arg value='-mem-prealloc'/>
> > </qemu:commandline>
> >
> > I created a proof of concept patch to nova to demonstrate that this
> > works however to support this usecase in Nova a new xml element is
> required.
> >
https://review.openstack.org/#/c/309565/1
> >
> > I would like to propose the introduction of a new subelemnt to the
> > memorybacking element to request file discrptro backed memory
> >
> > <memoryBacking>
> > <filedescriptor size_mb="1024"
path="/var/lib/libvirt/qemu"
> > prealloc="true" shared="on" /> </memoryBacking>
>
> Specifying a size is not required - we already know how big memory must
> be for the guest.
>
> We already have a memAccess='shared' attribute against the <numa>
> element that is used to determine if the underlying memory should be
> setup as shared. We could define a further element that lets us control
> memory access mode for guests without NUMA topology specified.
[Mooney, Sean K] hi yes the reason I added the shared attribute was to cater for
The case of guest without numa topology. For guest with numa topology I agree that
Using the memAcess='shared' on the cell is better for consistency with hugepage
memory.
> <memoryBacking>
> <access mode="shared"/>
> </memoryBacking>
>
> For huge pages it seems we unconditionally pass --mem-prealloc. I'm
> thinking we could perhaps make that configurable via an element
>
>
> <memoryBacking>
> <allocation mode="immediate|ondemand"/>
> </memoryBacking>
>
> to control use of -mem-prealloc or not.
[Mooney, Sean K] for the vhost user case the the mem-prealloc is required
Because you are basically doing dma so you really want memory to allocated.
Generically though from a Libvirt point of view I do think It makes sense for this
To be configurable to allow over subscript of memory for higher density.
>
> So all that remains is a way to request file based backing of RAM. As
> with huge pages, I think we should hide the actual path from the user.
> We should just use /dev/shm as the backing for non-hugepage RAM. For
> this we could define something like
>
> <memoryBacking>
> <source type="file|anonymous"/>
> </memoryBacking>
>
[Mooney, Sean K] for some reason when I used /dev/shm I could only boot one instance at a
time.
that was my first choice but maybe we would have to create a file per instance under
/dev/shm to make it work.
QEMU should create the file itself - its not different to our use
of hugetlbfs in fact. Possibly you hit a limit on amount of memory
allowed to be used via /dev/shm - iirc the mount poin tis limited
to 50% by default
If you use /var/lib/libvirt/ as the location you get a real file
backed by disk, so akin to putting the VM on swap IIUC !
> Putting that all together, to get what you want we'd have
>
> <memoryBacking>
> <source type="file"/>
> <access mode="shared"/>
> <allocation mode="immediate"/>
> </memoryBacking>
>
[Mooney, Sean K]
Yes this seems like it would be a clean way to address this use case.
Can you guage how small/large of a change this would be. Its been
A while since I worked with c directly but if you could point me in the
Right direction in the Libvirt codebase I would be happy to look at
creating an RFC patch.
First there's defining the XML extensions - needs docs/schemas/domaincommon.rng
and src/conf/domain_conf.{c,h} to be changed.
Then there's wiring up QEMU XML -> ARGV conversion - src/qemu/qemu_command.c
and adding test cases in tests/qemuxml2argvtest.c
From a nova side assuming Libvirt was extended for this feature
should
I open a blueprint to extend the existing guest memory backing support
In parallel to the Libvirt implementation or wait until after it is
support in Libvirt to start the Nova discussion? In either case I think
we agree that any support in nova Would Depend on Libvirt support to be
accepted in upstream nova.
You're going to hit the deadline for approval of Newton specs in Nova
fairly soon, and unless the libvirt impl is done before then, I think
it is unlikely you'd get a spec approved. So by all means work on this
in parallel, but be realistic about chances of approval in Nova for
this cycle.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|