On Tue, Aug 24, 2010 at 11:53:27AM +0530, Nikunj A. Dadhania wrote:
Subject: [RFC] Memory controller exploitation in libvirt
Memory CGroup is a kernel feature that can be exploited effectively in the
current libvirt/qemu driver. Here is a shot at that.
At present, QEmu uses memory ballooning feature, where the memory can be
inflated/deflated as and when needed, co-operatively between the host and
the guest. There should be some mechanism where the host can have more
control over the guests memory usage. Memory CGroup provides features such
as hard-limit and soft-limit for memory, and hard-limit for swap area.
Exposing the tunables is nice, but there is another related problem.
We don't provide apps enough information to effectively use them.
eg, they configure a guest with 500 MB of RAM. How much RAM does
QEMU actually use. 500 MB + X MB more. We need to give apps an
indication of what the 'X' overhead is. Some of it comes from the
video RAM. Some is pure QEMU emulation overhead.
Design 1: Provide new API and XML changes for resource management
=================================================================
All the memory controller tunables are not supported with the current
abstractions provided by the libvirt API. libvirt works on various OS. This
new API will support GNU/Linux initially and as and when other platforms
starts supporting memory tunables, the interface could be enabled for
them. Adding following two function pointer to the virDriver interface.
1) domainSetMemoryParameters: which would take one or more name-value
pairs. This makes the API extensible, and agnostic to the kind of
parameters supported by various Hypervisors.
2) domainGetMemoryParameters: For getting current memory parameters
Corresponding libvirt public API:
int virDomainSetMemoryParamters (virDomainPtr domain,
virMemoryParamterPtr params,
unsigned int nparams);
int virDomainGetMemoryParamters (virDomainPtr domain,
virMemoryParamterPtr params,
unsigned int nparams);
Parameter list supported:
MemoryHardLimits (memory.limits_in_bytes) - Maximum memory
MemorySoftLimits (memory.softlimit_in_bytes) - Desired memory
MemoryMinimumGaurantee - Minimum memory required (without this amount of
memory, VM should not be started)
SwapHardLimits (memory.memsw_limit_in_bytes) - Maximum swap
SwapSoftLimits (Currently not supported by kernel) - Desired swap space
Tunables memory.limit_in_bytes, memory.softlimit_in_bytes and
memory.memsw_limit_in_bytes are provided by the memory controller in the
Linux kernel.
I am not an expert here, so just listing what new elements need to be added
to the XML schema:
<define name="resource">
<element memory>
<element memoryHardLimit/>
<element memorySoftLimit/>
<element memoryMinGaurantee/>
<element swapHardLimit/>
<element swapSoftLimit/>
</element>
</define>
Pros:
* Support all the tunables exported by the kernel
* More tunables can be added as and when required
Cons:
* Code changes would touch various levels
Not a problem.
* Might need to redefine(changing the scope) of existing memory
API. Currently, domainSetMemory is used to set limit_in_bytes in LXC and
memory ballooning in QEmu. While the domainSetMaxMemory is not defined in
QEmu and in case of LXC it is setting the internal object's maxmem
variable.
Yep, might need to clarify LXC a little bit.
Future:
* Later on, CPU/IO/Network controllers related tunables can be
added/enhanced along with the APIs/XML elements:
CPUHardLimit
CPUSoftLimit
CPUShare
CPUPercentage
IO_BW_Softlimit
IO_BW_Hardlimit
IO_BW_percentage
We have APIs to cope with CPU tunables, but no persistent XML
representation. We have nothing for IO
* libvirt-cim support for resource management
Design 2: Reuse the current memory APIs in libvirt
==================================================
Use memory.limit_in_bytes to tweak memory hard limits
Init - Set the memory.limit_in_bytes to maximum mem.
Claiming memory from guest:
a) Reduce balloon size
b) If the guest does not co-operate(How do we know?), reduce
memory.limit_in_bytes.
Allocating memory more than max memory: How to solve this? As we have
already set the max balloon size. We can only play within this!
Pros:
* Few changes
* Is not intrusive
Cons:
* SetMemory and SetMaxMemory usage is confusing.
* SetMemory is too generic a name, it does not cover all the tunables.
* Does not support memory softlimit
* Does not have support to reserve the memory swap region
* This solution is not extensible
IMO, "Design 1" is more generic and extensible for various memory
tuneables.
Agreed, the current approach to memory is not flexible enough. It only
really fits into control of the over all memory allocation + balloon
level. In things like LXC we've rather twisted the meaning. Design 1
will clear up alot of this mess.
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://deltacloud.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|