[libvirt] [RFC] memory settings interface for containers

Hi, everyone. I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains. So question is how to map this model to existing interface (it is unlikely to have a new interface for this case). I plan to make both parameters to have same meaning and be equal for containers and update virsh, API and xml model documentation accordingly. I'd be happy to hear core developers opinions on this topic.

On Thu, 2015-11-12 at 11:11 +0300, Nikolay Shirokovskiy wrote:
Hi, everyone.
I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains.
Not only one parameter, there are a lot of parameters, that can be tuned with memory cgroup. But at least physical pages limit (memory.limit_in_bytes) and swap pages (memory.memsw.limit_in_bytes) have sense.
So question is how to map this model to existing interface (it is unlikely to have a new interface for this case). I plan to make both parameters to have same meaning and be equal for containers and update virsh, API and xml model documentation accordingly.
I'd be happy to hear core developers opinions on this topic. -- Dmitry Guryanov

On 13.11.2015 13:55, Dmitry Guryanov wrote:
On Thu, 2015-11-12 at 11:11 +0300, Nikolay Shirokovskiy wrote:
Hi, everyone.
I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains.
Not only one parameter, there are a lot of parameters, that can be tuned with memory cgroup. But at least physical pages limit (memory.limit_in_bytes) and swap pages (memory.memsw.limit_in_bytes) have sense.
So question is how to map this model to existing interface (it is unlikely to have a new interface for this case). I plan to make both parameters to have same meaning and be equal for containers and update virsh, API and xml model documentation accordingly.
I'd be happy to hear core developers opinions on this topic.
Have any ideas?

On Fri, Nov 13, 2015 at 01:55:15PM +0300, Dmitry Guryanov wrote:
On Thu, 2015-11-12 at 11:11 +0300, Nikolay Shirokovskiy wrote:
Hi, everyone.
I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains.
Not only one parameter, there are a lot of parameters, that can be tuned with memory cgroup. But at least physical pages limit (memory.limit_in_bytes) and swap pages (memory.memsw.limit_in_bytes) have sense.
The memory.limit_in_bytes maps to the current max memory limit. For LXC we also support setting the soft limit and swap limit Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Thu, Nov 12, 2015 at 11:11:31AM +0300, Nikolay Shirokovskiy wrote:
Hi, everyone.
I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains. So question is how to map this model to existing interface (it is unlikely to have a new interface for this case). I plan to make both parameters to have same meaning and be equal for containers and update virsh, API and xml model documentation accordingly.
I'd be happy to hear core developers opinions on this topic.
So from VM POV, the 'total_memory' represents the initial populated memory map. This is traditionally fixed at boot and cannot be changed while the VM is running. 'cur_balloon' represents the current memory after balloon adjustments and must be strictly less than or equal to total_memory. When the VM is shutoff, both values can be changed, but if you want to increase 'cur_balloon' you must first increase 'total_memory'. With LXC we essentially ignore 'cur_balloon' and just set the cgroups memory.limit_in_bytes to the 'total_memory' value. For reasons that escape me, we forbid changes to 'total_memory' in LXC driver, despite the fact that we could trivially allow them. We should fix that. In the virDomainInfo struct, things are a little different. For VMs we report 'current' as being the current balloon level. For LXC we report 'current' as being the current container usage, as reported by memory.usage_in_bytes cgroup field. I agree we should be more explicit about this all in the docs. For initial XML config, we should just raise an error if both <memory> and <currentMemory> are present and have different values, or possibly just clamp <currentMemory> to match <memory>. In virDomainSetMemoryFlags() we should document that with container based virt, you cannot change the CURRENT_MEMORY setting, and with machine based virt you cannot change the MAX_MEMORY setting but containers should allow it. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

14.01.2016 16:01, Daniel P. Berrange пишет: [snip]
I agree we should be more explicit about this all in the docs. For initial XML config, we should just raise an error if both <memory> and <currentMemory> are present and have different values, or possibly just clamp <currentMemory> to match <memory>. Hmm. And what if a user wants a VM to be started with inflated balloon and be able to deflate it later to add some memory to the VM in runtime? If libvirt raised an error in case <memory> and <currentMemory> are present and different a user wouldn't have such a possibility.
In virDomainSetMemoryFlags() we should document that with container based virt, you cannot change the CURRENT_MEMORY setting, and with machine based virt you cannot change the MAX_MEMORY setting but containers should allow it.
Regards, Daniel

On Thu, Jan 14, 2016 at 04:14:49PM +0300, Maxim Nestratov wrote:
14.01.2016 16:01, Daniel P. Berrange пишет:
[snip]
I agree we should be more explicit about this all in the docs. For initial XML config, we should just raise an error if both <memory> and <currentMemory> are present and have different values, or possibly just clamp <currentMemory> to match <memory>. Hmm. And what if a user wants a VM to be started with inflated balloon and be able to deflate it later to add some memory to the VM in runtime? If libvirt raised an error in case <memory> and <currentMemory> are present and different a user wouldn't have such a possibility.
Sorry, I should clarify we'd only clamp them together for container based virt Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

14.01.2016 16:16, Daniel P. Berrange пишет:
14.01.2016 16:01, Daniel P. Berrange пишет:
[snip]
I agree we should be more explicit about this all in the docs. For initial XML config, we should just raise an error if both <memory> and <currentMemory> are present and have different values, or possibly just clamp <currentMemory> to match <memory>. Hmm. And what if a user wants a VM to be started with inflated balloon and be able to deflate it later to add some memory to the VM in runtime? If libvirt raised an error in case <memory> and <currentMemory> are present and different a user wouldn't have such a possibility. Sorry, I should clarify we'd only clamp them together for container
On Thu, Jan 14, 2016 at 04:14:49PM +0300, Maxim Nestratov wrote: based virt
Regards, Daniel Ahh, I see. I should have used the subject as a global context. Anyway, thank you for clarification.
Best, Maxim

On 14.01.2016 16:01, Daniel P. Berrange wrote:
On Thu, Nov 12, 2015 at 11:11:31AM +0300, Nikolay Shirokovskiy wrote:
Hi, everyone.
I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains. So question is how to map this model to existing interface (it is unlikely to have a new interface for this case). I plan to make both parameters to have same meaning and be equal for containers and update virsh, API and xml model documentation accordingly.
I'd be happy to hear core developers opinions on this topic.
So from VM POV, the 'total_memory' represents the initial populated memory map. This is traditionally fixed at boot and cannot be changed while the VM is running.
'cur_balloon' represents the current memory after balloon adjustments and must be strictly less than or equal to total_memory.
When the VM is shutoff, both values can be changed, but if you want to increase 'cur_balloon' you must first increase 'total_memory'.
With LXC we essentially ignore 'cur_balloon' and just set the cgroups memory.limit_in_bytes to the 'total_memory' value. [1]
For reasons that escape me, we forbid changes to 'total_memory' in LXC driver, despite the fact that we could trivially allow them. We should fix that.
In the virDomainInfo struct, things are a little different. For VMs we report 'current' as being the current balloon level. For LXC we report 'current' as being the current container usage, as reported by memory.usage_in_bytes cgroup field.
I agree we should be more explicit about this all in the docs. For initial XML config, we should just raise an error if both <memory> and <currentMemory> are present and have different values, or possibly just clamp <currentMemory> to match <memory>.
In virDomainSetMemoryFlags() we should document that with container based virt, you cannot change the CURRENT_MEMORY setting, and with machine based virt you cannot change the MAX_MEMORY setting but containers should allow it. [2]
Thanks for answer, Daniel. So actual[1] and proposed[2] behaviour are different if in [2] you mean 'fail' when say 'cannot change'. So wouldn't fixing [1] to [2] be a degradation? It is a case of a really lame libvirt usage but nevertheless. Thes same question raises to XML config. If adding checks can break somebody?
Regards, Daniel

On Thu, Jan 14, 2016 at 04:23:59PM +0300, Nikolay Shirokovskiy wrote:
On 14.01.2016 16:01, Daniel P. Berrange wrote:
On Thu, Nov 12, 2015 at 11:11:31AM +0300, Nikolay Shirokovskiy wrote:
Hi, everyone.
I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains. So question is how to map this model to existing interface (it is unlikely to have a new interface for this case). I plan to make both parameters to have same meaning and be equal for containers and update virsh, API and xml model documentation accordingly.
I'd be happy to hear core developers opinions on this topic.
So from VM POV, the 'total_memory' represents the initial populated memory map. This is traditionally fixed at boot and cannot be changed while the VM is running.
'cur_balloon' represents the current memory after balloon adjustments and must be strictly less than or equal to total_memory.
When the VM is shutoff, both values can be changed, but if you want to increase 'cur_balloon' you must first increase 'total_memory'.
With LXC we essentially ignore 'cur_balloon' and just set the cgroups memory.limit_in_bytes to the 'total_memory' value. [1]
For reasons that escape me, we forbid changes to 'total_memory' in LXC driver, despite the fact that we could trivially allow them. We should fix that.
In the virDomainInfo struct, things are a little different. For VMs we report 'current' as being the current balloon level. For LXC we report 'current' as being the current container usage, as reported by memory.usage_in_bytes cgroup field.
I agree we should be more explicit about this all in the docs. For initial XML config, we should just raise an error if both <memory> and <currentMemory> are present and have different values, or possibly just clamp <currentMemory> to match <memory>.
In virDomainSetMemoryFlags() we should document that with container based virt, you cannot change the CURRENT_MEMORY setting, and with machine based virt you cannot change the MAX_MEMORY setting but containers should allow it. [2]
Thanks for answer, Daniel.
So actual[1] and proposed[2] behaviour are different if in [2] you mean 'fail' when say 'cannot change'. So wouldn't fixing [1] to [2] be a degradation? It is a case of a really lame libvirt usage but nevertheless. Thes same question raises to XML config. If adding checks can break somebody?
Actually looking again, current impl is worse than I thought :-( At initial startup, we set limit based on <memory> and ignore <currentMemory>. This is good. clamping <currentMemory> to <memory> would be fine in this respect. For the virDomainSetMemory() API, it seems we deny the option to change max_memory, but allow setting current memory and use to set the cgroups. This is kind of crazy, as it is inconsistent with what we do during startup :-( So based on that current impl, I think that for virDomainSetMemory we'll have to declare that MEMORY & MAX_MEMORY are considered to be identical, and make both have the same effect on cgroups. Otherwise you're right that we'd risk breaking apps. IOW, whenever we update one setting, we should update the other to match Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 14.01.2016 16:31, Daniel P. Berrange wrote:
On Thu, Jan 14, 2016 at 04:23:59PM +0300, Nikolay Shirokovskiy wrote:
On 14.01.2016 16:01, Daniel P. Berrange wrote:
On Thu, Nov 12, 2015 at 11:11:31AM +0300, Nikolay Shirokovskiy wrote:
Hi, everyone.
I plan to add means to configure vz containers memory setting and have trouble getting it done thru libvirt interface. Looks like current interface fits good for vm memory managment but its not clear how to use it with containers. First let's take aside memory hotplugging which is obviously not suitable for containers. Then memory interface is represented by 2 parameters: total_memory and cur_balloon. For VMs total_memory can't be changed at runtime, cur_ballon can't be greater than total_memory. But for containers memory model is different. We have only one parameter and it can be changed for running domains. So question is how to map this model to existing interface (it is unlikely to have a new interface for this case). I plan to make both parameters to have same meaning and be equal for containers and update virsh, API and xml model documentation accordingly.
I'd be happy to hear core developers opinions on this topic.
So from VM POV, the 'total_memory' represents the initial populated memory map. This is traditionally fixed at boot and cannot be changed while the VM is running.
'cur_balloon' represents the current memory after balloon adjustments and must be strictly less than or equal to total_memory.
When the VM is shutoff, both values can be changed, but if you want to increase 'cur_balloon' you must first increase 'total_memory'.
With LXC we essentially ignore 'cur_balloon' and just set the cgroups memory.limit_in_bytes to the 'total_memory' value. [1]
For reasons that escape me, we forbid changes to 'total_memory' in LXC driver, despite the fact that we could trivially allow them. We should fix that.
In the virDomainInfo struct, things are a little different. For VMs we report 'current' as being the current balloon level. For LXC we report 'current' as being the current container usage, as reported by memory.usage_in_bytes cgroup field.
I agree we should be more explicit about this all in the docs. For initial XML config, we should just raise an error if both <memory> and <currentMemory> are present and have different values, or possibly just clamp <currentMemory> to match <memory>.
In virDomainSetMemoryFlags() we should document that with container based virt, you cannot change the CURRENT_MEMORY setting, and with machine based virt you cannot change the MAX_MEMORY setting but containers should allow it. [2]
Thanks for answer, Daniel.
So actual[1] and proposed[2] behaviour are different if in [2] you mean 'fail' when say 'cannot change'. So wouldn't fixing [1] to [2] be a degradation? It is a case of a really lame libvirt usage but nevertheless. Thes same question raises to XML config. If adding checks can break somebody?
Actually looking again, current impl is worse than I thought :-(
At initial startup, we set limit based on <memory> and ignore <currentMemory>. This is good. clamping <currentMemory> to <memory> would be fine in this respect.
Oh. Now it clear. Clamping is to ignore given value and take value from different node. So in short desicion you described later in relation to API makes <memory> and <currentMemory> synonyms. But in case of XML we can be given both names with different values simultaneously. So to be backward compatible instead of failing we let <memory> silently overrule. Ok. I plan to document it, fix in lxc and implement in vz. Thanx.
For the virDomainSetMemory() API, it seems we deny the option to change max_memory, but allow setting current memory and use to set the cgroups. This is kind of crazy, as it is inconsistent with what we do during startup :-(
So based on that current impl, I think that for virDomainSetMemory we'll have to declare that MEMORY & MAX_MEMORY are considered to be identical, and make both have the same effect on cgroups. Otherwise you're right that we'd risk breaking apps. IOW, whenever we update one setting, we should update the other to match
Regards, Daniel
participants (4)
-
Daniel P. Berrange
-
Dmitry Guryanov
-
Maxim Nestratov
-
Nikolay Shirokovskiy