[libvirt] [RFC] cgroup settings and systemd daemon-reload conflict

Hi, all. It turns out that systemd daemon-reload reset settings that are managable thru 'systemctl set-property' interface.
virsh schedinfo tst3 | grep global_quota global_quota : -1 virsh schedinfo tst3 --set global_quota=50000 | grep global_quota global_quota : 50000 systemctl daemon-reload virsh schedinfo tst3 | grep global_quota global_quota : -1
This behaviour does not limited to cpu controller, same for blkio for example. I checked different versions of systemd (219 - Feb 15, and quite recent 236 - Dec 17) to make sure it is not kind of bug of old version. So systemd does not play well with direct writes to cgroup parameters that managable thru systemd. Looks like libvirtd needs to use systemd's dbus interface to change all such parameters. I only wonder how this can be unnoticed for such long time (creating cgroup for domain thru systemd - Jul 2013) as daemon-reload is called upon libvirtd package update. May be I miss something? Nikolay

ping On 30.01.2018 10:34, Nikolay Shirokovskiy wrote:
Hi, all.
It turns out that systemd daemon-reload reset settings that are managable thru 'systemctl set-property' interface.
virsh schedinfo tst3 | grep global_quota global_quota : -1 virsh schedinfo tst3 --set global_quota=50000 | grep global_quota global_quota : 50000 systemctl daemon-reload virsh schedinfo tst3 | grep global_quota global_quota : -1
This behaviour does not limited to cpu controller, same for blkio for example. I checked different versions of systemd (219 - Feb 15, and quite recent 236 - Dec 17) to make sure it is not kind of bug of old version. So systemd does not play well with direct writes to cgroup parameters that managable thru systemd. Looks like libvirtd needs to use systemd's dbus interface to change all such parameters.
I only wonder how this can be unnoticed for such long time (creating cgroup for domain thru systemd - Jul 2013) as daemon-reload is called upon libvirtd package update. May be I miss something?
Nikolay
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Tue, Jan 30, 2018 at 10:34:14AM +0300, Nikolay Shirokovskiy wrote:
Hi, all.
It turns out that systemd daemon-reload reset settings that are managable thru 'systemctl set-property' interface.
virsh schedinfo tst3 | grep global_quota global_quota : -1 virsh schedinfo tst3 --set global_quota=50000 | grep global_quota global_quota : 50000 systemctl daemon-reload virsh schedinfo tst3 | grep global_quota global_quota : -1
This behaviour does not limited to cpu controller, same for blkio for example. I checked different versions of systemd (219 - Feb 15, and quite recent 236 - Dec 17) to make sure it is not kind of bug of old version. So systemd does not play well with direct writes to cgroup parameters that managable thru systemd. Looks like libvirtd needs to use systemd's dbus interface to change all such parameters.
I only wonder how this can be unnoticed for such long time (creating cgroup for domain thru systemd - Jul 2013) as daemon-reload is called upon libvirtd package update. May be I miss something?
I guess the reasons its unnoticed is that using global_* constants is relatively uncommon. Traditionally we had the per-vCPU tunables for quota and the global stuff was a newer addition. Also for a long time I didn't think systemd even supported the quota+period settings in unit files at all. It is incredibly frustrating that systemd would just reset th cgroups settings on daemon reload - something which ostensibly is supposed to be used to reload config files on disk. Since this behaviour has existed a long time though, I guess we have little choice but to cope with it now. We kind of need to use the systemd dbus API more in order to support cgroups v2 properly too. None of this will be pleasant changes to make though as this area of code is fairly complex. Also I don't think there's any nice way to introspect what properties a given version of systemd supports, which makes it hard to know when to set direct vs when to set via dbus. I think we would have to create the machine via dbus, then read back the auto-generated unit file for the machine to see what properties are listed as existing, then we can see which we now need to set via dbus. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 14.02.2018 13:34, Daniel P. Berrangé wrote:
On Tue, Jan 30, 2018 at 10:34:14AM +0300, Nikolay Shirokovskiy wrote:
Hi, all.
It turns out that systemd daemon-reload reset settings that are managable thru 'systemctl set-property' interface.
virsh schedinfo tst3 | grep global_quota global_quota : -1 virsh schedinfo tst3 --set global_quota=50000 | grep global_quota global_quota : 50000 systemctl daemon-reload virsh schedinfo tst3 | grep global_quota global_quota : -1
This behaviour does not limited to cpu controller, same for blkio for example. I checked different versions of systemd (219 - Feb 15, and quite recent 236 - Dec 17) to make sure it is not kind of bug of old version. So systemd does not play well with direct writes to cgroup parameters that managable thru systemd. Looks like libvirtd needs to use systemd's dbus interface to change all such parameters.
I only wonder how this can be unnoticed for such long time (creating cgroup for domain thru systemd - Jul 2013) as daemon-reload is called upon libvirtd package update. May be I miss something?
I guess the reasons its unnoticed is that using global_* constants is relatively uncommon. Traditionally we had the per-vCPU tunables for quota and the global stuff was a newer addition. Also for a long time I didn't think systemd even supported the quota+period settings in unit files at all.
It is incredibly frustrating that systemd would just reset th cgroups settings on daemon reload - something which ostensibly is supposed to be used to reload config files on disk. Since this behaviour has existed a long time though, I guess we have little choice but to cope with it now.
We kind of need to use the systemd dbus API more in order to support cgroups v2 properly too.
None of this will be pleasant changes to make though as this area of code is fairly complex. Also I don't think there's any nice way to introspect what properties a given version of systemd supports, which makes it hard to know when to set direct vs when to set via dbus. I think we would have to create the machine via dbus, then read back the auto-generated unit file for the machine to see what properties are listed as existing, then we can see which we now need to set via dbus.
Worse then that. systemd hardcodes cpu.cfs_period_us to 100ms so even if we set cpu.cfs_quota_us via systemd we will still have cpu.cfs_period_us reseted on daemon-reload. The best solution I see is to teach systemd not to store these runtime settings in unit files but rather consult cgroup itself. By the way some versions of systemd behave a bit differently. For example for recent censos7 version one need to start another domain thru libvirt after daemon-reload to get the described reset. Nikolay

On 18.05.2018 16:04, Nikolay Shirokovskiy wrote:
On 14.02.2018 13:34, Daniel P. Berrangé wrote:
On Tue, Jan 30, 2018 at 10:34:14AM +0300, Nikolay Shirokovskiy wrote:
Hi, all.
It turns out that systemd daemon-reload reset settings that are managable thru 'systemctl set-property' interface.
virsh schedinfo tst3 | grep global_quota global_quota : -1 virsh schedinfo tst3 --set global_quota=50000 | grep global_quota global_quota : 50000 systemctl daemon-reload virsh schedinfo tst3 | grep global_quota global_quota : -1
This behaviour does not limited to cpu controller, same for blkio for example. I checked different versions of systemd (219 - Feb 15, and quite recent 236 - Dec 17) to make sure it is not kind of bug of old version. So systemd does not play well with direct writes to cgroup parameters that managable thru systemd. Looks like libvirtd needs to use systemd's dbus interface to change all such parameters.
I only wonder how this can be unnoticed for such long time (creating cgroup for domain thru systemd - Jul 2013) as daemon-reload is called upon libvirtd package update. May be I miss something?
I guess the reasons its unnoticed is that using global_* constants is relatively uncommon. Traditionally we had the per-vCPU tunables for quota and the global stuff was a newer addition. Also for a long time I didn't think systemd even supported the quota+period settings in unit files at all.
It is incredibly frustrating that systemd would just reset th cgroups settings on daemon reload - something which ostensibly is supposed to be used to reload config files on disk. Since this behaviour has existed a long time though, I guess we have little choice but to cope with it now.
We kind of need to use the systemd dbus API more in order to support cgroups v2 properly too.
None of this will be pleasant changes to make though as this area of code is fairly complex. Also I don't think there's any nice way to introspect what properties a given version of systemd supports, which makes it hard to know when to set direct vs when to set via dbus. I think we would have to create the machine via dbus, then read back the auto-generated unit file for the machine to see what properties are listed as existing, then we can see which we now need to set via dbus.
Worse then that. systemd hardcodes cpu.cfs_period_us to 100ms so even if we set cpu.cfs_quota_us via systemd we will still have cpu.cfs_period_us reseted on daemon-reload.
The best solution I see is to teach systemd not to store these runtime settings in unit files but rather consult cgroup itself.
By the way some versions of systemd behave a bit differently. For example for recent censos7 version one need to start another domain thru libvirt after daemon-reload to get the described reset.
Given all the said difficulties I decided not to fix the problem on libvirt side but instead patch downstream systemd so that cases when VM's scope cgroup it tuned only via libvirtd are fixed. Fortunately there is upstream systemd patch [1] that helps at least for systemd version of centos7. Nikolay [1] commit 00e4959506b36a0b3c990846dd46e5c3b9ba4267 Author: Franck Bui <fbui@suse.com> Date: Mon Mar 27 18:00:54 2017 +0200 core: when deserializing a unit, fully restore its cgroup state
participants (2)
-
Daniel P. Berrangé
-
Nikolay Shirokovskiy