[libvirt-users] RHEL6 cgroup error after a few days of uptime

I have a RHEL6 that hosts many kvm virtual machines. It has been working fine for a couple years. I apply errata updates about once a week. In the last couple weeks, I've ran into a bug where the virtual machines start failing to start with a cgroup error message. If I reboot the host (very disruptive) then things start working normaly for a few days. Can I configure qemu/libvirt not to use cgroups at all as a temporary workaround? How would I do that? Here is example of trying to start a vm (that hasn't been started since boot) after the problem manifests itself. [root@virthost ~]# find /cgroup | grep r04s14 [root@virthost ~]# virsh start r04s14 error: Failed to start domain r04s14 error: Unable to create cgroup for r04s14: No such file or directory [root@virthost ~]# find /cgroup | grep r04s14 /cgroup/cpu/libvirt/qemu/r04s14 /cgroup/cpu/libvirt/qemu/r04s14/cpu.rt_period_us /cgroup/cpu/libvirt/qemu/r04s14/cpu.rt_runtime_us /cgroup/cpu/libvirt/qemu/r04s14/cpu.stat /cgroup/cpu/libvirt/qemu/r04s14/cpu.cfs_period_us /cgroup/cpu/libvirt/qemu/r04s14/cpu.cfs_quota_us /cgroup/cpu/libvirt/qemu/r04s14/cpu.shares /cgroup/cpu/libvirt/qemu/r04s14/cgroup.event_control /cgroup/cpu/libvirt/qemu/r04s14/notify_on_release /cgroup/cpu/libvirt/qemu/r04s14/cgroup.procs /cgroup/cpu/libvirt/qemu/r04s14/tasks /cgroup/cpuacct/libvirt/qemu/r04s14 /cgroup/cpuacct/libvirt/qemu/r04s14/cpuacct.stat /cgroup/cpuacct/libvirt/qemu/r04s14/cpuacct.usage_percpu /cgroup/cpuacct/libvirt/qemu/r04s14/cpuacct.usage /cgroup/cpuacct/libvirt/qemu/r04s14/cgroup.event_control /cgroup/cpuacct/libvirt/qemu/r04s14/notify_on_release /cgroup/cpuacct/libvirt/qemu/r04s14/cgroup.procs /cgroup/cpuacct/libvirt/qemu/r04s14/tasks [root@virthost ~]#

On Thu, Dec 13, 2012 at 01:31:27PM -0700, Dax Kelson wrote:
I have a RHEL6 that hosts many kvm virtual machines. It has been working fine for a couple years. I apply errata updates about once a week.
In the last couple weeks, I've ran into a bug where the virtual machines start failing to start with a cgroup error message. If I reboot the host (very disruptive) then things start working normaly for a few days.
Sounds like something has either unmounted your cgroups, or deleted the directories that libvirt created. I wonder if the cgconfig initscript is doing it perhaps, and getting triggered in a %post from an RPM script. In any case, you ought not need to reboot the host - restarting libvirtd will get it to re-create its cgroups.
Can I configure qemu/libvirt not to use cgroups at all as a temporary workaround? How would I do that?
Simply don't mount any on the host and libvirt won't use them. On RHEL6, the 'cgconfig' initscript is what mounts them at boot. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

What i did on our test environment. cggroup clear /etc/init.d/cgred stop /etc/init.d/cgconfig stop /etc/init.d/libvirtd restart And everything got back to normal. David 2012/12/13 Daniel P. Berrange <berrange@redhat.com>:
On Thu, Dec 13, 2012 at 01:31:27PM -0700, Dax Kelson wrote:
I have a RHEL6 that hosts many kvm virtual machines. It has been working fine for a couple years. I apply errata updates about once a week.
In the last couple weeks, I've ran into a bug where the virtual machines start failing to start with a cgroup error message. If I reboot the host (very disruptive) then things start working normaly for a few days.
Sounds like something has either unmounted your cgroups, or deleted the directories that libvirt created. I wonder if the cgconfig initscript is doing it perhaps, and getting triggered in a %post from an RPM script. In any case, you ought not need to reboot the host - restarting libvirtd will get it to re-create its cgroups.
Can I configure qemu/libvirt not to use cgroups at all as a temporary workaround? How would I do that?
Simply don't mount any on the host and libvirt won't use them. On RHEL6, the 'cgconfig' initscript is what mounts them at boot.
Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|
_______________________________________________ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users

Thanks for the tips on how to recover without rebooting. At the time getting it working again as quickly as possible was the priority, so the box was rebooted. The next time things go south I'll try this. Today was the 3rd time in two weeks that this problem appeared. As I said earlier, this box has been running for months without incident. Any ideas about the root cause? Thanks, Dax Kelson
participants (3)
-
Daniel P. Berrange
-
David Cruz
-
Dax Kelson