Yes, it's just a race condition here. It's not LXC, it's just qemu-driver, and
it's not using systemd for cgroup management.
(Running hypervisor: QEMU 1.5.1)
We can run these two command to clear cgroup dirctory 'machine' first:
service cgconfig restart
service cgred restart
Then we start multi VMs with 'virsh start' at the same time.
Stack 1:
#0 qemuInitCgroup (driver=0x7f13a0d01570, vm=0x7f138c0116a0, startup=true)
at qemu/qemu_cgroup.c:742
#1 0x00007f1394595b7d in qemuSetupCgroup (driver=0x7f13a0d01570,
vm=0x7f138c0116a0, nodemask=0x0) at qemu/qemu_cgroup.c:857
#2 0x00007f13945b40c5 in qemuProcessStart (conn=0x7f13a0d92870,
driver=0x7f13a0d01570, vm=0x7f138c0116a0, migrateFrom=0x0, stdin_fd=-1,
stdin_path=0x0, snapshot=0x0, vmop=VIR_NETDEV_VPORT_PROFILE_OP_CREATE,
flags=1) at qemu/qemu_process.c:3828
#3 0x00007f1394606c2c in qemuDomainObjStart (conn=0x7f13a0d92870,
driver=0x7f13a0d01570, vm=0x7f138c0116a0, flags=0)
at qemu/qemu_driver.c:5852
#4 0x00007f1394606e82 in qemuDomainCreateWithFlags (dom=0x7f138c01c490,
flags=0) at qemu/qemu_driver.c:5904
#5 0x00007f1394606f0b in qemuDomainCreate (dom=0x7f138c01c490)
at qemu/qemu_driver.c:5922
#6 0x00007f139ffc6e51 in virDomainCreate (domain=0x7f138c01c490)
at libvirt.c:9357
#7 0x00007f13a0a46ca2 in remoteDispatchDomainCreate (server=0x7f13a0cd9500,
client=0x7f138c01b060, msg=0x7f138c014920, rerr=0x7f13995d0b10,
args=0x7f138c01c4d0) at remote_dispatch.h:2931
#8 0x00007f13a0a46da4 in remoteDispatchDomainCreateHelper (
Stack 2:
#0 virCgroupMakeGroup (parent=0x7f138c01e1a0, group=0x7f138c0080f0,
create=true, flags=0) at util/vircgroup.c:751
#1 0x00007f139fec4f89 in virCgroupNewPartition (
path=0x7f138c008ee0 "/machine", create=true, controllers=-1,
group=0x7f13995d00f8) at util/vircgroup.c:1286
#2 0x00007f1394593888 in qemuInitCgroup (driver=0x7f13a0d01570,
vm=0x7f138c0116a0, startup=true) at qemu/qemu_cgroup.c:779
Thread A and Thread B start VMs at the same time.
In virCgroupMakeGroup:
if (!virFileExists(path)) {// Thread A and Thread B do the test at the same time
and both passed.
if (!create ||
mkdir(path, 0755) < 0) {//If thread A succeed here, then B will fail,
and B will do clear to remove the 'machine' directory, then A may fail to find the
dirctory removed by B.
________________________________________
发件人: Daniel P. Berrange [berrange(a)redhat.com]
发送时间: 2014年3月21日 1:16
收件人: Michal Privoznik
抄送: Wangyufei (James); libvir-list(a)redhat.com; Moyuxiang; Zhaoyanbin (A); Wangrui (K)
主题: Re: [libvirt] [PATCH] cgroup: Fix start VMs coincidently failed
On Thu, Mar 20, 2014 at 05:04:13PM +0100, Michal Privoznik wrote:
On 20.03.2014 08:24, Wangyufei (James) wrote:
>>From 0163328efa67da1d63e504c86e323db5affa378f Mon Sep 17 00:00:00 2001
>From: Wang Yufei <james.wangyufei(a)huawei.com>
>Date: Thu, 20 Mar 2014 07:14:01 +0000
>Subject: [PATCH] cgroup: Fix start VMs coincidently failed
>When I start multi VMs coincidently and any of the cgroup directories
>named machine doesn't exist. There's a chance that VM start failed because
>of creating directory failed:
>Unable to initialize /machine cgroup: File exists
>When the errno returned by mkdir in virCgroupMakeGroup is EEXIST,
>we should pass it through and continue to start the VM.
>Signed-off-by: Wang Yufei <james.wangyufei(a)huawei.com>
>---
> src/util/vircgroup.c | 4 ++++
> 1 file changed, 4 insertions(+)
>diff --git a/src/util/vircgroup.c b/src/util/vircgroup.c
>index c5925b1..a10d6f6 100644
>--- a/src/util/vircgroup.c
>+++ b/src/util/vircgroup.c
>@@ -924,6 +924,10 @@ virCgroupMakeGroup(virCgroupPtr parent,
> if (!virFileExists(path)) {
> if (!create ||
> mkdir(path, 0755) < 0) {
>+ if (errno == EEXIST) {
>+ VIR_FREE(path);
>+ continue;
>+ }
> /* With a kernel that doesn't support multi-level directory
> * for blkio controller, libvirt will fail and disable all
> * other controllers even though they are available. So
>
NACK. Prior to starting a domain we make sure that no historical
cgroup is lying around. So if we don't remove the cgroup there
that's the actual bug and this just shadows it. We can't guarantee
anything if the old cgroup is not removed and the new one is created
by us. However, we are not removing the stale cgroup in case of LXC
only in QEMU. Is it LXC that you are seeing this error on?
I think there is actually a genuine race condition here, at least when
using systemd for cgroup management.
When we invoke "CreateMachine" in the systemd-machined DBus API, it will
only do the directory hierarchy /sys/fs/cgroup/systemd/some/sub/dir/guestname
If the other resource controllers are mounted seperately, libvirt then has
to manually create dirs /sys/fs/cgroup/{cpu,cpuacct,blkio,...}/some/sub/dir/guestname
I believe it is thus entirely possible for there to be a race in creating
the intermediate nods in this tree (ie the /some/sub/dir part) which may
be common to many guests.
When not using systemd, we require that the admin has pre-created the
/some/sub/dir part for all resource controllers, so we shouldn't have
a race in that non-systemd case.
Regards,
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|