[libvirt] [PATCH 1/5] Fix formatting of XML for an inactive guest

If the virDomainDefPtr object has an 'id' of -1, then forcably set the VIR_DOMAIN_XML_INACTIVE flag to ensure generated XML does not include any cruft from the previously running guest such as console PTY path, or VNC port. * src/conf/domain_conf.c: Set VIR_DOMAIN_XML_INACTIVE if def->id is -1. Replace checks for def->id == -1 with check against flags & VIR_DOMAIN_XML_INACTIVE. --- src/conf/domain_conf.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9e37452..c2c07ca 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4407,7 +4407,6 @@ virDomainInputDefFormat(virConnectPtr conn, static int virDomainGraphicsDefFormat(virConnectPtr conn, virBufferPtr buf, - virDomainDefPtr vm, virDomainGraphicsDefPtr def, int flags) { @@ -4424,7 +4423,7 @@ virDomainGraphicsDefFormat(virConnectPtr conn, switch (def->type) { case VIR_DOMAIN_GRAPHICS_TYPE_VNC: if (def->data.vnc.port && - (!def->data.vnc.autoport || vm->id != -1)) + (!def->data.vnc.autoport || !(flags & VIR_DOMAIN_XML_INACTIVE))) virBufferVSprintf(buf, " port='%d'", def->data.vnc.port); else if (def->data.vnc.autoport) @@ -4579,7 +4578,10 @@ char *virDomainDefFormat(virConnectPtr conn, goto cleanup; } - if (def->id >= 0) + if (def->id == -1) + flags |= VIR_DOMAIN_XML_INACTIVE; + + if (!(flags & VIR_DOMAIN_XML_INACTIVE)) virBufferVSprintf(&buf, "<domain type='%s' id='%d'>\n", type, def->id); else virBufferVSprintf(&buf, "<domain type='%s'>\n", type); @@ -4770,7 +4772,7 @@ char *virDomainDefFormat(virConnectPtr conn, goto cleanup; for (n = 0 ; n < def->ngraphics ; n++) - if (virDomainGraphicsDefFormat(conn, &buf, def, def->graphics[n], flags) < 0) + if (virDomainGraphicsDefFormat(conn, &buf, def->graphics[n], flags) < 0) goto cleanup; } -- 1.6.2.5

There is a race condition in HAL driver startup where the callback can get triggered before we have finished startup. This then causes a deadlock in the driver. * src/node_device/node_device_hal.c: RElease driver lock before registering DBus callbacks --- src/node_device/node_device_hal.c | 14 ++++++++++---- 1 files changed, 10 insertions(+), 4 deletions(-) diff --git a/src/node_device/node_device_hal.c b/src/node_device/node_device_hal.c index 818c7d6..918a3a9 100644 --- a/src/node_device/node_device_hal.c +++ b/src/node_device/node_device_hal.c @@ -742,6 +742,16 @@ static int halDeviceMonitorStartup(int privileged ATTRIBUTE_UNUSED) goto failure; } + /* Populate with known devices */ + driverState->privateData = hal_ctx; + + /* We need to unlock state now, since setting these callbacks cause + * a dbus RPC call, and while this call is waiting for the reply, + * a signal may already arrive, triggering the callback and thus + * requiring the lock ! + */ + nodeDeviceUnlock(driverState); + /* Register HAL event callbacks */ if (!libhal_ctx_set_device_added(hal_ctx, device_added) || !libhal_ctx_set_device_removed(hal_ctx, device_removed) || @@ -753,10 +763,6 @@ static int halDeviceMonitorStartup(int privileged ATTRIBUTE_UNUSED) goto failure; } - /* Populate with known devices */ - driverState->privateData = hal_ctx; - - nodeDeviceUnlock(driverState); udi = libhal_get_all_devices(hal_ctx, &num_devs, &err); if (udi == NULL) { VIR_ERROR0("libhal_get_all_devices failed\n"); -- 1.6.2.5

On Thu, Nov 12, 2009 at 03:00:48PM +0000, Daniel P. Berrange wrote:
There is a race condition in HAL driver startup where the callback can get triggered before we have finished startup. This then causes a deadlock in the driver.
* src/node_device/node_device_hal.c: RElease driver lock before registering DBus callbacks --- src/node_device/node_device_hal.c | 14 ++++++++++---- 1 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/src/node_device/node_device_hal.c b/src/node_device/node_device_hal.c index 818c7d6..918a3a9 100644 --- a/src/node_device/node_device_hal.c +++ b/src/node_device/node_device_hal.c @@ -742,6 +742,16 @@ static int halDeviceMonitorStartup(int privileged ATTRIBUTE_UNUSED) goto failure; }
+ /* Populate with known devices */ + driverState->privateData = hal_ctx; + + /* We need to unlock state now, since setting these callbacks cause + * a dbus RPC call, and while this call is waiting for the reply, + * a signal may already arrive, triggering the callback and thus + * requiring the lock ! + */ + nodeDeviceUnlock(driverState); + /* Register HAL event callbacks */ if (!libhal_ctx_set_device_added(hal_ctx, device_added) || !libhal_ctx_set_device_removed(hal_ctx, device_removed) || @@ -753,10 +763,6 @@ static int halDeviceMonitorStartup(int privileged ATTRIBUTE_UNUSED) goto failure; }
- /* Populate with known devices */ - driverState->privateData = hal_ctx; - - nodeDeviceUnlock(driverState); udi = libhal_get_all_devices(hal_ctx, &num_devs, &err); if (udi == NULL) { VIR_ERROR0("libhal_get_all_devices failed\n");
Tricky, ACK, Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

In the scenario where the cgroups were mounted but the particular group did not exist, and the caller had not requested auto-creation, the code would fail to return an error condition. This caused the lxc_controller to think the cgroup existed, and it then later failed when attempting to use it * src/util/cgroup.c: Raise an error if the cgroup path does not exist --- src/util/cgroup.c | 17 ++++++++--------- 1 files changed, 8 insertions(+), 9 deletions(-) diff --git a/src/util/cgroup.c b/src/util/cgroup.c index bdd4eb6..c80cf50 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -442,7 +442,7 @@ static int virCgroupCpuSetInherit(virCgroupPtr parent, virCgroupPtr group) return rc; } -static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group) +static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, int create) { int i; int rc = 0; @@ -461,7 +461,8 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group) VIR_DEBUG("Make controller %s", path); if (access(path, F_OK) != 0) { - if (mkdir(path, 0755) < 0) { + if (!create || + mkdir(path, 0755) < 0) { rc = -errno; VIR_FREE(path); break; @@ -548,7 +549,7 @@ static int virCgroupAppRoot(int privileged, if (rc != 0) goto cleanup; - rc = virCgroupMakeGroup(rootgrp, *group); + rc = virCgroupMakeGroup(rootgrp, *group, 1); cleanup: virCgroupFree(&rootgrp); @@ -647,9 +648,8 @@ int virCgroupForDriver(const char *name, rc = virCgroupNew(path, group); VIR_FREE(path); - if (rc == 0 && - create) { - rc = virCgroupMakeGroup(rootgrp, *group); + if (rc == 0) { + rc = virCgroupMakeGroup(rootgrp, *group, create); if (rc != 0) virCgroupFree(group); } @@ -695,9 +695,8 @@ int virCgroupForDomain(virCgroupPtr driver, rc = virCgroupNew(path, group); VIR_FREE(path); - if (rc == 0 && - create) { - rc = virCgroupMakeGroup(driver, *group); + if (rc == 0) { + rc = virCgroupMakeGroup(driver, *group, create); if (rc != 0) virCgroupFree(group); } -- 1.6.2.5

On Fri, Nov 13, 2009 at 12:00 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
In the scenario where the cgroups were mounted but the particular group did not exist, and the caller had not requested auto-creation, the code would fail to return an error condition. This caused the lxc_controller to think the cgroup existed, and it then later failed when attempting to use it
* src/util/cgroup.c: Raise an error if the cgroup path does not exist --- src/util/cgroup.c | 17 ++++++++--------- 1 files changed, 8 insertions(+), 9 deletions(-)
diff --git a/src/util/cgroup.c b/src/util/cgroup.c index bdd4eb6..c80cf50 100644 --- a/src/util/cgroup.c +++ b/src/util/cgroup.c @@ -442,7 +442,7 @@ static int virCgroupCpuSetInherit(virCgroupPtr parent, virCgroupPtr group) return rc; }
-static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group) +static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group, int create) { int i; int rc = 0; @@ -461,7 +461,8 @@ static int virCgroupMakeGroup(virCgroupPtr parent, virCgroupPtr group)
VIR_DEBUG("Make controller %s", path); if (access(path, F_OK) != 0) { - if (mkdir(path, 0755) < 0) { + if (!create || + mkdir(path, 0755) < 0) {
Got it. if creation is not requested, then errno of access(2) is returned before attempting mkdir(2). The code looks correct to me, ACK. ozaki-r
rc = -errno; VIR_FREE(path); break; @@ -548,7 +549,7 @@ static int virCgroupAppRoot(int privileged, if (rc != 0) goto cleanup;
- rc = virCgroupMakeGroup(rootgrp, *group); + rc = virCgroupMakeGroup(rootgrp, *group, 1);
cleanup: virCgroupFree(&rootgrp); @@ -647,9 +648,8 @@ int virCgroupForDriver(const char *name, rc = virCgroupNew(path, group); VIR_FREE(path);
- if (rc == 0 && - create) { - rc = virCgroupMakeGroup(rootgrp, *group); + if (rc == 0) { + rc = virCgroupMakeGroup(rootgrp, *group, create); if (rc != 0) virCgroupFree(group); } @@ -695,9 +695,8 @@ int virCgroupForDomain(virCgroupPtr driver, rc = virCgroupNew(path, group); VIR_FREE(path);
- if (rc == 0 && - create) { - rc = virCgroupMakeGroup(driver, *group); + if (rc == 0) { + rc = virCgroupMakeGroup(driver, *group, create); if (rc != 0) virCgroupFree(group); } -- 1.6.2.5
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

The wrong variable was being passed in with the LXC event callback resulting in a later deadlock or crash * src/lxc/lxc_driver.c: Pass 'vm' instead of 'driver' to event callback --- src/lxc/lxc_driver.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-) diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 2baff65..7c78df2 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -941,7 +941,8 @@ static void lxcMonitorEvent(int watch, } cleanup: - virDomainObjUnlock(vm); + if (vm) + virDomainObjUnlock(vm); if (event) { lxcDriverLock(driver); lxcDomainEventQueue(driver, event); @@ -1225,7 +1226,7 @@ static int lxcVmStart(virConnectPtr conn, vm->monitor, VIR_EVENT_HANDLE_ERROR | VIR_EVENT_HANDLE_HANGUP, lxcMonitorEvent, - driver, NULL)) < 0) { + vm, NULL)) < 0) { lxcVmTerminate(conn, driver, vm, 0); goto cleanup; } -- 1.6.2.5

On Fri, Nov 13, 2009 at 12:00 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
The wrong variable was being passed in with the LXC event callback resulting in a later deadlock or crash
* src/lxc/lxc_driver.c: Pass 'vm' instead of 'driver' to event callback --- src/lxc/lxc_driver.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 2baff65..7c78df2 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -941,7 +941,8 @@ static void lxcMonitorEvent(int watch, }
cleanup: - virDomainObjUnlock(vm); + if (vm) + virDomainObjUnlock(vm);
Hem, if vm is possible to be NULL, above virDomainObjLock(vm) likely fails prior to here. So we also need non-NULL check before virDomainObjLock?
if (event) { lxcDriverLock(driver); lxcDomainEventQueue(driver, event); @@ -1225,7 +1226,7 @@ static int lxcVmStart(virConnectPtr conn, vm->monitor, VIR_EVENT_HANDLE_ERROR | VIR_EVENT_HANDLE_HANGUP, lxcMonitorEvent, - driver, NULL)) < 0) { + vm, NULL)) < 0) {
Oh, horrible ;< ACK to this one anyway. ozaki-r
lxcVmTerminate(conn, driver, vm, 0); goto cleanup; } -- 1.6.2.5
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Fri, Nov 13, 2009 at 06:01:46AM +0900, Ryota Ozaki wrote:
On Fri, Nov 13, 2009 at 12:00 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
The wrong variable was being passed in with the LXC event callback resulting in a later deadlock or crash
* src/lxc/lxc_driver.c: Pass 'vm' instead of 'driver' to event callback --- src/lxc/lxc_driver.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 2baff65..7c78df2 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -941,7 +941,8 @@ static void lxcMonitorEvent(int watch, }
cleanup: - virDomainObjUnlock(vm); + if (vm) + virDomainObjUnlock(vm);
Hem, if vm is possible to be NULL, above virDomainObjLock(vm) likely fails prior to here. So we also need non-NULL check before virDomainObjLock?
No, it is only needed in the cleanup path, due to this bit of code: if (!vm->persistent) { virDomainRemoveInactive(&driver->domains, vm); vm = NULL; } cleanup: virDomainObjUnlock(vm); Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Fri, Nov 13, 2009 at 8:49 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
On Fri, Nov 13, 2009 at 06:01:46AM +0900, Ryota Ozaki wrote:
On Fri, Nov 13, 2009 at 12:00 AM, Daniel P. Berrange <berrange@redhat.com> wrote:
The wrong variable was being passed in with the LXC event callback resulting in a later deadlock or crash
* src/lxc/lxc_driver.c: Pass 'vm' instead of 'driver' to event callback --- src/lxc/lxc_driver.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-)
diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 2baff65..7c78df2 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -941,7 +941,8 @@ static void lxcMonitorEvent(int watch, }
cleanup: - virDomainObjUnlock(vm); + if (vm) + virDomainObjUnlock(vm);
Hem, if vm is possible to be NULL, above virDomainObjLock(vm) likely fails prior to here. So we also need non-NULL check before virDomainObjLock?
No, it is only needed in the cleanup path, due to this bit of code:
if (!vm->persistent) { virDomainRemoveInactive(&driver->domains, vm); vm = NULL; }
cleanup: virDomainObjUnlock(vm);
Oh, I missed that... You're right, the fix is correct. ozaki-r
Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

The 'virsh console' command did not check if the domain was already running before attempting to fetch the XML and extract the console PTY path. This caused a slightly unhelpful / misleading error message for the user. The explicit check ensures the user gets an explicit 'domain is not running' message. * tools/virsh.c: Validate that state != VIR_DOMAIN_SHUTOFF in virsh console command --- tools/virsh.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-) diff --git a/tools/virsh.c b/tools/virsh.c index 0d0ebca..9faac35 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -523,6 +523,7 @@ cmdRunConsole(vshControl *ctl, virDomainPtr dom) char *doc; char *thatHost = NULL; char *thisHost = NULL; + virDomainInfo dominfo; if (!(thisHost = virGetHostname(ctl->conn))) { vshError(ctl, "%s", _("Failed to get local hostname")); @@ -539,6 +540,16 @@ cmdRunConsole(vshControl *ctl, virDomainPtr dom) goto cleanup; } + if (virDomainGetInfo(dom, &dominfo) < 0) { + vshError(ctl, "%s", _("Unable to get domain status")); + goto cleanup; + } + + if (dominfo.state == VIR_DOMAIN_SHUTOFF) { + vshError(ctl, "%s", _("The domain is not running")); + goto cleanup; + } + doc = virDomainGetXMLDesc(dom, 0); if (!doc) goto cleanup; -- 1.6.2.5

On Thu, Nov 12, 2009 at 03:00:51PM +0000, Daniel P. Berrange wrote:
The 'virsh console' command did not check if the domain was already running before attempting to fetch the XML and extract the console PTY path. This caused a slightly unhelpful / misleading error message for the user. The explicit check ensures the user gets an explicit 'domain is not running' message.
* tools/virsh.c: Validate that state != VIR_DOMAIN_SHUTOFF in virsh console command --- tools/virsh.c | 11 +++++++++++ 1 files changed, 11 insertions(+), 0 deletions(-)
diff --git a/tools/virsh.c b/tools/virsh.c index 0d0ebca..9faac35 100644 --- a/tools/virsh.c +++ b/tools/virsh.c @@ -523,6 +523,7 @@ cmdRunConsole(vshControl *ctl, virDomainPtr dom) char *doc; char *thatHost = NULL; char *thisHost = NULL; + virDomainInfo dominfo;
if (!(thisHost = virGetHostname(ctl->conn))) { vshError(ctl, "%s", _("Failed to get local hostname")); @@ -539,6 +540,16 @@ cmdRunConsole(vshControl *ctl, virDomainPtr dom) goto cleanup; }
+ if (virDomainGetInfo(dom, &dominfo) < 0) { + vshError(ctl, "%s", _("Unable to get domain status")); + goto cleanup; + } + + if (dominfo.state == VIR_DOMAIN_SHUTOFF) { + vshError(ctl, "%s", _("The domain is not running")); + goto cleanup; + } + doc = virDomainGetXMLDesc(dom, 0); if (!doc) goto cleanup;
Ah, right, ACK ! Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/

On Thu, Nov 12, 2009 at 03:00:47PM +0000, Daniel P. Berrange wrote:
If the virDomainDefPtr object has an 'id' of -1, then forcably set the VIR_DOMAIN_XML_INACTIVE flag to ensure generated XML does not include any cruft from the previously running guest such as console PTY path, or VNC port.
* src/conf/domain_conf.c: Set VIR_DOMAIN_XML_INACTIVE if def->id is -1. Replace checks for def->id == -1 with check against flags & VIR_DOMAIN_XML_INACTIVE. --- src/conf/domain_conf.c | 10 ++++++---- 1 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9e37452..c2c07ca 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -4407,7 +4407,6 @@ virDomainInputDefFormat(virConnectPtr conn, static int virDomainGraphicsDefFormat(virConnectPtr conn, virBufferPtr buf, - virDomainDefPtr vm, virDomainGraphicsDefPtr def, int flags) { @@ -4424,7 +4423,7 @@ virDomainGraphicsDefFormat(virConnectPtr conn, switch (def->type) { case VIR_DOMAIN_GRAPHICS_TYPE_VNC: if (def->data.vnc.port && - (!def->data.vnc.autoport || vm->id != -1)) + (!def->data.vnc.autoport || !(flags & VIR_DOMAIN_XML_INACTIVE))) virBufferVSprintf(buf, " port='%d'", def->data.vnc.port); else if (def->data.vnc.autoport) @@ -4579,7 +4578,10 @@ char *virDomainDefFormat(virConnectPtr conn, goto cleanup; }
- if (def->id >= 0) + if (def->id == -1) + flags |= VIR_DOMAIN_XML_INACTIVE; + + if (!(flags & VIR_DOMAIN_XML_INACTIVE)) virBufferVSprintf(&buf, "<domain type='%s' id='%d'>\n", type, def->id); else virBufferVSprintf(&buf, "<domain type='%s'>\n", type); @@ -4770,7 +4772,7 @@ char *virDomainDefFormat(virConnectPtr conn, goto cleanup;
for (n = 0 ; n < def->ngraphics ; n++) - if (virDomainGraphicsDefFormat(conn, &buf, def, def->graphics[n], flags) < 0) + if (virDomainGraphicsDefFormat(conn, &buf, def->graphics[n], flags) < 0) goto cleanup; }
ACK Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Ryota Ozaki