[libvirt] [RFC PATCH 0/8] Make loading domains with invalid XML possible

We always had to deal with new parsing errors in a weird way. All of them needed to go into functions starting the domains. That messes up the code, it's confusing to newcomers and so on. I once had an idea that we can handle this stuff. We know what failed, we have the XML that failed parsing and if the problem is not in the domain name nor UUID, we can create a domain object out of that as well. This way we are able to do something we weren't with this series applied. Example follows. Assume "dummy" is a domain with invalid XML (I just modified the file). Now, just for the purpose of this silly example, let's say we allowed domains with archtecture x86_*, but now we've realized that x86_64 is the only one we want to allow, but there already is a domain defined that has <type arch='x86_256' .../>. With the current master, the domain would be lost, so we would need to modify the funstion starting the domain (e.g. qemuProcessStart for qemu domain). However, with this series, it behaves like this: # virsh list --all Id Name State ---------------------------------------------------- - dummy shut off # virsh domstate --reason dummy shut off (invalid XML) # virsh start dummy error: Failed to start domain dummy error: XML error: domain 'dummy' was not loaded due to an XML error (unsupported configuration: Unknown architecture x86_256), please redefine it # VISUAL='sed -i s/x86_256/x86_64/' virsh edit dummy Domain dummy XML configuration edited. # virsh domstate --reason dummy shut off (unknown) # virsh start dummy Domain dummy started This is a logical next step for us to clean and separate parsing and starting, getting rid of some old code without sacrifying compatibility and maybe generating parser code in the future. Why is this an RFC? - I can certainly easily imagine some people who might not like this - It doesn't work for all drivers (yet) - It does handle status XMLs, but only slightly. All the handling is done in a commit message that says something along the lines of "beware, we should still be careful, but not as much as before". - The error type used for the message is one that is already used for something else and we might want a new type of error for exactly this one error message, although it might be enough for some to have the possibility of checking this by querying the domain state and checking the reason. - Just so you know I came up with something new for which there's no BZ (or at least yet) =) Martin Kletzander (8): conf, virsh: Add new domain shutoff reason qemu: Few whitespace cleanups conf: Extract name-parsing into its own function conf: Extract UUID parsing into its own function conf: Optionally keep domains with invalid XML qemu: Don't lookup invalid domains unless specified otherwise qemu: Prepare basic APIs to handle invalid defs qemu: Load domains with invalid XML on start include/libvirt/libvirt-domain.h | 2 + src/bhyve/bhyve_driver.c | 2 + src/conf/domain_conf.c | 180 +++++++++++++++++++++++++++++++++------ src/conf/domain_conf.h | 5 ++ src/libxl/libxl_driver.c | 3 + src/lxc/lxc_driver.c | 3 + src/qemu/qemu_driver.c | 64 +++++++++++--- src/uml/uml_driver.c | 2 + tools/virsh-domain-monitor.c | 3 +- 9 files changed, 223 insertions(+), 41 deletions(-) -- 2.5.3

This new reason means the domain is shutoff because parsing of the XML failed while daemon was starting and from user's point of view, there's nothing else to do with it other than re-defining it. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- include/libvirt/libvirt-domain.h | 2 ++ src/conf/domain_conf.c | 3 ++- tools/virsh-domain-monitor.c | 3 ++- 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index a1ea6a5d0786..37aa6db85cda 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -142,6 +142,8 @@ typedef enum { VIR_DOMAIN_SHUTOFF_FAILED = 6, /* domain failed to start */ VIR_DOMAIN_SHUTOFF_FROM_SNAPSHOT = 7, /* restored from a snapshot which was * taken while domain was shutoff */ + VIR_DOMAIN_SHUTOFF_INVALID_XML = 8, /* Domain XML failed to load for some + * reason, it needs to be re-defined */ # ifdef VIR_ENUM_SENTINELS VIR_DOMAIN_SHUTOFF_LAST # endif diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index a3b3ccb0d7cc..564f66de25bb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -710,7 +710,8 @@ VIR_ENUM_IMPL(virDomainShutoffReason, VIR_DOMAIN_SHUTOFF_LAST, "migrated", "saved", "failed", - "from snapshot") + "from snapshot", + "invalid XML") VIR_ENUM_IMPL(virDomainCrashedReason, VIR_DOMAIN_CRASHED_LAST, "unknown", diff --git a/tools/virsh-domain-monitor.c b/tools/virsh-domain-monitor.c index d4e500b21225..cdcee3cbb267 100644 --- a/tools/virsh-domain-monitor.c +++ b/tools/virsh-domain-monitor.c @@ -218,7 +218,8 @@ VIR_ENUM_IMPL(virshDomainShutoffReason, N_("migrated"), N_("saved"), N_("failed"), - N_("from snapshot")) + N_("from snapshot"), + N_("invalid XML")) VIR_ENUM_DECL(virshDomainCrashedReason) VIR_ENUM_IMPL(virshDomainCrashedReason, -- 2.5.3

Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_driver.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index fc3b60d01c01..8197f0238fb7 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -7043,9 +7043,9 @@ qemuDomainObjRestore(virConnectPtr conn, } -static char -*qemuDomainGetXMLDesc(virDomainPtr dom, - unsigned int flags) +static char * +qemuDomainGetXMLDesc(virDomainPtr dom, + unsigned int flags) { virQEMUDriverPtr driver = dom->conn->privateData; virDomainObjPtr vm; @@ -7449,7 +7449,10 @@ qemuDomainCreate(virDomainPtr dom) return qemuDomainCreateWithFlags(dom, 0); } -static virDomainPtr qemuDomainDefineXMLFlags(virConnectPtr conn, const char *xml, unsigned int flags) +static virDomainPtr +qemuDomainDefineXMLFlags(virConnectPtr conn, + const char *xml, + unsigned int flags) { virQEMUDriverPtr driver = conn->privateData; virDomainDefPtr def = NULL; -- 2.5.3

Create virDomainDefParseName that parses only the name from XML definition. This will be used in future patches. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/conf/domain_conf.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 564f66de25bb..c2aeca2e52a6 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14579,6 +14579,19 @@ virDomainVcpuParse(virDomainDefPtr def, return ret; } +static int +virDomainDefParseName(virDomainDefPtr def, + xmlXPathContextPtr ctxt) +{ + if (!(def->name = virXPathString("string(./name[1])", ctxt))) { + virReportError(VIR_ERR_NO_NAME, NULL); + return -1; + } + + return 0; +} + + static virDomainDefPtr virDomainDefParseXML(xmlDocPtr xml, xmlNodePtr root, @@ -14703,11 +14716,8 @@ virDomainDefParseXML(xmlDocPtr xml, VIR_FREE(capsdata); } - /* Extract domain name */ - if (!(def->name = virXPathString("string(./name[1])", ctxt))) { - virReportError(VIR_ERR_NO_NAME, NULL); + if (virDomainDefParseName(def, ctxt) < 0) goto error; - } /* Extract domain uuid. If both uuid and sysinfo/system/entry/uuid * exist, they must match; and if only the latter exists, it can -- 2.5.3

Create virDomainDefParseUUID that parses only the UUID from XML definition. This will be used in future patch(es). Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/conf/domain_conf.c | 64 +++++++++++++++++++++++++++++++++++--------------- 1 file changed, 45 insertions(+), 19 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index c2aeca2e52a6..230800542f8c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -14579,6 +14579,49 @@ virDomainVcpuParse(virDomainDefPtr def, return ret; } +/* + * Parse domain's UUID, optionally generating it if missing. + */ +static int +virDomainDefParseUUID(virDomainDefPtr def, + xmlXPathContextPtr ctxt, + bool required, + bool *generated) +{ + char *tmp = virXPathString("string(./uuid[1])", ctxt); + + if (generated) + *generated = false; + + if (tmp) { + if (virUUIDParse(tmp, def->uuid) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("malformed uuid element")); + VIR_FREE(tmp); + return -1; + } + VIR_FREE(tmp); + return 0; + } + + if (required) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Missing required UUID")); + return -1; + } + + if (virUUIDGenerate(def->uuid)) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Failed to generate UUID")); + return -1; + } + + if (generated) + *generated = true; + + return 0; +} + static int virDomainDefParseName(virDomainDefPtr def, xmlXPathContextPtr ctxt) @@ -14719,25 +14762,8 @@ virDomainDefParseXML(xmlDocPtr xml, if (virDomainDefParseName(def, ctxt) < 0) goto error; - /* Extract domain uuid. If both uuid and sysinfo/system/entry/uuid - * exist, they must match; and if only the latter exists, it can - * also serve as the uuid. */ - tmp = virXPathString("string(./uuid[1])", ctxt); - if (!tmp) { - if (virUUIDGenerate(def->uuid)) { - virReportError(VIR_ERR_INTERNAL_ERROR, - "%s", _("Failed to generate UUID")); - goto error; - } - uuid_generated = true; - } else { - if (virUUIDParse(tmp, def->uuid) < 0) { - virReportError(VIR_ERR_INTERNAL_ERROR, - "%s", _("malformed uuid element")); - goto error; - } - VIR_FREE(tmp); - } + if (virDomainDefParseUUID(def, ctxt, false, &uuid_generated) < 0) + goto error; /* Extract short description of domain (title) */ def->title = virXPathString("string(./title[1])", ctxt); -- 2.5.3

Add new parameter to virDomainObjListLoadConfig() and virDomainObjListLoadAllConfigs() that controls whether domains with invalid XML (which could not be parsed) should be kept in order not to lose track of them. For now, the parameter is set to false in all callers. Each driver can switch it to true when it is prepared to deal with such domains. For the domain object to be created add virDomainDefParseMinimal() that parses only name and UUID from the XML definition. UUID must be present, it will not be generated. The purpose of this function is to be used when all else fails, but we still want a domain object to work with. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/bhyve/bhyve_driver.c | 2 + src/conf/domain_conf.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++-- src/conf/domain_conf.h | 5 +++ src/libxl/libxl_driver.c | 3 ++ src/lxc/lxc_driver.c | 3 ++ src/qemu/qemu_driver.c | 3 ++ src/uml/uml_driver.c | 2 + 7 files changed, 109 insertions(+), 4 deletions(-) diff --git a/src/bhyve/bhyve_driver.c b/src/bhyve/bhyve_driver.c index 7f365b1f2446..abf6789c2f9c 100644 --- a/src/bhyve/bhyve_driver.c +++ b/src/bhyve/bhyve_driver.c @@ -1219,6 +1219,7 @@ bhyveStateInitialize(bool privileged, NULL, 1, bhyve_driver->caps, bhyve_driver->xmlopt, + false, NULL, NULL) < 0) goto cleanup; @@ -1227,6 +1228,7 @@ bhyveStateInitialize(bool privileged, BHYVE_AUTOSTART_DIR, 0, bhyve_driver->caps, bhyve_driver->xmlopt, + false, NULL, NULL) < 0) goto cleanup; diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 230800542f8c..6850e79396bb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -22676,6 +22676,39 @@ virDomainSaveStatus(virDomainXMLOptionPtr xmlopt, return ret; } +static virDomainDefPtr +virDomainDefParseMinimal(const char *filename, + const char *xmlStr) +{ + xmlXPathContextPtr ctxt = NULL; + virDomainDefPtr def = NULL; + xmlDocPtr xml = NULL; + + if (!(xml = virXMLParse(filename, xmlStr, _("(domain_definition)")))) + goto cleanup; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + } + + ctxt->node = xmlDocGetRootElement(xml); + + if (!(def = virDomainDefNew())) + goto cleanup; + + def->id = -1; + + if (virDomainDefParseName(def, ctxt) < 0 || + virDomainDefParseUUID(def, ctxt, true, NULL) < 0) + virDomainDefFree(def); + + cleanup: + xmlFreeDoc(xml); + xmlXPathFreeContext(ctxt); + return def; +} static virDomainObjPtr virDomainObjListLoadConfig(virDomainObjListPtr doms, @@ -22684,6 +22717,7 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms, const char *configDir, const char *autostartDir, const char *name, + bool keep_invalid, virDomainLoadConfigNotify notify, void *opaque) { @@ -22692,13 +22726,57 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms, virDomainObjPtr dom; int autostart; virDomainDefPtr oldDef = NULL; + char *xmlStr = NULL; + char *xmlErr = NULL; if ((configFile = virDomainConfigFile(configDir, name)) == NULL) goto error; - if (!(def = virDomainDefParseFile(configFile, caps, xmlopt, - VIR_DOMAIN_DEF_PARSE_INACTIVE | - VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS))) - goto error; + + def = virDomainDefParseFile(configFile, caps, xmlopt, + VIR_DOMAIN_DEF_PARSE_INACTIVE | + VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS); + if (!def) { + char *tmp = NULL; + + if (!keep_invalid) + goto error; + + if (VIR_STRDUP(xmlErr, virGetLastErrorMessage()) < 0) + goto error; + + if (virFileReadAll(configFile, 1024*1024*10, &xmlStr) < 0) + goto error; + + if (!(def = virDomainDefParseMinimal(NULL, xmlStr))) + goto error; + + /* + * Remove the comment with a warning from the top. Don't fail + * if we can't copy it or find it. + */ + tmp = strstr(xmlStr, "-->"); + + if (tmp) + tmp += strlen("-->"); + else + tmp = xmlStr; + + if (virAsprintf(&def->xmlStr, + "<!-- %s\n\n%s\n-->%s", + _("WARNING: The following XML failed to load!" + "\n\n" + "In order for it to be loaded properly, " + "it needs to be fixed.\n" + "The error that was reported while loading " + "is provided below for your convenience:"), + xmlErr, tmp) < 0) { + def->xmlStr = xmlStr; + xmlStr = NULL; + } + + def->parseError = xmlErr; + xmlErr = NULL; + } if ((autostartLink = virDomainConfigFile(autostartDir, name)) == NULL) goto error; @@ -22711,6 +22789,11 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms, dom->autostart = autostart; + if (def->parseError) { + virDomainObjSetState(dom, VIR_DOMAIN_SHUTOFF, + VIR_DOMAIN_SHUTOFF_INVALID_XML); + } + if (notify) (*notify)(dom, oldDef == NULL, opaque); @@ -22720,6 +22803,8 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms, return dom; error: + VIR_FREE(xmlErr); + VIR_FREE(xmlStr); VIR_FREE(configFile); VIR_FREE(autostartLink); virDomainDefFree(def); @@ -22790,6 +22875,7 @@ virDomainObjListLoadAllConfigs(virDomainObjListPtr doms, int liveStatus, virCapsPtr caps, virDomainXMLOptionPtr xmlopt, + bool keep_invalid, virDomainLoadConfigNotify notify, void *opaque) { @@ -22837,6 +22923,7 @@ virDomainObjListLoadAllConfigs(virDomainObjListPtr doms, configDir, autostartDir, entry->d_name, + keep_invalid, notify, opaque); if (dom) { diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8be390b7811a..fa6449461047 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2195,6 +2195,10 @@ struct _virDomainDef { char *title; char *description; + /* Possible error and string that failed parsing */ + char *xmlStr; + const char *parseError; + virDomainBlkiotune blkio; virDomainMemtune mem; @@ -2897,6 +2901,7 @@ int virDomainObjListLoadAllConfigs(virDomainObjListPtr doms, int liveStatus, virCapsPtr caps, virDomainXMLOptionPtr xmlopt, + bool keep_invalid, virDomainLoadConfigNotify notify, void *opaque); diff --git a/src/libxl/libxl_driver.c b/src/libxl/libxl_driver.c index 8087c2779795..53ea64b9bdd1 100644 --- a/src/libxl/libxl_driver.c +++ b/src/libxl/libxl_driver.c @@ -694,6 +694,7 @@ libxlStateInitialize(bool privileged, 1, cfg->caps, libxl_driver->xmlopt, + false, NULL, NULL) < 0) goto error; @@ -706,6 +707,7 @@ libxlStateInitialize(bool privileged, 0, cfg->caps, libxl_driver->xmlopt, + false, NULL, NULL) < 0) goto error; @@ -746,6 +748,7 @@ libxlStateReload(void) 1, cfg->caps, libxl_driver->xmlopt, + false, NULL, libxl_driver); virDomainObjListForEach(libxl_driver->domains, libxlAutostartDomain, diff --git a/src/lxc/lxc_driver.c b/src/lxc/lxc_driver.c index 71be9c7de8a6..e185d785de8c 100644 --- a/src/lxc/lxc_driver.c +++ b/src/lxc/lxc_driver.c @@ -1659,6 +1659,7 @@ static int lxcStateInitialize(bool privileged, NULL, 1, caps, lxc_driver->xmlopt, + false, NULL, NULL) < 0) goto cleanup; @@ -1670,6 +1671,7 @@ static int lxcStateInitialize(bool privileged, cfg->autostartDir, 0, caps, lxc_driver->xmlopt, + false, NULL, NULL) < 0) goto cleanup; @@ -1734,6 +1736,7 @@ lxcStateReload(void) cfg->autostartDir, 0, caps, lxc_driver->xmlopt, + false, lxcNotifyLoadDomain, lxc_driver); virObjectUnref(caps); virObjectUnref(cfg); diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 8197f0238fb7..e7b47eaa944c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -902,6 +902,7 @@ qemuStateInitialize(bool privileged, NULL, 1, qemu_driver->caps, qemu_driver->xmlopt, + false, NULL, NULL) < 0) goto error; @@ -924,6 +925,7 @@ qemuStateInitialize(bool privileged, cfg->autostartDir, 0, qemu_driver->caps, qemu_driver->xmlopt, + false, NULL, NULL) < 0) goto error; @@ -1004,6 +1006,7 @@ qemuStateReload(void) cfg->configDir, cfg->autostartDir, 0, caps, qemu_driver->xmlopt, + false, qemuNotifyLoadDomain, qemu_driver); cleanup: virObjectUnref(cfg); diff --git a/src/uml/uml_driver.c b/src/uml/uml_driver.c index c3c5fa7d9ac5..cb652838fe19 100644 --- a/src/uml/uml_driver.c +++ b/src/uml/uml_driver.c @@ -578,6 +578,7 @@ umlStateInitialize(bool privileged, uml_driver->autostartDir, 0, uml_driver->caps, uml_driver->xmlopt, + false, NULL, NULL) < 0) goto error; @@ -646,6 +647,7 @@ umlStateReload(void) uml_driver->autostartDir, 0, uml_driver->caps, uml_driver->xmlopt, + false, umlNotifyLoadDomain, uml_driver); umlDriverUnlock(uml_driver); -- 2.5.3

On 22.09.2015 14:15, Martin Kletzander wrote:
Add new parameter to virDomainObjListLoadConfig() and virDomainObjListLoadAllConfigs() that controls whether domains with invalid XML (which could not be parsed) should be kept in order not to lose track of them. For now, the parameter is set to false in all callers. Each driver can switch it to true when it is prepared to deal with such domains.
For the domain object to be created add virDomainDefParseMinimal() that parses only name and UUID from the XML definition. UUID must be present, it will not be generated. The purpose of this function is to be used when all else fails, but we still want a domain object to work with.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/bhyve/bhyve_driver.c | 2 + src/conf/domain_conf.c | 95 ++++++++++++++++++++++++++++++++++++++++++++++-- src/conf/domain_conf.h | 5 +++ src/libxl/libxl_driver.c | 3 ++ src/lxc/lxc_driver.c | 3 ++ src/qemu/qemu_driver.c | 3 ++ src/uml/uml_driver.c | 2 + 7 files changed, 109 insertions(+), 4 deletions(-)
diff --git a/src/bhyve/bhyve_driver.c b/src/bhyve/bhyve_driver.c index 7f365b1f2446..abf6789c2f9c 100644 --- a/src/bhyve/bhyve_driver.c +++ b/src/bhyve/bhyve_driver.c @@ -1219,6 +1219,7 @@ bhyveStateInitialize(bool privileged, NULL, 1, bhyve_driver->caps, bhyve_driver->xmlopt, + false, NULL, NULL) < 0) goto cleanup;
@@ -1227,6 +1228,7 @@ bhyveStateInitialize(bool privileged, BHYVE_AUTOSTART_DIR, 0, bhyve_driver->caps, bhyve_driver->xmlopt, + false, NULL, NULL) < 0) goto cleanup;
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 230800542f8c..6850e79396bb 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -22676,6 +22676,39 @@ virDomainSaveStatus(virDomainXMLOptionPtr xmlopt, return ret; }
+static virDomainDefPtr +virDomainDefParseMinimal(const char *filename, + const char *xmlStr) +{ + xmlXPathContextPtr ctxt = NULL; + virDomainDefPtr def = NULL; + xmlDocPtr xml = NULL; + + if (!(xml = virXMLParse(filename, xmlStr, _("(domain_definition)")))) + goto cleanup; + + ctxt = xmlXPathNewContext(xml); + if (ctxt == NULL) { + virReportOOMError(); + goto cleanup; + } + + ctxt->node = xmlDocGetRootElement(xml); + + if (!(def = virDomainDefNew())) + goto cleanup; + + def->id = -1; + + if (virDomainDefParseName(def, ctxt) < 0 || + virDomainDefParseUUID(def, ctxt, true, NULL) < 0) + virDomainDefFree(def); + + cleanup: + xmlFreeDoc(xml); + xmlXPathFreeContext(ctxt); + return def; +}
static virDomainObjPtr virDomainObjListLoadConfig(virDomainObjListPtr doms, @@ -22684,6 +22717,7 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms, const char *configDir, const char *autostartDir, const char *name, + bool keep_invalid, virDomainLoadConfigNotify notify, void *opaque) { @@ -22692,13 +22726,57 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms, virDomainObjPtr dom; int autostart; virDomainDefPtr oldDef = NULL; + char *xmlStr = NULL; + char *xmlErr = NULL;
if ((configFile = virDomainConfigFile(configDir, name)) == NULL) goto error; - if (!(def = virDomainDefParseFile(configFile, caps, xmlopt, - VIR_DOMAIN_DEF_PARSE_INACTIVE | - VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS))) - goto error; + + def = virDomainDefParseFile(configFile, caps, xmlopt, + VIR_DOMAIN_DEF_PARSE_INACTIVE | + VIR_DOMAIN_DEF_PARSE_SKIP_OSTYPE_CHECKS); + if (!def) { + char *tmp = NULL; + + if (!keep_invalid) + goto error; + + if (VIR_STRDUP(xmlErr, virGetLastErrorMessage()) < 0) + goto error; + + if (virFileReadAll(configFile, 1024*1024*10, &xmlStr) < 0) + goto error; + + if (!(def = virDomainDefParseMinimal(NULL, xmlStr))) + goto error; + + /* + * Remove the comment with a warning from the top. Don't fail + * if we can't copy it or find it. + */ + tmp = strstr(xmlStr, "-->"); + + if (tmp) + tmp += strlen("-->"); + else + tmp = xmlStr; + + if (virAsprintf(&def->xmlStr, + "<!-- %s\n\n%s\n-->%s", + _("WARNING: The following XML failed to load!" + "\n\n" + "In order for it to be loaded properly, " + "it needs to be fixed.\n" + "The error that was reported while loading " + "is provided below for your convenience:"), + xmlErr, tmp) < 0) { + def->xmlStr = xmlStr; + xmlStr = NULL; + } + + def->parseError = xmlErr; + xmlErr = NULL; + }
if ((autostartLink = virDomainConfigFile(autostartDir, name)) == NULL) goto error; @@ -22711,6 +22789,11 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms,
dom->autostart = autostart;
+ if (def->parseError) { + virDomainObjSetState(dom, VIR_DOMAIN_SHUTOFF, + VIR_DOMAIN_SHUTOFF_INVALID_XML); + } + if (notify) (*notify)(dom, oldDef == NULL, opaque);
@@ -22720,6 +22803,8 @@ virDomainObjListLoadConfig(virDomainObjListPtr doms, return dom;
error: + VIR_FREE(xmlErr); + VIR_FREE(xmlStr); VIR_FREE(configFile); VIR_FREE(autostartLink); virDomainDefFree(def); @@ -22790,6 +22875,7 @@ virDomainObjListLoadAllConfigs(virDomainObjListPtr doms, int liveStatus, virCapsPtr caps, virDomainXMLOptionPtr xmlopt, + bool keep_invalid, virDomainLoadConfigNotify notify, void *opaque) { @@ -22837,6 +22923,7 @@ virDomainObjListLoadAllConfigs(virDomainObjListPtr doms, configDir, autostartDir, entry->d_name, + keep_invalid, notify, opaque); if (dom) { diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index 8be390b7811a..fa6449461047 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -2195,6 +2195,10 @@ struct _virDomainDef { char *title; char *description;
+ /* Possible error and string that failed parsing */ + char *xmlStr; + const char *parseError;
Why is this ^^ const if we are allocating the string ourselves? Moreover, I think these would be leaked, so I'd suggest to free them in virDomainDefFree().
+ virDomainBlkiotune blkio; virDomainMemtune mem;
Michal

Change qemuDomObjFromDomain() to qemuDomObjFromDomainInternal() with additional parameter that controls how to deal with invalid domains. New qemuDomObjFromDomain() then follows the safe path wo we don't have to change its callers from all APIs and qemuDomObjFromDomainInvalid() is added as a new function that can be called from APIs that are prepared to handle invalid definitions as well. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_driver.c | 28 +++++++++++++++++++++++++++- 1 file changed, 27 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index e7b47eaa944c..0a671500134f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -209,15 +209,22 @@ struct qemuAutostartData { /** * qemuDomObjFromDomain: * @domain: Domain pointer that has to be looked up + * @invalid: Whether to also return invalid definitions * * This function looks up @domain and returns the appropriate virDomainObjPtr * that has to be released by calling virDomainObjEndAPI(). * + * If @invalid is true, this function can return domain definition with only + * name and uuid set, so beware as that can lead to NULL pointer dereference. + * This function should be called with @invalid == true only from APIs that are + * needed to fix the domain definition (e.g. those needed for looking it up or + * providing a new definition. + * * Returns the domain object with incremented reference counter which is locked * on success, NULL otherwise. */ static virDomainObjPtr -qemuDomObjFromDomain(virDomainPtr domain) +qemuDomObjFromDomainInternal(virDomainPtr domain, bool invalid) { virDomainObjPtr vm; virQEMUDriverPtr driver = domain->conn->privateData; @@ -231,10 +238,29 @@ qemuDomObjFromDomain(virDomainPtr domain) uuidstr, domain->name); return NULL; } + if (!invalid && vm->def->parseError) { + virReportError(VIR_ERR_XML_ERROR, + _("domain '%s' was not loaded due to an XML error " + "(%s), please redefine it"), + vm->def->name, vm->def->parseError); + virDomainObjEndAPI(&vm); + } return vm; } +static inline virDomainObjPtr +qemuDomObjFromDomain(virDomainPtr domain) +{ + return qemuDomObjFromDomainInternal(domain, false); +} + +static inline virDomainObjPtr +qemuDomObjFromDomainInvalid(virDomainPtr domain) +{ + return qemuDomObjFromDomainInternal(domain, true); +} + /* Looks up the domain object from snapshot and unlocks the * driver. The returned domain object is locked and ref'd and the * caller must call virDomainObjEndAPI() on it. */ -- 2.5.3

Hello. Is there a way to manually manage (acquire/release/info) locks in virtlockd for running qemu via libvirt? I haven't found any virsh command for that. Possibly via virtlockd's unix socket? What to send through socat? I'm trying to workarund bug (live external disk-only snapshot deadlocks with virtlockd): https://bugzilla.redhat.com/show_bug.cgi?id=1191901 Best regards -- Piotr Rybicki, Prezes Zarządu InnerVision Sp. z o.o. http://www.innervision.pl

On Tue, Sep 22, 2015 at 03:26:14PM +0200, Piotr Rybicki wrote:
Hello.
Is there a way to manually manage (acquire/release/info) locks in virtlockd for running qemu via libvirt?
I haven't found any virsh command for that.
Possibly via virtlockd's unix socket? What to send through socat?
No, there's no supportable way to talk to virtlockd directly. It is considered an internal only service at the current time. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

W dniu 2015-09-22 o 15:34, Daniel P. Berrange pisze:
Possibly via virtlockd's unix socket? What to send through socat?
No, there's no supportable way to talk to virtlockd directly. It is considered an internal only service at the current time.
I see. Thanks for information. I'll try libvirtd hooks then, to implement own locking mechanism. Best regards Piotr Rybicki

In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_driver.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0a671500134f..881c5c4c1984 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2706,7 +2706,7 @@ qemuDomainGetState(virDomainPtr dom, virCheckFlags(0, -1); - if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup; if (virDomainGetStateEnsureACL(dom->conn, vm->def) < 0) @@ -7082,19 +7082,23 @@ qemuDomainGetXMLDesc(virDomainPtr dom, /* Flags checked by virDomainDefFormat */ - if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup; if (virDomainGetXMLDescEnsureACL(dom->conn, vm->def, flags) < 0) goto cleanup; - if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) - goto cleanup; + if (vm->def->parseError) { + ignore_value(VIR_STRDUP(ret, vm->def->xmlStr)); + } else { + if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) + goto cleanup; - if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) - flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS; + if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) + flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS; - ret = qemuDomainFormatXML(driver, vm, flags); + ret = qemuDomainFormatXML(driver, vm, flags); + } cleanup: virDomainObjEndAPI(&vm); @@ -7557,6 +7561,9 @@ qemuDomainDefineXMLFlags(virConnectPtr conn, goto cleanup; } + if (oldDef && oldDef->parseError) + virDomainObjSetState(vm, virDomainObjGetState(vm, NULL), 0); + event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_DEFINED, !oldDef ? -- 2.5.3

On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs. eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to be blocked as well. And we would have to go through all the APIs just to see whether it accesses something that might be missing. Moreover, how would you decide what to skip at each error during parsing? If, for example, the <numatune> has some faulty attribute in one subelement should we skip all the elements or just the one or just skip that one particular attribute? We would also not format the faulty attribute to the XML being dumped, so the user wouldn't see what's missing, which is even worse when there are two problems and they fix only one, so we skip the other one. I thought about your approach as well, but that would require complete rewrite of the whole parsing code (maybe changing it to generated one based on RNG schema and some minor additional info) and it would touch way more APIs. The approach I'm suggesting here keeps almost all APIs untouched and it doesn't have an effect on anything, unless explicitly specified. Check last two patches to see how much qemu driver needs to be changed, for others it will be similar, probably a little bit less.. The patches are 17+/9- when combined. I don't think you can do much better than that. And that doesn't show the improvements that can be made in the starting and parsing code after this series is applied.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Sep 22, 2015 at 03:12:14PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to be blocked as well. And we would have to go through all the APIs just to see whether it accesses something that might be missing. Moreover, how would you decide what to skip at each error during parsing? If, for example, the <numatune> has some faulty attribute in one subelement should we skip all the elements or just the one or just skip that one particular attribute? We would also not format the faulty attribute to the XML being dumped, so the user wouldn't see what's missing, which is even worse when there are two problems and they fix only one, so we skip the other one.
Oh, I wasn't suggesting changing the parser. I just mean that we would create a virDomainDefPtr instance which only contains the name and UUID, and ID field (if the guest is running from domain status XML). We'd leave all the rest of the config blank - our code ought to be able to deal with that, since essentially all config is optional aside from the name/uuid anyway.
I thought about your approach as well, but that would require complete rewrite of the whole parsing code (maybe changing it to generated one based on RNG schema and some minor additional info) and it would touch way more APIs. The approach I'm suggesting here keeps almost all APIs untouched and it doesn't have an effect on anything, unless explicitly specified.
Check last two patches to see how much qemu driver needs to be changed, for others it will be similar, probably a little bit less.. The patches are 17+/9- when combined. I don't think you can do much better than that. And that doesn't show the improvements that can be made in the starting and parsing code after this series is applied.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Sep 22, 2015 at 02:17:14PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 03:12:14PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to be blocked as well. And we would have to go through all the APIs just to see whether it accesses something that might be missing. Moreover, how would you decide what to skip at each error during parsing? If, for example, the <numatune> has some faulty attribute in one subelement should we skip all the elements or just the one or just skip that one particular attribute? We would also not format the faulty attribute to the XML being dumped, so the user wouldn't see what's missing, which is even worse when there are two problems and they fix only one, so we skip the other one.
Oh, I wasn't suggesting changing the parser. I just mean that we would create a virDomainDefPtr instance which only contains the name and UUID, and ID field (if the guest is running from domain status XML).
We'd leave all the rest of the config blank - our code ought to be able to deal with that, since essentially all config is optional aside from the name/uuid anyway.
Then probably I misunderstood because that's exactly what this series is doing.
I thought about your approach as well, but that would require complete rewrite of the whole parsing code (maybe changing it to generated one based on RNG schema and some minor additional info) and it would touch way more APIs. The approach I'm suggesting here keeps almost all APIs untouched and it doesn't have an effect on anything, unless explicitly specified.
Check last two patches to see how much qemu driver needs to be changed, for others it will be similar, probably a little bit less.. The patches are 17+/9- when combined. I don't think you can do much better than that. And that doesn't show the improvements that can be made in the starting and parsing code after this series is applied.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Sep 22, 2015 at 03:26:26PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 02:17:14PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 03:12:14PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to be blocked as well. And we would have to go through all the APIs just to see whether it accesses something that might be missing. Moreover, how would you decide what to skip at each error during parsing? If, for example, the <numatune> has some faulty attribute in one subelement should we skip all the elements or just the one or just skip that one particular attribute? We would also not format the faulty attribute to the XML being dumped, so the user wouldn't see what's missing, which is even worse when there are two problems and they fix only one, so we skip the other one.
Oh, I wasn't suggesting changing the parser. I just mean that we would create a virDomainDefPtr instance which only contains the name and UUID, and ID field (if the guest is running from domain status XML).
We'd leave all the rest of the config blank - our code ought to be able to deal with that, since essentially all config is optional aside from the name/uuid anyway.
Then probably I misunderstood because that's exactly what this series is doing.
Right, but I mean without converting all the drivers to use this new qemuDomObjFromDomainInvalid() method. I'm suggesting just having a check in the start method only. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Sep 22, 2015 at 02:36:54PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 03:26:26PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 02:17:14PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 03:12:14PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to be blocked as well. And we would have to go through all the APIs just to see whether it accesses something that might be missing. Moreover, how would you decide what to skip at each error during parsing? If, for example, the <numatune> has some faulty attribute in one subelement should we skip all the elements or just the one or just skip that one particular attribute? We would also not format the faulty attribute to the XML being dumped, so the user wouldn't see what's missing, which is even worse when there are two problems and they fix only one, so we skip the other one.
Oh, I wasn't suggesting changing the parser. I just mean that we would create a virDomainDefPtr instance which only contains the name and UUID, and ID field (if the guest is running from domain status XML).
We'd leave all the rest of the config blank - our code ought to be able to deal with that, since essentially all config is optional aside from the name/uuid anyway.
Then probably I misunderstood because that's exactly what this series is doing.
Right, but I mean without converting all the drivers to use this new qemuDomObjFromDomainInvalid() method. I'm suggesting just having a check in the start method only.
I don't think virDomainDef with only name, id and uuid set would not cause problems, but to be honest I haven't tried that. I just remember how sometimes we rely on the fact that whatever we have in virDomainDef went through parsing code. I believe that if we remove the safeguard and only add a check to *CreateXML*() functions, we'll find some flaws. Mainly information gathering functions that work with enums that have no default values (e.g. virDomainVirtType). Well, even without those, we will report false information. No pinning, no tuning, everything would return success with all information lost. So we'd add few checks here and there. Then we'd realize that (offline) migration will fail weirdly since qemuMigrationIsAllowed() will return true and the first thing we will do after that is formatting the XML. We'd probably define way different domain on the destination if that worked, I'm pretty sure parsing would fail there. Anyway, I think we'll end up adding more explicit checks to the code in your case. Way more than three calls to a differently named function per driver. Also, this can lead to a bigger cleanup, if we do some wrapper around our APIs, we might move ACL checks, domain object gathering, job creation and common API cleanups (*EndAPI()) and all other stuff in it, so it won't be visible anyway. But that's a future that won't happen until I have way to much free time.
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 22.09.2015 15:17, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 03:12:14PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to be blocked as well. And we would have to go through all the APIs just to see whether it accesses something that might be missing. Moreover, how would you decide what to skip at each error during parsing? If, for example, the <numatune> has some faulty attribute in one subelement should we skip all the elements or just the one or just skip that one particular attribute? We would also not format the faulty attribute to the XML being dumped, so the user wouldn't see what's missing, which is even worse when there are two problems and they fix only one, so we skip the other one.
Oh, I wasn't suggesting changing the parser. I just mean that we would create a virDomainDefPtr instance which only contains the name and UUID, and ID field (if the guest is running from domain status XML).
We'd leave all the rest of the config blank - our code ought to be able to deal with that, since essentially all config is optional aside from the name/uuid anyway.
So if I understand it correctly, if there's an error during XML parsing, we would: virDomainDefFree(def); /* becasue @def is parsed just partially */ def = virDomainDefNewFull(name, uuid, -1); def->error = true; /* or something */ But where would be the original (invalid) XML be stored? We don't want to lost it. I rather prefer to giving it back to the user (in dumpXML api) and thus giving him chance to fix the problem that caused error in the first place. Otherwise this seems pointless to me. And I don't expect that many APIs need fixing. It's just a few of them that should work with broken XML: dumpXML and probably GetState (so that we can report domain is shut off due to invalid XML). Michal

On Tue, Oct 06, 2015 at 03:35:36PM +0200, Michal Privoznik wrote:
On 22.09.2015 15:17, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 03:12:14PM +0200, Martin Kletzander wrote:
On Tue, Sep 22, 2015 at 01:48:01PM +0100, Daniel P. Berrange wrote:
On Tue, Sep 22, 2015 at 02:15:53PM +0200, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Hmm, that's an interesting approach to the problem. I wonder though if we could do things slightly differently such that we don't need to change so many APIs.
eg, just have a 'bool error' field in virDomainDefPtr. When loading the XML fails, populate a virDomainObjPtr/DefPtr as normal, but set the error field. Now we merely need to change the qemuDomainStart method, so it refuses to launch a VM with the 'error' flag set. All the other APIs could be essentially unchanged. Sure it would not be useful to allow things like virDomainAttachDevice, etc on such broken domains, but for sake of simplicity we can avoid touching all the methods except start.
To be honest, I'm a afraid that we might forget some API that needs to be blocked as well. And we would have to go through all the APIs just to see whether it accesses something that might be missing. Moreover, how would you decide what to skip at each error during parsing? If, for example, the <numatune> has some faulty attribute in one subelement should we skip all the elements or just the one or just skip that one particular attribute? We would also not format the faulty attribute to the XML being dumped, so the user wouldn't see what's missing, which is even worse when there are two problems and they fix only one, so we skip the other one.
Oh, I wasn't suggesting changing the parser. I just mean that we would create a virDomainDefPtr instance which only contains the name and UUID, and ID field (if the guest is running from domain status XML).
We'd leave all the rest of the config blank - our code ought to be able to deal with that, since essentially all config is optional aside from the name/uuid anyway.
So if I understand it correctly, if there's an error during XML parsing, we would:
virDomainDefFree(def); /* becasue @def is parsed just partially */ def = virDomainDefNewFull(name, uuid, -1); def->error = true; /* or something */
But where would be the original (invalid) XML be stored? We don't want to lost it. I rather prefer to giving it back to the user (in dumpXML api) and thus giving him chance to fix the problem that caused error in the first place. Otherwise this seems pointless to me.
Hmm, good point.
And I don't expect that many APIs need fixing. It's just a few of them that should work with broken XML: dumpXML and probably GetState (so that we can report domain is shut off due to invalid XML).
I don't feel particularly strongly about the approach, so if you want to go the route you originally described its fine. BTW, I guess we want this for all object types, not just domains. ie storage, network, etc. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 22.09.2015 14:15, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_driver.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0a671500134f..881c5c4c1984 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2706,7 +2706,7 @@ qemuDomainGetState(virDomainPtr dom,
virCheckFlags(0, -1);
- if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup;
if (virDomainGetStateEnsureACL(dom->conn, vm->def) < 0) @@ -7082,19 +7082,23 @@ qemuDomainGetXMLDesc(virDomainPtr dom,
/* Flags checked by virDomainDefFormat */
- if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup;
if (virDomainGetXMLDescEnsureACL(dom->conn, vm->def, flags) < 0) goto cleanup;
- if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) - goto cleanup; + if (vm->def->parseError) { + ignore_value(VIR_STRDUP(ret, vm->def->xmlStr)); + } else { + if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) + goto cleanup;
- if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) - flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS; + if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) + flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS;
- ret = qemuDomainFormatXML(driver, vm, flags); + ret = qemuDomainFormatXML(driver, vm, flags); + }
cleanup: virDomainObjEndAPI(&vm); @@ -7557,6 +7561,9 @@ qemuDomainDefineXMLFlags(virConnectPtr conn, goto cleanup; }
+ if (oldDef && oldDef->parseError) + virDomainObjSetState(vm, virDomainObjGetState(vm, NULL), 0); +
I don't understand this bit. Why is this needed?
event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_DEFINED, !oldDef ?
Otherwise looking good. Michal

On Tue, Oct 13, 2015 at 12:10:05PM +0200, Michal Privoznik wrote:
On 22.09.2015 14:15, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_driver.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0a671500134f..881c5c4c1984 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2706,7 +2706,7 @@ qemuDomainGetState(virDomainPtr dom,
virCheckFlags(0, -1);
- if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup;
if (virDomainGetStateEnsureACL(dom->conn, vm->def) < 0) @@ -7082,19 +7082,23 @@ qemuDomainGetXMLDesc(virDomainPtr dom,
/* Flags checked by virDomainDefFormat */
- if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup;
if (virDomainGetXMLDescEnsureACL(dom->conn, vm->def, flags) < 0) goto cleanup;
- if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) - goto cleanup; + if (vm->def->parseError) { + ignore_value(VIR_STRDUP(ret, vm->def->xmlStr)); + } else { + if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) + goto cleanup;
- if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) - flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS; + if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) + flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS;
- ret = qemuDomainFormatXML(driver, vm, flags); + ret = qemuDomainFormatXML(driver, vm, flags); + }
cleanup: virDomainObjEndAPI(&vm); @@ -7557,6 +7561,9 @@ qemuDomainDefineXMLFlags(virConnectPtr conn, goto cleanup; }
+ if (oldDef && oldDef->parseError) + virDomainObjSetState(vm, virDomainObjGetState(vm, NULL), 0); +
I don't understand this bit. Why is this needed?
This resets the domain state reason to "unknown" (while keeping the state) which is the same one as the properly-loaded domains have. Otherwise it would keep the INVALID_XML reason -- virshdomstate --reason would output shutoff (invalid xml).
event = virDomainEventLifecycleNewFromObj(vm, VIR_DOMAIN_EVENT_DEFINED, !oldDef ?
Otherwise looking good.
Michal

On 13.10.2015 15:08, Martin Kletzander wrote:
On Tue, Oct 13, 2015 at 12:10:05PM +0200, Michal Privoznik wrote:
On 22.09.2015 14:15, Martin Kletzander wrote:
In order for the user to be able to fix broken domains function qemuDomainGetXMLDesc() needs to be able to lookup invalid domain definitions and handle them properly. When redefined, function qemuDomainDefineXMLFlags() must clear the 'invalid XML' reason. As a nice addition, qemuDomainGetState() can lookup such domains without any other change and that allows virsh not only to get their status, but also to list them.
Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_driver.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-)
diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 0a671500134f..881c5c4c1984 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2706,7 +2706,7 @@ qemuDomainGetState(virDomainPtr dom,
virCheckFlags(0, -1);
- if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup;
if (virDomainGetStateEnsureACL(dom->conn, vm->def) < 0) @@ -7082,19 +7082,23 @@ qemuDomainGetXMLDesc(virDomainPtr dom,
/* Flags checked by virDomainDefFormat */
- if (!(vm = qemuDomObjFromDomain(dom))) + if (!(vm = qemuDomObjFromDomainInvalid(dom))) goto cleanup;
if (virDomainGetXMLDescEnsureACL(dom->conn, vm->def, flags) < 0) goto cleanup;
- if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) - goto cleanup; + if (vm->def->parseError) { + ignore_value(VIR_STRDUP(ret, vm->def->xmlStr)); + } else { + if (qemuDomainUpdateCurrentMemorySize(driver, vm) < 0) + goto cleanup;
- if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) - flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS; + if ((flags & VIR_DOMAIN_XML_MIGRATABLE)) + flags |= QEMU_DOMAIN_FORMAT_LIVE_FLAGS;
- ret = qemuDomainFormatXML(driver, vm, flags); + ret = qemuDomainFormatXML(driver, vm, flags); + }
cleanup: virDomainObjEndAPI(&vm); @@ -7557,6 +7561,9 @@ qemuDomainDefineXMLFlags(virConnectPtr conn, goto cleanup; }
+ if (oldDef && oldDef->parseError) + virDomainObjSetState(vm, virDomainObjGetState(vm, NULL), 0); +
I don't understand this bit. Why is this needed?
This resets the domain state reason to "unknown" (while keeping the state) which is the same one as the properly-loaded domains have. Otherwise it would keep the INVALID_XML reason -- virshdomstate --reason would output shutoff (invalid xml).
Ah, okay. ACK then. Michal

If we load such domains, we don't need to handle invalid XML checking in qemuProcessStart, but we can move it to qemuDomainDefPostParse() or even into the parsing functions if all goes well. So the only thing we'll need to worry about after this is XML parsing code that would error out for running domains or non-qemu drivers. The latter will be fixed in following patches, hence making the active XML parsing the only problematic part. Signed-off-by: Martin Kletzander <mkletzan@redhat.com> --- src/qemu/qemu_driver.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 881c5c4c1984..c4df44c9f71f 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -312,6 +312,7 @@ qemuAutostartDomain(virDomainObjPtr vm, virObjectRef(vm); virResetLastError(); if (vm->autostart && + !vm->def->parseError && !virDomainObjIsActive(vm)) { if (qemuDomainObjBeginJob(data->driver, vm, QEMU_JOB_MODIFY) < 0) { @@ -951,7 +952,7 @@ qemuStateInitialize(bool privileged, cfg->autostartDir, 0, qemu_driver->caps, qemu_driver->xmlopt, - false, + true, NULL, NULL) < 0) goto error; @@ -1032,7 +1033,7 @@ qemuStateReload(void) cfg->configDir, cfg->autostartDir, 0, caps, qemu_driver->xmlopt, - false, + true, qemuNotifyLoadDomain, qemu_driver); cleanup: virObjectUnref(cfg); -- 2.5.3

On 22.09.2015 14:15, Martin Kletzander wrote:
We always had to deal with new parsing errors in a weird way. All of them needed to go into functions starting the domains. That messes up the code, it's confusing to newcomers and so on.
I once had an idea that we can handle this stuff. We know what failed, we have the XML that failed parsing and if the problem is not in the domain name nor UUID, we can create a domain object out of that as well. This way we are able to do something we weren't with this series applied. Example follows.
Assume "dummy" is a domain with invalid XML (I just modified the file). Now, just for the purpose of this silly example, let's say we allowed domains with archtecture x86_*, but now we've realized that x86_64 is the only one we want to allow, but there already is a domain defined that has <type arch='x86_256' .../>. With the current master, the domain would be lost, so we would need to modify the funstion starting the domain (e.g. qemuProcessStart for qemu domain). However, with this series, it behaves like this:
# virsh list --all Id Name State ---------------------------------------------------- - dummy shut off
# virsh domstate --reason dummy shut off (invalid XML)
# virsh start dummy error: Failed to start domain dummy error: XML error: domain 'dummy' was not loaded due to an XML error (unsupported configuration: Unknown architecture x86_256), please redefine it
# VISUAL='sed -i s/x86_256/x86_64/' virsh edit dummy Domain dummy XML configuration edited.
# virsh domstate --reason dummy shut off (unknown)
# virsh start dummy Domain dummy started
This is a logical next step for us to clean and separate parsing and starting, getting rid of some old code without sacrifying compatibility and maybe generating parser code in the future.
Why is this an RFC? - I can certainly easily imagine some people who might not like this - It doesn't work for all drivers (yet) - It does handle status XMLs, but only slightly. All the handling is done in a commit message that says something along the lines of "beware, we should still be careful, but not as much as before". - The error type used for the message is one that is already used for something else and we might want a new type of error for exactly this one error message, although it might be enough for some to have the possibility of checking this by querying the domain state and checking the reason. - Just so you know I came up with something new for which there's no BZ (or at least yet) =)
Martin Kletzander (8): conf, virsh: Add new domain shutoff reason qemu: Few whitespace cleanups conf: Extract name-parsing into its own function conf: Extract UUID parsing into its own function conf: Optionally keep domains with invalid XML qemu: Don't lookup invalid domains unless specified otherwise qemu: Prepare basic APIs to handle invalid defs qemu: Load domains with invalid XML on start
include/libvirt/libvirt-domain.h | 2 + src/bhyve/bhyve_driver.c | 2 + src/conf/domain_conf.c | 180 +++++++++++++++++++++++++++++++++------ src/conf/domain_conf.h | 5 ++ src/libxl/libxl_driver.c | 3 + src/lxc/lxc_driver.c | 3 + src/qemu/qemu_driver.c | 64 +++++++++++--- src/uml/uml_driver.c | 2 + tools/virsh-domain-monitor.c | 3 +- 9 files changed, 223 insertions(+), 41 deletions(-)
So now that we have an agreement here on the design, I went through the patches and they basically look okay. Except for a few places I've pointed out. ACK series then except for 7/8 where some explanation is needed. Michal
participants (4)
-
Daniel P. Berrange
-
Martin Kletzander
-
Michal Privoznik
-
Piotr Rybicki