[PATCH 0/5] be consistent about error checking xmlNodeGetContent() return

Awhile back I noticed that calls to xmlNodeGetContent() from libvirt code were inconsistent in their handling of the returned pointer. Sometimes we would assume the return was always non-NULL (dereferencing with wild abandon without concern for the consequences), sometimes we would interpret NULL as "", and sometimes as OOM. I sent mail about this to the list last week, wondering (and doubting) if we could assume that a NULL return would always mean OOM: https://www.redhat.com/archives/libvir-list/2020-July/msg00333.html After looking at the libxml code, danpb's determination was that a NULL return from xmlNodeGetContent *might* mean OOM, but it might also mean some odd XML that we weren't expecting, so we can't just always exit on a NULL return. He also agreed with (and expanded on) my suspicion that there really is no reliable way to tell the reason for a NULL return from xmlNodeGetContent, and suggested that possibly we could just examing the xmlNode directly to learn the content instead of calling xmlNodeGetContent(). This series is a followup to that discussion. The first 4 patches clean up the code with the result that: 1) a libvirt wrapper function is always called instead of calling xmlNodeGetContent() directly. 2) that wrapper function logs an error when it gets back NULL from xmlNodeGetContent(). 3) All the callers check for a NULL return, and do a "silent error return" themselves when there is a NULL. In the final patch, I try out Dan's idea of looking at the xmlNode object directly to get the content. It turns out it's not as straightforward as you would think from just looking at the layout of the object - a full explanation is in patch 5. I'm mainly sending that patch as an "FYI" (one step back from an "RFC"), since really all it changes is that libvirt will exit on OOM, and log (different, but not any more informative) error messages when the problem isn't OOM. Unless someone has a strong opinion otherwise, I think just the first 4 patches should be applied, and users can just "deal" with the ambiguity in case of error. Laine Stump (5): conf: refactor virDomainBlkioDeviceParseXML to reduce calls to xmlNodeGetContent util: replace all calls to xmlNodeGetContent with virXMLNodeContentString util: log an error if virXMLNodeContentString will return NULL treat all NULL returns from virXMLNodeContentString() as an error util: open code virXMLNodeContentString to access the node object directly src/conf/domain_conf.c | 194 ++++++++++++++++++++---------------- src/conf/network_conf.c | 7 +- src/conf/node_device_conf.c | 6 +- src/util/virxml.c | 53 +++++++++- 4 files changed, 169 insertions(+), 91 deletions(-) -- 2.25.4

virDomainBlkioDeviceParseXML() calls xmlNodeGetContent() multiple times in a loop, but can easily be refactored to call it once for all element nodes, and then use the result of that one call in each of the (mutually exclusive) blocks that previously each had their own call to xmlNodeGetContent. This is being done in order to reduce the number of changes needed in an upcoming patch that will eliminate the lack of checking for NULL on return from xmlNodeGetContent(). As part of the simplification, the while() loop has been changed into a for() so that we can use "continue" without bypassing the "node = node->next". Signed-off-by: Laine Stump <laine@redhat.com> Change from V1: turned into for() loop and log error rather than ignoring NULL. Jano had suggested we might be able to set dev->path directly instead of using a temporary var, but doing that would require keeping the error: label and its cleanup of dev->path, rather than just relying on g_autofree. Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_conf.c | 128 ++++++++++++++++++++++------------------- 1 file changed, 70 insertions(+), 58 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 7ecd2818b9..ade8c13914 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1635,73 +1635,85 @@ virDomainBlkioDeviceParseXML(xmlNodePtr root, virBlkioDevicePtr dev) { xmlNodePtr node; - g_autofree char *c = NULL; - - node = root->children; - while (node) { - if (node->type == XML_ELEMENT_NODE) { - if (virXMLNodeNameEqual(node, "path") && !dev->path) { - dev->path = (char *)xmlNodeGetContent(node); - } else if (virXMLNodeNameEqual(node, "weight")) { - c = (char *)xmlNodeGetContent(node); - if (virStrToLong_ui(c, NULL, 10, &dev->weight) < 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not parse weight %s"), - c); - goto error; - } - VIR_FREE(c); - } else if (virXMLNodeNameEqual(node, "read_bytes_sec")) { - c = (char *)xmlNodeGetContent(node); - if (virStrToLong_ull(c, NULL, 10, &dev->rbps) < 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not parse read bytes sec %s"), - c); - goto error; - } - VIR_FREE(c); - } else if (virXMLNodeNameEqual(node, "write_bytes_sec")) { - c = (char *)xmlNodeGetContent(node); - if (virStrToLong_ull(c, NULL, 10, &dev->wbps) < 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not parse write bytes sec %s"), - c); - goto error; - } - VIR_FREE(c); - } else if (virXMLNodeNameEqual(node, "read_iops_sec")) { - c = (char *)xmlNodeGetContent(node); - if (virStrToLong_ui(c, NULL, 10, &dev->riops) < 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not parse read iops sec %s"), - c); - goto error; - } - VIR_FREE(c); - } else if (virXMLNodeNameEqual(node, "write_iops_sec")) { - c = (char *)xmlNodeGetContent(node); - if (virStrToLong_ui(c, NULL, 10, &dev->wiops) < 0) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, - _("could not parse write iops sec %s"), - c); - goto error; - } - VIR_FREE(c); + g_autofree char *path = NULL; + + for (node = root->children; node != NULL; node = node->next) { + g_autofree char *c = NULL; + + if (node->type != XML_ELEMENT_NODE) + continue; + + c = (char *)xmlNodeGetContent(node); + + if (virXMLNodeNameEqual(node, "path")) { + /* To avoid the need for explicit cleanup on failure, + * don't set dev->path until we're assured of + * success. Until then, store it in an autofree pointer. + */ + if (!path) + path = g_steal_pointer(&c); + continue; + } + + if (virXMLNodeNameEqual(node, "weight")) { + if (virStrToLong_ui(c, NULL, 10, &dev->weight) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("could not parse weight %s"), + c); + return -1; } + continue; + } + + if (virXMLNodeNameEqual(node, "read_bytes_sec")) { + if (virStrToLong_ull(c, NULL, 10, &dev->rbps) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("could not parse read bytes sec %s"), + c); + return -1; + } + continue; + } + + if (virXMLNodeNameEqual(node, "write_bytes_sec")) { + if (virStrToLong_ull(c, NULL, 10, &dev->wbps) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("could not parse write bytes sec %s"), + c); + return -1; + } + continue; + } + + if (virXMLNodeNameEqual(node, "read_iops_sec")) { + if (virStrToLong_ui(c, NULL, 10, &dev->riops) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("could not parse read iops sec %s"), + c); + return -1; + } + continue; + } + + if (virXMLNodeNameEqual(node, "write_iops_sec")) { + if (virStrToLong_ui(c, NULL, 10, &dev->wiops) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("could not parse write iops sec %s"), + c); + return -1; + } + continue; } - node = node->next; } - if (!dev->path) { + + if (!path) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", _("missing per-device path")); return -1; } + dev->path = g_steal_pointer(&path); return 0; - - error: - VIR_FREE(dev->path); - return -1; } -- 2.25.4

No functional change Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_conf.c | 20 ++++++++++---------- src/conf/network_conf.c | 2 +- src/conf/node_device_conf.c | 4 ++-- 3 files changed, 13 insertions(+), 13 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index ade8c13914..cb69c97a8e 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1643,7 +1643,7 @@ virDomainBlkioDeviceParseXML(xmlNodePtr root, if (node->type != XML_ELEMENT_NODE) continue; - c = (char *)xmlNodeGetContent(node); + c = virXMLNodeContentString(node); if (virXMLNodeNameEqual(node, "path")) { /* To avoid the need for explicit cleanup on failure, @@ -9373,10 +9373,10 @@ virDomainLeaseDefParseXML(xmlNodePtr node) while (cur != NULL) { if (cur->type == XML_ELEMENT_NODE) { if (!key && virXMLNodeNameEqual(cur, "key")) { - key = (char *)xmlNodeGetContent(cur); + key = virXMLNodeContentString(cur); } else if (!lockspace && virXMLNodeNameEqual(cur, "lockspace")) { - lockspace = (char *)xmlNodeGetContent(cur); + lockspace = virXMLNodeContentString(cur); } else if (!path && virXMLNodeNameEqual(cur, "target")) { path = virXMLPropString(cur, "path"); @@ -10595,16 +10595,16 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } else if (!serial && virXMLNodeNameEqual(cur, "serial")) { - serial = (char *)xmlNodeGetContent(cur); + serial = virXMLNodeContentString(cur); } else if (!wwn && virXMLNodeNameEqual(cur, "wwn")) { - wwn = (char *)xmlNodeGetContent(cur); + wwn = virXMLNodeContentString(cur); if (!virValidateWWN(wwn)) goto error; } else if (!vendor && virXMLNodeNameEqual(cur, "vendor")) { - vendor = (char *)xmlNodeGetContent(cur); + vendor = virXMLNodeContentString(cur); if (strlen(vendor) > VENDOR_LEN) { virReportError(VIR_ERR_XML_ERROR, "%s", @@ -10619,7 +10619,7 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } } else if (!product && virXMLNodeNameEqual(cur, "product")) { - product = (char *)xmlNodeGetContent(cur); + product = virXMLNodeContentString(cur); if (strlen(product) > PRODUCT_LEN) { virReportError(VIR_ERR_XML_ERROR, "%s", @@ -13513,7 +13513,7 @@ virDomainSmartcardDefParseXML(virDomainXMLOptionPtr xmlopt, "exactly three certificates")); goto error; } - def->data.cert.file[i] = (char *)xmlNodeGetContent(cur); + def->data.cert.file[i] = virXMLNodeContentString(cur); if (!def->data.cert.file[i]) { virReportOOMError(); goto error; @@ -13522,7 +13522,7 @@ virDomainSmartcardDefParseXML(virDomainXMLOptionPtr xmlopt, } else if (cur->type == XML_ELEMENT_NODE && virXMLNodeNameEqual(cur, "database") && !def->data.cert.database) { - def->data.cert.database = (char *)xmlNodeGetContent(cur); + def->data.cert.database = virXMLNodeContentString(cur); if (!def->data.cert.database) { virReportOOMError(); goto error; @@ -19875,7 +19875,7 @@ virDomainLoaderDefParseXML(xmlNodePtr node, if (!fwAutoSelect) { readonly_str = virXMLPropString(node, "readonly"); type_str = virXMLPropString(node, "type"); - loader->path = (char *) xmlNodeGetContent(node); + loader->path = virXMLNodeContentString(node); if (STREQ_NULLABLE(loader->path, "")) VIR_FREE(loader->path); } diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 0fd68a7d66..0a32f57188 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -720,7 +720,7 @@ virNetworkDNSHostDefParseXML(const char *networkName, if (cur->type == XML_ELEMENT_NODE && virXMLNodeNameEqual(cur, "hostname")) { if (cur->children != NULL) { - g_autofree char *name = (char *) xmlNodeGetContent(cur); + g_autofree char *name = virXMLNodeContentString(cur); if (!name) { virReportError(VIR_ERR_XML_DETAIL, diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index c54015336a..7bbe709c5d 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -1988,10 +1988,10 @@ virNodeDeviceDefParseXML(xmlXPathContextPtr ctxt, switch ((virNodeDevDevnodeType)val) { case VIR_NODE_DEV_DEVNODE_DEV: - def->devnode = (char*)xmlNodeGetContent(node); + def->devnode = virXMLNodeContentString(node); break; case VIR_NODE_DEV_DEVNODE_LINK: - def->devlinks[m++] = (char*)xmlNodeGetContent(node); + def->devlinks[m++] = virXMLNodeContentString(node); break; case VIR_NODE_DEV_DEVNODE_LAST: break; -- 2.25.4

Many of our calls to xmlNodeGetContent() (which are now all via virXMLNodeContentString() are failing to check for a NULL return. We need to remedy that, but in order to make the remedy simpler, let's log an error in virXMLNodeContentString(), so that the callers don't all individually need to (since it would be the same error message for all of them anyway). Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virxml.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-) diff --git a/src/util/virxml.c b/src/util/virxml.c index 27d22598ee..5315d4ff6f 100644 --- a/src/util/virxml.c +++ b/src/util/virxml.c @@ -538,7 +538,23 @@ virXMLPropStringLimit(xmlNodePtr node, char * virXMLNodeContentString(xmlNodePtr node) { - return (char *)xmlNodeGetContent(node); + char *ret = (char *)xmlNodeGetContent(node); + + if (node->type != XML_ELEMENT_NODE) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("node '%s' has unexpected type %d"), + node->name, node->type); + return NULL; + } + + if (!ret) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("node '%s' has unexpected NULL content. This could be caused by malformed input, or a memory allocation failure"), + node->name); + return NULL; + } + + return ret; } -- 2.25.4

and stop erroneously equating NULL with "". The latter means that the element has empty content, while the former means there was an error during parsing (either internal with the parser, or the content of the XML was bad). Signed-off-by: Laine Stump <laine@redhat.com> --- src/conf/domain_conf.c | 68 ++++++++++++++++++++++--------------- src/conf/network_conf.c | 5 ++- src/conf/node_device_conf.c | 6 ++-- 3 files changed, 48 insertions(+), 31 deletions(-) diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index cb69c97a8e..c377fd74aa 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -1643,7 +1643,8 @@ virDomainBlkioDeviceParseXML(xmlNodePtr root, if (node->type != XML_ELEMENT_NODE) continue; - c = virXMLNodeContentString(node); + if (!(c = virXMLNodeContentString(node))) + return -1; if (virXMLNodeNameEqual(node, "path")) { /* To avoid the need for explicit cleanup on failure, @@ -9373,10 +9374,12 @@ virDomainLeaseDefParseXML(xmlNodePtr node) while (cur != NULL) { if (cur->type == XML_ELEMENT_NODE) { if (!key && virXMLNodeNameEqual(cur, "key")) { - key = virXMLNodeContentString(cur); + if (!(key = virXMLNodeContentString(cur))) + goto error; } else if (!lockspace && virXMLNodeNameEqual(cur, "lockspace")) { - lockspace = virXMLNodeContentString(cur); + if (!(lockspace = virXMLNodeContentString(cur))) + goto error; } else if (!path && virXMLNodeNameEqual(cur, "target")) { path = virXMLPropString(cur, "path"); @@ -10595,16 +10598,19 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } else if (!serial && virXMLNodeNameEqual(cur, "serial")) { - serial = virXMLNodeContentString(cur); + if (!(serial = virXMLNodeContentString(cur))) + goto error; } else if (!wwn && virXMLNodeNameEqual(cur, "wwn")) { - wwn = virXMLNodeContentString(cur); + if (!(wwn = virXMLNodeContentString(cur))) + goto error; if (!virValidateWWN(wwn)) goto error; } else if (!vendor && virXMLNodeNameEqual(cur, "vendor")) { - vendor = virXMLNodeContentString(cur); + if (!(vendor = virXMLNodeContentString(cur))) + goto error; if (strlen(vendor) > VENDOR_LEN) { virReportError(VIR_ERR_XML_ERROR, "%s", @@ -10619,7 +10625,8 @@ virDomainDiskDefParseXML(virDomainXMLOptionPtr xmlopt, } } else if (!product && virXMLNodeNameEqual(cur, "product")) { - product = virXMLNodeContentString(cur); + if (!(product = virXMLNodeContentString(cur))) + goto error; if (strlen(product) > PRODUCT_LEN) { virReportError(VIR_ERR_XML_ERROR, "%s", @@ -13513,20 +13520,16 @@ virDomainSmartcardDefParseXML(virDomainXMLOptionPtr xmlopt, "exactly three certificates")); goto error; } - def->data.cert.file[i] = virXMLNodeContentString(cur); - if (!def->data.cert.file[i]) { - virReportOOMError(); + if (!(def->data.cert.file[i] = virXMLNodeContentString(cur))) goto error; - } + i++; } else if (cur->type == XML_ELEMENT_NODE && virXMLNodeNameEqual(cur, "database") && !def->data.cert.database) { - def->data.cert.database = virXMLNodeContentString(cur); - if (!def->data.cert.database) { - virReportOOMError(); + if (!(def->data.cert.database = virXMLNodeContentString(cur))) goto error; - } + if (*def->data.cert.database != '/') { virReportError(VIR_ERR_XML_ERROR, _("expecting absolute path: %s"), @@ -15638,8 +15641,10 @@ virSysinfoOEMStringsParseXML(xmlNodePtr node, goto cleanup; def->nvalues = nstrings; - for (i = 0; i < nstrings; i++) - def->values[i] = virXMLNodeContentString(strings[i]); + for (i = 0; i < nstrings; i++) { + if (!(def->values[i] = virXMLNodeContentString(strings[i]))) + goto cleanup; + } *oem = g_steal_pointer(&def); ret = 0; @@ -15767,7 +15772,9 @@ virSysinfoParseFWCfgDef(virSysinfoDefPtr def, return -1; } - value = virXMLNodeContentString(nodes[i]); + if (!(value = virXMLNodeContentString(nodes[i]))) + return -1; + file = virXMLPropString(nodes[i], "file"); if (virStringIsEmpty(value)) @@ -19875,8 +19882,10 @@ virDomainLoaderDefParseXML(xmlNodePtr node, if (!fwAutoSelect) { readonly_str = virXMLPropString(node, "readonly"); type_str = virXMLPropString(node, "type"); - loader->path = virXMLNodeContentString(node); - if (STREQ_NULLABLE(loader->path, "")) + if (!(loader->path = virXMLNodeContentString(node))) + return -1; + + if (STREQ(loader->path, "")) VIR_FREE(loader->path); } @@ -20093,14 +20102,15 @@ virDomainVcpuParse(virDomainDefPtr def, vcpus = maxvcpus = 1; if ((vcpuNode = virXPathNode("./vcpu[1]", ctxt))) { - if ((tmp = virXMLNodeContentString(vcpuNode))) { - if (virStrToLong_ui(tmp, NULL, 10, &maxvcpus) < 0) { - virReportError(VIR_ERR_XML_ERROR, "%s", - _("maximum vcpus count must be an integer")); - return -1; - } - VIR_FREE(tmp); + if (!(tmp = virXMLNodeContentString(vcpuNode))) + return -1; + + if (virStrToLong_ui(tmp, NULL, 10, &maxvcpus) < 0) { + virReportError(VIR_ERR_XML_ERROR, "%s", + _("maximum vcpus count must be an integer")); + return -1; } + VIR_FREE(tmp); if ((tmp = virXMLPropString(vcpuNode, "current"))) { if (virStrToLong_ui(tmp, NULL, 10, &vcpus) < 0) { @@ -20361,7 +20371,9 @@ virDomainDefParseBootOptions(virDomainDefPtr def, if (STREQ_NULLABLE(tmp, "slic")) { VIR_FREE(tmp); - tmp = virXMLNodeContentString(nodes[0]); + if (!(tmp = virXMLNodeContentString(nodes[0]))) + return -1; + def->os.slic_table = virFileSanitizePath(tmp); } else { virReportError(VIR_ERR_XML_ERROR, diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c index 0a32f57188..54a723be64 100644 --- a/src/conf/network_conf.c +++ b/src/conf/network_conf.c @@ -722,7 +722,10 @@ virNetworkDNSHostDefParseXML(const char *networkName, if (cur->children != NULL) { g_autofree char *name = virXMLNodeContentString(cur); - if (!name) { + if (!name) + goto error; + + if (!name[0]) { virReportError(VIR_ERR_XML_DETAIL, _("Missing hostname in network '%s' DNS HOST record"), networkName); diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 7bbe709c5d..31c81a3b85 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -1988,10 +1988,12 @@ virNodeDeviceDefParseXML(xmlXPathContextPtr ctxt, switch ((virNodeDevDevnodeType)val) { case VIR_NODE_DEV_DEVNODE_DEV: - def->devnode = virXMLNodeContentString(node); + if (!(def->devnode = virXMLNodeContentString(node))) + goto error; break; case VIR_NODE_DEV_DEVNODE_LINK: - def->devlinks[m++] = virXMLNodeContentString(node); + if (!(def->devlinks[m++] = virXMLNodeContentString(node))) + goto error; break; case VIR_NODE_DEV_DEVNODE_LAST: break; -- 2.25.4

(I am *NOT* advocating that we apply this patch. Just providing it for informational purposes, since we had previously discussed this possibility on the list) Since it's impossible to determine whether xmlNodeContent has returned a NULL due to OOM, or due to badly formed / evil XML, this patch open codes virXMLNodeContentString to get the content string directly from the node. This turns out to not be so easy as it seemed at first glance when it was suggested - the "content" member of the element node itself does not contain the content string for the node. The content string that we want can be found (at least for our uses of libxml) by looking for a child node of the element node - if that child node is of type XML_TEXT_NODE, then the content member of *that* node is the string we're looking for. If there is no child node, then the element has no content, so we return "". Likewise, if the child node is type XML_TEXT_NODE but has no content, we also return "". In all other cases, we log an error and return because this is some case that hasn't been encountered in our test cases, so either someone is sending bad XML, or our assumptions about the layout of the XML node object list are incorrect. Note that while calling virXMLNodeContentString() would return NULL from an OOM situation, this new code will exit the process on OOM (since it is calling glib for memory allocation). Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virxml.c | 43 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 38 insertions(+), 5 deletions(-) diff --git a/src/util/virxml.c b/src/util/virxml.c index 5315d4ff6f..b2298d74c8 100644 --- a/src/util/virxml.c +++ b/src/util/virxml.c @@ -538,7 +538,17 @@ virXMLPropStringLimit(xmlNodePtr node, char * virXMLNodeContentString(xmlNodePtr node) { - char *ret = (char *)xmlNodeGetContent(node); + /* We specifically avoid using virXMLNodeContentString() here, because + * when NULL is returned, it is difficult/impossible to + * distinguish between 1) OOM, 2) NULL content, 3) some other error. + */ + + /* for elements used the way libvirt uses them, the xmlNode object + * for an element will have a type of XML_ELEMENT_NODE, and if the + * node has any content, it will be in the content field of a + * child node of that object which is itself of type + * XML_TEXT_NODE. + */ if (node->type != XML_ELEMENT_NODE) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -547,15 +557,38 @@ virXMLNodeContentString(xmlNodePtr node) return NULL; } - if (!ret) { + /* no children --> empty element node */ + if (!node->children) + return g_strdup(""); + + /* if the child isn't text, or there is more than a single node + * hanging off "children", our assumptions have been wrong + */ + if (node->children->type != XML_TEXT_NODE) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("child of element node '%s' has unexpected name '%s', type %d"), + node->name, node->children->name, node->children->type); + return NULL; + } + if (node->children->next) { virReportError(VIR_ERR_INTERNAL_ERROR, - _("node '%s' has unexpected NULL content. This could be caused by malformed input, or a memory allocation failure"), + _("child of element node '%s' is type XML_TEXT_NODE, but is a list"), + node->name); + return NULL; + } + if (node->children->children) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("child of element node '%s' is type XML_TEXT_NODE, but has children"), node->name); return NULL; } - return ret; -} + /* if content is NULL, return "" instead */ + if (!node->children->content) + return g_strdup(""); + + return g_strdup((char *)node->children->content); + } /** -- 2.25.4

On 7/20/20 10:48 PM, Laine Stump wrote:
(I am *NOT* advocating that we apply this patch. Just providing it for informational purposes, since we had previously discussed this possibility on the list)
Since it's impossible to determine whether xmlNodeContent has returned a NULL due to OOM, or due to badly formed / evil XML, this patch open codes virXMLNodeContentString to get the content string directly from the node.
This turns out to not be so easy as it seemed at first glance when it was suggested - the "content" member of the element node itself does not contain the content string for the node. The content string that we want can be found (at least for our uses of libxml) by looking for a child node of the element node - if that child node is of type XML_TEXT_NODE, then the content member of *that* node is the string we're looking for. If there is no child node, then the element has no content, so we return "". Likewise, if the child node is type XML_TEXT_NODE but has no content, we also return "". In all other cases, we log an error and return because this is some case that hasn't been encountered in our test cases, so either someone is sending bad XML, or our assumptions about the layout of the XML node object list are incorrect.
Note that while calling virXMLNodeContentString() would return NULL from an OOM situation, this new code will exit the process on OOM (since it is calling glib for memory allocation).
Signed-off-by: Laine Stump <laine@redhat.com> --- src/util/virxml.c | 43 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 38 insertions(+), 5 deletions(-)
diff --git a/src/util/virxml.c b/src/util/virxml.c index 5315d4ff6f..b2298d74c8 100644 --- a/src/util/virxml.c +++ b/src/util/virxml.c @@ -538,7 +538,17 @@ virXMLPropStringLimit(xmlNodePtr node, char * virXMLNodeContentString(xmlNodePtr node) { - char *ret = (char *)xmlNodeGetContent(node); + /* We specifically avoid using virXMLNodeContentString() here, because + * when NULL is returned, it is difficult/impossible to + * distinguish between 1) OOM, 2) NULL content, 3) some other error. + */
s/virXMLNodeContentString/xmlNodeGetContent/ This patch makes sense to me. I'll leave it up to you whether you merge it or not. Michal

On 7/20/20 10:48 PM, Laine Stump wrote:
Awhile back I noticed that calls to xmlNodeGetContent() from libvirt code were inconsistent in their handling of the returned pointer. Sometimes we would assume the return was always non-NULL (dereferencing with wild abandon without concern for the consequences), sometimes we would interpret NULL as "", and sometimes as OOM. I sent mail about this to the list last week, wondering (and doubting) if we could assume that a NULL return would always mean OOM:
https://www.redhat.com/archives/libvir-list/2020-July/msg00333.html
After looking at the libxml code, danpb's determination was that a NULL return from xmlNodeGetContent *might* mean OOM, but it might also mean some odd XML that we weren't expecting, so we can't just always exit on a NULL return. He also agreed with (and expanded on) my suspicion that there really is no reliable way to tell the reason for a NULL return from xmlNodeGetContent, and suggested that possibly we could just examing the xmlNode directly to learn the content instead of calling xmlNodeGetContent().
This series is a followup to that discussion. The first 4 patches clean up the code with the result that:
1) a libvirt wrapper function is always called instead of calling xmlNodeGetContent() directly.
2) that wrapper function logs an error when it gets back NULL from xmlNodeGetContent().
3) All the callers check for a NULL return, and do a "silent error return" themselves when there is a NULL.
In the final patch, I try out Dan's idea of looking at the xmlNode object directly to get the content. It turns out it's not as straightforward as you would think from just looking at the layout of the object - a full explanation is in patch 5. I'm mainly sending that patch as an "FYI" (one step back from an "RFC"), since really all it changes is that libvirt will exit on OOM, and log (different, but not any more informative) error messages when the problem isn't OOM. Unless someone has a strong opinion otherwise, I think just the first 4 patches should be applied, and users can just "deal" with the ambiguity in case of error.
Laine Stump (5): conf: refactor virDomainBlkioDeviceParseXML to reduce calls to xmlNodeGetContent util: replace all calls to xmlNodeGetContent with virXMLNodeContentString util: log an error if virXMLNodeContentString will return NULL treat all NULL returns from virXMLNodeContentString() as an error util: open code virXMLNodeContentString to access the node object directly
src/conf/domain_conf.c | 194 ++++++++++++++++++++---------------- src/conf/network_conf.c | 7 +- src/conf/node_device_conf.c | 6 +- src/util/virxml.c | 53 +++++++++- 4 files changed, 169 insertions(+), 91 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> Michal
participants (2)
-
Laine Stump
-
Michal Privoznik