lxml is a popular python XML processing library. It uses libxml2
behind the scenes, and registers custom callbacks via
xmlSetExternalEntityLoader. However this can cause crashes if
if an app uses both lxml and libxml2 together in the same process.
This is a known limitation of lxml and libxml2 generally. It also
prevents us from using lxml in virt-manager:
https://bugzilla.redhat.com/show_bug.cgi?id=1544019
However it's easy enough to work around in libvirt, by unsetting the
EntityLoader callback to a known state before we ask libxml2 to
parse a file from disk.
Signed-off-by: Cole Robinson <crobinso(a)redhat.com>
---
src/util/virxml.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/src/util/virxml.c b/src/util/virxml.c
index 6e87605ea..3e01794f9 100644
--- a/src/util/virxml.c
+++ b/src/util/virxml.c
@@ -810,9 +810,14 @@ virXMLParseHelper(int domcode,
pctxt->sax->error = catchXMLError;
if (filename) {
+ /* Reset any libxml2 file callbacks, other libs (like python lxml)
+ * may have set their own which can get crashy */
+ xmlExternalEntityLoader origloader = xmlGetExternalEntityLoader();
+ xmlSetExternalEntityLoader(xmlNoNetExternalEntityLoader);
xml = xmlCtxtReadFile(pctxt, filename, NULL,
XML_PARSE_NONET |
XML_PARSE_NOWARNING);
+ xmlSetExternalEntityLoader(origloader);
} else {
xml = xmlCtxtReadDoc(pctxt, BAD_CAST xmlStr, url, NULL,
XML_PARSE_NONET |
--
2.14.3