[libvirt] [PATCH] Detailed XML errors when XML is not well-formed

This patch changes the implementations lots of functions which parse XML documents, so that if the XML document is not well-formed then we get detailed error messages. The general form of the change is: static void catchXMLError (void *ctx, const char *msg ATTRIBUTE_UNUSED, ...) { // a callback which calls virDomainReportError } virDomainDefPtr virDomainDefParseString(virConnectPtr conn, virCapsPtr caps, const char *xmlStr) { xmlParserCtxtPtr pctxt; pctxt = xmlNewParserCtxt (); pctxt->sax->error = catchXMLError; xml = xmlCtxtReadDoc (pctxt, //...) etc. There are some unavoidable shortcomings: (1) There is no place to stash user pointers during the callback (the suggestively named pctxt->userData field is already used for something else), so we cannot pass the virConnectPtr to the error function. As a result, virterror will store the error in a global variable, and callers will probably not be able to access it. (2) The XML parser routinely produces multiple error messages, and virterror throws away all but the last one. The errors do, however, get printed to stderr. You can test this by using 'virsh define', 'virsh net-define', 'virsh pool-define', etc. on not-well-formed XML files. Rich. -- Richard Jones, Emerging Technologies, Red Hat http://et.redhat.com/~rjones virt-p2v converts physical machines to virtual machines. Boot with a live CD or over the network (PXE) and turn machines into Xen guests. http://et.redhat.com/~rjones/virt-p2v

On Wed, Jul 30, 2008 at 03:49:17PM +0100, Richard W.M. Jones wrote:
This patch changes the implementations lots of functions which parse XML documents, so that if the XML document is not well-formed then we get detailed error messages.
The general form of the change is:
static void catchXMLError (void *ctx, const char *msg ATTRIBUTE_UNUSED, ...) { // a callback which calls virDomainReportError }
virDomainDefPtr virDomainDefParseString(virConnectPtr conn, virCapsPtr caps, const char *xmlStr) { xmlParserCtxtPtr pctxt;
pctxt = xmlNewParserCtxt (); pctxt->sax->error = catchXMLError;
xml = xmlCtxtReadDoc (pctxt, //...) etc.
There are some unavoidable shortcomings:
(1) There is no place to stash user pointers during the callback (the suggestively named pctxt->userData field is already used for something else), so we cannot pass the virConnectPtr to the error function. As a result, virterror will store the error in a global variable, and callers will probably not be able to access it.
Yes, that makes it rather limited & not really any benefit as is. We can get around this though with some cleverness of thread locals. Either we can annotate a global variable with __thread, or use the pthread_once_t/pthread_key_t to store it. So before starting the parse, initialize these to 'empty' or some other defined state. Do the parse, and the error handler can store the problems. Once finished we can read the details out (& clear any state to avoid mem leak) and put them into our virConnectPtr error object. Since you're using thread locals this is all nice & safe from race conditions with multiple parses taking place in parallel.
(2) The XML parser routinely produces multiple error messages, and virterror throws away all but the last one.
It'd probably be more use to only report the first one, since the latter problems may well be caused by earlier problems.
The errors do, however, get printed to stderr.
You won't see those except with the 'test' or 'xen' drivers run locally. Anything going via the daemon will be output on the daemon's stderr (/dev/null) and not on the client. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Wed, Jul 30, 2008 at 03:57:57PM +0100, Daniel P. Berrange wrote:
On Wed, Jul 30, 2008 at 03:49:17PM +0100, Richard W.M. Jones wrote:
This patch changes the implementations lots of functions which parse XML documents, so that if the XML document is not well-formed then we get detailed error messages.
The general form of the change is:
static void catchXMLError (void *ctx, const char *msg ATTRIBUTE_UNUSED, ...) { // a callback which calls virDomainReportError }
virDomainDefPtr virDomainDefParseString(virConnectPtr conn, virCapsPtr caps, const char *xmlStr) { xmlParserCtxtPtr pctxt;
pctxt = xmlNewParserCtxt (); pctxt->sax->error = catchXMLError;
xml = xmlCtxtReadDoc (pctxt, //...) etc.
There are some unavoidable shortcomings:
(1) There is no place to stash user pointers during the callback (the suggestively named pctxt->userData field is already used for something else), so we cannot pass the virConnectPtr to the error function. As a result, virterror will store the error in a global variable, and callers will probably not be able to access it.
void *_private; /* For user data, libxml won't touch it */ just set _private in the parser context, you should get the parser context back in the SAX handler, then you can get to your _private data. that should work and avoid the mess of global variables and thread stuff...
(2) The XML parser routinely produces multiple error messages, and virterror throws away all but the last one.
It'd probably be more use to only report the first one, since the latter problems may well be caused by earlier problems.
Likely but not always correct, maybe the solution is to build a big string concatenating all the errors and passing it back
You won't see those except with the 'test' or 'xen' drivers run locally. Anything going via the daemon will be output on the daemon's stderr (/dev/null) and not on the client.
yeah, unfortunately we have to assume a model where each errors can't be reported individually, Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Richard W.M. Jones