[libvirt] PATCH: Fix escaping for 8-bit high characters

The virBufferEscapeString method has a broken test *cur > 0x20 which causes all 8-bit high characters to be lost as 'cur' is a signed char. This obviously dooms anyone using UTF-8 characters outside the boring old ASCII-7 range, like this guy https://bugzilla.redhat.com/show_bug.cgi?id=479517 Daniel diff --git a/src/buf.c b/src/buf.c index 259175d..c802aa2 100644 --- a/src/buf.c +++ b/src/buf.c @@ -304,7 +304,7 @@ virBufferEscapeString(const virBufferPtr buf, const char *format, const char *st *out++ = 'o'; *out++ = 's'; *out++ = ';'; - } else if ((*cur >= 0x20) || (*cur == '\n') || (*cur == '\t') || + } else if (((unsigned char)*cur >= 0x20) || (*cur == '\n') || (*cur == '\t') || (*cur == '\r')) { /* * default case, just copy ! -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On Tue, Aug 04, 2009 at 04:00:49PM +0100, Daniel P. Berrange wrote:
The virBufferEscapeString method has a broken test
*cur > 0x20
which causes all 8-bit high characters to be lost as 'cur' is a signed char. This obviously dooms anyone using UTF-8 characters outside the boring old ASCII-7 range, like this guy
https://bugzilla.redhat.com/show_bug.cgi?id=479517
Daniel
diff --git a/src/buf.c b/src/buf.c index 259175d..c802aa2 100644 --- a/src/buf.c +++ b/src/buf.c @@ -304,7 +304,7 @@ virBufferEscapeString(const virBufferPtr buf, const char *format, const char *st *out++ = 'o'; *out++ = 's'; *out++ = ';'; - } else if ((*cur >= 0x20) || (*cur == '\n') || (*cur == '\t') || + } else if (((unsigned char)*cur >= 0x20) || (*cur == '\n') || (*cur == '\t') || (*cur == '\r')) { /* * default case, just copy !
The original code from libxml2 used const xmlChar *cur which is an unsigned char..., ACK W.r.t. the comment, another way would be to check if the out of ASCII is an UTF-8 char sequence and if not export it as char references like ''. But it's painful and I don't think we can have the problem, at least as long as libvirt is used for the management, because all definitions are input as XML, libxml2 will only expose them as UTF-8 so all strings coming out of a definition (domain or else) should be UTF-8. Daniel -- Daniel Veillard | libxml Gnome XML XSLT toolkit http://xmlsoft.org/ daniel@veillard.com | Rpmfind RPM search engine http://rpmfind.net/ http://veillard.com/ | virtualization library http://libvirt.org/
participants (2)
-
Daniel P. Berrange
-
Daniel Veillard