On Mon, Mar 30, 2015 at 07:06:45AM -0600, Eric Blake wrote:
On 03/30/2015 05:02 AM, Ján Tomko wrote:
> These cannot be represented in XML.
Yes they can, via entities. DV would know for sure, but I think that
 is the entity for the C byte '\1'.
no they can't :-)
A character must match prod Char, even when using a CharRef
http://www.w3.org/TR/REC-xml/#NT-Char
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] |
[#xE000-#xFFFD] | [#x10000-#x10FFFF] /* any Unicode character,
excluding the surrogate blocks, FFFE, and FFFF. */
>
> We have been stripping them, but only if the string had
> characters that needed escaping: <>"'&
>
> Extend the strcspn check to include control codes, and strip
> them even if we don't do any escaping.
NACK. Stripping control codes from a volume name represents the wrong
name. We need to escape the problematic bytes, rather than strip them.
you can't escape them with a CharRef for sure
http://www.w3.org/TR/REC-xml/#wf-Legalchar
Characters referred to using character references must match the
production for Char.
That time Ján is right :-)
Daniel
--
Daniel Veillard | Open Source and Standards, Red Hat
veillard(a)redhat.com | libxml Gnome XML XSLT toolkit
http://xmlsoft.org/
http://veillard.com/ | virtualization library
http://libvirt.org/