Eric Blake <eblake@redhat.com> wrote on 09/27/2010 05:07:47 PM:

>
> On 09/27/2010 12:40 PM, Stefan Berger wrote:
> > Extend the nwfilter.rng schema to accept comment attributes for all protocol
> > types.
> >
> > Signed-off-by: Stefan Berger <stefanb@us.ibm.com>
> >
> >
> > +  <define name="comment-attribute">
> > +    <interleave>
> > +      <optional>
> > +        <attribute name="comment">
> > +          <ref name="comment-type"/>
> > +        </attribute>
> > +      </optional>
> > +    </interleave>
> > +  </define>
>
> Maybe I'm not understanding rng, but what is being interleaved here?  Do
> things still validate if "comment-attribute" does not contain an
> <interleave>?


It's not necessary from what I can see, so I removed it.

>
> > +
> > +  <define name='comment-type'>
> > +    <data type="string">
> > +      <param name="pattern">[a-zA-Z0-9`~\^!@#$%\-_+=|\\:";,./ \
> (\)\[\]\{\}&quot;&amp;&lt;&gt;&apos;]*</param>
>
> Since we are enforcing a maximum comment length of 256, would it make
> sense to use {0,256} rather than * (or is it \{0,256\} for this flavor
> of regular expression)?  This explicitly leaves out tabs; I guess that's


Correct.

> okay.  It also leaves out 8-bit bytes - could that be a problem for i18n
> where people want comments with native-language accented characters?

> That is, are we being too strict here?  Maybe a better pattern would be
> to reject specific non-printing ASCII bytes we want to avoid, assuing
> you can use escape sequences like [^\001]?


Looking at

http://www.asciitable.com/

I should probably include 0x20-0x7E and 128-175, 224-238 - maybe even more? So the regex then becomes

[&#x20;-&#x7E;&#128;-&#175;&#224;-&#238;]{0,256}

   Stefan