On 11/06/20 10:09, Erik Skultety wrote:
On Wed, Jun 10, 2020 at 03:55:14PM +0200, Paulo de Rezende Pinatti
wrote:
>
> On 08/06/20 16:35, Erik Skultety wrote:
>> On Fri, May 29, 2020 at 12:10:03PM +0200, Paulo de Rezende Pinatti wrote:
>>> Introduce two utility functions to parse a kernel command
>>> line string according to the kernel code parsing rules in
>>> order to enable the caller to perform operations such as
>>> verifying whether certain argument=value combinations are
>>> present or retrieving an argument's value.
>>>
>>> Signed-off-by: Paulo de Rezende Pinatti <ppinatti(a)linux.ibm.com>
>>> Signed-off-by: Boris Fiuczynski <fiuczy(a)linux.ibm.com>
>>> ---
>>> src/libvirt_private.syms | 2 +
>>> src/util/virutil.c | 169 +++++++++++++++++++++++++++++++++++++++
>>> src/util/virutil.h | 17 ++++
>>> tests/utiltest.c | 141 ++++++++++++++++++++++++++++++++
>>> 4 files changed, 329 insertions(+)
>>>
>>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>>> index a6af44fe1c..a206a943c5 100644
>>> --- a/src/libvirt_private.syms
>>> +++ b/src/libvirt_private.syms
>>> @@ -3433,6 +3433,8 @@ virHostGetDRMRenderNode;
>>> virHostHasIOMMU;
>>> virIndexToDiskName;
>>> virIsDevMapperDevice;
>>> +virKernelCmdlineMatchParam;
>>> +virKernelCmdlineNextParam;
>>> virMemoryLimitIsSet;
>>> virMemoryLimitTruncate;
>>> virMemoryMaxValue;
>>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>>> index fb46501142..749c9d7116 100644
>>> --- a/src/util/virutil.c
>>> +++ b/src/util/virutil.c
>>> @@ -1725,6 +1725,175 @@ virHostGetDRMRenderNode(void)
>>> return ret;
>>> }
>>>
>>> +
>>> +static const char *virKernelCmdlineSkipDbQuote(const char *cmdline,
>>
>> minor nitpick: we can call ^this simply ...SkipQuote, we don't need to be so
>> specific about it being a double quotation mark, do we?
>>
>
> Sure, changed to SkipQuote.
>
>>> + bool *is_quoted)
>>> +{
>>> + if (cmdline[0] == '"') {
>>> + *is_quoted = !(*is_quoted);
>>> + cmdline++;
>>> + }
>>> + return cmdline;
>>> +}
>>> +
>>> +
>>> +static size_t virKernelCmdlineSearchForward(const char *cmdline,
>>> + bool *is_quoted,
>>> + bool include_equal)
>>
>> Hmm, what if instead we tried to find and return the index of the '='
character
>> but iterated all the way until the next applicable (i.e. taking quotation into
>> account) space and saved that end of arg/arg=val parameter into **res. The
>> caller of this function would then continue directly from *res with the next
>> arg/arg=val parameter.
>> We could then call this something like virKernelCmdlineFindEqual and return -1
>> if there is no '=' sign, indicating that it's a standalone parameter
with no
>> value.
>>
>
> Yes, we can do that. It would look like this:
>
> size_t index;
nitpick: ^this is a standard iteration variable, so naming it "i" might look
more "expected"
Changed.
> size_t equal_index = 0;
>
> for (index = 0; cmdline[index]; index++) {
> if (!(is_quoted) && g_ascii_isspace(cmdline[index]))
> break;
> if (equal_index == 0 && cmdline[index] == '=') {
> equal_index = index;
> continue;
> }
> virKernelCmdlineSkipQuote(cmdline + index, &is_quoted);
> }
> *res = cmdline + index;
> return equal_index;
>
>
> I found it more convenient to use 0 rather than -1 so that we can also
> handle the case when the argument itself starts with the equal sign.
Yes, that's fine, but please provide a doc string for this function mentioning
this.
Will do.
...
>
> So the function would look like this now:
>
> virSkipSpaces(&cmdline);
> cmdline = virKernelCmdlineSkipQuote(cmdline, &is_quoted);
> equal_index = virKernelCmdlineFindEqual(cmdline, is_quoted, &next);
>
> if (next == cmdline)
> return next;
>
> /* param has no value */
> if (equal_index == 0) {
> if (is_quoted && next[-1] == '"')
> *param = g_strndup(cmdline, next - cmdline - 1);
> else
> *param = g_strndup(cmdline, next - cmdline);
> return next;
> }
>
> *param = g_strndup(cmdline, equal_index);
>
> if (cmdline[equal_index + 1] == '"') {
> is_quoted = true;
> equal_index++;
> }
>
> if (is_quoted && next[-1] == '"')
> *val = g_strndup(cmdline + equal_index + 1,
> next - cmdline - equal_index - 2);
> else
> *val = g_strndup(cmdline + equal_index + 1,
> next - cmdline - equal_index - 1);
> return next;
>
> There's still a bit of handling required due to the kernel's quoting rules.
> I took the liberty of using 'next' in place of 'res' as I think it
conveys
> better its purpose. Let me know if that looks good to you.
That is fine, looks good.
>
>>> +}
>>> +
>>> +
>>> +#define VIR_CMDLINE_STR_CMP(kernel_val, caller_val, flags) \
>>> + (((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ) && \
>>> + STREQ(kernel_val, caller_val)) || ((flags &
VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX) && \
>>> + STRPREFIX(kernel_val,
caller_val)))
>>> +
>>> +
>>> +/*
>>> + * Try to match the provided kernel cmdline string with the provided @arg
>>> + * and the list @values of possible values according to the matching
strategy
>>> + * defined in @flags. Possible options include:
>>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX: do a substring comparison of
values
>>
>> I know you used it in the following patches to match against the accepted
>> values, but do we really need to match with a prefix, I'd be fine with
simple
>> stirng equality matching in all cases and not mix the matching strategy for
>> both values and kernel arguments.
>>
>
> The prefix based comparison is to make sure libvirt matches exactly like the
> kernel code does for the parameter prot_virt (see arch/s390/kernel/uv.c ->
> prot_virt_setup -> kstrtobool) which performs prefix based matching. Using
> equality matching should work for most cases but there would be certain
> corner cases missed (i.e. prot_virt=yfoo will be recognized as 'activate' by
> the kernel but not by libvirt).
That..is...interesting...
>
> If you think such cases are not worth the trouble I can remove the PREFIX
> flag, let me know what you prefer.
Well, given that kernel supposedly recognizes "yfoo" as "on" just
because of
the prefix, we obviously can't neglect the PREFIX matching can we? If we did
and someone indeed used such a horrible example of a cmdline,
virt-host-validate would lie about the feature not being enabled (even though
it would be) because libvirt would only do an equality match, so it has to stay.
<reprove>bad kernel, bad kernel</reprove>
Just to re-assure myself, that's a generic behavior used for all option parsing
in the kernel not just prot_virt, right? If so, then contrary to my previous
response, we should always use and default to prefix matching, because given
the situation equality matching would clearly not be a satisfactory behaviour.
No, actually other parameters might use equality matching, this is the
case with 'mem_encrypt' (see arch/x86/mm/mem_encrypt_identity.c ->
sme_enable -> strncmp at line 575) which was being checked for SEV in
the first version of this series. I think we are fine with the current
approach as we allow the caller to choose which matching strategy to use
by specifying the appropriate flag.
...
>
>
>>> + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY: first positive match satifies
search
>>
>> is "sticky searching" some terminus technicus? If not, we probably
should name
>> this FLAGS_MATCH_FIRST and FLAGS_MATCH_LAST respectively.
>>
>
> I guess the flag explanation was not clear enough, sticky actually means
> "stop as soon as you find a match, no matter the order", so
Well, isn't "stop as soon as you find a match" == match first occurrence?
I suppose one can also see that way :) I will thus use SEARCH_FIRST as
requested.
> FLAGS_SEARCH_FIRST doesn't really fit here. A better name
could be
> FLAGS_SEARCH_ANY. Let me know if you agree.
>
...
>>>
>>> +struct testKernelCmdlineNextParamData
>>> +{
>>> + const char *cmdline;
>>> + const char *param;
>>> + const char *val;
>>> + const char *next;
>>> +};
>>> +
>>> +static struct testKernelCmdlineNextParamData kEntries[] = {
>>> + { "arg1 arg2 arg3=val1",
"arg1", NULL, " arg2 arg3=val1" },
>>> + { "arg1=val1 arg2 arg3=val3 arg4",
"arg1", "val1", " arg2 arg3=val3
arg4" },
>>> + { "arg3=val3 ",
"arg3", "val3", " " },
>>> + { "arg3=val3",
"arg3", "val3", "" },
>>> + { "arg-3=val3 arg4",
"arg-3", "val3", " arg4" },
>>> + { "arg_3=val3 arg4",
"arg-3", "val3", " arg4" },
>>> + { " arg_3=val3 arg4",
"arg-3", "val3", " arg4" },
>>> + { " arg-3=val3 arg4",
"arg-3", "val3", " arg4" },
>>> + { "arg2=\"value with spaces\" arg3=val3",
"arg2", "value with spaces", " arg3=val3" },
>>> + { " arg2=\"value with spaces\" arg3=val3",
"arg2", "value with spaces", " arg3=val3" },
>>> + { " \"arg2=value with spaces\" arg3=val3",
"arg2", "value with spaces", " arg3=val3" },
>>> + { "arg2=\"val\"ue arg3",
"arg2", "val\"ue", " arg3" },
>>> + { " arg3\" escaped=val2\"",
"arg3\" escaped", "val2", "" },
>>
>> ^Is this even valid for the kernel itself? Looking at [1], they clearly
don't
>> allow escaped \" in the arg/value.
>>
>> [1]
https://github.com/torvalds/linux/blob/db54615e21419c3cb4d699a0b0aa16cc44...
>>
>
> I guess the word "escaped" in this test is a bit misleading; it's
actually
> escaping the blank space, not the quote itself. This is valid for the
> kernel. In order to assure our parsing results would match those of the
> kernel I executed the code in cmdline.c in a standalone file to generate the
> reference values for the test.
>
> I'll change "arg3\" escaped" to "arg3\" with
spaces" to clearly state the
> intention here.
Sure, that will help, although I wasn't relating to the word "escaped" in
my
reply. Nevermind though, my bad, I just couldn't wrap my head around seeing
"val\"ue" and somehow could not make sense out of it, too much QE to
process I
guess, it's fine.
Regards,
Erik
--
Best regards,
Paulo de Rezende Pinatti