Re: [libvirt] [Qemu-devel] [PATCH] vl.c: Support multiple CPU ranges on -numa option

26 Feb 2013


      On Tue, Feb 26, 2013 at 10:53:07AM +0100, Markus Armbruster wrote:
...
Eduardo Habkost <ehabkost@redhat.com> writes:
...
On Thu, Feb 21, 2013 at 09:23:22PM +0100, Markus Armbruster wrote:
...
Eduardo Habkost <ehabkost@redhat.com> writes:
...
This allows "," to be used a separator between each CPU range.  Note
that commas inside key=value command-line options have to be escaped
using ",,", so the command-line will look like:
-numa node,cpus=A,,B,,C,,D
This is really, really ugly, and an embarrassment to document.  Which
you didn't ;)
I was trying to have an intermediate solution using the current -numa
parser. I have patches in my queue that will change the code to properly
use QemuOpts later.
It would be interesting to support the "A,B,C,D" format in config files,
though, as it is simple and straighforward when no escaping is required.
Our config file syntax is in a Windows INI dialect: key=value lines
grouped into sections.  Our dialect requires values to be enclosed in
quotes.  Commonly, the quotes are optional.  Could be fixed.  It
supports multi-valued keys the common INI way: multiple key=value lines
for the same key, one per value
key = "A,B,C" works when the A, B, C can't contain commas.  Fine for a
list of numbers.  For long lists, we'd probably want to add a line
continuation feature.
Strings can contain commas, so you'd have to do something like key =
"A", "B", "C".  Whether that's still Windows INI is debatable.  More so
since there's already a common way to do it: one line per value.
I was only thinking about the -numa option problem. Having a more
generic solution would surely be even better.
...
If we decide INI doesn't meet our needs or desires for pretty syntax, we
should not extend it beyond its limits into QEMU's very own
configuration syntax.  We should switch to a common syntax that serves
our needs and desires.  For what it's worth, we already parse JSON.
I completely agree. But by now I just want to know what we should do
while we don't have a generic parser/syntax that can handle lists in a
pretty way. So:
...
For me, the INI way to do multi-valued keys is still fine.
Having multiple-valued keys (cpus=A,cpus=B,cpus=C) seems like the best
intermediate solution while we don't have a decent generic syntax.
Except that Anthony doesn't like it.

Anthony, care to explain why exactly you don't want it?
...
...
...
What about
-numa node,cpus=A,cpus=B,cpus=C,cpus=D
Looks better for the command-line usage, at least. I will give it a try.
...
Yes, QemuOpts lets you do that. Getting all the values isn't as easy as
it could be (unless you use Laszlo's opt-visitor), but that could be
improved.
Guess what: -numa doesn't even use QemuOpts, and I am not sure the
current format of -numa will allow QemuOpts to be used easily. I expect
the proper solution using QemuOpts to involve having a
standards-compliant "numa-node" config section instead of this weird
"-numa <type>,..." format where the only valid <type> that ever existed
was "node".
This is the current -numa syntax, as far as I can tell:
-numa node,KEY=VALUE,...
Recognized KEY=VALUE:
nodeid=UINT
    mem=SIZE
    cpus=[|UINT|UINT-UINT]
Unrecognized KEYs are silently ignored.
This should fit into QemuOpts just fine.  Sketch:
static QemuOptsList qemu_numa_opts = {
    .name = "numa",
    .implied_opt_name = "type"
    .head = QTAILQ_HEAD_INITIALIZER(qemu_rtc_numa.head),
    .desc = {
        {
            .name = "type",
            .type = QEMU_OPT_STRING,
            .help = "node type"
        },
The "node" part is not a "node type", it is an "numa option type", and
the only valid "option type" today is "node" (which is what makes the
current syntax seem weird to me).

I would simply drop the "numa" part from the command-line argument and
name the new config section "numa-node". I will send patches to do that,
later.
...
{
            .name = "nodeid",
            .type = QEMU_OPT_NUMBER,
            .help = "node ID"
        }, {
            .name = "mem",
            .type = QEMU_OPT_SIZE,
            .help = "memory size"
        }, {
I need to double-check that QEMU_OPT_SIZE has exactly the same behavior
of the ad-hoc parser, first.
...
.name = "cpus",
            .type = QEMU_OPT_STRING,
            .help = "CPU range"
        },
        { /* end of list */ }
    },
};
type = qemu_opt_get(opts);
    if (!type || strcmp(type, "node)) {
        // error
    }
    // get and record nodeid, mem
    // get, parse and record cpus
This rejects unrecognized keys, unlike the current code.  Declare bug
fix ;)
Good.  :-)
...
To support discontinuous CPU sets, simply get all values of key "cpus".
I think I have an unfinished work branch that did that. But Paolo also
have a similar patch on his tree that does the conversion to QemuOpts in
a much simpler way.

In either case, first I need to check if QemuOpts will match the ad-hoc
parser behavior, because they use different integer/size parser
functions.
...
...
But I believe it will be feasible to allow "cpus=A,cpus=B" using the
current parser, before we convert to a proper QemuOpts-based
implementaiton.
...
...
Note that the following format, currently used by libvirt:
-numa nodes,cpus=A,B,C,D
will _not_ work yet, as "," is the option separator for the command-line
option parser, and it will require changing the -numa option parsing
code to handle "cpus" as a special case.
No way.
Agreed.  :-)
The bad news is that libvirt uses this format since forever, this format
never worked, and nobody ever noticed that this was broken.
The good news is that it never worked, which simplifies our backward
compatibility worries.
Note that only the multiple-CPU-ranges use-case is broken. It works if
all NUMA nodes have simple "A-B" CPU ranges.

-- 
Eduardo