On 2012年10月15日 14:22, Hu Tao wrote:
On Fri, Oct 12, 2012 at 01:27:27PM +0800, Osier Yang wrote:
> These 3 elements conflicts with each other in either the doc
> or the underlying codes are. This is to propse a solution.
>
> Before writing any codes, I want to see if the principle is
> correct, any advise is welcomed.
>
> Current problems:
>
> Problem 1:
>
> The doc shouldn't simply say "These settings are superseded
> by CPU tuning. " for element<vcpu>. As except the tuning,<vcpu>
> allows to specify the current, maxmum vcpu number. Apart from that,
> <vcpu> also allows to specify the placement as "auto", which binds
> the domain process to the advisory nodeset from numad.
>
> Problem 2:
>
> Doc for<vcpu> says its "cpuset" specify the physical CPUs
> that the vcpus can be pinned. But it's not the truth, as
> actually it only pin domain process to the specified physical
> CPUs. So either it's a document bug, or code bug.
>
> Problem 3:
>
> Doc for<vcpupin> says it supersed "cpuset" of<vcpu>, it's
> not quite correct, as each<vcpupin> specify the pinning policy
> only for one vcpu. How about the ones which doesn't have
> <vcpupin> specified? it says the vcpu will be pinned to all
> available physical CPUs, but what's the meaning of attribute
> "cpuset" of<vcpu> then?
>
> Problem 4:
>
> Doc for<emulatorpin> says it pin the emulator threads (domain
> process in other context, perhaps another follow up patch to
> cleanup the inconsistency is needed) to the physical CPUs
> specified its attribute "cpuset". Which conflicts with
> <vcpu>'s "cpuset". And actually in the underlying codes,
> it set the affinity for domain process twice if both
> "cpuset" for<vcpu> and<emulatorpin> are specified,
> and<emulatorpin>'s pinning will override<vcpu>'s.
>
> Problem 5:
>
> When "placement" of<vcpu> is "auto" (I.e. uses numad to
> get the advisory nodeset to which the domain process is
> pinned to), it will also be overridden by<emulatorpin>,
>
> This patch is trying to sort out the conflicts or bugs by:
>
> 1) Don't say<vcpu> is superseded by<cputune>
You mean in the documentation of XML format?
Yes.
Acutally the VCPUs placement settings of<vcpu> will be overrided
by those of<cputune>. So I think it's better to keep the words in
doc to make users aware of this.
The problem is <vcpu> not only defines the vcpu affinities. And
in the new design (see following), "cpuset" of <vcpu> defines
the **default** placement for both domain process (emulator threads
in emulatorpin context) and vcpu threads. For vcpus which doesn't
have <vcpupin> specified, they still inherit the default placement.
Also domain process will be pinned to the default placement if
<emulatorpin> is not specified.
>
> 2) Keep the semanteme for "cpuset" of<vcpu> (I.e. Still says it
> specify the physical CPUs the virtual CPUs). But modifying it
> to mention it also set the pinning policy for domain process,
> and the CPU placement of domain process specified by "cpuset"
> of<vcpu> will be ingored if<emulatorpin> specified, and
> similary, the CPU placement of vcpu thread will be ignored
> if it has<vcpupin> specified, for vcpu which doesn't have
> <vcpupin> specified, it inherits "cpuset" of<vcpu>.
OK.
>
> 3) Don't say<vcpu> is supersed by<vcpupin>. If
neither<vcpupin>
> nor "cpuset" of<vcpu> is specified, the vcpu will be pinned
> to all available pCPUs.
OK.
>
> 4) If neither<emulatorpin> nor "cpuset" of<vcpu> is
specified,
> the domain process (emulator threads in the context) will be
> pinned to all available pCPUs.
OK.
>
> 5) If "placement" of<vcpu> is "auto",<emulatorpin>
is not allowed.
Conflicts with 2). Why not just override the emulator part? for vcpu
threads the "placement" is still "auto".
In the final patch, <emulatorpin> is ignored if vcpu placement
is "auto" when parsing.
>
> 6) hotplugged vcpus will also inherit "cpuset" of<vcpu>
OK.
>
> Codes changes above document changes will cause:
>
> 1) Inherit def->cpumask for each vcpu which doesn't have<vcpupin>
> specified, during parsing.
OK.
>
> 2) ping the vcpu which doesn't have<vcpupin> specified to def->cpumask
s/ping/pin/
> either by cgroup for sched_setaffinity(2), which is actually done
> by 1).
pin vcpu according to this_vcpu->cpumask, since the cpumask either
inherits from def->cpumask(<vcpu>), or is set by<vcpupin>.
Yeah, but this is already done by either cgroup or sched_setaffinity
when domain starting. So no new codes is needed.
>
> 3) Error out if "placement" == "auto", and<emulatorpin> is
specified.
> Otherwise,<emulatorpin> is honored, and "cpuset"
of<cpuset> is
> ignored.
You mean "cpuset" of<vcpu> here?
Right, a typo.
But I still don't understand why "placement" = "auto"
and<emulatorpin>
can not both exist, but the latter overides the former.
I think we have agreement on the princinple: we must use either
"placement == auto" or <emulatorpin> to set the affinity for
domain process, but not the both, right?
Based on the agreement, there are two ways, one is to ignore one
of them inside driver, another is to ignore when parsing conf.
The later one is better as other drivers could support "placement"
and "emulatorpin" in future, doing the ignoring work inside
each driver is duplicate work, and we need a general rule
about the relationship between them for doc.
In case of we don't have agreement we can't use both of them
to set the affinites. The reason is:
numad is likely to manage the affinity for domain process
dynamically in future, it's unpreditable to overrides the
advisory affinity from numad with cgroup afterwards.
Regards,
Osier