On 11/01/2012 02:05 AM, Viktor Mihajlovski wrote:
> You were asking about VIR_NODEINFO_MAXCPUS:
>
> #define VIR_NODEINFO_MAXCPUS(nodeinfo)
> ((nodeinfo).nodes*(nodeinfo).sockets*(nodeinfo).cores*(nodeinfo).threads)
>
> I can confirm that virNodeGetInfo misbehaves if enough trailing cpus are
> offline, and therefore agree that we need to fix that API:
>
Actually, this is where I initially started off, see the thread at
https://www.redhat.com/archives/libvir-list/2012-October/msg00399.html
and specifically Daniel's comments on compatibility.
Obviously my first attempt was flawed/naive since it was breaking binary
compatibility. One possible attempt to fix virGetNodeInfo would have
been to change the meaning of the virNodeInfo.cpus field.
No, we can't change the meaning of the cpus field - it must remain the
number of active CPUs.
But that would
be semantically incorrect as the field is denominated as the number of
active CPUs. Fixing the core/socket/thread detection doesn't seem
possible using the sysfs interfaces.
Why not? We just proved with nodeGetCPUCount that it is possible to
determine the number of possible cpus even when some of the
cores/threads are offline. That just means our core/socket/thread
detection code needs to be aware of offline cpus, even if it can't
determine their complete topology, so that it at least doesn't
underestimate the number of possible cores.
Now, the next straight-forward thing might have been a
virNodeGetInfo2
or similar but I thought an API for the host CPU map might be more
versatile in the long run.
Both APIs are useful. We're a bit constrained with virNodeGetInfo, but
I still think that it should return the right value for
VIR_NODEINFO_MAXCPUS.
All that said ... it seems that we have to live with the flawed
semantics of VIR_NODEINFO_MAXCPUS.
I'm not convinced of that.
This series should alleviate this
problem while still retaining the old semantics (by means of fallback)
even if a new client talks to an old server.
Implementation wise, from a newer client, we KNOW that if the new
interface is present, then we know the cpu count is accurate (well, that
won't be true until we fix the new interface to use fallbacks on RHEL 5,
so getting that to work on RHEL 5 is my priority today before 1.0.0 is
cut). We also know that if the new interface is not present, then
virNodeGetInfo might under-estimate the number of cpus, but there is
nothing we can do about it, because it is an older libvirt; but at the
same time, we know it is the best we can do, so we must use the fallback.
However, for older clients connecting to newer libvirt, where the older
client only knows how to use the older interface, we might as well make
newer libvirt give them correct information.
--
Eric Blake eblake(a)redhat.com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org