[libvirt] [RFC] New API to retrieve node CPU map

Hi, in order to use the APIs listed below it is necessary for a client to know the maximum number of node CPUs which is passed via the maplen argument. virDomainGetEmulatorPinInfo virDomainGetVcpuPinInfo virDomainGetVcpus virDomainPinEmulator virDomainPinVcpu virDomainPinVcpuFlags The current approach uses virNodeGetInfo to determine the maximum CPU number. This can lead to incorrect results if not all node CPUs are online. The maximum CPU number should always be the number of CPUs present on the host, regardless of their online/offline state. The following example illustrates the issue: Host has 3 logical CPUs, 1 socket, 3 cores, 1 thread. Guest has 1 virtual CPU and is started while all 3 host CPUs are online. $ virsh vcpuinfo guest VCPU: 0 CPU: 0 State: running CPU time: 35.4s CPU Affinity: yyy $ echo 0 > /sys/devices/system/cpu/cpu1/online $ virsh vcpuinfo guest VCPU: 0 CPU: 0 State: running CPU time: 35.5s CPU Affinity: y- The correct display for CPU affinity would have been y-y, as the guest continues to use CPUs 0 and 2. This is not a display problem only, because it is also not possible to explicitly pin the virtual CPU to host CPUs 0 and 2, due to the truncated CPU mask. PROPOSAL: To help solve the issue above I suggest two new public API functions: int virNodeGetCpuNum(virConnectPtr conn); returning the number of present CPUs on the host or -1 upon failure and int virNodeGetCpuMap(virConnectPtr conn, unsigned char * cpumap, int maplen); returning the number of present CPUs or -1 on failure and storing a bit map of real CPUs as described in virDomainPinVcpu in cpumap. The bits in the bit map are set to 1 for online CPUs and set to 0 for offline CPUs. Implementation is facilitated by the function nodeGetCPUmap in nodeinfo.c. Clients can use virNodeGetCpuNum to properly determine the maximum number of node CPUs and the online/offline information. Thanks for your comments. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 10/10/2012 09:18 AM, Viktor Mihajlovski wrote:
Hi,
in order to use the APIs listed below it is necessary for a client to know the maximum number of node CPUs which is passed via the maplen argument.
virDomainGetEmulatorPinInfo virDomainGetVcpuPinInfo virDomainGetVcpus virDomainPinEmulator virDomainPinVcpu virDomainPinVcpuFlags
The current approach uses virNodeGetInfo to determine the maximum CPU number. This can lead to incorrect results if not all node CPUs are online. The maximum CPU number should always be the number of CPUs present on the host, regardless of their online/offline state.
This is not a display problem only, because it is also not possible to explicitly pin the virtual CPU to host CPUs 0 and 2, due to the truncated CPU mask.
Indeed, offline host cpus introduce lots of oddities. Your idea of a new API makes sense, although there's still some details to work out.
PROPOSAL:
To help solve the issue above I suggest two new public API functions:
int virNodeGetCpuNum(virConnectPtr conn);
returning the number of present CPUs on the host or -1 upon failure and
int virNodeGetCpuMap(virConnectPtr conn, unsigned char * cpumap, int maplen);
Lately, we have been favoring APIs that auto-allocate, instead of making the user call two separate APIs and risk a race where the answer from the first call is no longer matching reality during the second call. That is, if I hot[un]plug a CPU in between the two proposed API calls, will my maplen of the second call as learned from the first call still be accurate? And why should I have to call two functions? I think it might be better to write: int virNodeGetCpuMap(virConnectPtr conn, unsigned char **cpumap); where *cpumap will be malloc'd by libvirt and the return value be the maplen (also allow cpumap==NULL on entry to just query the map len without also allocating). Then the result is -1/maplen instead of -1/0 for failure/success.
returning the number of present CPUs or -1 on failure and storing a bit map of real CPUs as described in virDomainPinVcpu in cpumap. The bits in the bit map are set to 1 for online CPUs and set to 0 for offline CPUs.
Implementation is facilitated by the function nodeGetCPUmap in nodeinfo.c.
Clients can use virNodeGetCpuNum to properly determine the maximum number of node CPUs and the online/offline information.
Thanks for your comments.
Are you interested in writing patches to implement this new API, if others agree with my alternate single-API signature? -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 2012年10月11日 05:23, Eric Blake wrote:
On 10/10/2012 09:18 AM, Viktor Mihajlovski wrote:
Hi,
in order to use the APIs listed below it is necessary for a client to know the maximum number of node CPUs which is passed via the maplen argument.
virDomainGetEmulatorPinInfo virDomainGetVcpuPinInfo virDomainGetVcpus virDomainPinEmulator virDomainPinVcpu virDomainPinVcpuFlags
The current approach uses virNodeGetInfo to determine the maximum CPU number. This can lead to incorrect results if not all node CPUs are online. The maximum CPU number should always be the number of CPUs present on the host, regardless of their online/offline state.
This is not a display problem only, because it is also not possible to explicitly pin the virtual CPU to host CPUs 0 and 2, due to the truncated CPU mask.
Indeed, offline host cpus introduce lots of oddities. Your idea of a new API makes sense, although there's still some details to work out.
PROPOSAL:
To help solve the issue above I suggest two new public API functions:
int virNodeGetCpuNum(virConnectPtr conn);
returning the number of present CPUs on the host or -1 upon failure and
int virNodeGetCpuMap(virConnectPtr conn, unsigned char * cpumap, int maplen);
Lately, we have been favoring APIs that auto-allocate, instead of making the user call two separate APIs and risk a race where the answer from the first call is no longer matching reality during the second call. That is, if I hot[un]plug a CPU in between the two proposed API calls, will my maplen of the second call as learned from the first call still be accurate? And why should I have to call two functions?
I think it might be better to write:
int virNodeGetCpuMap(virConnectPtr conn, unsigned char **cpumap);
where *cpumap will be malloc'd by libvirt and the return value be the maplen (also allow cpumap==NULL on entry to just query the map len without also allocating). Then the result is -1/maplen instead of -1/0 for failure/success.
returning the number of present CPUs or -1 on failure and storing a bit map of real CPUs as described in virDomainPinVcpu in cpumap. The bits in the bit map are set to 1 for online CPUs and set to 0 for offline CPUs.
Implementation is facilitated by the function nodeGetCPUmap in nodeinfo.c.
Clients can use virNodeGetCpuNum to properly determine the maximum number of node CPUs and the online/offline information.
Thanks for your comments.
Are you interested in writing patches to implement this new API, if others agree with my alternate single-API signature?
Agreed. A single-API and allocating the cpumap inside libvirt without the need to known the maplen avoids the race. But a flags support won't bite us, assuming someone just want to get the cpumap of the onlined CPUs, or offlined CPUs. So how about: int virNodeGetCpuMap(virConnectPtr conn, unsigned char **cpumap, unsigned int flags); Regards, Osier

On 10/10/2012 10:02 PM, Osier Yang wrote:
Are you interested in writing patches to implement this new API, if others agree with my alternate single-API signature?
Agreed. A single-API and allocating the cpumap inside libvirt without the need to known the maplen avoids the race.
But a flags support won't bite us, assuming someone just want to get the cpumap of the onlined CPUs, or offlined CPUs.
Indeed, shame on me for not remember a flags argument for any possible future extensions.
So how about:
int virNodeGetCpuMap(virConnectPtr conn, unsigned char **cpumap, unsigned int flags);
Yes, that's better, even if flags must always be 0 for now. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 10/11/2012 06:06 AM, Eric Blake wrote:
On 10/10/2012 10:02 PM, Osier Yang wrote:
Are you interested in writing patches to implement this new API, if others agree with my alternate single-API signature?
Agreed. A single-API and allocating the cpumap inside libvirt without the need to known the maplen avoids the race.
But a flags support won't bite us, assuming someone just want to get the cpumap of the onlined CPUs, or offlined CPUs.
Indeed, shame on me for not remember a flags argument for any possible future extensions.
So how about:
int virNodeGetCpuMap(virConnectPtr conn, unsigned char **cpumap, unsigned int flags);
Yes, that's better, even if flags must always be 0 for now.
No problem to include flags for future extensions to be on the safe side. However, the map will contain all CPUs with the bits representing the online/offline state. If the intention is to quickly get the number of offline/online CPUS the following signature would be better IMO int virNodeGetCpuMap(virConnectPtr conn, unsigned char **cpumap, unsigned int *online, unsigned int *offline, unsigned int flags); -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 10/11/2012 01:01 AM, Viktor Mihajlovski wrote:
No problem to include flags for future extensions to be on the safe side. However, the map will contain all CPUs with the bits representing the online/offline state. If the intention is to quickly get the number of offline/online CPUS the following signature would be better IMO
int virNodeGetCpuMap(virConnectPtr conn, unsigned char **cpumap, unsigned int *online, unsigned int *offline, unsigned int flags);
That starts to feel rather complex. On the other hand, you have a point that returning maplen (number of bytes allocated in the map) does NOT tell you how many cpus are online without doing a population count of the result; worse, it cannot tell you how many cpus are offline (a maplen of 1 is returned whether I have 4 cpus or 8 cpus, so if the map contains the value 0xb, does that mean I had a 4 cpu machine with cpu2 off, or an 8 cpu machine with cpu2 and cpu4-7 off?). So allowing the user to optionally ask how many cpus are online or offline might indeed also be useful, in addition to the maplen needed to return the bitmap. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 10/11/2012 02:27 PM, Eric Blake wrote:
That starts to feel rather complex.
well :-)
On the other hand, you have a point that returning maplen (number of bytes allocated in the map) does NOT tell you how many cpus are online without doing a population count of the result; worse, it cannot tell you how many cpus are offline (a maplen of 1 is returned whether I have 4 cpus or 8 cpus, so if the map contains the value 0xb, does that mean I had a 4 cpu machine with cpu2 off, or an 8 cpu machine with cpu2 and cpu4-7 off?). So allowing the user to optionally ask how many cpus are online or offline might indeed also be useful, in addition to the maplen needed to return the bitmap.
I actually thought of returning the number of CPUs, i.e. relevant bits which of course makes either 'online' or 'offline' redundant. The maplen - if of interest - can be computed by VIR_CPU_MAPLEN. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 10/11/2012 03:15 PM, Viktor Mihajlovski wrote:
On 10/11/2012 02:27 PM, Eric Blake wrote:
That starts to feel rather complex.
well :-)
On the other hand, you have a point that returning maplen (number of bytes allocated in the map) does NOT tell you how many cpus are online without doing a population count of the result; worse, it cannot tell you how many cpus are offline (a maplen of 1 is returned whether I have 4 cpus or 8 cpus, so if the map contains the value 0xb, does that mean I had a 4 cpu machine with cpu2 off, or an 8 cpu machine with cpu2 and cpu4-7 off?). So allowing the user to optionally ask how many cpus are online or offline might indeed also be useful, in addition to the maplen needed to return the bitmap.
I actually thought of returning the number of CPUs, i.e. relevant bits which of course makes either 'online' or 'offline' redundant. The maplen - if of interest - can be computed by VIR_CPU_MAPLEN.
I have a process question as the implementation I have posted on the list at https://www.redhat.com/archives/libvir-list/2012-October/msg00803.html would need a rebase because of upstream conflicts. Should I send out new rebased versions once in a while, even if I have no other changes or should I wait for initial feedback before producing a new patch version? Thanks. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 10/10/2012 11:23 PM, Eric Blake wrote:
Are you interested in writing patches to implement this new API, if others agree with my alternate single-API signature?
actually, I started to write up my proposal with the signature you suggested :-). A second thought was that having a function to retrieve the number only might be convenient here and there. And yes, I want to implement the outcome of this discussion. -- Mit freundlichen Grüßen/Kind Regards Viktor Mihajlovski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Martin Jetter Geschäftsführung: Dirk Wittkopp Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 10/11/2012 12:46 AM, Viktor Mihajlovski wrote:
On 10/10/2012 11:23 PM, Eric Blake wrote:
Are you interested in writing patches to implement this new API, if others agree with my alternate single-API signature?
actually, I started to write up my proposal with the signature you suggested :-). A second thought was that having a function to retrieve the number only might be convenient here and there.
Retrieving only the number is useful, but it is also easy with a single API - have the API return the number always, and allow the user to pass NULL instead of the location where libvirt would normally malloc() the result. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 2012年10月10日 23:18, Viktor Mihajlovski wrote:
Hi,
in order to use the APIs listed below it is necessary for a client to know the maximum number of node CPUs which is passed via the maplen argument.
virDomainGetEmulatorPinInfo virDomainGetVcpuPinInfo virDomainGetVcpus virDomainPinEmulator virDomainPinVcpu virDomainPinVcpuFlags
The current approach uses virNodeGetInfo to determine the maximum CPU number. This can lead to incorrect results if not all node CPUs are online. The maximum CPU number should always be the number of CPUs present on the host, regardless of their online/offline state.
Agree. It doesn't make sense to show online CPU number on PowerPC either. It's better to show the CPU number of online and offline. When SMT is disabled on PowerPC, node info is not right either. And there is no relationship as x86 as the following: nodeinfo->nodes * nodeinfo->sockets * nodeinfo->cores * nodeinfo->threads = nodeinfo->cpus + offline It should be for PowerPC : nodeinfo->sockets * nodeinfo->cores * nodeinfo->threads = nodeinfo->cpus + offline
The following example illustrates the issue: Host has 3 logical CPUs, 1 socket, 3 cores, 1 thread. Guest has 1 virtual CPU and is started while all 3 host CPUs are online.
$ virsh vcpuinfo guest VCPU: 0 CPU: 0 State: running CPU time: 35.4s CPU Affinity: yyy
$ echo 0 > /sys/devices/system/cpu/cpu1/online
$ virsh vcpuinfo guest VCPU: 0 CPU: 0 State: running CPU time: 35.5s CPU Affinity: y-
The correct display for CPU affinity would have been y-y, as the guest continues to use CPUs 0 and 2. This is not a display problem only, because it is also not possible to explicitly pin the virtual CPU to host CPUs 0 and 2, due to the truncated CPU mask.
How is CPU mask truncated? CPU 2 is still online, right?
PROPOSAL:
To help solve the issue above I suggest two new public API functions:
int virNodeGetCpuNum(virConnectPtr conn);
returning the number of present CPUs on the host or -1 upon failure and
int virNodeGetCpuMap(virConnectPtr conn, unsigned char * cpumap, int maplen);
returning the number of present CPUs or -1 on failure and storing a bit map of real CPUs as described in virDomainPinVcpu in cpumap. The bits in the bit map are set to 1 for online CPUs and set to 0 for offline CPUs.
Implementation is facilitated by the function nodeGetCPUmap in nodeinfo.c.
Clients can use virNodeGetCpuNum to properly determine the maximum number of node CPUs and the online/offline information.
Thanks for your comments.
participants (4)
-
Eric Blake
-
Li Zhang
-
Osier Yang
-
Viktor Mihajlovski