On Thu, 2016-06-25 at 12:03 +0530, Shivaprasad G Bhat wrote:
Hi Shivaprasad,
here are a few comments about your patch.
[...]
> +# if HAVE_LINUX_KVM_H
> +# include <linux/kvm.h
This line should be moved to the top of the file, with all the other
#include directives.
> + kvmfd = open("/dev/kvm", O_RDONLY);
> + if (kvmfd >= 0) {
> +# ifdef KVM_CAP_PPC_SMT
> + valid_ppc64_kvmhost_mode = true;
> + /* For Phyp and KVM based guests the ioctl for
> KVM_CAP_PPC_SMT
> + * returns zero and both primary and secondary threads
> will be
> + * online. Dont try to alter or interpret anything here.
> + */
> + threads_per_subcore = ioctl(kvmfd, KVM_CHECK_EXTENSION,
> KVM_CAP_PPC_SMT);
> + if (threads_per_subcore == 0)
> + valid_ppc64_kvmhost_mode = false;
> +# endif
> + }
> + VIR_FORCE_CLOSE(kvmfd);
> +# endif
> + rewinddir(cpudir);
> + lastonline = -1;
> + while (valid_ppc64_kvmhost_mode &&
> + ((direrr = virDirRead(cpudir, &cpudirent, node))
> 0)) {
> + if (sscanf(cpudirent->d_name, "cpu%u", &cpu) != 1)
> + continue;
> + if ((online = virNodeGetCpuValue(node, cpu, "online",
> 1)) < 0)
> + goto cleanup;
> +
> + if (online) {
> + if (lastonline != -1 &&
> + (cpu - lastonline) < threads_per_subcore) {
> + valid_ppc64_kvmhost_mode = false; /* if any of
> secondaries
> + * online */
If a comment is long enough to span two lines, it should probably not
start at the end of a statement but rather be above it.
+ break;
+ }
+ lastonline = cpu;
+ }
+ }
+ }
+
Enclosing this whole block within the #if and #ifdef directives would
make
the code cleaner; on the other hand, since the skeleton of the loop is
identical to the one a few lines above, maybe you could move the part
reading from /dev/kvm closer to the top and merge the two loops
together?
> /* allocate cpu maps for each socket */
> if (VIR_ALLOC_N(core_maps, sock_max) < 0)
> goto cleanup;
> @@ -473,6 +539,7 @@ virNodeParseNode(const char *node,
> /* iterate over all CPU's in the node */
> rewinddir(cpudir);
> + lastonline = -1;
> while ((direrr = virDirRead(cpudir, &cpudirent, node)) > 0) {
> if (sscanf(cpudirent->d_name, "cpu%u", &cpu) != 1)
> continue;
> @@ -480,6 +547,17 @@ virNodeParseNode(const char *node,
> if ((online = virNodeGetCpuValue(node, cpu, "online", 1)) <
> 0)
> goto cleanup;
> + if (ARCH_IS_PPC64(arch) &&
valid_ppc64_kvmhost_mode) {
valid_ppc64_kvmhost_mode can be true iff ARCH_IS_PPC64(arch), so this
condition is kinda redundant.
+ if (online) {
+ lastonline = cpu;
+ } else if (lastonline != -1 &&
+ (cpu-lastonline) < threads_per_subcore) {
+ processors++; /* Count only those secondaries whose
primary
+ subcore is online */
Same as above.
> + continue;
> + }
> + }
> +
> if (!online) {
> (*offline)++;
> continue;
> @@ -528,6 +606,13 @@ virNodeParseNode(const char *node,
> *cores = core;
> }
> + if (ARCH_IS_PPC64(arch) &&
valid_ppc64_kvmhost_mode) {
Same as above.
> + /* The actual host threads per core is
> + * multiple of the subcores_per_core(i.e variable *threads
> in this case)
> + * and threads_per_subcore.
> + */
> + *threads = *threads * threads_per_subcore;
> + }
> ret = processors;
> cleanup:
Overall, the code makes sense and seems to work from my limited
testing.
However, this change completely breaks the topology information
reported
by `virsh capabilities`. Here's the output on a host with
subcores_per_core=1, without the patch:
<topology
<cells num='4'
<cell id='0'
<memory unit='KiB'>67108864</memory
<pages unit='KiB' size='64'>1048576</pages
<pages unit='KiB'
size='16384'>0</pages
<pages unit='KiB' size='16777216'>0</pages
<distances
<sibling id='0' value='10'/
<sibling id='1' value='20'/
<sibling id='16' value='40'/
<sibling id='17' value='40'/
</distances
<cpus
num='5'
<cpu id='0'
socket_id='0' core_id='32' siblings='0'/
<cpu id='8' socket_id='0'
core_id='40' siblings='8'/
<cpu id='16' socket_id='0' core_id='48'
siblings='16'/
<cpu id='24'
socket_id='0' core_id='96' siblings='24'/
<cpu id='32' socket_id='0'
core_id='104' siblings='32'/
</cpus
</cell
<cell id='1'
<memory unit='KiB'>67108864</memory
<pages unit='KiB' size='64'>1048576</pages
<pages unit='KiB'
size='16384'>0</pages
<pages unit='KiB' size='16777216'>0</pages
<distances
<sibling id='0' value='20'/
<sibling id='1' value='10'/
<sibling id='16' value='40'/
<sibling id='17' value='40'/
</distances
<cpus
num='5'
<cpu id='40'
socket_id='1' core_id='160' siblings='40'/
<cpu id='48' socket_id='1'
core_id='168' siblings='48'/
<cpu id='56' socket_id='1' core_id='176'
siblings='56'/
<cpu id='64'
socket_id='1' core_id='224' siblings='64'/
<cpu id='72' socket_id='1'
core_id='232' siblings='72'/
</cpus
</cell
<cell id='16'
<memory unit='KiB'>67108864</memory
<pages unit='KiB' size='64'>1048576</pages
<pages unit='KiB'
size='16384'>0</pages
<pages unit='KiB' size='16777216'>0</pages
<distances
<sibling id='0' value='40'/
<sibling id='1' value='40'/
<sibling id='16' value='10'/
<sibling id='17' value='20'/
</distances
<cpus
num='5'
<cpu id='80'
socket_id='16' core_id='2088' siblings='80'/
<cpu id='88' socket_id='16'
core_id='2096' siblings='88'/
<cpu id='96' socket_id='16' core_id='2144'
siblings='96'/
<cpu id='104'
socket_id='16' core_id='2152'
siblings='104'/
<cpu id='112'
socket_id='16' core_id='2160'
siblings='112'/
</cpus
</cell
<cell id='17'
<memory
unit='KiB'>67108864</memory
<pages unit='KiB' size='64'>1048576</pages
<pages unit='KiB'
size='16384'>0</pages
<pages unit='KiB' size='16777216'>0</pages
<distances
<sibling id='0' value='40'/
<sibling id='1' value='40'/
<sibling id='16' value='20'/
<sibling id='17' value='10'/
</distances
<cpus
num='5'
<cpu id='120'
socket_id='17' core_id='2208'
siblings='120'/
<cpu id='128'
socket_id='17' core_id='2216'
siblings='128'/
<cpu id='136'
socket_id='17' core_id='2272'
siblings='136'/
<cpu id='144'
socket_id='17' core_id='2280'
siblings='144'/
<cpu id='152'
socket_id='17' core_id='2288'
siblings='152'/
</cpus
</cell
</cells
</topology
With the patch:
<topology
<cells num='1'
<cell id='0'
<memory unit='KiB'>263635136</memory
<cpus num='160'
<cpu id='0'
socket_id='0' core_id='0' siblings='0'/
<cpu id='8' socket_id='0'
core_id='1' siblings='8'/
<cpu id='16' socket_id='0' core_id='2'
siblings='16'/
<cpu id='24'
socket_id='0' core_id='3' siblings='24'/
<cpu id='32' socket_id='0'
core_id='4' siblings='32'/
<cpu id='0'/
<!-- The above line
is repeated 155 times --
</cpus
</cell
</cells
</topology
This issue needs to be addressed before the changes can be considered
for inclusion. I'd also highly recommend writing tests cases in order
to prevent regressions in the future.
Cheers.
--
Andrea Bolognani
Software Engineer - Virtualization Team