Hi Andrea,
On 07/20/2017 09:16 PM, Andrea Bolognani wrote:
On Mon, 2017-07-17 at 21:29 +0530, Shivaprasad G Bhat wrote:
> @@ -3786,6 +3786,11 @@
> part of the specified NUMA node (it is up to the user of the
> libvirt API to attach host devices to the correct
> pci-expander-bus when assigning them to the domain).
> + On PPC64, the PCI devices can be specified to be part of a NUMA
> + node using only the pci-root controller with an optional
> + <code><node></code> subelement within the
> + <code><target></code> subelement. The PCI devices
on the
> + given pci-root controller will be part of the specified NUMA node.
Instead of adding an entire new sentence here, it would make
more sense to rephrase what's already present, something like:
Some PCI controllers (pci-expander-bus for the pc machine
type, pcie-expander-bus for the q35 machine type and
pci-root for the pseries machine type) can have an optional
<node> subelement [...]
> @@ -9457,6 +9457,12 @@ virDomainControllerDefParseXML(xmlNodePtr node,
> goto error;
> }
> }
> + if (def->idx == 0 && numaNode >= 0) {
> + virReportError(VIR_ERR_XML_ERROR, "%s",
> + _("Only the PCI controller with index != 0 can
"
> + "have NUMA node property specified"));
> + goto error;
> + }
> if (numaNode >= 0)
> def->opts.pciopts.numaNode = numaNode;
The check you're adding can be merged with the one below it.
The error message should also be reworded, something like:
The PCI controller with index=0 can't be associated with
a NUMA node.
> @@ -3458,9 +3458,14 @@ qemuDomainControllerDefPostParse(virDomainControllerDefPtr
cont,
> * that NUMA node is configured in the guest <cpu><numa>
> * array. NUMA cell id's in this array are numbered
> * from 0 .. size-1.
> + *
> + * On PSeries, the NUMA node is set at the PHB.
As above, reworking the existing comment would work better
than merely appending to it.
> */
> - if ((cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
> - cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS)
&&
> + if (((qemuDomainIsPSeries(def) &&
> + cont->type == VIR_DOMAIN_CONTROLLER_TYPE_PCI &&
> + cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_ROOT) ||
> + (cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCI_EXPANDER_BUS ||
> + cont->model == VIR_DOMAIN_CONTROLLER_MODEL_PCIE_EXPANDER_BUS))
&&
> (int) virDomainNumaGetNodeCount(def->numa)
> <= cont->opts.pciopts.numaNode) {
I actually don't see any point in the condition being this
complex: it could just be
if (cont->opts.pciopts.numaNode >= 0 &&
cont->opts.pciopts.numaNode >=
(int) virDomainNumaGetNodeCount(def->numa))
because we've already made sure, while parsing, that numaNode
is only set for controllers that allow it.
> +++ b/tests/qemuxml2argvdata/qemuxml2argv-spapr-pci-host-bridge-numa-node.xml
> @@ -0,0 +1,54 @@
> +<domain type='qemu'>
> + <name>QEMUGuest1</name>
> + <uuid>87eedafe-eedc-4336-8130-ed9fe5dc90c8</uuid>
> + <memory unit='KiB'>2097152</memory>
> + <currentMemory unit='MiB'>2048</currentMemory>
You don't need <currentMemory>
> + <vcpu placement='static'>8</vcpu>
> + <numatune>
> + <memory mode='strict' nodeset='1'/>
> + </numatune>
> + <cpu>
> + <topology sockets='3' cores='1' threads='8'/>
> + <numa>
> + <cell id='0' cpus='0-3' memory='1048576'
unit='KiB'/>
> + <cell id='1' cpus='4-7' memory='1048576'
unit='KiB'/>
> + </numa>
> + </cpu>
> + <os>
> + <type arch='ppc64' machine='pseries'>hvm</type>
> + <boot dev='hd'/>
<boot> is unnecessary.
> + </os>
> + <clock offset='utc'/>
> + <on_poweroff>destroy</on_poweroff>
> + <on_reboot>restart</on_reboot>
> + <on_crash>destroy</on_crash>
<clock> and <on_*> can be removed.
> + <devices>
> + <emulator>/usr/bin/qemu-system-ppc64</emulator>
> + <disk type='block' device='disk'>
> + <driver name='qemu' type='raw'/>
> + <source dev='/dev/HostVG/QEMUGuest1'/>
> + <target dev='hda' bus='scsi'/>
> + <address type='drive' controller='0' bus='0'
target='0' unit='0'/>
> + </disk>
<disk> is irrelevant for the test case at hand.
> + <controller type='usb' index='0'/>
The model should be 'none'.
> + <controller type='scsi' index='0'/>
The SCSI controller is also not useful here.
> + <controller type='pci' index='0'
model='pci-root'>
> + <target index='0'/>
> + </controller>
> + <controller type='pci' index='1'
model='pci-root'>
> + <target index='1'>
> + <node>1</node>
> + </target>
> + </controller>
> + <controller type='pci' index='2'
model='pci-root'>
> + <target index='2'/>
> + </controller>
> + <controller type='pci' index='3'
model='pci-root'>
> + <target index='3'>
> + <node>0</node>
> + </target>
> + </controller>
> + <memballoon model='none'/>
> + <panic model='pseries'/>
You can omit <panic>.
> +++ b/tests/qemuxml2argvtest.c
> @@ -2739,6 +2739,9 @@ mymain(void)
> DO_TEST_PARSE_ERROR("cpu-cache-emulate-l2", QEMU_CAPS_KVM);
> DO_TEST_PARSE_ERROR("cpu-cache-passthrough3", QEMU_CAPS_KVM);
> DO_TEST_PARSE_ERROR("cpu-cache-passthrough-l3", QEMU_CAPS_KVM);
> + DO_TEST("spapr-pci-host-bridge-numa-node", QEMU_CAPS_NUMA,
> + QEMU_CAPS_DEVICE_SPAPR_PCI_HOST_BRIDGE,
> + QEMU_CAPS_SPAPR_PCI_HOST_BRIDGE_NUMA_NODE);
This should be moved up to be close to the pserie-phb* test
cases, and renamed to something like 'pseries-phb-numa-node'.
There should be one capability per line.
You also need to have an identical test case in qemuxml2xml,
and at least one negative test to show that trying to
assign the default PHB to a NUMA node will result in failure.
Agree to all your
comments. Fixing them all.
As I relook at my test case, I realize the numatune should be for the
memnode
instead of memory for this use case. Fixing that as well. For
pci-expander-bus
test cases I see no numatune in the xml, will fix them later.
--
Andrea Bolognani / Red Hat / Virtualization