Hi. Sorry for the late reply.

On 10/11/18 9:03 PM, John Ferlan wrote:

On 10/4/18 9:14 AM, Daniel Henrique Barboza wrote:
This patch makes two quality of life changes for non-x86 guests. The
first one is a maxCpus validation at qemuDomainDefValidate. The check
is made by the same function used to do that at qemuProcessStartValidateXML,
qemuValidateCpuCount. This ensures that the user doesn't goes over the
board with the maxCpus value when editing the XML, only to be faced with a
runtime error when starting it.

To do that, the following changes were made:

- qemuValidateCpuCount was made public. It was renamed to
qemuProcessValidateCpuCount to be compliant with the other public methods
at qemu_process.h;

- the method signature was slightly adapted to fit the const 'def'
variable used in qemuDomainDefValidate. This change has no downside in
in its original usage at qemuProcessStartValidateXML.

The second QoL change is adding the maxCpus value in the error message
of the now qemuProcessValidateCpuCount. This simple change allows the
user to quickly edit the XML to comply with the acceptable limit without
having to know QEMU internals. x86 guests, that might have been created
prior to the x86 qemuDomainDefValidate maxCpus check code, will also
benefit from this change.

After this patch, this is the expect result running a Power 9 guest with
a maxCpus of 40000:

$ ./virsh start dhb
error: Failed to start domain dhb
error: unsupported configuration: Maximum CPUs greater than specified machine type limit 240

And this is the result when trying to edit maxCpus to an invalid value:

$ ./virsh edit dhb
error: unsupported configuration: Maximum CPUs greater than specified machine type limit 240
Failed. Try again? [y,n,i,f,?]:
error: unsupported configuration: Maximum CPUs greater than specified machine type limit 240

Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
---
 src/qemu/qemu_domain.c  |  5 +++++
 src/qemu/qemu_process.c | 13 +++++++------
 src/qemu/qemu_process.h |  3 +++
 3 files changed, 15 insertions(+), 6 deletions(-)

When you write a commit message that says this much, start thinking - I
might need to split this up into multiple patches to reach my final
goal. I think there are 3 patches here:

1. Changing the name from qemuValidateCpuCount to
qemuProcessValidateCpuCount, alter the @def variable name, and add to
qemu_process.h.

2. Alter the error message.

3. The qemu_domain change.

I can split it in 3 patches as you suggested, no problem.



      
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 939b2a3da2..5a39b11da7 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -4100,6 +4100,11 @@ qemuDomainDefValidate(const virDomainDef *def,
         }
     }
 
+    if (!ARCH_IS_X86(def->os.arch) &&
+        qemuProcessValidateCpuCount(def, qemuCaps) < 0) {
+        goto cleanup;
+    }
+
So this really is the crux of the fix... To perform this validate during
the define processing, which seems to be a good idea; however, I'm
wondering why you're filtering "!ARCH_IS_X86". That same filter isn't
done at start time. Yes, there is an X86 specific message just before:

    unsupported configuration: more than 255 vCPUs are only supported on
q35-based machine type


My idea here was to keep the x86 verification unchanged, with its own
error/warning message, while adding a vCPU verification for all other
archs. This is why I've used !ARCH_IS_X86.

So, for q35-based machines, they'd have to wait for start to get the
message?  Considering there's lots of roadblocks to define a q35 with >
255 vCPUs, I think we'd be safe.

Still part of me wonders why we don't just move the
qemuProcessStartValidateXML call to qemuProcessValidateCpuCount rather
than having it both places.

I think when it was first added where it was it was done because the
DefValidate processing didn't exist (see [1] for some history). Back
then all we had was PostParse processing and if XML range validation was
done there guests may "disappear" when libvirtd restarted. So adding to
start processing was the only answer.

Should I move qemuProcessStartValidadeXML to qemuProcessValidadeCpuCount
in the next spin then?


     if (def->nresctrls &&
         def->virtType != VIR_DOMAIN_VIRT_KVM) {
         virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
index 29b0ba1590..0de5b503d6 100644
--- a/src/qemu/qemu_process.c
+++ b/src/qemu/qemu_process.c
@@ -3901,9 +3901,9 @@ qemuProcessSPICEAllocatePorts(virQEMUDriverPtr driver,
 }
 
 
-static int
-qemuValidateCpuCount(virDomainDefPtr def,
-                     virQEMUCapsPtr qemuCaps)
+int
+qemuProcessValidateCpuCount(const virDomainDef *def,
+                            virQEMUCapsPtr qemuCaps)
Going back through history commit ff968889 added this logic

 {
     unsigned int maxCpus = virQEMUCapsGetMachineMaxCpus(qemuCaps, def->os.machine);
 
@@ -3914,8 +3914,9 @@ qemuValidateCpuCount(virDomainDefPtr def,
     }
 
     if (maxCpus > 0 && virDomainDefGetVcpusMax(def) > maxCpus) {
-        virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
-                       _("Maximum CPUs greater than specified machine type limit"));
+        virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+                       _("Maximum CPUs greater than specified machine type limit %u"),
+                       maxCpus);
Change this to

        virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
                       _("Maximum CPUs greater than specified machine "
                         "type limit: %u"), maxCpus);

It's a long line thing...

Ok!


         return -1;
     }
 
@@ -5165,7 +5166,7 @@ qemuProcessStartValidateXML(virQEMUDriverPtr driver,
      * If back compat isn't a concern, XML validation should probably
      * be done at parse time.
      */
[1]
Looking at the history of the above comment finds commit 5938f2d0b which
was added before commit b394af162 to introduce virDomainDefValidate.
Then eventually commit d071d292c added a call to virDomainDefValidate
for VIR_QEMU_PROCESS_START_NEW processing. Commit 05eab1bf9a further
altered that code to just call virDomainDefValidate

-    if (qemuValidateCpuCount(vm->def, qemuCaps) < 0)
+    if (qemuProcessValidateCpuCount(vm->def, qemuCaps) < 0)
         return -1;
 
     /* checks below should not be executed when starting a qemu process for a
diff --git a/src/qemu/qemu_process.h b/src/qemu/qemu_process.h
index c2f7c2b5d2..1716230475 100644
--- a/src/qemu/qemu_process.h
+++ b/src/qemu/qemu_process.h
@@ -215,4 +215,7 @@ int qemuProcessStartManagedPRDaemon(virDomainObjPtr vm);
 
 void qemuProcessKillManagedPRDaemon(virDomainObjPtr vm);
 
+int qemuProcessValidateCpuCount(const virDomainDef *def,
+                                virQEMUCapsPtr qemuCaps);
+
Not that it matters, but I'd like to see this in the same logical order
after qemuProcessDestroyMemoryBackingPath and before
qemuProcessAutostartAll. If you look at the source code that's about
it's location. Of course qemuProcessAutostartAll doesn't exist, but I
will fix that after I send this. It seems commit aad362f93 moved
qemuProcessReconnectAll, but didn't change the .h file <sigh>...

Ok!


John

 #endif /* __QEMU_PROCESS_H__ */