On Thu, 30 Aug 2018 17:51:13 +0200
Laszlo Ersek <lersek(a)redhat.com> wrote:
> +Drew
>
> On 08/30/18 14:08, Igor Mammedov wrote:
>> If VM has VCPUs plugged sparselly (for example a VM started with
>> 3 VCPUs (cpu0, cpu1 and cpu2) and then cpu1 was hotunplugged so
>> only cpu0 and cpu2 are present), QGA will rise a error
>> error: internal error: unable to execute QEMU agent command
'guest-get-vcpus':
>> open("/sys/devices/system/cpu/cpu1/"): No such file or directory
>> when
>> virsh vcpucount FOO --guest
>> is executed.
>> Fix it by ignoring non present CPUs when fetching CPUs status from sysfs.
>>
>> Signed-off-by: Igor Mammedov <imammedo(a)redhat.com>
>> ---
>> qga/commands-posix.c | 4 +++-
>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/qga/commands-posix.c b/qga/commands-posix.c
>> index 37e8a2d..2929872 100644
>> --- a/qga/commands-posix.c
>> +++ b/qga/commands-posix.c
>> @@ -2044,7 +2044,9 @@ static void transfer_vcpu(GuestLogicalProcessor *vcpu, bool
sys2vcpu,
>> vcpu->logical_id);
>> dirfd = open(dirpath, O_RDONLY | O_DIRECTORY);
>> if (dirfd == -1) {
>> - error_setg_errno(errp, errno, "open(\"%s\")",
dirpath);
>> + if (!(sys2vcpu && errno == ENOENT)) {
>> + error_setg_errno(errp, errno, "open(\"%s\")",
dirpath);
>> + }
>> } else {
>> static const char fn[] = "online";
>> int fd;
>>
>
> Originally these guest agent commands (both getting and setting) were
> meant to be used in the absence of real VCPU hot[un]plug, as a fallback
> / place-holder.
>
> If the latter (= real VCPU hot(un)plug) works, then these guest agent
> commands shouldn't be used at all.
Technically there isn't reasons for "get" not to work in sparse usecase
hence the patch.
> Drew, do I remember correctly? ... The related RHBZ is
> <
https://bugzilla.redhat.com/show_bug.cgi?id=924684>. (It's a private
> one, and I'm not at liberty to open it up, so my apologies to non-RH folks.)
>
> Anyway, given that "set" should be a subset of the "get" return
value
> (as documented in the command schema), if we fix up "get" to work with
> sparse topologies, then "set" should work at once.
>
> However... as far as I understand, this change will allow
> qmp_guest_get_vcpus() to produce a GuestLogicalProcessor object for the
> missing (hot-unplugged) VCPU, with the following contents:
> - @logical-id: populated by the loop,
> - @online: false (from g_malloc0()),
> - @can-offline: present (from the loop), and false (from g_malloc0()).
>
> The smaller problem with this might be that "online==false &&
> can-offline==false" is nonsensical and has never been returned before. I
> don't know how higher level apps will react.
>
> The larger problem might be that a higher level app could simply copy
> this output structure into the next "set" call unchanged, and then that
> "set" call will fail.
Libvirt it seems that survives such outrageous output
> I wonder if, instead of this patch, we should rework
> qmp_guest_get_vcpus(), to silently skip processors for which this
> dirpath ENOENT condition arises (i.e., return a shorter list of
> GuestLogicalProcessor objects).
> But, again, I wouldn't mix this guest agent command with real VCPU
> hot(un)plug in the first place. The latter is much-much better, so if
> it's available, use that exclusively?
Agreed,
Maybe we can block invalid usecase on libvirt side with a more clear
error message as libvirt sort of knows that sparse cpus are supported.
OK -- I see libvir-list is CC'd, so let's hear what they prefer :)
Thanks
Laszlo