[libvirt] [PATCH] qemu: fix migration fail of an auto-placement vm after attached memory to it

This patch fix this condition: -vm has the "auto" placement in vcpu -hot-plug memory with source node "1-3" through attach-device command -migrate the vm to a host with only 2 numa node And the migration will fail with error: "error: unsupported configuration: NUMA node 2 is unavailable" Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Xi Xu <xu.xi8@zte.com.cn> --- src/qemu/qemu_process.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 7b708be..dcc564c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -5259,6 +5259,16 @@ qemuProcessPrepareDomain(virConnectPtr conn, goto cleanup; } + VIR_DEBUG("Updating memory source nodes"); + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + if (priv->autoNodeset && mem && mem->sourceNodes) { + virBitmapFree(mem->sourceNodes); + if (!(mem->sourceNodes = virBitmapNewCopy(priv->autoNodeset))) + goto cleanup; + } + } + /* Whether we should use virtlogd as stdio handler for character * devices source backend. */ if (cfg->stdioLogD && -- 1.8.3.1

On Sat, Jul 22, 2017 at 05:45:59 -0400, Yi Wang wrote:
This patch fix this condition: -vm has the "auto" placement in vcpu -hot-plug memory with source node "1-3" through attach-device command -migrate the vm to a host with only 2 numa node And the migration will fail with error: "error: unsupported configuration: NUMA node 2 is unavailable"
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn> Signed-off-by: Xi Xu <xu.xi8@zte.com.cn> --- src/qemu/qemu_process.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 7b708be..dcc564c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -5259,6 +5259,16 @@ qemuProcessPrepareDomain(virConnectPtr conn, goto cleanup; }
+ VIR_DEBUG("Updating memory source nodes"); + for (i = 0; i < vm->def->nmems; i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i]; + if (priv->autoNodeset && mem && mem->sourceNodes) { + virBitmapFree(mem->sourceNodes); + if (!(mem->sourceNodes = virBitmapNewCopy(priv->autoNodeset))) + goto cleanup;
This is not correct. This code will be executed even during normal startup and it would remove any manual pinning the user set-up. I think the problem might be that the XML retains the nodesets of automatically placed memory modules while formatting the migratable XML.

Pk9uIFNhdCwgSnVsIDIyLCAyMDE3IGF0IDA1OjQ1OjU5IC0wNDAwLCBZaSBXYW5nIHdyb3RlOg0K DQo+PiBUaGlzIHBhdGNoIGZpeCB0aGlzIGNvbmRpdGlvbjoNCg0KPj4gICAtdm0gaGFzIHRoZSAi YXV0byIgcGxhY2VtZW50IGluIHZjcHUNCg0KPj4gICAtaG90LXBsdWcgbWVtb3J5IHdpdGggc291 cmNlIG5vZGUgIjEtMyIgdGhyb3VnaCBhdHRhY2gtZGV2aWNlIGNvbW1hbmQNCg0KPj4gICAtbWln cmF0ZSB0aGUgdm0gdG8gYSBob3N0IHdpdGggb25seSAyIG51bWEgbm9kZQ0KDQo+PiBBbmQgdGhl IG1pZ3JhdGlvbiB3aWxsIGZhaWwgd2l0aCBlcnJvcjoNCg0KPj4gImVycm9yOiB1bnN1cHBvcnRl ZCBjb25maWd1cmF0aW9uOiBOVU1BIG5vZGUgMiBpcyB1bmF2YWlsYWJsZSINCg0KPj4gDQoNCj4+ IFNpZ25lZC1vZmYtYnk6IFlpIFdhbmcgPHdhbmcgeWk1OSB6dGUgY29tIGNuPg0KDQo+PiBTaWdu ZWQtb2ZmLWJ5OiBYaSBYdSA8eHUgeGk4IHp0ZSBjb20gY24+DQoNCj4+IC0tLQ0KDQo+PiAgc3Jj L3FlbXUvcWVtdV9wcm9jZXNzLmMgfCAxMCArKysrKysrKysrDQoNCj4+ICAxIGZpbGUgY2hhbmdl ZCwgMTAgaW5zZXJ0aW9ucygrKQ0KDQo+PiANCg0KPj4gZGlmZiAtLWdpdCBhL3NyYy9xZW11L3Fl bXVfcHJvY2Vzcy5jIGIvc3JjL3FlbXUvcWVtdV9wcm9jZXNzLmMNCg0KPj4gaW5kZXggN2I3MDhi ZS4uZGNjNTY0YyAxMDA2NDQNCg0KPj4gLS0tIGEvc3JjL3FlbXUvcWVtdV9wcm9jZXNzLmMNCg0K Pj4gKysrIGIvc3JjL3FlbXUvcWVtdV9wcm9jZXNzLmMNCg0KPj4gQEAgLTUyNTksNiArNTI1OSwx NiBAQCBxZW11UHJvY2Vzc1ByZXBhcmVEb21haW4odmlyQ29ubmVjdFB0ciBjb25uLA0KDQo+PiAg ICAgICAgICAgICAgZ290byBjbGVhbnVwDQoNCj4+ICAgICAgfQ0KDQo+PiAgDQoNCj4+ICsgICAg VklSX0RFQlVHKCJVcGRhdGluZyBtZW1vcnkgc291cmNlIG5vZGVzIikNCg0KPj4gKyAgICBmb3Ig KGkgPSAwIGkgPCB2bS0+ZGVmLT5ubWVtcyBpKyspIHsNCg0KPj4gKyAgICAgICAgdmlyRG9tYWlu TWVtb3J5RGVmUHRyIG1lbSA9IHZtLT5kZWYtPm1lbXNbaV0NCg0KPj4gKyAgICAgICAgaWYgKHBy aXYtPmF1dG9Ob2Rlc2V0ICYmIG1lbSAmJiBtZW0tPnNvdXJjZU5vZGVzKSB7DQoNCj4+ICsgICAg ICAgICAgICB2aXJCaXRtYXBGcmVlKG1lbS0+c291cmNlTm9kZXMpDQoNCj4+ICsgICAgICAgICAg ICBpZiAoIShtZW0tPnNvdXJjZU5vZGVzID0gdmlyQml0bWFwTmV3Q29weShwcml2LT5hdXRvTm9k ZXNldCkpKQ0KDQo+PiArICAgICAgICAgICAgICAgIGdvdG8gY2xlYW51cA0KDQo+DQoNCj5UaGlz IGlzIG5vdCBjb3JyZWN0LiBUaGlzIGNvZGUgd2lsbCBiZSBleGVjdXRlZCBldmVuIGR1cmluZyBu b3JtYWwNCg0KPnN0YXJ0dXAgYW5kIGl0IHdvdWxkIHJlbW92ZSBhbnkgbWFudWFsIHBpbm5pbmcg dGhlIHVzZXIgc2V0LXVwLg0KDQoNCg0KDQpXaGF0IHlvdSBzYWlkIGlzIHJpZ2h0LCB0aGlzIGlz IGluZGVlZCBhIHByb2JsZW0gdGhhdCBteSBwYXRjaCBpZ25vcmVkLg0KDQoNCg0KDQo+DQoNCj5J IHRoaW5rIHRoZSBwcm9ibGVtIG1pZ2h0IGJlIHRoYXQgdGhlIFhNTCByZXRhaW5zIHRoZSBub2Rl c2V0cyBvZg0KDQo+YXV0b21hdGljYWxseSBwbGFjZWQgbWVtb3J5IG1vZHVsZXMgd2hpbGUgZm9y bWF0dGluZyB0aGUgbWlncmF0YWJsZSBYTUwuDQoNCg0KDQoNClRoZSBwcm9ibGVtIGlzIHRoYXQg dGhlIHNvdXJjZSBub2RlbWFzayBzcGVjaWZpZWQgaW4gdGhlICJ2aXJzaCBhdHRhY2gtZGV2aWNl Ig0KDQpYTUwsIHdoaWNoIGRvZXNuJ3QgZXhpc3QgaW4gdGhlIGRlc3RpbmF0aW9uIGhvc3QuIEZv ciBleGFtcGxlOg0KDQogICAgPG1lbW9yeSBtb2RlbD0nZGltbSc+DQoNCiAgICAgICAgPHRhcmdl dD4NCg0KICAgICAgICAgICAgPHNpemUgdW5pdD0nTWlCJz4xMDI0PC9zaXplPg0KDQogICAgICAg ICAgICA8bm9kZT4wPC9ub2RlPg0KDQogICAgICAgIDwvdGFyZ2V0Pg0KDQogICAgICAgIDxzb3Vy Y2U+DQoNCiAgICAgICAgICAgIDxub2RlbWFzaz4xLTM8L25vZGVtYXNrPg0KDQogICAgICAgIDwv c291cmNlPg0KDQogICAgPC9tZW1vcnk+DQoNCg0KDQoNClRoZSBzb3VyY2Ugb2YgdGhlIHByb2Js ZW0gaXMgdGhhdCB3ZSBtdXN0IHNldCBzb3VyY2Ugbm9kZW1hc2sgd2hlbiBob3QtcGx1Zw0KDQpt ZW1vcnkgZGltbSB3aGVuIHRoZSBWTSBoYXMgYXV0byBwbGFjZW1lbnQsIG9yIGl0IGZhaWxlZDoN Cg0KIyB2aXJzaCBhdHRhY2gtZGV2aWNlIGNlbnRvcyBtZW1faHAueG1sDQoNCmVycm9yOiBGYWls ZWQgdG8gYXR0YWNoIGRldmljZSBmcm9tIG1lbV9ocC54bWwNCg0KZXJyb3I6IGludGVybmFsIGVy cm9yOiBBZHZpY2UgZnJvbSBudW1hZCBpcyBuZWVkZWQgaW4gY2FzZSBvZiBhdXRvbWF0aWMgbnVt YSBwbGFjZW1lbnQNCg0KDQoNCg0KQW5kIGFmdGVyIHdlIHNldCBzb3VyY2Ugbm9kZW1hc2ssIGl0 IG1pZ2h0IGZhaWwgdG8gbWlncmF0ZS4NCg0KDQoNCg0KU28sIGEgYmV0dGVyIHdheSB0byB3b3Jr IGFyb3VuZCB0aGlzIHByb2JsZW0gaXMgYWRkIGEgImF1dG8iIHN1cHBvcnRlZCB0byA8c291cmNl Pj8NCg0KQW55IG90aGVyIHN1Z2dlc3Rpb24/DQoNCg0KDQoNCg0KDQoNCi0tLQ0KDQpCZXN0IHdp c2hlcw0KDQpZaSBXYW5n

Please configure your e-mail client so that it does not break threads and also that it does not add spurious newlines between lines of the reply. I've trimmed the spurious newlines in the reply. On Mon, Jul 24, 2017 at 17:11:50 +0800, wang.yi59@zte.com.cn wrote:
On Sat, Jul 22, 2017 at 05:45:59 -0400, Yi Wang wrote:
This patch fix this condition: -vm has the "auto" placement in vcpu -hot-plug memory with source node "1-3" through attach-device command -migrate the vm to a host with only 2 numa node And the migration will fail with error: "error: unsupported configuration: NUMA node 2 is unavailable"
Signed-off-by: Yi Wang <wang yi59 zte com cn> Signed-off-by: Xi Xu <xu xi8 zte com cn>
src/qemu/qemu_process.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 7b708be..dcc564c 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -5259,6 +5259,16 @@ qemuProcessPrepareDomain(virConnectPtr conn, goto cleanup }
+ VIR_DEBUG("Updating memory source nodes") + for (i = 0 i < vm->def->nmems i++) { + virDomainMemoryDefPtr mem = vm->def->mems[i] + if (priv->autoNodeset && mem && mem->sourceNodes) { + virBitmapFree(mem->sourceNodes) + if (!(mem->sourceNodes = virBitmapNewCopy(priv->autoNodeset))) + goto cleanup
This is not correct. This code will be executed even during normal startup and it would remove any manual pinning the user set-up. What you said is right, this is indeed a problem that my patch ignored.
I think the problem might be that the XML retains the nodesets of automatically placed memory modules while formatting the migratable XML. The problem is that the source nodemask specified in the "virsh attach-device" XML, which doesn't exist in the destination host. For example: <memory model='dimm'> <target> <size unit='MiB'>1024</size> <node>0</node> </target> <source> <nodemask>1-3</nodemask> </source> </memory> The source of the problem is that we must set source nodemask when hot-plug memory dimm when the VM has auto placement, or it failed: # virsh attach-device centos mem_hp.xml error: Failed to attach device from mem_hp.xml error: internal error: Advice from numad is needed in case of automatic numa placement
The 'numad' advice is no longer valid if you try to add more memory to the guest, so you really need to configure things externally with memory hotplug. While starting the VM automatic placement for the memory devices works since we are able to ask numad to accomodate them.
And after we set source nodemask, it might fail to migrate. So, a better way to work around this problem is add a "auto" supported to <source>?
This would mean that the whole VM would need to be re-pinned to the new nodeset provided by numad, which I don't think is desirable. Adding 'auto' really won't be easy in this case. You can explore this path though, since it's the only way to do it automatically.
Any other suggestion?
Since memory hotplug requires manual setup, you really need to provide a destination XML on migration, where you can change the nodeset on the target. Another option is to use manual pinning since that's the only one that allows enough flexibility. The automatic placement is really meant only for basic cases. Or don't use pinning at all.

On 07/24/2017 05:37 AM, Peter Krempa wrote:
Please configure your e-mail client so that it does not break threads
In particular, your email client is failing to include an "In-Reply-To:" header (which should contain the message-id from the parent message's "Message-Id:" header) (this is different from the "References:" header, which your email client does include). If your client has no way to turn on In-Reply-To: headers (it should be enabled by default), then you need to file a bug against the email client, and (if the developers don't fix it right away) seriously consider moving to a different client. (I'm also Cc'ing lu.zhipeng@zte.com.cn, because their email client is doing the same thing (failing to add In-Reply-To:), and also is adding "答复:" (which I guess is the Mandarin equivalent of "Re:") to all responses, causing the subject line to change with each response message, and thus breaking the possibility of an email client threading based on subject.) Small details like this seem trivial and unimportant, but they make a big difference when trying to dig into the history of a patch in order to make an informed reply (especially when there are hundreds of messages and dozens of threads every day), so any improvements would be greatly appreciated :-)
and also that it does not add spurious newlines between lines of the reply.
I've trimmed the spurious newlines in the reply.
participants (4)
-
Laine Stump
-
Peter Krempa
-
wang.yi59@zte.com.cn
-
Yi Wang