[libvirt] [PATCH 0/8] logically memory hotplug via guest agent

Logically memory hotplug via guest agent, by enabling/disabling memory blocks. The corresponding qga commands are: 'guest-get-memory-blocks', 'guest-set-memory-blocks' and 'guest-get-memory-block-info'. detailed flow: 1 get memory block list, each member has 'phy-index', 'online' and 'can-offline' parameters 2 get memory block size, normally 128MB or 256MB for most OSes 3 convert the target memory size to memory block number, and see if there's enough memory blocks to be set online/offline. 4 update the memory block list info, and let guest agent to set memory blocks online/offline. Note that because we hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has a size much bigger than KiB, there's a deviation with the range of (0, block_size). block_size may be 128MB or 256MB or etc., it differs on different OSes. Zhang Bo (8): lifecycle: add flag VIR_DOMAIN_MEM_GUEST for viDomainSetMemoryFlags qemu: agent: define structure of qemuAgentMemblockInfo qemu: agent: implement qemuAgentGetMemblocks qemu: agent: implement qemuAgentGetMemblockGeneralInfo qemu: agent: implement qemuAgentUpdateMemblocks qemu: agent: implement function qemuAgetSetMemblocks qemu: memory: logically hotplug memory with guest agent virsh: support memory hotplug with guest agent in virsh include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 7 + src/qemu/qemu_agent.c | 307 +++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_agent.h | 22 +++ src/qemu/qemu_driver.c | 46 +++++- tools/virsh-domain.c | 10 +- tools/virsh.pod | 7 +- 7 files changed, 396 insertions(+), 4 deletions(-) -- 1.7.12.4

just add the flag and description for function virDomainSetMemoryFlags(). Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- include/libvirt/libvirt-domain.h | 1 + src/libvirt-domain.c | 4 ++++ src/qemu/qemu_driver.c | 3 ++- 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h index d851225..103266a 100644 --- a/include/libvirt/libvirt-domain.h +++ b/include/libvirt/libvirt-domain.h @@ -1163,6 +1163,7 @@ typedef enum { /* Additionally, these flags may be bitwise-OR'd in. */ VIR_DOMAIN_MEM_MAXIMUM = (1 << 2), /* affect Max rather than current */ + VIR_DOMAIN_MEM_GUEST = (1 << 3), /* logically change memory size in the guest */ } virDomainMemoryModFlags; diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 7e6d749..155fb92 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -1945,6 +1945,10 @@ virDomainSetMemory(virDomainPtr domain, unsigned long memory) * on whether just live or both live and persistent state is changed. * If VIR_DOMAIN_MEM_MAXIMUM is set, the change affects domain's maximum memory * size rather than current memory size. + * If VIR_DOMAIN_MEM_GUEST is set, it changes the domain's memory size inside + * the guest instead of the hypervisor. This flag can only be used with live guests. + * The usage of this flag may require a guest agent configured. + * * Not all hypervisors can support all flag combinations. * * Returns 0 in case of success, -1 in case of failure. diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 34e5581..580cd60 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2310,7 +2310,8 @@ static int qemuDomainSetMemoryFlags(virDomainPtr dom, unsigned long newmem, virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | - VIR_DOMAIN_MEM_MAXIMUM, -1); + VIR_DOMAIN_MEM_MAXIMUM | + VIR_DOMAIN_MEM_GUEST, -1); if (!(vm = qemuDomObjFromDomain(dom))) goto cleanup; -- 1.7.12.4

add the definition of qemuAgentMemblockInfo, according to the json format: { 'struct': 'GuestMemoryBlock', 'data': {'phys-index': 'uint64', 'online': 'bool', '*can-offline': 'bool'} } Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- src/qemu/qemu_agent.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 7cbf8eb..425ee87 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -103,6 +103,15 @@ int qemuAgentUpdateCPUInfo(unsigned int nvcpus, qemuAgentCPUInfoPtr cpuinfo, int ncpuinfo); +typedef struct _qemuAgentMemblockInfo qemuAgentMemblockInfo; +typedef qemuAgentMemblockInfo *qemuAgentMemblockInfoPtr; +struct _qemuAgentMemblockInfo { + unsigned long long id; /* arbitrary guest-specific unique identifier of the MEMORY BLOCK*/ + bool online; /* true if the MEMORY BLOCK is enabled in the guest*/ + bool offlinable; /* true if the MEMORY BLOCK can be offlined */ +}; + + int qemuAgentGetTime(qemuAgentPtr mon, long long *seconds, unsigned int *nseconds); -- 1.7.12.4

implement function qemuAgentGetMemblocks(). behaviour example: input: '{"execute":"guest-get-memory-blocks"}' output: { "return": [ { "can-offline": false, "online": true, "phys-index": 0 }, { "can-offline": false, "online": true, "phys-index": 1 }, .......... ] } please refer to http://git.qemu.org/?p=qemu.git;a=log;h=0dd38a03f5e1498aabf7d053a9fab792a5ee... for more information. Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- src/qemu/qemu_agent.c | 73 +++++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_agent.h | 1 + 2 files changed, 74 insertions(+) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 043695b..95daf7a 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -1654,6 +1654,79 @@ qemuAgentUpdateCPUInfo(unsigned int nvcpus, return 0; } +int +qemuAgentGetMemblocks(qemuAgentPtr mon, + qemuAgentMemblockInfoPtr *info) +{ + int ret = -1; + size_t i; + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; + virJSONValuePtr data = NULL; + int ndata; + + if (!(cmd = qemuAgentMakeCommand("guest-get-memory-blocks", NULL))) + return -1; + + if (qemuAgentCommand(mon, cmd, &reply, true, + VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK) < 0) + goto cleanup; + + if (!(data = virJSONValueObjectGet(reply, "return"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest-get-memory-blocks reply was missing return data")); + goto cleanup; + } + + if (data->type != VIR_JSON_TYPE_ARRAY) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest-get-memory-blocks return information was not an array")); + goto cleanup; + } + + ndata = virJSONValueArraySize(data); + + if (VIR_ALLOC_N(*info, ndata) < 0) + goto cleanup; + + for (i = 0; i < ndata; i++) { + virJSONValuePtr entry = virJSONValueArrayGet(data, i); + qemuAgentMemblockInfoPtr in = *info + i; + + if (!entry) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("array element missing in guest-get-memory-blocks return " + "value")); + goto cleanup; + } + + if (virJSONValueObjectGetNumberUint(entry, "phys-index", &in->id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("'phys-index' missing in reply of guest-get-memory-blocks")); + goto cleanup; + } + + if (virJSONValueObjectGetBoolean(entry, "online", &in->online) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("'online' missing in reply of guest-get-memory-blocks")); + goto cleanup; + } + + if (virJSONValueObjectGetBoolean(entry, "can-offline", + &in->offlinable) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("'can-offline' missing in reply of guest-get-memory-blocks")); + goto cleanup; + } + } + + ret = ndata; + + cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} int qemuAgentGetTime(qemuAgentPtr mon, diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 425ee87..61ba038 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -111,6 +111,7 @@ struct _qemuAgentMemblockInfo { bool offlinable; /* true if the MEMORY BLOCK can be offlined */ }; +int qemuAgentGetMemblocks(qemuAgentPtr mon, qemuAgentMemblockInfoPtr *info); int qemuAgentGetTime(qemuAgentPtr mon, long long *seconds, -- 1.7.12.4

qemuAgentGetMemblockGeneralInfo() is implememted, according to the qga command 'guest-get-memory-block-info'. the difference between this command and 'guest-get-memory-blocks' is that the latter one gets a list of infos for each memory block, and this command just returns general attributes for the guest memory blocks. Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- src/qemu/qemu_agent.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++- src/qemu/qemu_agent.h | 7 +++++++ 2 files changed, 56 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 95daf7a..3481354 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -1700,7 +1700,7 @@ qemuAgentGetMemblocks(qemuAgentPtr mon, goto cleanup; } - if (virJSONValueObjectGetNumberUint(entry, "phys-index", &in->id) < 0) { + if (virJSONValueObjectGetNumberUlong(entry, "phys-index", &in->id) < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", _("'phys-index' missing in reply of guest-get-memory-blocks")); goto cleanup; @@ -1729,6 +1729,54 @@ qemuAgentGetMemblocks(qemuAgentPtr mon, } int +qemuAgentGetMemblockGeneralInfo(qemuAgentPtr mon, + qemuAgentMemblockGeneralInfoPtr info) +{ + int ret = -1; + unsigned long long json_size = 0; + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; + virJSONValuePtr data = NULL; + + if (!info) { + VIR_ERROR(_("NULL info")); + return ret; + } + + cmd = qemuAgentMakeCommand("guest-get-memory-block-info", + NULL); + if (!cmd) + return ret; + + if (qemuAgentCommand(mon, cmd, &reply, true, + VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK) < 0) + goto cleanup; + + if (!(data = virJSONValueObjectGet(reply, "return"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest-get-memory-block-info reply was missing return data")); + goto cleanup; + } + + if (virJSONValueObjectGetNumberUlong(data, "size", &json_size) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("'size' missing in reply of guest-get-memory-block-info")); + goto cleanup; + } + + /* guest agent returns the size in Bytes, + * we change it into MB here */ + info->blockSize = json_size >> 20; + ret = 0; + + cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} + + +int qemuAgentGetTime(qemuAgentPtr mon, long long *seconds, unsigned int *nseconds) diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 61ba038..9a9b859 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -111,7 +111,14 @@ struct _qemuAgentMemblockInfo { bool offlinable; /* true if the MEMORY BLOCK can be offlined */ }; +typedef struct _qemuAgentMemblockGeneralInfo qemuAgentMemblockGeneralInfo; +typedef qemuAgentMemblockGeneralInfo *qemuAgentMemblockGeneralInfoPtr; +struct _qemuAgentMemblockGeneralInfo { + unsigned long long blockSize; +}; + int qemuAgentGetMemblocks(qemuAgentPtr mon, qemuAgentMemblockInfoPtr *info); +int qemuAgentGetMemblockGeneralInfo(qemuAgentPtr mon, qemuAgentMemblockGeneralInfoPtr info); int qemuAgentGetTime(qemuAgentPtr mon, long long *seconds, -- 1.7.12.4

function qemuAgentUpdateMemblocks() checks whether it needs to plug/unplug memory blocks to reach the target memory. it's similar to qemuAgentUpdateCPUInfo(). Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- src/qemu/qemu_agent.c | 69 +++++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_agent.h | 4 +++ 2 files changed, 73 insertions(+) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 3481354..2c3a5ba 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -1775,6 +1775,75 @@ qemuAgentGetMemblockGeneralInfo(qemuAgentPtr mon, return ret; } +int +qemuAgentUpdateMemblocks(unsigned long long memory, + qemuAgentMemblockInfoPtr info, + int nblock, + unsigned long long blocksize) +{ + size_t i; + int nonline = 0; + int nofflinable = 0; + unsigned long long ntarget = 0; + + if (memory % blocksize) { + ntarget = (int)((memory / blocksize) + 1); + }else { + ntarget = (int)(memory / blocksize); + } + + /* count the active and offlinable memory blocks */ + for (i = 0; i < nblock; i++) { + if (info[i].online) + nonline++; + + if (info[i].offlinable && info[i].online) + nofflinable++; + + /* This shouldn't happen, but we can't trust the guest agent */ + if (!info[i].online && !info[i].offlinable) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Invalid data provided by guest agent")); + return -1; + } + } + + /* the guest agent reported less memory than requested */ + if (ntarget > nblock) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest agent reports less memory than requested")); + return -1; + } + + /* not enough offlinable memory blocks to support the request */ + if (ntarget < (nonline - nofflinable)) { + virReportError(VIR_ERR_INVALID_ARG, "%s", + _("Cannot offline enough memory blocks")); + return -1; + } + + for (i = 0; i < nblock; i++) { + if (ntarget < nonline) { + /* unplug */ + if (info[i].offlinable && info[i].online) { + info[i].online = false; + nonline--; + } + } else if (ntarget > nonline) { + /* plug */ + if (!info[i].online) { + info[i].online = true; + nonline++; + } + } else { + /* done */ + break; + } + } + + return 0; + +} int qemuAgentGetTime(qemuAgentPtr mon, diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 9a9b859..3ba6deb 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -119,6 +119,10 @@ struct _qemuAgentMemblockGeneralInfo { int qemuAgentGetMemblocks(qemuAgentPtr mon, qemuAgentMemblockInfoPtr *info); int qemuAgentGetMemblockGeneralInfo(qemuAgentPtr mon, qemuAgentMemblockGeneralInfoPtr info); +int qemuAgentUpdateMemblocks(unsigned long long memory, + qemuAgentMemblockInfoPtr info, + int nblock, + unsigned long long blocksize); int qemuAgentGetTime(qemuAgentPtr mon, long long *seconds, -- 1.7.12.4

qemuAgetSetMemblocks() is implemented, according to the qga command: 'guest-set-memory-blocks'. It asks the guest agent to set memory blocks online/offline according to the updated MemblockInfo. If all the blocks were setted successfully, the function returns with success, otherwise, fails. Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- src/qemu/qemu_agent.c | 117 ++++++++++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_agent.h | 1 + 2 files changed, 118 insertions(+) diff --git a/src/qemu/qemu_agent.c b/src/qemu/qemu_agent.c index 2c3a5ba..1945fae 100644 --- a/src/qemu/qemu_agent.c +++ b/src/qemu/qemu_agent.c @@ -1846,6 +1846,123 @@ qemuAgentUpdateMemblocks(unsigned long long memory, } int +qemuAgentSetMemblocks(qemuAgentPtr mon, + qemuAgentMemblockInfoPtr info, + int nblocks) +{ + int ret = -1; + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; + virJSONValuePtr memblocks = NULL; + virJSONValuePtr block = NULL; + virJSONValuePtr data = NULL; + int size = -1; + size_t i; + + /* create the key data array */ + if (!(memblocks = virJSONValueNewArray())) + goto cleanup; + + for (i = 0; i < nblocks; i++) { + qemuAgentMemblockInfoPtr in = &info[i]; + + /* create single memory block object */ + if (!(block = virJSONValueNewObject())) + goto cleanup; + + if (virJSONValueObjectAppendNumberInt(block, "phys-index", in->id) < 0) + goto cleanup; + + if (virJSONValueObjectAppendBoolean(block, "online", in->online) < 0) + goto cleanup; + + if (virJSONValueArrayAppend(memblocks, block) < 0) + goto cleanup; + + block = NULL; + } + + if (!(cmd = qemuAgentMakeCommand("guest-set-memory-blocks", + "a:mem-blks", memblocks, + NULL))) + goto cleanup; + + memblocks = NULL; + + if (qemuAgentCommand(mon, cmd, &reply, true, + VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK) < 0) + goto cleanup; + + if (!(data = virJSONValueObjectGet(reply, "return"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest-set-memory-blocks reply was missing return data")); + goto cleanup; + } + + if (data->type != VIR_JSON_TYPE_ARRAY) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("guest-set-memory-blocks returned information was not " + "an array")); + goto cleanup; + } + + if ((size = virJSONValueArraySize(data)) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("qemu agent didn't return an array of results")); + goto cleanup; + } + + for (i = 0; i < size; i++) { + virJSONValuePtr tmp_res = virJSONValueArrayGet(data, i); + unsigned long long id = 0; + const char *response = NULL; + int error_code = 0; + + if (!tmp_res) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("qemu agent reply missing result entry in array")); + goto cleanup; + } + + if (virJSONValueObjectGetNumberUlong(tmp_res, "phys-index", &id) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("qemu agent didn't provide 'phys-index' correctly")); + goto cleanup; + } + + if (!(response = virJSONValueObjectGetString(tmp_res, "response"))) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("qemu agent didn't provide 'response'" + " field for memory block %llu"), id); + goto cleanup; + } + + if (STRNEQ(response, "success")) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("qemu agent failed to set memory block %llu: %s"), id, response); + if (virJSONValueObjectGetNumberInt(tmp_res, "error-code", &error_code) < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("qemu agent didn't provide 'error-code' in response")); + goto cleanup; + } + + virReportError(VIR_ERR_INTERNAL_ERROR, _("errno-code is %d"), error_code); + goto cleanup; + } + } + + ret = 0; + + cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + virJSONValueFree(block); + virJSONValueFree(memblocks); + return ret; +} + + +int qemuAgentGetTime(qemuAgentPtr mon, long long *seconds, unsigned int *nseconds) diff --git a/src/qemu/qemu_agent.h b/src/qemu/qemu_agent.h index 3ba6deb..9707510 100644 --- a/src/qemu/qemu_agent.h +++ b/src/qemu/qemu_agent.h @@ -123,6 +123,7 @@ int qemuAgentUpdateMemblocks(unsigned long long memory, qemuAgentMemblockInfoPtr info, int nblock, unsigned long long blocksize); +int qemuAgentSetMemblocks(qemuAgentPtr mon, qemuAgentMemblockInfoPtr info, int nblocks); int qemuAgentGetTime(qemuAgentPtr mon, long long *seconds, -- 1.7.12.4

hotplug memory with guest agent. It 1 get memory block list, each member has 'phy-index', 'online' and 'can-offline' parameters 2 get memory block size, normally 128MB or 256MB for most OSes 3 convert the target memory size to memory block number, and see if there's enough memory blocks to be set online/offline. 4 update the memory block list info, and let guest agent to set memory blocks online/offline. note: because we hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has a size much bigger than KiB, there's a deviation with the range of (0, block_size). Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- src/qemu/qemu_driver.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 580cd60..2a20bef 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2307,6 +2307,10 @@ static int qemuDomainSetMemoryFlags(virDomainPtr dom, unsigned long newmem, virDomainDefPtr persistentDef; int ret = -1, r; virQEMUDriverConfigPtr cfg = NULL; + qemuAgentMemblockInfoPtr memblocks = NULL; + int nblocks = 0; + qemuAgentMemblockGeneralInfoPtr meminfo = NULL; + unsigned long long newmem_MB = newmem >> 10; virCheckFlags(VIR_DOMAIN_AFFECT_LIVE | VIR_DOMAIN_AFFECT_CONFIG | @@ -2368,6 +2372,41 @@ static int qemuDomainSetMemoryFlags(virDomainPtr dom, unsigned long newmem, /* resize the current memory */ unsigned long oldmax = 0; + priv = vm->privateData; + + if (flags & VIR_DOMAIN_MEM_GUEST) { + if (!qemuDomainAgentAvailable(vm, true)) + goto endjob; + + if (VIR_ALLOC(meminfo)) { + virReportOOMError(); + goto endjob; + } + + qemuDomainObjEnterAgent(vm); + nblocks = qemuAgentGetMemblocks(priv->agent, &memblocks); + qemuDomainObjExitAgent(vm); + + if (nblocks < 0) + goto endjob; + + qemuDomainObjEnterAgent(vm); + ret = qemuAgentGetMemblockGeneralInfo(priv->agent, meminfo); + qemuDomainObjExitAgent(vm); + + if (ret < 0) + goto endjob; + + if (qemuAgentUpdateMemblocks(newmem_MB, memblocks, nblocks, meminfo->blockSize)) + goto endjob; + + qemuDomainObjEnterAgent(vm); + ret = qemuAgentSetMemblocks(priv->agent, memblocks, nblocks); + qemuDomainObjExitAgent(vm); + + goto endjob; + } + if (def) oldmax = virDomainDefGetMemoryActual(def); if (persistentDef) { @@ -2382,7 +2421,6 @@ static int qemuDomainSetMemoryFlags(virDomainPtr dom, unsigned long newmem, } if (def) { - priv = vm->privateData; qemuDomainObjEnterMonitor(driver, vm); r = qemuMonitorSetBalloon(priv->mon, newmem); if (qemuDomainObjExitMonitor(driver, vm) < 0) @@ -2415,6 +2453,8 @@ static int qemuDomainSetMemoryFlags(virDomainPtr dom, unsigned long newmem, cleanup: virDomainObjEndAPI(&vm); virObjectUnref(cfg); + VIR_FREE(meminfo); + VIR_FREE(memblocks); return ret; } -- 1.7.12.4

support memory hotplug with the arg --guest in virsh command 'setmem'. fix a little bug in qemu_driver.c at the meanwhile. Signed-off-by: Zhang Bo <oscar.zhangbo@huawei.com> Signed-off-by: Li Bin <binlibin.li@huawei.com> --- src/libvirt-domain.c | 5 ++++- src/qemu/qemu_driver.c | 3 ++- tools/virsh-domain.c | 10 +++++++++- tools/virsh.pod | 7 ++++++- 4 files changed, 21 insertions(+), 4 deletions(-) diff --git a/src/libvirt-domain.c b/src/libvirt-domain.c index 155fb92..a1250b6 100644 --- a/src/libvirt-domain.c +++ b/src/libvirt-domain.c @@ -1947,7 +1947,10 @@ virDomainSetMemory(virDomainPtr domain, unsigned long memory) * size rather than current memory size. * If VIR_DOMAIN_MEM_GUEST is set, it changes the domain's memory size inside * the guest instead of the hypervisor. This flag can only be used with live guests. - * The usage of this flag may require a guest agent configured. + * The usage of this flag may require a guest agent configured. Note that because we + * hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has + * a size much bigger than KiB, there's a deviation with the range of (0, block_size). + * block_size may be 128MB or 256MB or etc., it differs on different OSes. * * Not all hypervisors can support all flag combinations. * diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2a20bef..e96465c 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2397,7 +2397,8 @@ static int qemuDomainSetMemoryFlags(virDomainPtr dom, unsigned long newmem, if (ret < 0) goto endjob; - if (qemuAgentUpdateMemblocks(newmem_MB, memblocks, nblocks, meminfo->blockSize)) + ret = qemuAgentUpdateMemblocks(newmem_MB, memblocks, nblocks, meminfo->blockSize); + if (ret < 0) goto endjob; qemuDomainObjEnterAgent(vm); diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index a25b7ba..ddb1cf9 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -8333,6 +8333,10 @@ static const vshCmdOptDef opts_setmem[] = { .type = VSH_OT_BOOL, .help = N_("affect current domain") }, + {.name = "guest", + .type = VSH_OT_BOOL, + .help = N_("use guest agent based hotplug, by enabling/disabling memory blocks") + }, {.name = NULL} }; @@ -8347,17 +8351,21 @@ cmdSetmem(vshControl *ctl, const vshCmd *cmd) bool config = vshCommandOptBool(cmd, "config"); bool live = vshCommandOptBool(cmd, "live"); bool current = vshCommandOptBool(cmd, "current"); + bool guest = vshCommandOptBool(cmd, "guest"); unsigned int flags = VIR_DOMAIN_AFFECT_CURRENT; VSH_EXCLUSIVE_OPTIONS_VAR(current, live); VSH_EXCLUSIVE_OPTIONS_VAR(current, config); + VSH_EXCLUSIVE_OPTIONS_VAR(guest, config); if (config) flags |= VIR_DOMAIN_AFFECT_CONFIG; if (live) flags |= VIR_DOMAIN_AFFECT_LIVE; + if (guest) + flags |= VIR_DOMAIN_MEM_GUEST; /* none of the options were specified */ - if (!current && !live && !config) + if (!current && flags == 0) flags = -1; if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) diff --git a/tools/virsh.pod b/tools/virsh.pod index 4e3f82a..534cc5e 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -1988,7 +1988,7 @@ B<Examples> virsh send-process-signal myguest 1 SIG_HUP =item B<setmem> I<domain> B<size> [[I<--config>] [I<--live>] | -[I<--current>]] +[I<--current>]] [I<--guest>] Change the memory allocation for a guest domain. If I<--live> is specified, perform a memory balloon of a running guest. @@ -1997,6 +1997,11 @@ If I<--current> is specified, affect the current guest state. Both I<--live> and I<--config> flags may be given, but I<--current> is exclusive. If no flag is specified, behavior is different depending on hypervisor. +If I<--guest> is specified, it use guest agent based hotplug, by +enabling/disabling memory blocks. Note that because we hotplug memory logically +by online/offline MEMORY BLOCKS, and each memory block has a size much bigger +than KiB, there's a deviation with the range of (0, block_size). block_size +may be 128MB or 256MB or etc., it differs on different OSes. I<size> is a scaled integer (see B<NOTES> above); it defaults to kibibytes (blocks of 1024 bytes) unless you provide a suffix (and the older option -- 1.7.12.4

On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
Logically memory hotplug via guest agent, by enabling/disabling memory blocks. The corresponding qga commands are: 'guest-get-memory-blocks', 'guest-set-memory-blocks' and 'guest-get-memory-block-info'.
detailed flow: 1 get memory block list, each member has 'phy-index', 'online' and 'can-offline' parameters 2 get memory block size, normally 128MB or 256MB for most OSes 3 convert the target memory size to memory block number, and see if there's enough memory blocks to be set online/offline. 4 update the memory block list info, and let guest agent to set memory blocks online/offline.
Note that because we hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has a size much bigger than KiB, there's a deviation with the range of (0, block_size). block_size may be 128MB or 256MB or etc., it differs on different OSes.
So thre's alot of questions about this feature that are unclear to me.. This appears to be entirely operating via guest agent commands. How does this then correspond to increased/decreased allocation in the host side QEMU ? What are the upper/lower bounds on adding/removing blocks. eg what prevents a malicous guest from asking for more memory to be added too itself than we wish to allow ? How is this better / worse than adjusting memory via the balloon driver ? How does this relate to the recently added DIMM hot add/remove feature on the host side, if at all ? Are the changes made synchronously or asynchronously - ie does the API block while the guest OS releases the memory from the blocks that re released, or is it totally in the backgrond like the balloon driver.. On a design POV, we're reusing the existing virDomainSetMemory API but adding a restriction that it has to be in multiples of the block size, which the mgmt app has no way of knowing upfront. It feels like this is information we need to be able to expose to the app in some manner. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Jun 09, 2015 at 11:05:16 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
Logically memory hotplug via guest agent, by enabling/disabling memory blocks. The corresponding qga commands are: 'guest-get-memory-blocks', 'guest-set-memory-blocks' and 'guest-get-memory-block-info'.
detailed flow: 1 get memory block list, each member has 'phy-index', 'online' and 'can-offline' parameters 2 get memory block size, normally 128MB or 256MB for most OSes 3 convert the target memory size to memory block number, and see if there's enough memory blocks to be set online/offline. 4 update the memory block list info, and let guest agent to set memory blocks online/offline.
Note that because we hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has a size much bigger than KiB, there's a deviation with the range of (0, block_size). block_size may be 128MB or 256MB or etc., it differs on different OSes.
So thre's alot of questions about this feature that are unclear to me..
This appears to be entirely operating via guest agent commands. How does this then correspond to increased/decreased allocation in the host side QEMU ? What are the upper/lower bounds on adding/removing blocks. eg what prevents a malicous guest from asking for more memory to be added too itself than we wish to allow ? How is this better / worse than adjusting memory via the balloon driver ? How does this relate to the
There are two possibilities where this could be advantageous: 1) This could be better than ballooning (given that it would actually return the memory to the host, which it doesn't) since you probably will be able to disable memory regions in certain NUMA nodes which is not possible with the current balloon driver (memory is taken randomly). 2) The guest OS sometimes needs to enable the memory region after ACPI memory hotplug. The GA would be able to online such memory. For this option we don't need to go through a different API though since it can be compounded using a flag.
recently added DIMM hot add/remove feature on the host side, if at all ? Are the changes made synchronously or asynchronously - ie does the API block while the guest OS releases the memory from the blocks that re released, or is it totally in the backgrond like the balloon driver..
On a design POV, we're reusing the existing virDomainSetMemory API but adding a restriction that it has to be in multiples of the block size, which the mgmt app has no way of knowing upfront. It feels like this is information we need to be able to expose to the app in some manner.
Since this feature would not actually release any host resources in contrast with agent based vCPU unplug I don't think it's worth exposing the memory region manipulation APIs via libvirt. Only sane way I can think of is to use it to enable the memory regions after hotplug. Peter

On Tue, Jun 09, 2015 at 01:22:49PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 11:05:16 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
Logically memory hotplug via guest agent, by enabling/disabling memory blocks. The corresponding qga commands are: 'guest-get-memory-blocks', 'guest-set-memory-blocks' and 'guest-get-memory-block-info'.
detailed flow: 1 get memory block list, each member has 'phy-index', 'online' and 'can-offline' parameters 2 get memory block size, normally 128MB or 256MB for most OSes 3 convert the target memory size to memory block number, and see if there's enough memory blocks to be set online/offline. 4 update the memory block list info, and let guest agent to set memory blocks online/offline.
Note that because we hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has a size much bigger than KiB, there's a deviation with the range of (0, block_size). block_size may be 128MB or 256MB or etc., it differs on different OSes.
So thre's alot of questions about this feature that are unclear to me..
This appears to be entirely operating via guest agent commands. How does this then correspond to increased/decreased allocation in the host side QEMU ? What are the upper/lower bounds on adding/removing blocks. eg what prevents a malicous guest from asking for more memory to be added too itself than we wish to allow ? How is this better / worse than adjusting memory via the balloon driver ? How does this relate to the
There are two possibilities where this could be advantageous:
1) This could be better than ballooning (given that it would actually return the memory to the host, which it doesn't) since you probably will be able to disable memory regions in certain NUMA nodes which is not possible with the current balloon driver (memory is taken randomly).
2) The guest OS sometimes needs to enable the memory region after ACPI memory hotplug. The GA would be able to online such memory. For this option we don't need to go through a different API though since it can be compounded using a flag.
So, are you saying that we should not be adding this to the virDomainSetMemory API as done in this series, and we should instead be able to request automatic enabling/disabling of the regions when we do the original DIMM hotplug ? Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Jun 09, 2015 at 12:46:27 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 01:22:49PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 11:05:16 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
Logically memory hotplug via guest agent, by enabling/disabling memory blocks. The corresponding qga commands are: 'guest-get-memory-blocks', 'guest-set-memory-blocks' and 'guest-get-memory-block-info'.
detailed flow: 1 get memory block list, each member has 'phy-index', 'online' and 'can-offline' parameters 2 get memory block size, normally 128MB or 256MB for most OSes 3 convert the target memory size to memory block number, and see if there's enough memory blocks to be set online/offline. 4 update the memory block list info, and let guest agent to set memory blocks online/offline.
Note that because we hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has a size much bigger than KiB, there's a deviation with the range of (0, block_size). block_size may be 128MB or 256MB or etc., it differs on different OSes.
So thre's alot of questions about this feature that are unclear to me..
This appears to be entirely operating via guest agent commands. How does this then correspond to increased/decreased allocation in the host side QEMU ? What are the upper/lower bounds on adding/removing blocks. eg what prevents a malicous guest from asking for more memory to be added too itself than we wish to allow ? How is this better / worse than adjusting memory via the balloon driver ? How does this relate to the
There are two possibilities where this could be advantageous:
1) This could be better than ballooning (given that it would actually return the memory to the host, which it doesn't) since you probably will be able to disable memory regions in certain NUMA nodes which is not possible with the current balloon driver (memory is taken randomly).
2) The guest OS sometimes needs to enable the memory region after ACPI memory hotplug. The GA would be able to online such memory. For this option we don't need to go through a different API though since it can be compounded using a flag.
So, are you saying that we should not be adding this to the virDomainSetMemory API as done in this series, and we should instead be able to request automatic enabling/disabling of the regions when we do the original DIMM hotplug ?
Well, that's the only place where using the memory region GA apis would make sense for libvirt. Whether we should do it is not that clear. Windows does online the regions automatically and I was told that some linux distros do it via udev rules. Peter

On Tue, Jun 09, 2015 at 02:03:13PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 12:46:27 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 01:22:49PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 11:05:16 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
Logically memory hotplug via guest agent, by enabling/disabling memory blocks. The corresponding qga commands are: 'guest-get-memory-blocks', 'guest-set-memory-blocks' and 'guest-get-memory-block-info'.
detailed flow: 1 get memory block list, each member has 'phy-index', 'online' and 'can-offline' parameters 2 get memory block size, normally 128MB or 256MB for most OSes 3 convert the target memory size to memory block number, and see if there's enough memory blocks to be set online/offline. 4 update the memory block list info, and let guest agent to set memory blocks online/offline.
Note that because we hotplug memory logically by online/offline MEMORY BLOCKS, and each memory block has a size much bigger than KiB, there's a deviation with the range of (0, block_size). block_size may be 128MB or 256MB or etc., it differs on different OSes.
So thre's alot of questions about this feature that are unclear to me..
This appears to be entirely operating via guest agent commands. How does this then correspond to increased/decreased allocation in the host side QEMU ? What are the upper/lower bounds on adding/removing blocks. eg what prevents a malicous guest from asking for more memory to be added too itself than we wish to allow ? How is this better / worse than adjusting memory via the balloon driver ? How does this relate to the
There are two possibilities where this could be advantageous:
1) This could be better than ballooning (given that it would actually return the memory to the host, which it doesn't) since you probably will be able to disable memory regions in certain NUMA nodes which is not possible with the current balloon driver (memory is taken randomly).
2) The guest OS sometimes needs to enable the memory region after ACPI memory hotplug. The GA would be able to online such memory. For this option we don't need to go through a different API though since it can be compounded using a flag.
So, are you saying that we should not be adding this to the virDomainSetMemory API as done in this series, and we should instead be able to request automatic enabling/disabling of the regions when we do the original DIMM hotplug ?
Well, that's the only place where using the memory region GA apis would make sense for libvirt.
Whether we should do it is not that clear. Windows does online the regions automatically and I was told that some linux distros do it via udev rules.
What do we do in the case of hotunplug currently ? Are we expectig the guest admin to have manually offlined the regions before doing hotunplug on the host ? Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Tue, Jun 09, 2015 at 13:05:35 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 02:03:13PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 12:46:27 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 01:22:49PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 11:05:16 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
...
2) The guest OS sometimes needs to enable the memory region after ACPI memory hotplug. The GA would be able to online such memory. For this option we don't need to go through a different API though since it can be compounded using a flag.
So, are you saying that we should not be adding this to the virDomainSetMemory API as done in this series, and we should instead be able to request automatic enabling/disabling of the regions when we do the original DIMM hotplug ?
Well, that's the only place where using the memory region GA apis would make sense for libvirt.
Whether we should do it is not that clear. Windows does online the regions automatically and I was told that some linux distros do it via udev rules.
What do we do in the case of hotunplug currently ? Are we expectig the guest admin to have manually offlined the regions before doing hotunplug on the host ?
You don't need to offline them prior to unplug. The guest OS handles that automatically when it receives the request.

On Tue, Jun 09, 2015 at 02:12:39PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 13:05:35 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 02:03:13PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 12:46:27 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 01:22:49PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 11:05:16 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
...
2) The guest OS sometimes needs to enable the memory region after ACPI memory hotplug. The GA would be able to online such memory. For this option we don't need to go through a different API though since it can be compounded using a flag.
So, are you saying that we should not be adding this to the virDomainSetMemory API as done in this series, and we should instead be able to request automatic enabling/disabling of the regions when we do the original DIMM hotplug ?
Well, that's the only place where using the memory region GA apis would make sense for libvirt.
Whether we should do it is not that clear. Windows does online the regions automatically and I was told that some linux distros do it via udev rules.
What do we do in the case of hotunplug currently ? Are we expectig the guest admin to have manually offlined the regions before doing hotunplug on the host ?
You don't need to offline them prior to unplug. The guest OS handles that automatically when it receives the request.
Hmm, so if the guest can offline and online DIMMS automatically on hotplug/unplug, then I'm puzzelled what value this patch series really adds. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 2015/6/9 20:47, Daniel P. Berrange wrote:
On Tue, Jun 09, 2015 at 02:12:39PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 13:05:35 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 02:03:13PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 12:46:27 +0100, Daniel Berrange wrote:
On Tue, Jun 09, 2015 at 01:22:49PM +0200, Peter Krempa wrote:
On Tue, Jun 09, 2015 at 11:05:16 +0100, Daniel Berrange wrote: > On Tue, Jun 09, 2015 at 05:33:24PM +0800, Zhang Bo wrote:
...
2) The guest OS sometimes needs to enable the memory region after ACPI memory hotplug. The GA would be able to online such memory. For this option we don't need to go through a different API though since it can be compounded using a flag.
So, are you saying that we should not be adding this to the virDomainSetMemory API as done in this series, and we should instead be able to request automatic enabling/disabling of the regions when we do the original DIMM hotplug ?
Well, that's the only place where using the memory region GA apis would make sense for libvirt.
Whether we should do it is not that clear. Windows does online the regions automatically and I was told that some linux distros do it via udev rules.
What do we do in the case of hotunplug currently ? Are we expectig the guest admin to have manually offlined the regions before doing hotunplug on the host ?
You don't need to offline them prior to unplug. The guest OS handles that automatically when it receives the request.
Hmm, so if the guest can offline and online DIMMS automatically on hotplug/unplug, then I'm puzzelled what value this patch series really adds.
Regards, Daniel
Thank you for your reply. Before this patch, we needed to manually online memory blocks inside the guest, after dimm memory hotplug for most *nix OSes. (Windows guests automatically get their memory blocks online after hotplugging) That is to say, we need to LOGICALLY hotplug memory after PHYSICAL hotplug. This patch did the LOGICAL part. With this patch, we don't need to get into the guest to manually online them anymore, which is even impossible for most host administrators. -- Oscar oscar.zhangbo@huawei.com

2015-06-10 5:28 GMT+03:00 zhang bo <oscar.zhangbo@huawei.com>:
Thank you for your reply. Before this patch, we needed to manually online memory blocks inside the guest, after dimm memory hotplug for most *nix OSes. (Windows guests automatically get their memory blocks online after hotplugging) That is to say, we need to LOGICALLY hotplug memory after PHYSICAL hotplug. This patch did the LOGICAL part. With this patch, we don't need to get into the guest to manually online them anymore, which is even impossible for most host administrators.
As i remember this online step easy can be automate via udev rules. -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On 2015/6/10 13:40, Vasiliy Tolstov wrote:
2015-06-10 5:28 GMT+03:00 zhang bo <oscar.zhangbo@huawei.com>:
Thank you for your reply. Before this patch, we needed to manually online memory blocks inside the guest, after dimm memory hotplug for most *nix OSes. (Windows guests automatically get their memory blocks online after hotplugging) That is to say, we need to LOGICALLY hotplug memory after PHYSICAL hotplug. This patch did the LOGICAL part. With this patch, we don't need to get into the guest to manually online them anymore, which is even impossible for most host administrators.
As i remember this online step easy can be automate via udev rules.
Logically that's true, but adding udev rules means: 1 you have to get into the guest 2 you have to be familar with udev rules. Not convenient enough compared to just calling libvirt API to do so. -- Oscar oscar.zhangbo@huawei.com

On Wed, Jun 10, 2015 at 02:05:16PM +0800, zhang bo wrote:
On 2015/6/10 13:40, Vasiliy Tolstov wrote:
2015-06-10 5:28 GMT+03:00 zhang bo <oscar.zhangbo@huawei.com>:
Thank you for your reply. Before this patch, we needed to manually online memory blocks inside the guest, after dimm memory hotplug for most *nix OSes. (Windows guests automatically get their memory blocks online after hotplugging) That is to say, we need to LOGICALLY hotplug memory after PHYSICAL hotplug. This patch did the LOGICAL part. With this patch, we don't need to get into the guest to manually online them anymore, which is even impossible for most host administrators.
As i remember this online step easy can be automate via udev rules.
Logically that's true, but adding udev rules means: 1 you have to get into the guest 2 you have to be familar with udev rules.
Not convenient enough compared to just calling libvirt API to do so.
The udev rules are really something the OS vendor should setup, so that it "just works" Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

2015-06-10 11:37 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
The udev rules are really something the OS vendor should setup, so that it "just works"
I think so, also for vcpu hotplug this also covered by udev. May be we need something to hot remove memory and cpu, because in guest we need offline firstly. -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru

On 2015/6/10 16:39, Vasiliy Tolstov wrote:
2015-06-10 11:37 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
The udev rules are really something the OS vendor should setup, so that it "just works"
I think so, also for vcpu hotplug this also covered by udev. May be we need something to hot remove memory and cpu, because in guest we need offline firstly.
In fact ,we also have --guest option for 'virsh sevvcpus' command, which also uses qga commands to do the logical hotplug/unplug jobs, although udev rules seems to cover the vcpu logical hotplug issue. virsh # help setvcpus ......................... --guest modify cpu state in the guest BTW: we didn't see OSes with udev rules for memory-hotplug-event setted by vendors, and adding such rules means that we have to *interfere within the guest*, It seems not a good option. -- Oscar oscar.zhangbo@huawei.com

On Wed, Jun 10, 2015 at 05:24:50PM +0800, zhang bo wrote:
On 2015/6/10 16:39, Vasiliy Tolstov wrote:
2015-06-10 11:37 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
The udev rules are really something the OS vendor should setup, so that it "just works"
I think so, also for vcpu hotplug this also covered by udev. May be we need something to hot remove memory and cpu, because in guest we need offline firstly.
In fact ,we also have --guest option for 'virsh sevvcpus' command, which also uses qga commands to do the logical hotplug/unplug jobs, although udev rules seems to cover the vcpu logical hotplug issue.
virsh # help setvcpus ......................... --guest modify cpu state in the guest
BTW: we didn't see OSes with udev rules for memory-hotplug-event setted by vendors, and adding such rules means that we have to *interfere within the guest*, It seems not a good option.
I was suggesting that an RFE be filed with any vendor who doesn't do it to add this capability, not that we add udev rules ourselves. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On Wed, Jun 10, 2015 at 10:28:08AM +0100, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 05:24:50PM +0800, zhang bo wrote:
On 2015/6/10 16:39, Vasiliy Tolstov wrote:
2015-06-10 11:37 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
The udev rules are really something the OS vendor should setup, so that it "just works"
I think so, also for vcpu hotplug this also covered by udev. May be we need something to hot remove memory and cpu, because in guest we need offline firstly.
In fact ,we also have --guest option for 'virsh sevvcpus' command, which also uses qga commands to do the logical hotplug/unplug jobs, although udev rules seems to cover the vcpu logical hotplug issue.
virsh # help setvcpus ......................... --guest modify cpu state in the guest
BTW: we didn't see OSes with udev rules for memory-hotplug-event setted by vendors, and adding such rules means that we have to *interfere within the guest*, It seems not a good option.
I was suggesting that an RFE be filed with any vendor who doesn't do it to add this capability, not that we add udev rules ourselves.
Or actually, it probably is sufficient to just send a patch to the upstream systemd project to add the desired rule to udev. That way all Linux distros will inherit the feature when they update to new udev. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 2015/6/10 17:31, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 10:28:08AM +0100, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 05:24:50PM +0800, zhang bo wrote:
On 2015/6/10 16:39, Vasiliy Tolstov wrote:
2015-06-10 11:37 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
The udev rules are really something the OS vendor should setup, so that it "just works"
I think so, also for vcpu hotplug this also covered by udev. May be we need something to hot remove memory and cpu, because in guest we need offline firstly.
In fact ,we also have --guest option for 'virsh sevvcpus' command, which also uses qga commands to do the logical hotplug/unplug jobs, although udev rules seems to cover the vcpu logical hotplug issue.
virsh # help setvcpus ......................... --guest modify cpu state in the guest
BTW: we didn't see OSes with udev rules for memory-hotplug-event setted by vendors, and adding such rules means that we have to *interfere within the guest*, It seems not a good option.
I was suggesting that an RFE be filed with any vendor who doesn't do it to add this capability, not that we add udev rules ourselves.
Or actually, it probably is sufficient to just send a patch to the upstream systemd project to add the desired rule to udev. That way all Linux distros will inherit the feature when they update to new udev.
Then, here comes the question: how to deal with the guests that are already in use? I think it's better to operate them at the host side without getting into the guest. That's the advantage of qemu-guest-agent, why not take advantage of it? -- Oscar oscar.zhangbo@huawei.com

On Thu, Jun 11, 2015 at 09:38:24 +0800, zhang bo wrote:
On 2015/6/10 17:31, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 10:28:08AM +0100, Daniel P. Berrange wrote:
On Wed, Jun 10, 2015 at 05:24:50PM +0800, zhang bo wrote:
On 2015/6/10 16:39, Vasiliy Tolstov wrote:
2015-06-10 11:37 GMT+03:00 Daniel P. Berrange <berrange@redhat.com>:
The udev rules are really something the OS vendor should setup, so that it "just works"
I think so, also for vcpu hotplug this also covered by udev. May be we need something to hot remove memory and cpu, because in guest we need offline firstly.
In fact ,we also have --guest option for 'virsh sevvcpus' command, which also uses qga commands to do the logical hotplug/unplug jobs, although udev rules seems to cover the vcpu logical hotplug issue.
virsh # help setvcpus ......................... --guest modify cpu state in the guest
BTW: we didn't see OSes with udev rules for memory-hotplug-event setted by vendors, and adding such rules means that we have to *interfere within the guest*, It seems not a good option.
I was suggesting that an RFE be filed with any vendor who doesn't do it to add this capability, not that we add udev rules ourselves.
Or actually, it probably is sufficient to just send a patch to the upstream systemd project to add the desired rule to udev. That way all Linux distros will inherit the feature when they update to new udev.
Then, here comes the question: how to deal with the guests that are already in use? I think it's better to operate them at the host side without getting into the guest. That's the advantage of qemu-guest-agent, why not take advantage of it?
Such guests would need an update qemu-guest-agent anyway. And installing a new version of qemu-guest-agent is not any easier than installing an updated udev or a new udev rule. That is, I don't think the qemu-guest-agent way has any benefits over a udev rule. It's rather the opposite. Jirka

2015-06-11 11:42 GMT+03:00 Jiri Denemark <jdenemar@redhat.com>:
Such guests would need an update qemu-guest-agent anyway. And installing a new version of qemu-guest-agent is not any easier than installing an updated udev or a new udev rule. That is, I don't think the qemu-guest-agent way has any benefits over a udev rule. It's rather the opposite.
May be as workaround install with qemu-ga (if os old) udev rules for cpu/memory hotplug? So we have udev rules that do all the work, and packagers can enable/disable installing rules ? -- Vasiliy Tolstov, e-mail: v.tolstov@selfip.ru
participants (6)
-
Daniel P. Berrange
-
Jiri Denemark
-
Peter Krempa
-
Vasiliy Tolstov
-
Zhang Bo
-
zhang bo