[libvirt] [PATCH 0/2 v3 use qemu's dump-guest-meory when vm uses host device

Currently, we use migrate to dump guest's memory. There is one restriction in migrate command: the device's status should be stored in qemu because the device's status should be passed to target machine. If we passthrough a host device to guest, the device's status is stored in the real device. So migrate command will fail. We usually use dump when guest is panicked. So there is no need to store device's status in the vmcore. qemu will have a new monitor command dump-guest-memory to dump guest memory, but it doesn't support async now(it will support later when the common async API is implemented). So I use dump-guest-memory only when the guest uses host device in this patchset. Note: the patchset for qemu is still queued. Luiz has acked, but he waits an ACK from Jan and/or Anthony. They are too busy, and donot reply. Changes from v2 to v3: 1. qemu supports the fd that is associated with a pipe, socket, or FIFO. So pass a pipe fd to qemu and O_DIRECT can work now. Change from v1 to v2: 1. remove the implemention for text mode. Wen Congyang (2): qemu: implement qemu's dump-guest-memory qemu: try to use qemu's dump-guest-meory when vm uses host device src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 39 +++++++++++++++++++++++++++++++++++++-- src/qemu/qemu_monitor.c | 36 ++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor.h | 13 +++++++++++++ src/qemu/qemu_monitor_json.c | 42 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 7 +++++++ 7 files changed, 137 insertions(+), 2 deletions(-)

--- src/qemu/qemu_monitor.c | 36 ++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor.h | 13 +++++++++++++ src/qemu/qemu_monitor_json.c | 42 ++++++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_monitor_json.h | 7 +++++++ 4 files changed, 98 insertions(+), 0 deletions(-) diff --git a/src/qemu/qemu_monitor.c b/src/qemu/qemu_monitor.c index e1a8d4c..81bdf07 100644 --- a/src/qemu/qemu_monitor.c +++ b/src/qemu/qemu_monitor.c @@ -2018,6 +2018,42 @@ int qemuMonitorMigrateCancel(qemuMonitorPtr mon) return ret; } +/* Return 0 on success, -1 on failure, or -2 if not supported. */ +int qemuMonitorDumpToFd(qemuMonitorPtr mon, + unsigned int flags, + int fd, + unsigned long long begin, + unsigned long long length) +{ + int ret; + VIR_DEBUG("mon=%p fd=%d flags=%x begin=%llx length=%llx", + mon, fd, flags, begin, length); + + if (!mon) { + qemuReportError(VIR_ERR_INVALID_ARG, "%s", + _("monitor must not be NULL")); + return -1; + } + + if (!mon->json) + /* dump-guest-memory is supported after qemu-1.0, and we always use json + * if qemu's version is >= 0.15. So if we use text mode, the qemu is + * old, and it does not support dump-guest-memory. + */ + return -2; + + if (qemuMonitorSendFileHandle(mon, "dump", fd) < 0) + return -1; + + ret = qemuMonitorJSONDump(mon, flags, "fd:dump", begin, length); + + if (ret < 0) { + if (qemuMonitorCloseFileHandle(mon, "dump") < 0) + VIR_WARN("failed to close dumping handle"); + } + + return ret; +} int qemuMonitorGraphicsRelocate(qemuMonitorPtr mon, int type, diff --git a/src/qemu/qemu_monitor.h b/src/qemu/qemu_monitor.h index b480966..315cb9e 100644 --- a/src/qemu/qemu_monitor.h +++ b/src/qemu/qemu_monitor.h @@ -379,6 +379,19 @@ int qemuMonitorMigrateToUnix(qemuMonitorPtr mon, int qemuMonitorMigrateCancel(qemuMonitorPtr mon); +typedef enum { + QEMU_MONITOR_DUMP_HAVE_FILTER = 1 << 0, + QEMU_MONITOR_DUMP_PAGING = 1 << 1, + QEMU_MONITOR_DUMP_FLAGS_LAST +} QEMU_MONITOR_DUMP; + +/* Return 0 on success, -1 on failure, or -2 if not supported. */ +int qemuMonitorDumpToFd(qemuMonitorPtr mon, + unsigned int flags, + int fd, + unsigned long long begin, + unsigned long long length); + int qemuMonitorGraphicsRelocate(qemuMonitorPtr mon, int type, const char *hostname, diff --git a/src/qemu/qemu_monitor_json.c b/src/qemu/qemu_monitor_json.c index d09d779..6709abf 100644 --- a/src/qemu/qemu_monitor_json.c +++ b/src/qemu/qemu_monitor_json.c @@ -2419,6 +2419,48 @@ int qemuMonitorJSONMigrateCancel(qemuMonitorPtr mon) return ret; } +/* Return 0 on success, -1 on failure, or -2 if not supported. */ +int qemuMonitorJSONDump(qemuMonitorPtr mon, + unsigned int flags, + const char *protocol, + unsigned long long begin, + unsigned long long length) +{ + int ret; + virJSONValuePtr cmd = NULL; + virJSONValuePtr reply = NULL; + + if (flags & QEMU_MONITOR_DUMP_HAVE_FILTER) + cmd = qemuMonitorJSONMakeCommand("dump-guest-memory", + "b:paging", flags & QEMU_MONITOR_DUMP_PAGING ? 1 : 0, + "s:protocol", protocol, + "U:begin", begin, + "U:length", length, + NULL); + else + cmd = qemuMonitorJSONMakeCommand("dump-guest-memory", + "b:paging", flags & QEMU_MONITOR_DUMP_PAGING ? 1 : 0, + "s:protocol", protocol, + NULL); + if (!cmd) + return -1; + + ret = qemuMonitorJSONCommand(mon, cmd, &reply); + + if (ret == 0) { + if (qemuMonitorJSONHasError(reply, "CommandNotFound")) { + ret = -2; + goto cleanup; + } + + ret = qemuMonitorJSONCheckError(cmd, reply); + } + +cleanup: + virJSONValueFree(cmd); + virJSONValueFree(reply); + return ret; +} int qemuMonitorJSONGraphicsRelocate(qemuMonitorPtr mon, int type, diff --git a/src/qemu/qemu_monitor_json.h b/src/qemu/qemu_monitor_json.h index a0f67aa..a8b70d4 100644 --- a/src/qemu/qemu_monitor_json.h +++ b/src/qemu/qemu_monitor_json.h @@ -136,6 +136,13 @@ int qemuMonitorJSONMigrate(qemuMonitorPtr mon, int qemuMonitorJSONMigrateCancel(qemuMonitorPtr mon); +/* Return 0 on success, -1 on failure, or -2 if not supported. */ +int qemuMonitorJSONDump(qemuMonitorPtr mon, + unsigned int flags, + const char *protocol, + unsigned long long begin, + unsigned long long length); + int qemuMonitorJSONGraphicsRelocate(qemuMonitorPtr mon, int type, const char *hostname, -- 1.7.1

--- src/qemu/qemu_domain.c | 1 + src/qemu/qemu_domain.h | 1 + src/qemu/qemu_driver.c | 39 +++++++++++++++++++++++++++++++++++++-- 3 files changed, 39 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 69d9e6e..e3a668a 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -158,6 +158,7 @@ qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv) job->phase = 0; job->mask = DEFAULT_JOB_MASK; job->start = 0; + job->qemu_dump = false; memset(&job->info, 0, sizeof(job->info)); } diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index adccfed..f1ab0e6 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -97,6 +97,7 @@ struct qemuDomainJobObj { int phase; /* Job phase (mainly for migrations) */ unsigned long long mask; /* Jobs allowed during async job */ unsigned long long start; /* When the async job started */ + bool qemu_dump; /* use qemu dump to do dump */ virDomainJobInfo info; /* Async job progress data */ }; diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index d9e35be..651e9b8 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -2953,6 +2953,31 @@ cleanup: return ret; } +/* Return 0 on success, -1 on failure, or -2 if not supported. */ +static int qemuDumpToFd(struct qemud_driver *driver, virDomainObjPtr vm, + int fd, enum qemuDomainAsyncJob asyncJob) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + int ret = -1; + + if (!qemuCapsGet(priv->qemuCaps, QEMU_CAPS_MIGRATE_QEMU_FD)) + return -2; + + if (virSecurityManagerSetImageFDLabel(driver->securityManager, vm->def, + fd) < 0) + return -1; + + priv->job.qemu_dump = true; + + if (qemuDomainObjEnterMonitorAsync(driver, vm, asyncJob) < 0) + return -1; + + ret = qemuMonitorDumpToFd(priv->mon, 0, fd, 0, 0); + qemuDomainObjExitMonitorWithDriver(driver, vm); + + return ret; +} + static int doCoreDump(struct qemud_driver *driver, virDomainObjPtr vm, @@ -2987,6 +3012,16 @@ doCoreDump(struct qemud_driver *driver, if (!(wrapperFd = virFileWrapperFdNew(&fd, path, flags))) goto cleanup; + if (vm->def->nhostdevs > 0) { + /* + * If the guest uses host devices, migrate command will fail. So we + * should use dump command. + */ + ret = qemuDumpToFd(driver, vm, fd, QEMU_ASYNC_JOB_DUMP); + if (ret != -2) + goto cleanup; + } + if (qemuMigrationToFile(driver, vm, fd, 0, path, qemuCompressProgramName(compress), false, QEMU_ASYNC_JOB_DUMP) < 0) @@ -9332,7 +9367,7 @@ static int qemuDomainGetJobInfo(virDomainPtr dom, priv = vm->privateData; if (virDomainObjIsActive(vm)) { - if (priv->job.asyncJob) { + if (priv->job.asyncJob && !priv->job.qemu_dump) { memcpy(info, &priv->job.info, sizeof(*info)); /* Refresh elapsed time again just to ensure it @@ -9390,7 +9425,7 @@ static int qemuDomainAbortJob(virDomainPtr dom) { priv = vm->privateData; - if (!priv->job.asyncJob) { + if (!priv->job.asyncJob || priv->job.qemu_dump) { qemuReportError(VIR_ERR_OPERATION_INVALID, "%s", _("no job is active on the domain")); goto endjob; -- 1.7.1

On Thu, Apr 19, 2012 at 09:03:16AM +0800, Wen Congyang wrote:
Currently, we use migrate to dump guest's memory. There is one restriction in migrate command: the device's status should be stored in qemu because the device's status should be passed to target machine.
If we passthrough a host device to guest, the device's status is stored in the real device. So migrate command will fail.
We usually use dump when guest is panicked. So there is no need to store device's status in the vmcore.
qemu will have a new monitor command dump-guest-memory to dump guest memory, but it doesn't support async now(it will support later when the common async API is implemented).
So I use dump-guest-memory only when the guest uses host device in this patchset.
Hmm, doesn't the new command generate dump files in a totally different format to the existing command ? If so I don't think it is nice behaviour to silently switch between 2 different dump formats depending on the XML config. I think we'd want some kind of flag to specify what format is desired. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

At 04/19/2012 04:38 PM, Daniel P. Berrange Wrote:
On Thu, Apr 19, 2012 at 09:03:16AM +0800, Wen Congyang wrote:
Currently, we use migrate to dump guest's memory. There is one restriction in migrate command: the device's status should be stored in qemu because the device's status should be passed to target machine.
If we passthrough a host device to guest, the device's status is stored in the real device. So migrate command will fail.
We usually use dump when guest is panicked. So there is no need to store device's status in the vmcore.
qemu will have a new monitor command dump-guest-memory to dump guest memory, but it doesn't support async now(it will support later when the common async API is implemented).
So I use dump-guest-memory only when the guest uses host device in this patchset.
Hmm, doesn't the new command generate dump files in a totally different format to the existing command ? If so I don't think it is nice behaviour to silently switch between 2 different dump formats depending on the XML config. I think we'd want some kind of flag to specify what format is desired.
Agree with it. But the new command is not a async command, and it will block the other operation. So I only use it when the vm uses host device(migrate command will fail in this case). The new command will be converted to async command later(after async API is implemented). I think we can allow the user specify the format when the new command is async command. Thanks Wen Congyang
Daniel

On 04/19/2012 02:45 AM, Wen Congyang wrote:
At 04/19/2012 04:38 PM, Daniel P. Berrange Wrote:
On Thu, Apr 19, 2012 at 09:03:16AM +0800, Wen Congyang wrote:
Currently, we use migrate to dump guest's memory. There is one restriction in migrate command: the device's status should be stored in qemu because the device's status should be passed to target machine.
If we passthrough a host device to guest, the device's status is stored in the real device. So migrate command will fail.
We usually use dump when guest is panicked. So there is no need to store device's status in the vmcore.
qemu will have a new monitor command dump-guest-memory to dump guest memory, but it doesn't support async now(it will support later when the common async API is implemented).
So I use dump-guest-memory only when the guest uses host device in this patchset.
Hmm, doesn't the new command generate dump files in a totally different format to the existing command ? If so I don't think it is nice behaviour to silently switch between 2 different dump formats depending on the XML config. I think we'd want some kind of flag to specify what format is desired.
Agree with it. But the new command is not a async command, and it will block the other operation. So I only use it when the vm uses host device(migrate command will fail in this case).
The new command will be converted to async command later(after async API is implemented). I think we can allow the user specify the format when the new command is async command.
You missed the point. We need 2 flags - one now that says whether to dump via migrate-to-file or via dumping memory (I'd name it VIR_DOMAIN_CORE_DUMP_MEMORY_ONLY), and another flag later to state whether to block the dump or whether to do the dump asynchronously (similar to the recent block-job-cancel conversion, I'd name it VIR_DOMAIN_CORE_DUMP_ASYNC). In other words, the user should be able to choose which format they get, for 3 of the 4 combinations of user's XML vs. available qemu commands: hostdev no hostdev flag = 0 error migrate-to-file flag = MEMORY_ONLY dump-guest-memory dump-guest-memory and the choice of async or blocking is orthogonal to the above choice. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
Wen Congyang