
On Tue, Feb 04, 2014 at 04:42:02PM +0000, Daniel P. Berrange wrote:
On Tue, Feb 04, 2014 at 11:32:47AM -0500, Marcelo Tosatti wrote:
Add an element named "strict-hugepages" to control whether to refuse guest initialization in case hugepage allocation cannot be performed.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index ff50214..e79f5e6 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -632,6 +632,9 @@ <dt><code>hugepages</code></dt> <dd>This tells the hypervisor that the guest should have its memory allocated using hugepages instead of the normal native page size.</dd> + <dt><code>strict-hugepages</code></dt> + <dd>This tells the hypervisor that the guest should refuse to start + in case of failure to allocate guest memory with hugepages</dd>
Huh, we already supply the -mem-prealloc command line flag alongside the -mem-path flag which should cause QEMU to exit if it cannot allocate all memory upfront.
-mem-path path Allocate guest RAM from a temporarily created file in path. -mem-prealloc Preallocate memory when using -mem-path. "should cause QEMU to exit if it cannot allocate all memory upfront" => no it does not. See more below.
+/* + * Returns bool: whether to fail guest initialization. + * + */ +static bool qemuValidateStrictHugepage(virDomainObjPtr vm, virQEMUDriverConfigPtr cfg) +{ + bool ret = false; + char **maps = NULL; + int i; + char *buf; + + if (!vm->def->mem.strict_hugepages) + return ret; + + ret = true; + + if (!vm->def->mem.hugepage_backed || !cfg->hugepagePath) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("strict huge pages depends on huge pages")); + return ret; + } + + buf = malloc(strlen(cfg->hugepagePath) + 50); + + /* The parser requires /proc/pid, which only exists on platforms + * like Linux where pid_t fits in int. */ + if ((int) vm->pid != vm->pid || + qemuParseProcFileStrings(vm->pid, "maps", &maps) < 0) + goto cleanup; + + for (i = 0; maps && maps[i]; i++) { + char *endptr; + unsigned long start, end; + const char *map = maps[i]; + bool found = false; + + sprintf(buf, "%s/qemu_back_mem.pc.ram.", cfg->hugepagePath); + if (strstr(map,buf) != NULL) + found = true; + + sprintf(buf, "%s/kvm.", cfg->hugepagePath); + if (strstr(map,buf) != NULL) + found = true; + + if (!found) + continue; + + errno = 0; + start = strtol(map, &endptr, 16); + if ((errno == ERANGE && (start == LONG_MAX || start == LONG_MIN)) + || (errno != 0 && start == 0)) { + continue; + } + + if (endptr && *endptr == '-') + endptr++; + + if (!*endptr) + continue; + + errno = 0; + end = strtol(endptr, NULL, 16); + if ((errno == ERANGE && (end == LONG_MAX || end == LONG_MIN)) + || (errno != 0 && end == 0)) { + continue; + } + + if (end-start >= vm->def->mem.max_balloon * 1024) { + ret = false; + break; + } + }
This code is really very nasty. It is both Linux specific and exposes
Hugetlbfs is Linux specific. So all Hugetlbfs code in libvirt is already Linux specific.
libvirt to the private implementation detail of QEMU's backing file
Right. I am going to add a comment to QEMU saying libvirt interprets the string (so that changing it breaks libvirt). Other tools already do this.
naming. On top of which I don't see why we need this when we're already setting -mem-prealloc to enforce exactly this.
Because there is no guarantee with -mem-prealloc. For instance, if the hugepage path is not actually hugetlbfs backed, QEMU falls back to malloc(). A new element StrictHugePageSize=N has been requested, where it is necessary to fail in case hugepagesize != N.