
On Tue, Feb 04, 2014 at 04:42:02PM +0000, Daniel P. Berrange wrote:
On Tue, Feb 04, 2014 at 11:32:47AM -0500, Marcelo Tosatti wrote:
Add an element named "strict-hugepages" to control whether to refuse guest initialization in case hugepage allocation cannot be performed.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index ff50214..e79f5e6 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -632,6 +632,9 @@ <dt><code>hugepages</code></dt> <dd>This tells the hypervisor that the guest should have its memory allocated using hugepages instead of the normal native page size.</dd> + <dt><code>strict-hugepages</code></dt> + <dd>This tells the hypervisor that the guest should refuse to start + in case of failure to allocate guest memory with hugepages</dd>
Huh, we already supply the -mem-prealloc command line flag alongside the -mem-path flag which should cause QEMU to exit if it cannot allocate all memory upfront.
+/* + * Returns bool: whether to fail guest initialization. + * + */ +static bool qemuValidateStrictHugepage(virDomainObjPtr vm, virQEMUDriverConfigPtr cfg) +{ + bool ret = false; + char **maps = NULL; + int i; + char *buf; + + if (!vm->def->mem.strict_hugepages) + return ret; + + ret = true; + + if (!vm->def->mem.hugepage_backed || !cfg->hugepagePath) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("strict huge pages depends on huge pages")); + return ret; + } + + buf = malloc(strlen(cfg->hugepagePath) + 50); + + /* The parser requires /proc/pid, which only exists on platforms + * like Linux where pid_t fits in int. */ + if ((int) vm->pid != vm->pid || + qemuParseProcFileStrings(vm->pid, "maps", &maps) < 0) + goto cleanup; + + for (i = 0; maps && maps[i]; i++) { + char *endptr; + unsigned long start, end; + const char *map = maps[i]; + bool found = false; + + sprintf(buf, "%s/qemu_back_mem.pc.ram.", cfg->hugepagePath); + if (strstr(map,buf) != NULL) + found = true; + + sprintf(buf, "%s/kvm.", cfg->hugepagePath); + if (strstr(map,buf) != NULL) + found = true; + + if (!found) + continue; + + errno = 0; + start = strtol(map, &endptr, 16); + if ((errno == ERANGE && (start == LONG_MAX || start == LONG_MIN)) + || (errno != 0 && start == 0)) { + continue; + } + + if (endptr && *endptr == '-') + endptr++; + + if (!*endptr) + continue; + + errno = 0; + end = strtol(endptr, NULL, 16); + if ((errno == ERANGE && (end == LONG_MAX || end == LONG_MIN)) + || (errno != 0 && end == 0)) { + continue; + } + + if (end-start >= vm->def->mem.max_balloon * 1024) { + ret = false; + break; + } + }
This code is really very nasty. It is both Linux specific and exposes libvirt to the private implementation detail of QEMU's backing file naming. On top of which I don't see why we need this when we're already setting -mem-prealloc to enforce exactly this.
BTW, you already do /proc parsing and already do hugetlbfs which is Linux specific. So if you can detail in what new way this code is nasty (accordingly to your judgement above), i can look into improving it.