On Tue, Feb 04, 2014 at 04:42:02PM +0000, Daniel P. Berrange wrote:
On Tue, Feb 04, 2014 at 11:32:47AM -0500, Marcelo Tosatti wrote:
>
> Add an element named "strict-hugepages" to control whether to
> refuse guest initialization in case hugepage allocation cannot
> be performed.
>
> Signed-off-by: Marcelo Tosatti <mtosatti(a)redhat.com>
>
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index ff50214..e79f5e6 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -632,6 +632,9 @@
> <dt><code>hugepages</code></dt>
> <dd>This tells the hypervisor that the guest should have its memory
> allocated using hugepages instead of the normal native page
size.</dd>
> + <dt><code>strict-hugepages</code></dt>
> + <dd>This tells the hypervisor that the guest should refuse to start
> + in case of failure to allocate guest memory with hugepages</dd>
Huh, we already supply the -mem-prealloc command line flag alongside
the -mem-path flag which should cause QEMU to exit if it cannot allocate
all memory upfront.
> +/*
> + * Returns bool: whether to fail guest initialization.
> + *
> + */
> +static bool qemuValidateStrictHugepage(virDomainObjPtr vm, virQEMUDriverConfigPtr
cfg)
> +{
> + bool ret = false;
> + char **maps = NULL;
> + int i;
> + char *buf;
> +
> + if (!vm->def->mem.strict_hugepages)
> + return ret;
> +
> + ret = true;
> +
> + if (!vm->def->mem.hugepage_backed || !cfg->hugepagePath) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("strict huge pages depends on huge pages"));
> + return ret;
> + }
> +
> + buf = malloc(strlen(cfg->hugepagePath) + 50);
> +
> + /* The parser requires /proc/pid, which only exists on platforms
> + * like Linux where pid_t fits in int. */
> + if ((int) vm->pid != vm->pid ||
> + qemuParseProcFileStrings(vm->pid, "maps", &maps) < 0)
> + goto cleanup;
> +
> + for (i = 0; maps && maps[i]; i++) {
> + char *endptr;
> + unsigned long start, end;
> + const char *map = maps[i];
> + bool found = false;
> +
> + sprintf(buf, "%s/qemu_back_mem.pc.ram.", cfg->hugepagePath);
> + if (strstr(map,buf) != NULL)
> + found = true;
> +
> + sprintf(buf, "%s/kvm.", cfg->hugepagePath);
> + if (strstr(map,buf) != NULL)
> + found = true;
> +
> + if (!found)
> + continue;
> +
> + errno = 0;
> + start = strtol(map, &endptr, 16);
> + if ((errno == ERANGE && (start == LONG_MAX || start == LONG_MIN))
> + || (errno != 0 && start == 0)) {
> + continue;
> + }
> +
> + if (endptr && *endptr == '-')
> + endptr++;
> +
> + if (!*endptr)
> + continue;
> +
> + errno = 0;
> + end = strtol(endptr, NULL, 16);
> + if ((errno == ERANGE && (end == LONG_MAX || end == LONG_MIN))
> + || (errno != 0 && end == 0)) {
> + continue;
> + }
> +
> + if (end-start >= vm->def->mem.max_balloon * 1024) {
> + ret = false;
> + break;
> + }
> + }
This code is really very nasty. It is both Linux specific and exposes
libvirt to the private implementation detail of QEMU's backing file
naming. On top of which I don't see why we need this when we're already
setting -mem-prealloc to enforce exactly this.
BTW, you already do /proc parsing and already do hugetlbfs which is
Linux specific.
So if you can detail in what new way this code is nasty (accordingly to
your judgement above), i can look into improving it.