On Tue, Feb 04, 2014 at 04:42:02PM +0000, Daniel P. Berrange wrote:
On Tue, Feb 04, 2014 at 11:32:47AM -0500, Marcelo Tosatti wrote:
>
> Add an element named "strict-hugepages" to control whether to
> refuse guest initialization in case hugepage allocation cannot
> be performed.
>
> Signed-off-by: Marcelo Tosatti <mtosatti(a)redhat.com>
>
> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in
> index ff50214..e79f5e6 100644
> --- a/docs/formatdomain.html.in
> +++ b/docs/formatdomain.html.in
> @@ -632,6 +632,9 @@
> <dt><code>hugepages</code></dt>
> <dd>This tells the hypervisor that the guest should have its memory
> allocated using hugepages instead of the normal native page
size.</dd>
> + <dt><code>strict-hugepages</code></dt>
> + <dd>This tells the hypervisor that the guest should refuse to start
> + in case of failure to allocate guest memory with hugepages</dd>
Huh, we already supply the -mem-prealloc command line flag alongside
the -mem-path flag which should cause QEMU to exit if it cannot allocate
all memory upfront.
-mem-path path
Allocate guest RAM from a temporarily created file in path.
-mem-prealloc
Preallocate memory when using -mem-path.
"should cause QEMU to exit if it cannot allocate all memory upfront" =>
no it does not. See more below.
> +/*
> + * Returns bool: whether to fail guest initialization.
> + *
> + */
> +static bool qemuValidateStrictHugepage(virDomainObjPtr vm, virQEMUDriverConfigPtr
cfg)
> +{
> + bool ret = false;
> + char **maps = NULL;
> + int i;
> + char *buf;
> +
> + if (!vm->def->mem.strict_hugepages)
> + return ret;
> +
> + ret = true;
> +
> + if (!vm->def->mem.hugepage_backed || !cfg->hugepagePath) {
> + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
> + _("strict huge pages depends on huge pages"));
> + return ret;
> + }
> +
> + buf = malloc(strlen(cfg->hugepagePath) + 50);
> +
> + /* The parser requires /proc/pid, which only exists on platforms
> + * like Linux where pid_t fits in int. */
> + if ((int) vm->pid != vm->pid ||
> + qemuParseProcFileStrings(vm->pid, "maps", &maps) < 0)
> + goto cleanup;
> +
> + for (i = 0; maps && maps[i]; i++) {
> + char *endptr;
> + unsigned long start, end;
> + const char *map = maps[i];
> + bool found = false;
> +
> + sprintf(buf, "%s/qemu_back_mem.pc.ram.", cfg->hugepagePath);
> + if (strstr(map,buf) != NULL)
> + found = true;
> +
> + sprintf(buf, "%s/kvm.", cfg->hugepagePath);
> + if (strstr(map,buf) != NULL)
> + found = true;
> +
> + if (!found)
> + continue;
> +
> + errno = 0;
> + start = strtol(map, &endptr, 16);
> + if ((errno == ERANGE && (start == LONG_MAX || start == LONG_MIN))
> + || (errno != 0 && start == 0)) {
> + continue;
> + }
> +
> + if (endptr && *endptr == '-')
> + endptr++;
> +
> + if (!*endptr)
> + continue;
> +
> + errno = 0;
> + end = strtol(endptr, NULL, 16);
> + if ((errno == ERANGE && (end == LONG_MAX || end == LONG_MIN))
> + || (errno != 0 && end == 0)) {
> + continue;
> + }
> +
> + if (end-start >= vm->def->mem.max_balloon * 1024) {
> + ret = false;
> + break;
> + }
> + }
This code is really very nasty. It is both Linux specific and exposes
Hugetlbfs is Linux specific. So all Hugetlbfs code in libvirt is already
Linux specific.
libvirt to the private implementation detail of QEMU's backing
file
Right. I am going to add a comment to QEMU saying libvirt interprets the
string (so that changing it breaks libvirt). Other tools already do
this.
naming. On top of which I don't see why we need this when
we're already
setting -mem-prealloc to enforce exactly this.
Because there is no guarantee with -mem-prealloc. For instance, if the
hugepage path is not actually hugetlbfs backed, QEMU falls back to
malloc().
A new element StrictHugePageSize=N has been requested, where it is
necessary to fail in case hugepagesize != N.