[libvirt] [PATCH] add option to enforce minimal pagesize for hugetlbfs backed guests

Require a minimal pagesize for hugetlbfs backed guests. Fail guest initialization if hugetlbfs mount is configured with smaller page size. Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index fd02864..e28d182 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -632,6 +632,10 @@ <dt><code>hugepages</code></dt> <dd>This tells the hypervisor that the guest should have its memory allocated using hugepages instead of the normal native page size.</dd> + <dt><code>pagesize</code></dt> + <dd>This tells the hypervisor that the guest should refuse to start + in case of failure to allocate guest memory with hugepages equal + to or larger than the specified size</dd> <dt><code>nosharepages</code></dt> <dd>Instructs hypervisor to disable shared pages (memory merge, KSM) for this domain. <span class="since">Since 1.0.6</span></dd> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 28e24f9..babb745 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -11274,6 +11274,10 @@ virDomainDefParseXML(xmlDocPtr xml, &def->mem.swap_hard_limit, false) < 0) goto error; + if (virDomainParseMemory("./memoryBacking/hugepages/pagesize[1]", ctxt, + &def->mem.page_size, false) < 0) + goto error; + n = virXPathULong("string(./vcpu[1])", ctxt, &count); if (n == -2) { virReportError(VIR_ERR_XML_ERROR, "%s", diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index d8f2e49..03a900d 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -1984,6 +1984,7 @@ struct _virDomainDef { unsigned long long soft_limit; /* in kibibytes */ unsigned long long min_guarantee; /* in kibibytes */ unsigned long long swap_hard_limit; /* in kibibytes */ + unsigned long long page_size; /* in kibibytes */ } mem; unsigned short vcpus; unsigned short maxvcpus; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 8bcd98e..cd5e1c8 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -3570,6 +3570,33 @@ cleanup: return ret; } +#ifdef __linux__ + +#include <sys/vfs.h> + +#define HUGETLBFS_MAGIC 0x958458f6 + +static long gethugepagesize(const char *path) +{ + struct statfs fs; + int ret; + + do { + ret = statfs(path, &fs); + } while (ret != 0 && errno == EINTR); + + if (ret != 0) { + perror(path); + return 0; + } + + if (fs.f_type != HUGETLBFS_MAGIC) + return 0; + + return fs.f_bsize; +} +#endif + int qemuProcessStart(virConnectPtr conn, virQEMUDriverPtr driver, @@ -3712,6 +3739,31 @@ int qemuProcessStart(virConnectPtr conn, "%s", _("Unable to set huge path in security driver")); goto cleanup; } + + if (vm->def->mem.page_size) { +#ifdef __linux__ + unsigned long hpagesize = gethugepagesize(cfg->hugepagePath); + + if (!hpagesize) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Unable to stat hugepage path")); + goto cleanup; + } + + hpagesize /= 1024; + + if (hpagesize < vm->def->mem.page_size) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Error: hugetlbfs page size=%ld < pagesize=%lld"), + hpagesize, vm->def->mem.page_size); + goto cleanup; + } +#else + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + "%s", _("pagesize option unsupported")); + goto cleanup; +#endif + } } /* Ensure no historical cgroup for this VM is lying around bogus

On Thu, Feb 06, 2014 at 11:48:51AM -0500, Marcelo Tosatti wrote:
Require a minimal pagesize for hugetlbfs backed guests. Fail guest initialization if hugetlbfs mount is configured with smaller page size.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
+#ifdef __linux__ + +#include <sys/vfs.h> + +#define HUGETLBFS_MAGIC 0x958458f6 + +static long gethugepagesize(const char *path) +{ + struct statfs fs; + int ret; + + do { + ret = statfs(path, &fs); + } while (ret != 0 && errno == EINTR); + + if (ret != 0) { + perror(path); + return 0; + } + + if (fs.f_type != HUGETLBFS_MAGIC) + return 0; + + return fs.f_bsize; +} +#endif +
int qemuProcessStart(virConnectPtr conn, virQEMUDriverPtr driver, @@ -3712,6 +3739,31 @@ int qemuProcessStart(virConnectPtr conn, "%s", _("Unable to set huge path in security driver")); goto cleanup; } + + if (vm->def->mem.page_size) { +#ifdef __linux__ + unsigned long hpagesize = gethugepagesize(cfg->hugepagePath); + + if (!hpagesize) { + virReportError(VIR_ERR_INTERNAL_ERROR, + "%s", _("Unable to stat hugepage path")); + goto cleanup; + } + + hpagesize /= 1024; + + if (hpagesize < vm->def->mem.page_size) { + virReportError(VIR_ERR_INTERNAL_ERROR, + _("Error: hugetlbfs page size=%ld < pagesize=%lld"), + hpagesize, vm->def->mem.page_size); + goto cleanup; + } +#else + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + "%s", _("pagesize option unsupported")); + goto cleanup; +#endif + } }
IMHO all of this code is something that belongs in QEMU, with libvirt telling QEMU what min page size it wants via a CLI arg. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Il 11/02/2014 15:46, Daniel P. Berrange ha scritto:
IMHO all of this code is something that belongs in QEMU, with libvirt telling QEMU what min page size it wants via a CLI arg.
I disagree. QEMU just gets a path to something that doesn't even have to be a hugetlbfs mountpoint. QEMU hardly needs to know the page size; it computes it in order to truncate the file size to an integer number of pages, but if that's required it sounds like a workaround for a bug in hugetlbfs. In fact, I believe that even making a temporary file shouldn't belong in QEMU, and instead libvirt should just pass a filename or a file descriptor, but this would require changes in QEMU so it's left for another day. Paolo
participants (3)
-
Daniel P. Berrange
-
Marcelo Tosatti
-
Paolo Bonzini