[libvirt PATCHv3 00/12] add virtiofs support (virtio-fs epopee)

https://bugzilla.redhat.com/show_bug.cgi?id=1694166 v1: https://www.redhat.com/archives/libvir-list/2019-November/msg00005.html v2: https://www.redhat.com/archives/libvir-list/2020-January/msg00980.html new in v3: * renamed qemu.conf option * removed cache-size since it was not yet merged in upstream QEMU * use XPath for XML parsing * separated virtiofsd options under the <binary> element [0] * the binary path is now autodetected from vhost-user schemas * log virtiofsd output into a file instead of syslog [0] naming is hard Ján Tomko (12): qemuExtDevicesStart: pass logManager schema: wrap fsDriver in a choice group qemu: add QEMU_CAPS_VHOST_USER_FS docs: add virtiofs kbase conf: qemu: add virtiofs fsdriver type conf: add virtiofs-related elements and attributes qemu: add virtiofsd_debug to qemu.conf qemu: validate virtiofs filesystems qemu: forbid migration with vhost-user-fs device qemu: add code for handling virtiofsd qemu: use the vhost-user schemas to find binary qemu: build vhost-user-fs device command line docs/formatdomain.html.in | 35 +- docs/kbase.html.in | 3 + docs/kbase/virtiofs.rst | 152 +++++++++ docs/schemas/domaincommon.rng | 88 ++++- po/POTFILES.in | 1 + src/conf/domain_conf.c | 108 ++++++- src/conf/domain_conf.h | 16 + src/libvirt_private.syms | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 7 + src/qemu/qemu_capabilities.c | 2 + src/qemu/qemu_capabilities.h | 1 + src/qemu/qemu_command.c | 47 ++- src/qemu/qemu_conf.c | 2 + src/qemu/qemu_conf.h | 1 + src/qemu/qemu_domain.c | 33 +- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_domain_address.c | 4 + src/qemu/qemu_extdevice.c | 28 ++ src/qemu/qemu_extdevice.h | 1 + src/qemu/qemu_migration.c | 10 + src/qemu/qemu_process.c | 4 +- src/qemu/qemu_vhost_user.c | 40 +++ src/qemu/qemu_vhost_user.h | 4 + src/qemu/qemu_virtiofs.c | 302 ++++++++++++++++++ src/qemu/qemu_virtiofs.h | 42 +++ src/qemu/test_libvirtd_qemu.aug.in | 1 + .../caps_4.2.0.aarch64.xml | 1 + .../qemucapabilitiesdata/caps_4.2.0.s390x.xml | 1 + .../caps_4.2.0.x86_64.xml | 1 + .../caps_5.0.0.x86_64.xml | 1 + ...vhost-user-fs-fd-memory.x86_64-latest.args | 39 +++ .../vhost-user-fs-fd-memory.xml | 43 +++ ...vhost-user-fs-hugepages.x86_64-latest.args | 47 +++ .../vhost-user-fs-hugepages.xml | 75 +++++ tests/qemuxml2argvtest.c | 14 + .../vhost-user-fs-fd-memory.x86_64-latest.xml | 1 + .../vhost-user-fs-hugepages.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 3 + 40 files changed, 1144 insertions(+), 21 deletions(-) create mode 100644 docs/kbase/virtiofs.rst create mode 100644 src/qemu/qemu_virtiofs.c create mode 100644 src/qemu/qemu_virtiofs.h create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-fd-memory.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-hugepages.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml create mode 120000 tests/qemuxml2xmloutdata/vhost-user-fs-fd-memory.x86_64-latest.xml create mode 120000 tests/qemuxml2xmloutdata/vhost-user-fs-hugepages.x86_64-latest.xml -- 2.21.0

--- src/qemu/qemu_extdevice.c | 1 + src/qemu/qemu_extdevice.h | 1 + src/qemu/qemu_process.c | 4 +++- 3 files changed, 5 insertions(+), 1 deletion(-) diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 9c0c0fd573..7f3bb104d9 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -153,6 +153,7 @@ qemuExtDevicesCleanupHost(virQEMUDriverPtr driver, int qemuExtDevicesStart(virQEMUDriverPtr driver, virDomainObjPtr vm, + virLogManagerPtr logManager G_GNUC_UNUSED, bool incomingMigration) { virDomainDefPtr def = vm->def; diff --git a/src/qemu/qemu_extdevice.h b/src/qemu/qemu_extdevice.h index 5cf777ab4b..df29968e16 100644 --- a/src/qemu/qemu_extdevice.h +++ b/src/qemu/qemu_extdevice.h @@ -46,6 +46,7 @@ void qemuExtDevicesCleanupHost(virQEMUDriverPtr driver, int qemuExtDevicesStart(virQEMUDriverPtr driver, virDomainObjPtr vm, + virLogManagerPtr logManager, bool incomingMigration) ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) G_GNUC_WARN_UNUSED_RESULT; diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c index 57a60c568a..4d1d9f7669 100644 --- a/src/qemu/qemu_process.c +++ b/src/qemu/qemu_process.c @@ -6712,7 +6712,9 @@ qemuProcessLaunch(virConnectPtr conn, if (qemuProcessGenID(vm, flags) < 0) goto cleanup; - if (qemuExtDevicesStart(driver, vm, incoming != NULL) < 0) + if (qemuExtDevicesStart(driver, vm, + qemuDomainLogContextGetManager(logCtxt), + incoming != NULL) < 0) goto cleanup; VIR_DEBUG("Building emulator command line"); -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:17PM +0100, Ján Tomko wrote: Missing commit message - signed-off-by at least
--- src/qemu/qemu_extdevice.c | 1 + src/qemu/qemu_extdevice.h | 1 + src/qemu/qemu_process.c | 4 +++- 3 files changed, 5 insertions(+), 1 deletion(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Allow adding new groups without changing indentation. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> --- docs/schemas/domaincommon.rng | 50 +++++++++++++++++++---------------- 1 file changed, 27 insertions(+), 23 deletions(-) diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index ea237a05e5..51dd75b520 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2600,29 +2600,33 @@ for this kind of info, and 'type' for the storage format. We need the latter too, so had to invent a new attribute name --> - <optional> - <attribute name="type"> - <choice> - <value>path</value> - <value>handle</value> - <value>loop</value> - <value>nbd</value> - <value>ploop</value> - </choice> - </attribute> - </optional> - <optional> - <attribute name="format"> - <ref name="storageFormat"/> - </attribute> - </optional> - <optional> - <attribute name="wrpolicy"> - <value>immediate</value> - </attribute> - </optional> - <ref name='virtioOptions'/> - <empty/> + <choice> + <group> + <optional> + <attribute name="type"> + <choice> + <value>path</value> + <value>handle</value> + <value>loop</value> + <value>nbd</value> + <value>ploop</value> + </choice> + </attribute> + </optional> + <optional> + <attribute name="format"> + <ref name="storageFormat"/> + </attribute> + </optional> + <optional> + <attribute name="wrpolicy"> + <value>immediate</value> + </attribute> + </optional> + <ref name='virtioOptions'/> + </group> + <empty/> + </choice> </element> </define> -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:18PM +0100, Ján Tomko wrote:
Allow adding new groups without changing indentation.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> --- docs/schemas/domaincommon.rng | 50 +++++++++++++++++++---------------- 1 file changed, 27 insertions(+), 23 deletions(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Introduced by QEMU commit 98fc1ada4cf70af0f1df1a2d7183cf786fc7da05 virtio: add vhost-user-fs base device Released in QEMU v4.2.0. Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml | 1 + 6 files changed, 7 insertions(+) diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c index 8b7fde2bcc..cdc609a01f 100644 --- a/src/qemu/qemu_capabilities.c +++ b/src/qemu/qemu_capabilities.c @@ -559,6 +559,7 @@ VIR_ENUM_IMPL(virQEMUCaps, "i8042", "rng-builtin", "virtio-net.failover", + "vhost-user-fs", ); @@ -1275,6 +1276,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = { { "max-arm-cpu", QEMU_CAPS_ARM_MAX_CPU }, { "i8042", QEMU_CAPS_DEVICE_I8042 }, { "rng-builtin", QEMU_CAPS_OBJECT_RNG_BUILTIN }, + { "vhost-user-fs-device", QEMU_CAPS_DEVICE_VHOST_USER_FS }, }; static struct virQEMUCapsStringFlags virQEMUCapsDevicePropsVirtioBalloon[] = { diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h index d86f54a481..ad9a56ff89 100644 --- a/src/qemu/qemu_capabilities.h +++ b/src/qemu/qemu_capabilities.h @@ -540,6 +540,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */ QEMU_CAPS_DEVICE_I8042, /* PS/2 controller */ QEMU_CAPS_OBJECT_RNG_BUILTIN, /* -object rng-builtin */ QEMU_CAPS_VIRTIO_NET_FAILOVER, /* virtio-net-*.failover */ + QEMU_CAPS_DEVICE_VHOST_USER_FS, /* -device vhost-user-fs */ QEMU_CAPS_LAST /* this must always be the last item */ } virQEMUCapsFlags; diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml index d0faa4c471..640ce29c8c 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml @@ -178,6 +178,7 @@ <flag name='smp-dies'/> <flag name='rng-builtin'/> <flag name='virtio-net.failover'/> + <flag name='vhost-user-fs'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>61700242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml index 9275bf36fa..37776e1bbe 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml @@ -137,6 +137,7 @@ <flag name='drive-nvme'/> <flag name='smp-dies'/> <flag name='rng-builtin'/> + <flag name='vhost-user-fs'/> <version>4001050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>39100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml index faed23f5c1..83e804ea36 100644 --- a/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml @@ -222,6 +222,7 @@ <flag name='i8042'/> <flag name='rng-builtin'/> <flag name='virtio-net.failover'/> + <flag name='vhost-user-fs'/> <version>4002000</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100242</microcodeVersion> diff --git a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml index 1eef7197ef..6902cffd17 100644 --- a/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml +++ b/tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml @@ -223,6 +223,7 @@ <flag name='i8042'/> <flag name='rng-builtin'/> <flag name='virtio-net.failover'/> + <flag name='vhost-user-fs'/> <version>4002050</version> <kvmVersion>0</kvmVersion> <microcodeVersion>43100241</microcodeVersion> -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:19PM +0100, Ján Tomko wrote:
Introduced by QEMU commit 98fc1ada4cf70af0f1df1a2d7183cf786fc7da05 virtio: add vhost-user-fs base device
Released in QEMU v4.2.0.
Signed-off-by: Ján Tomko <jtomko@redhat.com> Reviewed-by: Peter Krempa <pkrempa@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> --- src/qemu/qemu_capabilities.c | 2 ++ src/qemu/qemu_capabilities.h | 1 + tests/qemucapabilitiesdata/caps_4.2.0.aarch64.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.s390x.xml | 1 + tests/qemucapabilitiesdata/caps_4.2.0.x86_64.xml | 1 + tests/qemucapabilitiesdata/caps_5.0.0.x86_64.xml | 1 + 6 files changed, 7 insertions(+)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Add a document describing the usage of virtiofs. --- docs/kbase.html.in | 3 + docs/kbase/virtiofs.rst | 152 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 155 insertions(+) create mode 100644 docs/kbase/virtiofs.rst diff --git a/docs/kbase.html.in b/docs/kbase.html.in index c156414c41..7d6caf3cb1 100644 --- a/docs/kbase.html.in +++ b/docs/kbase.html.in @@ -29,6 +29,9 @@ <dt><a href="kbase/backing_chains.html">Backing chain management</a></dt> <dd>Explanation of how disk backing chain specification impacts libvirt's behaviour and basic troubleshooting steps of disk problems.</dd> + + <dt><a href="kbase/virtiofs.html">Virtio-FS</a></dt> + <dd>Share a filesystem between the guest and the host</dd> </dl> </div> diff --git a/docs/kbase/virtiofs.rst b/docs/kbase/virtiofs.rst new file mode 100644 index 0000000000..e76ff8ca4b --- /dev/null +++ b/docs/kbase/virtiofs.rst @@ -0,0 +1,152 @@ +============================ +Sharing files with Virtio-FS +============================ + +=== 8< delete before merging 8< === +NOTE: if you're looking at this note, this is just a proposal. +See the up-to-date version on: https://libvirt.org/kbase/virtiofs.html +=== 8< --------------------- 8< === + +.. contents:: + +========= +Virtio-FS +========= + +Virtio-FS is a shared file system that lets virtual machines access +a directory tree on the host. Unlike existing approaches, it +is designed to offer local file system semantics and performance. + +See https://virtio-fs.gitlab.io/ + +========== +Host setup +========== + +The host-side virtiofsd daemon, like other vhost-user backed devices, +requres shared memory between the host and the guest. As of QEMU 4.2, this +requires specifying a NUMA topology for the guest and explicitly specifying +a memory backend. Multiple options are available: + +Either of the following: + +\a) Use file-backed memory + +Configure the directory where the files backing the memory will be stored +with the ``memory_backing_dir`` option in ``/etc/libvirt/qemu.conf`` + +:: + + # This directory is used for memoryBacking source if configured as file. + # NOTE: big files will be stored here + memory_backing_dir = "/dev/shm/" + +\b) Use hugepage-backed memory + +Make sure there are enough huge pages allocated for the requested guest memory. +For example, for one guest with 2 GiB of RAM backed by 2 MiB hugepages: + +:: + + # virsh allocpages 2M 1024 + +=========== +Guest setup +=========== + +\1. Specify the NUMA topology + +in the domain XML of the guest. +For the simplest one-node topology for a guest with 2GiB of RAM and 8 vCPUs: + +:: + + <domain> + ... + <cpu ...> + <numa> + <cell id='0' cpus='0-7' memory='2' unit='GiB' memAccess='shared'/> + </numa> + </cpu> + ... + </domain> + +Note that the CPU element might already be specified and only one is allowed. + +\2. Specify the memory backend + +Either of the following: + +\2. a) File-backed memory + +:: + + <domain> + ... + <memoryBacking> + <access mode='shared'/> + </memoryBacking> + ... + </domain> + +This will create a file in the directory specified in ``qemu.conf`` + +\2. b) Hugepage-backed memory + +:: + + <domain> + ... + <memoryBacking> + <hugepages> + <page size='2' unit='M'/> + </hugepages> + <access mode='shared'/> + </memoryBacking> + ... + </domain> + +\3. Add the ``vhost-user-fs`` QEMU device via the ``filesystem`` element + +:: + + <domain> + ... + <devices> + ... + <filesystem type='mount' accessmode='passthrough'> + <driver type='virtiofs'/> + <source dir='/path'/> + <target dir='mount_tag'/> + </filesystem> + ... + </devices> + </domain> + +Note that despite its name, the ``target dir`` is actually a mount tag and does +not have to correspond to the desired mount point in the guest. + +So far, ``passthrough`` is the only supported access mode and it requires +running the ``virtiofsd`` daemon as root. + +\4. Boot the guest and mount the filesystem + +:: + + guest# mount -t virtiofs mount_tag /mnt/mount/path + +Note: this requires virtiofs support in the guest kernel (Linux v5.4 or later) + +=================== +Optional parameters +=================== + +More optional elements can be specified + +:: + + <driver type='virtiofs' queue='1024'/> + <binary path='/usr/libexec/virtiofsd' xattr='on'> + <cache mode='always'/> + <lock posix_lock='on' flock='on'/> + </binary> -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:20PM +0100, Ján Tomko wrote:
Add a document describing the usage of virtiofs. --- docs/kbase.html.in | 3 + docs/kbase/virtiofs.rst | 152 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 155 insertions(+) create mode 100644 docs/kbase/virtiofs.rst
diff --git a/docs/kbase.html.in b/docs/kbase.html.in index c156414c41..7d6caf3cb1 100644 --- a/docs/kbase.html.in +++ b/docs/kbase.html.in @@ -29,6 +29,9 @@ <dt><a href="kbase/backing_chains.html">Backing chain management</a></dt> <dd>Explanation of how disk backing chain specification impacts libvirt's behaviour and basic troubleshooting steps of disk problems.</dd> + + <dt><a href="kbase/virtiofs.html">Virtio-FS</a></dt> + <dd>Share a filesystem between the guest and the host</dd> </dl> </div>
diff --git a/docs/kbase/virtiofs.rst b/docs/kbase/virtiofs.rst new file mode 100644 index 0000000000..e76ff8ca4b --- /dev/null +++ b/docs/kbase/virtiofs.rst @@ -0,0 +1,152 @@ +============================ +Sharing files with Virtio-FS +============================ + +=== 8< delete before merging 8< === +NOTE: if you're looking at this note, this is just a proposal. +See the up-to-date version on: https://libvirt.org/kbase/virtiofs.html +=== 8< --------------------- 8< ===
...this shouldn't be here still
+ +.. contents:: + +========= +Virtio-FS +========= + +Virtio-FS is a shared file system that lets virtual machines access +a directory tree on the host. Unlike existing approaches, it +is designed to offer local file system semantics and performance. + +See https://virtio-fs.gitlab.io/ + +========== +Host setup +========== + +The host-side virtiofsd daemon, like other vhost-user backed devices, +requres shared memory between the host and the guest. As of QEMU 4.2, this
requires
+requires specifying a NUMA topology for the guest and explicitly specifying +a memory backend. Multiple options are available: + +Either of the following: + +\a) Use file-backed memory
Why are you disabling list-ification here & throughout the file ?
+ +Configure the directory where the files backing the memory will be stored +with the ``memory_backing_dir`` option in ``/etc/libvirt/qemu.conf`` + +:: + + # This directory is used for memoryBacking source if configured as file. + # NOTE: big files will be stored here + memory_backing_dir = "/dev/shm/" + +\b) Use hugepage-backed memory + +Make sure there are enough huge pages allocated for the requested guest memory. +For example, for one guest with 2 GiB of RAM backed by 2 MiB hugepages: + +:: + + # virsh allocpages 2M 1024
I'm not understanding from this how big the files will be for virtiofsd ? How much memory should users expect to be needed per-guest per-virtiofsd ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Feb 03, 2020 at 04:17:31PM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 30, 2020 at 06:06:20PM +0100, Ján Tomko wrote:
Add a document describing the usage of virtiofs. --- docs/kbase.html.in | 3 + docs/kbase/virtiofs.rst | 152 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 155 insertions(+) create mode 100644 docs/kbase/virtiofs.rst
diff --git a/docs/kbase.html.in b/docs/kbase.html.in index c156414c41..7d6caf3cb1 100644 --- a/docs/kbase.html.in +++ b/docs/kbase.html.in @@ -29,6 +29,9 @@ <dt><a href="kbase/backing_chains.html">Backing chain management</a></dt> <dd>Explanation of how disk backing chain specification impacts libvirt's behaviour and basic troubleshooting steps of disk problems.</dd> + + <dt><a href="kbase/virtiofs.html">Virtio-FS</a></dt> + <dd>Share a filesystem between the guest and the host</dd> </dl> </div>
diff --git a/docs/kbase/virtiofs.rst b/docs/kbase/virtiofs.rst new file mode 100644 index 0000000000..e76ff8ca4b --- /dev/null +++ b/docs/kbase/virtiofs.rst @@ -0,0 +1,152 @@ +============================ +Sharing files with Virtio-FS +============================ + +=== 8< delete before merging 8< === +NOTE: if you're looking at this note, this is just a proposal. +See the up-to-date version on: https://libvirt.org/kbase/virtiofs.html +=== 8< --------------------- 8< ===
...this shouldn't be here still
+ +.. contents:: + +========= +Virtio-FS +========= + +Virtio-FS is a shared file system that lets virtual machines access +a directory tree on the host. Unlike existing approaches, it +is designed to offer local file system semantics and performance. + +See https://virtio-fs.gitlab.io/ + +========== +Host setup +========== + +The host-side virtiofsd daemon, like other vhost-user backed devices, +requres shared memory between the host and the guest. As of QEMU 4.2, this
requires
+requires specifying a NUMA topology for the guest and explicitly specifying +a memory backend. Multiple options are available: + +Either of the following: + +\a) Use file-backed memory
Why are you disabling list-ification here & throughout the file ?
+ +Configure the directory where the files backing the memory will be stored +with the ``memory_backing_dir`` option in ``/etc/libvirt/qemu.conf`` + +:: + + # This directory is used for memoryBacking source if configured as file. + # NOTE: big files will be stored here + memory_backing_dir = "/dev/shm/" + +\b) Use hugepage-backed memory + +Make sure there are enough huge pages allocated for the requested guest memory. +For example, for one guest with 2 GiB of RAM backed by 2 MiB hugepages: + +:: + + # virsh allocpages 2M 1024
I'm not understanding from this how big the files will be for virtiofsd ? How much memory should users expect to be needed per-guest per-virtiofsd ?
Oh, I realize on re-reading that this part of the docs is just about the general setup of a file backed guest. So we're just talking about guest RAM size here, virtiofsd isn't using this file backing for anything else. IOW, ignore this comment. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Feb 03, 2020 at 04:17:31PM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 30, 2020 at 06:06:20PM +0100, Ján Tomko wrote:
Add a document describing the usage of virtiofs. --- docs/kbase.html.in | 3 + docs/kbase/virtiofs.rst | 152 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 155 insertions(+) create mode 100644 docs/kbase/virtiofs.rst
diff --git a/docs/kbase.html.in b/docs/kbase.html.in index c156414c41..7d6caf3cb1 100644 --- a/docs/kbase.html.in +++ b/docs/kbase.html.in @@ -29,6 +29,9 @@ <dt><a href="kbase/backing_chains.html">Backing chain management</a></dt> <dd>Explanation of how disk backing chain specification impacts libvirt's behaviour and basic troubleshooting steps of disk problems.</dd> + + <dt><a href="kbase/virtiofs.html">Virtio-FS</a></dt> + <dd>Share a filesystem between the guest and the host</dd> </dl> </div>
diff --git a/docs/kbase/virtiofs.rst b/docs/kbase/virtiofs.rst new file mode 100644 index 0000000000..e76ff8ca4b --- /dev/null +++ b/docs/kbase/virtiofs.rst @@ -0,0 +1,152 @@ +============================ +Sharing files with Virtio-FS +============================ + +=== 8< delete before merging 8< === +NOTE: if you're looking at this note, this is just a proposal. +See the up-to-date version on: https://libvirt.org/kbase/virtiofs.html +=== 8< --------------------- 8< ===
...this shouldn't be here still
The plan was to leave it in every version posted to the mailing list, for anyone who might find it through the archives.
+ +.. contents:: + +========= +Virtio-FS +========= + +Virtio-FS is a shared file system that lets virtual machines access +a directory tree on the host. Unlike existing approaches, it +is designed to offer local file system semantics and performance. + +See https://virtio-fs.gitlab.io/ + +========== +Host setup +========== + +The host-side virtiofsd daemon, like other vhost-user backed devices, +requres shared memory between the host and the guest. As of QEMU 4.2, this
requires
+requires specifying a NUMA topology for the guest and explicitly specifying +a memory backend. Multiple options are available: + +Either of the following: + +\a) Use file-backed memory
Why are you disabling list-ification here & throughout the file ?
I could not get it to do what I want - it kept resetting to 1. Jano

Introduce a new 'virtiofs' driver type for filesystem. <filesystem type='mount' accessmode='passthrough'> <driver type='virtiofs'/> <source dir='/path'/> <target dir='mount_tag'> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </filesystem> Signed-off-by: Ján Tomko <jtomko@redhat.com> --- docs/formatdomain.html.in | 12 ++- docs/schemas/domaincommon.rng | 6 ++ src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_command.c | 4 + src/qemu/qemu_domain.c | 4 + src/qemu/qemu_domain_address.c | 4 + .../vhost-user-fs-fd-memory.xml | 39 ++++++++++ .../vhost-user-fs-hugepages.xml | 74 +++++++++++++++++++ .../vhost-user-fs-fd-memory.x86_64-latest.xml | 1 + .../vhost-user-fs-hugepages.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 3 + 12 files changed, 149 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml create mode 120000 tests/qemuxml2xmloutdata/vhost-user-fs-fd-memory.x86_64-latest.xml create mode 120000 tests/qemuxml2xmloutdata/vhost-user-fs-hugepages.x86_64-latest.xml diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index 20bf6a9b84..d5565361ab 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3921,6 +3921,11 @@ <target dir='/import/from/host'/> <readonly/> </filesystem> + <filesystem type='mount' accessmode='passthrough'> + <driver type='virtiofs'/> + <source dir='/path'/> + <target dir='mount_tag'/> + </filesystem> ... </devices> ...</pre> @@ -3949,6 +3954,9 @@ while the value <code>immediate</code> means that a host writeback is immediately triggered for all pages touched during a guest file write operation <span class="since">(since 0.9.10)</span>. + <span class="since">Since 6.1.0</span>, <code>type='virtiofs'</code> + is also supported. Using virtiofs requires setting up shared memory, + see the guide: <a href="kbase/virtiofs.html">Virtio-FS</a> </dd> <dt><code>template</code></dt> <dd> @@ -3984,7 +3992,9 @@ The filesystem element has an optional attribute <code>accessmode</code> which specifies the security mode for accessing the source <span class="since">(since 0.8.5)</span>. Currently this only works - with <code>type='mount'</code> for the QEMU/KVM driver. The possible + with <code>type='mount'</code> for the QEMU/KVM driver. + For driver type <code>virtiofs</code>, only <code>passthrough</code> is + supported. For other driver types, the possible values are: <dl> diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index 51dd75b520..e30b82c57e 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2625,6 +2625,12 @@ </optional> <ref name='virtioOptions'/> </group> + <group> + <attribute name="type"> + <value>virtiofs</value> + </attribute> + <ref name='virtioOptions'/> + </group> <empty/> </choice> </element> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 9b60db7ecd..d1a6d125d7 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -477,6 +477,7 @@ VIR_ENUM_IMPL(virDomainFSDriver, "loop", "nbd", "ploop", + "virtiofs", ); VIR_ENUM_IMPL(virDomainFSAccessMode, diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index e144f3aad3..de2bd3be93 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -773,6 +773,7 @@ typedef enum { VIR_DOMAIN_FS_DRIVER_TYPE_LOOP, VIR_DOMAIN_FS_DRIVER_TYPE_NBD, VIR_DOMAIN_FS_DRIVER_TYPE_PLOOP, + VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS, VIR_DOMAIN_FS_DRIVER_TYPE_LAST } virDomainFSDriverType; diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index c8195cfbb9..0d7cd42912 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -2672,6 +2672,10 @@ qemuBuildFilesystemCommandLine(virCommandPtr cmd, return -1; break; + case VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS: + /* TODO: vhost-user-fs-pci */ + break; + case VIR_DOMAIN_FS_DRIVER_TYPE_LOOP: case VIR_DOMAIN_FS_DRIVER_TYPE_NBD: case VIR_DOMAIN_FS_DRIVER_TYPE_PLOOP: diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 05a8d3de38..6367f5394e 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8313,6 +8313,10 @@ qemuDomainDeviceDefValidateFS(virDomainFSDefPtr fs, _("Filesystem driver type not supported")); return -1; + case VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS: + /* TODO: vhost-user-fs-pci */ + return 0; + case VIR_DOMAIN_FS_DRIVER_TYPE_LAST: default: virReportEnumRangeError(virDomainFSDriverType, fs->fsdriver); diff --git a/src/qemu/qemu_domain_address.c b/src/qemu/qemu_domain_address.c index b663e05391..26b33aeb05 100644 --- a/src/qemu/qemu_domain_address.c +++ b/src/qemu/qemu_domain_address.c @@ -690,6 +690,10 @@ qemuDomainDeviceCalculatePCIConnectFlags(virDomainDeviceDefPtr dev, } break; + case VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS: + /* vhost-user-fs-pci */ + return virtioFlags; + case VIR_DOMAIN_FS_DRIVER_TYPE_LOOP: case VIR_DOMAIN_FS_DRIVER_TYPE_NBD: case VIR_DOMAIN_FS_DRIVER_TYPE_PLOOP: diff --git a/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml new file mode 100644 index 0000000000..a6b6279fb8 --- /dev/null +++ b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml @@ -0,0 +1,39 @@ +<domain type='kvm'> + <name>guest</name> + <uuid>126f2720-6f8e-45ab-a886-ec9277079a67</uuid> + <memory unit='KiB'>14680064</memory> + <currentMemory unit='KiB'>14680064</currentMemory> + <memoryBacking> + <source type='file'/> + <access mode='shared'/> + </memoryBacking> + <vcpu placement='static'>2</vcpu> + <os> + <type arch='x86_64' machine='pc'>hvm</type> + <boot dev='hd'/> + </os> + <cpu mode='custom' match='exact' check='none'> + <model fallback='forbid'>qemu64</model> + <numa> + <cell id='0' cpus='0-1' memory='14680064' unit='KiB' memAccess='shared'/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <controller type='usb' index='0' model='none'/> + <controller type='pci' index='0' model='pci-root'/> + <filesystem type='mount' accessmode='passthrough'> + <driver type='virtiofs'/> + <source dir='/path'/> + <target dir='mount_tag'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> + </filesystem> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml new file mode 100644 index 0000000000..70df7b890d --- /dev/null +++ b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml @@ -0,0 +1,74 @@ +<domain type='qemu'> + <name>guest</name> + <uuid>1ccfd97d-5eb4-478a-bbe6-88d254c16db7</uuid> + <memory unit='KiB'>4194304</memory> + <currentMemory unit='KiB'>4194304</currentMemory> + <memoryBacking> + <hugepages> + <page size='2048' unit='KiB'/> + </hugepages> + <access mode='shared'/> + </memoryBacking> + <vcpu placement='static'>2</vcpu> + <os> + <type arch='x86_64' machine='q35'>hvm</type> + <boot dev='hd'/> + </os> + <features> + <acpi/> + <apic/> + </features> + <cpu mode='custom' match='exact' check='none'> + <model fallback='forbid'>qemu64</model> + <numa> + <cell id='0' cpus='0-1' memory='2097152' unit='KiB' memAccess='shared'/> + </numa> + </cpu> + <clock offset='utc'/> + <on_poweroff>destroy</on_poweroff> + <on_reboot>restart</on_reboot> + <on_crash>destroy</on_crash> + <devices> + <emulator>/usr/bin/qemu-system-x86_64</emulator> + <disk type='file' device='disk'> + <driver name='qemu' type='qcow2'/> + <source file='/var/lib/libvirt/images/guest.qcow2'/> + <target dev='vda' bus='virtio'/> + <address type='pci' domain='0x0000' bus='0x04' slot='0x00' function='0x0'/> + </disk> + <controller type='usb' index='0' model='none'/> + <controller type='sata' index='0'> + <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/> + </controller> + <controller type='pci' index='0' model='pcie-root'/> + <controller type='pci' index='1' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='1' port='0x8'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x0' multifunction='on'/> + </controller> + <controller type='pci' index='2' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='2' port='0x9'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> + </controller> + <controller type='pci' index='3' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='3' port='0xa'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> + </controller> + <controller type='pci' index='4' model='pcie-root-port'> + <model name='pcie-root-port'/> + <target chassis='4' port='0xb'/> + <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x3'/> + </controller> + <filesystem type='mount' accessmode='passthrough'> + <driver type='virtiofs'/> + <source dir='/path'/> + <target dir='mount_tag'/> + <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> + </filesystem> + <input type='mouse' bus='ps2'/> + <input type='keyboard' bus='ps2'/> + <memballoon model='none'/> + </devices> +</domain> diff --git a/tests/qemuxml2xmloutdata/vhost-user-fs-fd-memory.x86_64-latest.xml b/tests/qemuxml2xmloutdata/vhost-user-fs-fd-memory.x86_64-latest.xml new file mode 120000 index 0000000000..fbc552ef94 --- /dev/null +++ b/tests/qemuxml2xmloutdata/vhost-user-fs-fd-memory.x86_64-latest.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/vhost-user-fs-fd-memory.xml \ No newline at end of file diff --git a/tests/qemuxml2xmloutdata/vhost-user-fs-hugepages.x86_64-latest.xml b/tests/qemuxml2xmloutdata/vhost-user-fs-hugepages.x86_64-latest.xml new file mode 120000 index 0000000000..0c0f05b254 --- /dev/null +++ b/tests/qemuxml2xmloutdata/vhost-user-fs-hugepages.x86_64-latest.xml @@ -0,0 +1 @@ +../qemuxml2argvdata/vhost-user-fs-hugepages.xml \ No newline at end of file diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c index fa238ec339..b5b9c6ba1f 100644 --- a/tests/qemuxml2xmltest.c +++ b/tests/qemuxml2xmltest.c @@ -1428,6 +1428,9 @@ mymain(void) DO_TEST("vhost-vsock-ccw-auto", QEMU_CAPS_DEVICE_VHOST_VSOCK, QEMU_CAPS_CCW); + DO_TEST_CAPS_LATEST("vhost-user-fs-fd-memory"); + DO_TEST_CAPS_LATEST("vhost-user-fs-hugepages"); + DO_TEST("riscv64-virt", QEMU_CAPS_DEVICE_VIRTIO_MMIO); DO_TEST("riscv64-virt-pci", -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:21PM +0100, Ján Tomko wrote:
Introduce a new 'virtiofs' driver type for filesystem.
<filesystem type='mount' accessmode='passthrough'> <driver type='virtiofs'/> <source dir='/path'/> <target dir='mount_tag'> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </filesystem>
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- docs/formatdomain.html.in | 12 ++- docs/schemas/domaincommon.rng | 6 ++ src/conf/domain_conf.c | 1 + src/conf/domain_conf.h | 1 + src/qemu/qemu_command.c | 4 + src/qemu/qemu_domain.c | 4 + src/qemu/qemu_domain_address.c | 4 + .../vhost-user-fs-fd-memory.xml | 39 ++++++++++ .../vhost-user-fs-hugepages.xml | 74 +++++++++++++++++++ .../vhost-user-fs-fd-memory.x86_64-latest.xml | 1 + .../vhost-user-fs-hugepages.x86_64-latest.xml | 1 + tests/qemuxml2xmltest.c | 3 + 12 files changed, 149 insertions(+), 1 deletion(-) create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml create mode 120000 tests/qemuxml2xmloutdata/vhost-user-fs-fd-memory.x86_64-latest.xml create mode 120000 tests/qemuxml2xmloutdata/vhost-user-fs-hugepages.x86_64-latest.xml
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Add more elements for tuning the virtiofsd daemon and the vhost-user-fs device: <driver type='virtiofs' queue='1024' xattr='on'> <binary>/usr/libexec/virtiofsd</binary> <cache mode='always' size='2097152' unit='KiB'/> <lock posix='off' flock='off'/> </driver> Signed-off-by: Ján Tomko <jtomko@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 48 ++++++++ src/conf/domain_conf.c | 107 +++++++++++++++++- src/conf/domain_conf.h | 15 +++ src/libvirt_private.syms | 1 + .../vhost-user-fs-fd-memory.xml | 6 +- .../vhost-user-fs-hugepages.xml | 1 + 7 files changed, 200 insertions(+), 3 deletions(-) diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d5565361ab..58e06d21f5 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3922,10 +3922,15 @@ <readonly/> </filesystem> <filesystem type='mount' accessmode='passthrough'> - <driver type='virtiofs'/> + <driver type='virtiofs queue='1024'/> + <binary path='/usr/libexec/virtiofsd' xattr='on'> + <cache mode='always'/> + <lock posix='on' flock='on'/> + </binary> <source dir='/path'/> <target dir='mount_tag'/> </filesystem> + ... </devices> ...</pre> @@ -4049,9 +4054,27 @@ <a href="#elementsVirtio">Virtio-specific options</a> can also be set. (<span class="since">Since 3.5.0</span>) </li> + <li> + For <code>virtiofs</code>, the <code>queue</code> attribute can be used + to specify the queue size (i.e. how many requests can the queue fit). + (<span class="since">Since 6.1.0</span>) + </li> </ul> </dd> + <dt><code>binary</code></dt> + <dd> + The optional <code>binary</code> element can tune the options for virtiofsd. + The attribute <code>path</code> can be used to override the path to the daemon. + Attribute <code>xattr</code> enables the use of filesystem extended attributes. + Caching can be tuned via the <code>cache</code> element, possible <code>mode</code> + values being <code>none</code> and <code>always</code>. + Locking can be controlled via the <code>lock</code> + element - attributes <code>posix</code> and <code>flock</code> both accepting + values <code>yes</code> or <code>no</code>. + (<span class="since">Since 6.1.0</span>) + </dd> + <dt><code>source</code></dt> <dd> The resource on the host that is being accessed in the guest. The diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index e30b82c57e..db475a43ca 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2480,6 +2480,9 @@ <optional> <ref name="fsDriver"/> </optional> + <optional> + <ref name="fsBinary"/> + </optional> <interleave> <element name="source"> <attribute name="dir"> @@ -2629,12 +2632,57 @@ <attribute name="type"> <value>virtiofs</value> </attribute> + <optional> + <attribute name="queue"> + <ref name="unsignedInt"/> + </attribute> + </optional> <ref name='virtioOptions'/> </group> <empty/> </choice> </element> </define> + <define name="fsBinary"> + <element name="binary"> + <optional> + <attribute name="path"> + <ref name="absFilePath"/> + </attribute> + </optional> + <optional> + <attribute name="xattr"> + <ref name="virOnOff"/> + </attribute> + </optional> + <optional> + <element name="cache"> + <optional> + <attribute name="mode"> + <choice> + <value>none</value> + <value>always</value> + </choice> + </attribute> + </optional> + </element> + </optional> + <optional> + <element name="lock"> + <optional> + <attribute name="posix"> + <ref name="virOnOff"/> + </attribute> + </optional> + <optional> + <attribute name="flock"> + <ref name="virOnOff"/> + </attribute> + </optional> + </element> + </optional> + </element> + </define> <define name="interface-network-attributes"> <attribute name="network"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d1a6d125d7..5c26e0d06c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -501,6 +501,14 @@ VIR_ENUM_IMPL(virDomainFSModel, "virtio-non-transitional", ); +VIR_ENUM_IMPL(virDomainFSCacheMode, + VIR_DOMAIN_FS_CACHE_MODE_LAST, + "default", + "none", + "always", +); + + VIR_ENUM_IMPL(virDomainNet, VIR_DOMAIN_NET_TYPE_LAST, "user", @@ -2322,6 +2330,7 @@ void virDomainFSDefFree(virDomainFSDefPtr def) VIR_FREE(def->dst); virDomainDeviceInfoClear(&def->info); VIR_FREE(def->virtio); + VIR_FREE(def->binary); VIR_FREE(def); } @@ -11254,6 +11263,64 @@ virDomainFSDefParseXML(virDomainXMLOptionPtr xmlopt, } } + if (def->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + g_autofree char *queue_size = virXPathString("string(./driver/@queue)", ctxt); + g_autofree char *binary = virXPathString("string(./binary/@path)", ctxt); + g_autofree char *xattr = virXPathString("string(./binary/@xattr)", ctxt); + g_autofree char *cache = virXPathString("string(./binary/cache/@mode)", ctxt); + g_autofree char *posix_lock = virXPathString("string(./binary/lock/@posix)", ctxt); + g_autofree char *flock = virXPathString("string(./binary/lock/@flock)", ctxt); + int val; + + + if (queue_size && virStrToLong_ull(queue_size, NULL, 10, &def->queue_size) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("cannot parse queue size '%s' for virtiofs"), + queue_size); + goto error; + } + + if (binary) + def->binary = virFileSanitizePath(binary); + + if (xattr) { + if ((val = virTristateSwitchTypeFromString(xattr)) <= 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown xattr value '%s'"), xattr); + goto error; + } + def->xattr = val; + } + + if (cache) { + if ((val = virDomainFSCacheModeTypeFromString(cache)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("cannot parse cache mode '%s' for virtiofs"), + cache); + goto error; + } + def->cache = val; + } + + if (posix_lock) { + if ((val = virTristateSwitchTypeFromString(posix_lock)) <= 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown posix lock value '%s'"), posix_lock); + goto error; + } + def->posix_lock = val; + } + + if (flock) { + if ((val = virTristateSwitchTypeFromString(flock)) <= 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown flock value '%s'"), flock); + goto error; + } + def->flock = val; + } + } + if (format) { if ((def->format = virStorageFileFormatTypeFromString(format)) <= 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -25053,6 +25120,9 @@ virDomainFSDefFormat(virBufferPtr buf, const char *wrpolicy = virDomainFSWrpolicyTypeToString(def->wrpolicy); const char *src = def->src->path; g_auto(virBuffer) driverAttrBuf = VIR_BUFFER_INITIALIZER; + g_auto(virBuffer) driverBuf = VIR_BUFFER_INIT_CHILD(buf); + g_auto(virBuffer) binaryAttrBuf = VIR_BUFFER_INITIALIZER; + g_auto(virBuffer) binaryBuf = VIR_BUFFER_INIT_CHILD(buf); if (!type) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -25076,6 +25146,8 @@ virDomainFSDefFormat(virBufferPtr buf, virBufferAddLit(buf, ">\n"); virBufferAdjustIndent(buf, 2); + virBufferAdjustIndent(&driverBuf, 2); + virBufferAdjustIndent(&binaryBuf, 2); if (def->fsdriver) { virBufferAsprintf(&driverAttrBuf, " type='%s'", fsdriver); @@ -25087,11 +25159,44 @@ virDomainFSDefFormat(virBufferPtr buf, if (def->wrpolicy) virBufferAsprintf(&driverAttrBuf, " wrpolicy='%s'", wrpolicy); + if (def->queue_size) + virBufferAsprintf(&driverAttrBuf, " queue='%llu'", def->queue_size); + + } + + if (def->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + g_auto(virBuffer) lockAttrBuf = VIR_BUFFER_INITIALIZER; + virBufferEscapeString(&binaryAttrBuf, " path='%s'", def->binary); + + if (def->xattr != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&binaryAttrBuf, " xattr='%s'", + virTristateSwitchTypeToString(def->xattr)); + } + + if (def->cache != VIR_DOMAIN_FS_CACHE_MODE_DEFAULT) { + virBufferAsprintf(&binaryBuf, "<cache mode='%s'/>\n", + virDomainFSCacheModeTypeToString(def->cache)); + } + + if (def->posix_lock != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&lockAttrBuf, " posix='%s'", + virTristateSwitchTypeToString(def->posix_lock)); + } + + if (def->flock != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&lockAttrBuf, " flock='%s'", + virTristateSwitchTypeToString(def->flock)); + } + + virXMLFormatElement(&binaryBuf, "lock", &lockAttrBuf, NULL); } + virDomainVirtioOptionsFormat(&driverAttrBuf, def->virtio); - virXMLFormatElement(buf, "driver", &driverAttrBuf, NULL); + virXMLFormatElement(buf, "driver", &driverAttrBuf, &driverBuf); + virXMLFormatElement(buf, "binary", &binaryAttrBuf, &binaryBuf); + virXMLFormatElement(buf, "driver", &driverAttrBuf, &driverBuf); switch (def->type) { case VIR_DOMAIN_FS_TYPE_MOUNT: diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index de2bd3be93..21a36c34ef 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -804,6 +804,14 @@ typedef enum { VIR_DOMAIN_FS_MODEL_LAST } virDomainFSModel; +typedef enum { + VIR_DOMAIN_FS_CACHE_MODE_DEFAULT = 0, + VIR_DOMAIN_FS_CACHE_MODE_NONE, + VIR_DOMAIN_FS_CACHE_MODE_ALWAYS, + + VIR_DOMAIN_FS_CACHE_MODE_LAST +} virDomainFSCacheMode; + struct _virDomainFSDef { int type; int fsdriver; /* enum virDomainFSDriverType */ @@ -819,6 +827,12 @@ struct _virDomainFSDef { unsigned long long space_hard_limit; /* in bytes */ unsigned long long space_soft_limit; /* in bytes */ bool symlinksResolved; + char *binary; + unsigned long long queue_size; + virTristateSwitch xattr; + virDomainFSCacheMode cache; + virTristateSwitch posix_lock; + virTristateSwitch flock; virDomainVirtioOptionsPtr virtio; virObjectPtr privateData; }; @@ -3441,6 +3455,7 @@ VIR_ENUM_DECL(virDomainFSDriver); VIR_ENUM_DECL(virDomainFSAccessMode); VIR_ENUM_DECL(virDomainFSWrpolicy); VIR_ENUM_DECL(virDomainFSModel); +VIR_ENUM_DECL(virDomainFSCacheMode); VIR_ENUM_DECL(virDomainNet); VIR_ENUM_DECL(virDomainNetBackend); VIR_ENUM_DECL(virDomainNetVirtioTxMode); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ebf830791e..30c3b12d79 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -395,6 +395,7 @@ virDomainDiskSourceFormat; virDomainDiskTranslateSourcePool; virDomainFeatureTypeFromString; virDomainFeatureTypeToString; +virDomainFSCacheModeTypeToString; virDomainFSDefFree; virDomainFSDefNew; virDomainFSDriverTypeToString; diff --git a/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml index a6b6279fb8..f6bb663e97 100644 --- a/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml +++ b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml @@ -27,7 +27,11 @@ <controller type='usb' index='0' model='none'/> <controller type='pci' index='0' model='pci-root'/> <filesystem type='mount' accessmode='passthrough'> - <driver type='virtiofs'/> + <driver type='virtiofs' queue='1024'/> + <binary path='/usr/libexec/virtiofsd' xattr='on'> + <cache mode='always'/> + <lock posix='off' flock='off'/> + </binary> <source dir='/path'/> <target dir='mount_tag'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> diff --git a/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml index 70df7b890d..96b9774704 100644 --- a/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml +++ b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml @@ -63,6 +63,7 @@ </controller> <filesystem type='mount' accessmode='passthrough'> <driver type='virtiofs'/> + <binary path='/usr/libexec/virtiofsd'/> <source dir='/path'/> <target dir='mount_tag'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:22PM +0100, Ján Tomko wrote:
Add more elements for tuning the virtiofsd daemon and the vhost-user-fs device:
<driver type='virtiofs' queue='1024' xattr='on'> <binary>/usr/libexec/virtiofsd</binary> <cache mode='always' size='2097152' unit='KiB'/> <lock posix='off' flock='off'/> </driver>
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 48 ++++++++ src/conf/domain_conf.c | 107 +++++++++++++++++- src/conf/domain_conf.h | 15 +++ src/libvirt_private.syms | 1 + .../vhost-user-fs-fd-memory.xml | 6 +- .../vhost-user-fs-hugepages.xml | 1 + 7 files changed, 200 insertions(+), 3 deletions(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Jan 30, 2020 at 06:06:22PM +0100, Ján Tomko wrote:
Add more elements for tuning the virtiofsd daemon and the vhost-user-fs device:
<driver type='virtiofs' queue='1024' xattr='on'> <binary>/usr/libexec/virtiofsd</binary> <cache mode='always' size='2097152' unit='KiB'/> <lock posix='off' flock='off'/> </driver>
Above config doesn't work... Following? <driver type='virtiofs' queue='1024'/> <binary path='/usr/libexec/virtiofsd' xattr='off'> <cache mode='always'/> <lock posix='off' flock='off'/> </binary> The code looks good to me. Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com> Thanks, Masa
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- docs/formatdomain.html.in | 25 +++- docs/schemas/domaincommon.rng | 48 ++++++++ src/conf/domain_conf.c | 107 +++++++++++++++++- src/conf/domain_conf.h | 15 +++ src/libvirt_private.syms | 1 + .../vhost-user-fs-fd-memory.xml | 6 +- .../vhost-user-fs-hugepages.xml | 1 + 7 files changed, 200 insertions(+), 3 deletions(-)
diff --git a/docs/formatdomain.html.in b/docs/formatdomain.html.in index d5565361ab..58e06d21f5 100644 --- a/docs/formatdomain.html.in +++ b/docs/formatdomain.html.in @@ -3922,10 +3922,15 @@ <readonly/> </filesystem> <filesystem type='mount' accessmode='passthrough'> - <driver type='virtiofs'/> + <driver type='virtiofs queue='1024'/> + <binary path='/usr/libexec/virtiofsd' xattr='on'> + <cache mode='always'/> + <lock posix='on' flock='on'/> + </binary> <source dir='/path'/> <target dir='mount_tag'/> </filesystem> + ... </devices> ...</pre> @@ -4049,9 +4054,27 @@ <a href="#elementsVirtio">Virtio-specific options</a> can also be set. (<span class="since">Since 3.5.0</span>) </li> + <li> + For <code>virtiofs</code>, the <code>queue</code> attribute can be used + to specify the queue size (i.e. how many requests can the queue fit). + (<span class="since">Since 6.1.0</span>) + </li> </ul> </dd>
+ <dt><code>binary</code></dt> + <dd> + The optional <code>binary</code> element can tune the options for virtiofsd. + The attribute <code>path</code> can be used to override the path to the daemon. + Attribute <code>xattr</code> enables the use of filesystem extended attributes. + Caching can be tuned via the <code>cache</code> element, possible <code>mode</code> + values being <code>none</code> and <code>always</code>. + Locking can be controlled via the <code>lock</code> + element - attributes <code>posix</code> and <code>flock</code> both accepting + values <code>yes</code> or <code>no</code>. + (<span class="since">Since 6.1.0</span>) + </dd> + <dt><code>source</code></dt> <dd> The resource on the host that is being accessed in the guest. The diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng index e30b82c57e..db475a43ca 100644 --- a/docs/schemas/domaincommon.rng +++ b/docs/schemas/domaincommon.rng @@ -2480,6 +2480,9 @@ <optional> <ref name="fsDriver"/> </optional> + <optional> + <ref name="fsBinary"/> + </optional> <interleave> <element name="source"> <attribute name="dir"> @@ -2629,12 +2632,57 @@ <attribute name="type"> <value>virtiofs</value> </attribute> + <optional> + <attribute name="queue"> + <ref name="unsignedInt"/> + </attribute> + </optional> <ref name='virtioOptions'/> </group> <empty/> </choice> </element> </define> + <define name="fsBinary"> + <element name="binary"> + <optional> + <attribute name="path"> + <ref name="absFilePath"/> + </attribute> + </optional> + <optional> + <attribute name="xattr"> + <ref name="virOnOff"/> + </attribute> + </optional> + <optional> + <element name="cache"> + <optional> + <attribute name="mode"> + <choice> + <value>none</value> + <value>always</value> + </choice> + </attribute> + </optional> + </element> + </optional> + <optional> + <element name="lock"> + <optional> + <attribute name="posix"> + <ref name="virOnOff"/> + </attribute> + </optional> + <optional> + <attribute name="flock"> + <ref name="virOnOff"/> + </attribute> + </optional> + </element> + </optional> + </element> + </define>
<define name="interface-network-attributes"> <attribute name="network"> diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index d1a6d125d7..5c26e0d06c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -501,6 +501,14 @@ VIR_ENUM_IMPL(virDomainFSModel, "virtio-non-transitional", );
+VIR_ENUM_IMPL(virDomainFSCacheMode, + VIR_DOMAIN_FS_CACHE_MODE_LAST, + "default", + "none", + "always", +); + + VIR_ENUM_IMPL(virDomainNet, VIR_DOMAIN_NET_TYPE_LAST, "user", @@ -2322,6 +2330,7 @@ void virDomainFSDefFree(virDomainFSDefPtr def) VIR_FREE(def->dst); virDomainDeviceInfoClear(&def->info); VIR_FREE(def->virtio); + VIR_FREE(def->binary);
VIR_FREE(def); } @@ -11254,6 +11263,64 @@ virDomainFSDefParseXML(virDomainXMLOptionPtr xmlopt, } }
+ if (def->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + g_autofree char *queue_size = virXPathString("string(./driver/@queue)", ctxt); + g_autofree char *binary = virXPathString("string(./binary/@path)", ctxt); + g_autofree char *xattr = virXPathString("string(./binary/@xattr)", ctxt); + g_autofree char *cache = virXPathString("string(./binary/cache/@mode)", ctxt); + g_autofree char *posix_lock = virXPathString("string(./binary/lock/@posix)", ctxt); + g_autofree char *flock = virXPathString("string(./binary/lock/@flock)", ctxt); + int val; + + + if (queue_size && virStrToLong_ull(queue_size, NULL, 10, &def->queue_size) < 0) { + virReportError(VIR_ERR_XML_ERROR, + _("cannot parse queue size '%s' for virtiofs"), + queue_size); + goto error; + } + + if (binary) + def->binary = virFileSanitizePath(binary); + + if (xattr) { + if ((val = virTristateSwitchTypeFromString(xattr)) <= 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown xattr value '%s'"), xattr); + goto error; + } + def->xattr = val; + } + + if (cache) { + if ((val = virDomainFSCacheModeTypeFromString(cache)) <= 0) { + virReportError(VIR_ERR_XML_ERROR, + _("cannot parse cache mode '%s' for virtiofs"), + cache); + goto error; + } + def->cache = val; + } + + if (posix_lock) { + if ((val = virTristateSwitchTypeFromString(posix_lock)) <= 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown posix lock value '%s'"), posix_lock); + goto error; + } + def->posix_lock = val; + } + + if (flock) { + if ((val = virTristateSwitchTypeFromString(flock)) <= 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown flock value '%s'"), flock); + goto error; + } + def->flock = val; + } + } + if (format) { if ((def->format = virStorageFileFormatTypeFromString(format)) <= 0) { virReportError(VIR_ERR_CONFIG_UNSUPPORTED, @@ -25053,6 +25120,9 @@ virDomainFSDefFormat(virBufferPtr buf, const char *wrpolicy = virDomainFSWrpolicyTypeToString(def->wrpolicy); const char *src = def->src->path; g_auto(virBuffer) driverAttrBuf = VIR_BUFFER_INITIALIZER; + g_auto(virBuffer) driverBuf = VIR_BUFFER_INIT_CHILD(buf); + g_auto(virBuffer) binaryAttrBuf = VIR_BUFFER_INITIALIZER; + g_auto(virBuffer) binaryBuf = VIR_BUFFER_INIT_CHILD(buf);
if (!type) { virReportError(VIR_ERR_INTERNAL_ERROR, @@ -25076,6 +25146,8 @@ virDomainFSDefFormat(virBufferPtr buf, virBufferAddLit(buf, ">\n");
virBufferAdjustIndent(buf, 2); + virBufferAdjustIndent(&driverBuf, 2); + virBufferAdjustIndent(&binaryBuf, 2); if (def->fsdriver) { virBufferAsprintf(&driverAttrBuf, " type='%s'", fsdriver);
@@ -25087,11 +25159,44 @@ virDomainFSDefFormat(virBufferPtr buf, if (def->wrpolicy) virBufferAsprintf(&driverAttrBuf, " wrpolicy='%s'", wrpolicy);
+ if (def->queue_size) + virBufferAsprintf(&driverAttrBuf, " queue='%llu'", def->queue_size); + + } + + if (def->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + g_auto(virBuffer) lockAttrBuf = VIR_BUFFER_INITIALIZER; + virBufferEscapeString(&binaryAttrBuf, " path='%s'", def->binary); + + if (def->xattr != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&binaryAttrBuf, " xattr='%s'", + virTristateSwitchTypeToString(def->xattr)); + } + + if (def->cache != VIR_DOMAIN_FS_CACHE_MODE_DEFAULT) { + virBufferAsprintf(&binaryBuf, "<cache mode='%s'/>\n", + virDomainFSCacheModeTypeToString(def->cache)); + } + + if (def->posix_lock != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&lockAttrBuf, " posix='%s'", + virTristateSwitchTypeToString(def->posix_lock)); + } + + if (def->flock != VIR_TRISTATE_SWITCH_ABSENT) { + virBufferAsprintf(&lockAttrBuf, " flock='%s'", + virTristateSwitchTypeToString(def->flock)); + } + + virXMLFormatElement(&binaryBuf, "lock", &lockAttrBuf, NULL); }
+ virDomainVirtioOptionsFormat(&driverAttrBuf, def->virtio);
- virXMLFormatElement(buf, "driver", &driverAttrBuf, NULL); + virXMLFormatElement(buf, "driver", &driverAttrBuf, &driverBuf); + virXMLFormatElement(buf, "binary", &binaryAttrBuf, &binaryBuf); + virXMLFormatElement(buf, "driver", &driverAttrBuf, &driverBuf);
switch (def->type) { case VIR_DOMAIN_FS_TYPE_MOUNT: diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h index de2bd3be93..21a36c34ef 100644 --- a/src/conf/domain_conf.h +++ b/src/conf/domain_conf.h @@ -804,6 +804,14 @@ typedef enum { VIR_DOMAIN_FS_MODEL_LAST } virDomainFSModel;
+typedef enum { + VIR_DOMAIN_FS_CACHE_MODE_DEFAULT = 0, + VIR_DOMAIN_FS_CACHE_MODE_NONE, + VIR_DOMAIN_FS_CACHE_MODE_ALWAYS, + + VIR_DOMAIN_FS_CACHE_MODE_LAST +} virDomainFSCacheMode; + struct _virDomainFSDef { int type; int fsdriver; /* enum virDomainFSDriverType */ @@ -819,6 +827,12 @@ struct _virDomainFSDef { unsigned long long space_hard_limit; /* in bytes */ unsigned long long space_soft_limit; /* in bytes */ bool symlinksResolved; + char *binary; + unsigned long long queue_size; + virTristateSwitch xattr; + virDomainFSCacheMode cache; + virTristateSwitch posix_lock; + virTristateSwitch flock; virDomainVirtioOptionsPtr virtio; virObjectPtr privateData; }; @@ -3441,6 +3455,7 @@ VIR_ENUM_DECL(virDomainFSDriver); VIR_ENUM_DECL(virDomainFSAccessMode); VIR_ENUM_DECL(virDomainFSWrpolicy); VIR_ENUM_DECL(virDomainFSModel); +VIR_ENUM_DECL(virDomainFSCacheMode); VIR_ENUM_DECL(virDomainNet); VIR_ENUM_DECL(virDomainNetBackend); VIR_ENUM_DECL(virDomainNetVirtioTxMode); diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ebf830791e..30c3b12d79 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -395,6 +395,7 @@ virDomainDiskSourceFormat; virDomainDiskTranslateSourcePool; virDomainFeatureTypeFromString; virDomainFeatureTypeToString; +virDomainFSCacheModeTypeToString; virDomainFSDefFree; virDomainFSDefNew; virDomainFSDriverTypeToString; diff --git a/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml index a6b6279fb8..f6bb663e97 100644 --- a/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml +++ b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.xml @@ -27,7 +27,11 @@ <controller type='usb' index='0' model='none'/> <controller type='pci' index='0' model='pci-root'/> <filesystem type='mount' accessmode='passthrough'> - <driver type='virtiofs'/> + <driver type='virtiofs' queue='1024'/> + <binary path='/usr/libexec/virtiofsd' xattr='on'> + <cache mode='always'/> + <lock posix='off' flock='off'/> + </binary> <source dir='/path'/> <target dir='mount_tag'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> diff --git a/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml index 70df7b890d..96b9774704 100644 --- a/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml +++ b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.xml @@ -63,6 +63,7 @@ </controller> <filesystem type='mount' accessmode='passthrough'> <driver type='virtiofs'/> + <binary path='/usr/libexec/virtiofsd'/> <source dir='/path'/> <target dir='mount_tag'/> <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> -- 2.21.0

Add a 'virtiofsd_debug' option for tuning whether to run virtiofsd in debug mode. Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 7 +++++++ src/qemu/qemu_conf.c | 2 ++ src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 5 files changed, 12 insertions(+) diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug index 557b6f38f8..3014fa6b86 100644 --- a/src/qemu/libvirtd_qemu.aug +++ b/src/qemu/libvirtd_qemu.aug @@ -116,6 +116,7 @@ module Libvirtd_qemu = let nvram_entry = str_array_entry "nvram" let debug_level_entry = int_entry "gluster_debug_level" + | bool_entry "virtiofsd_debug" let memory_entry = str_entry "memory_backing_dir" diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf index b6805ffc41..e82c1b5bd5 100644 --- a/src/qemu/qemu.conf +++ b/src/qemu/qemu.conf @@ -809,6 +809,13 @@ # #gluster_debug_level = 9 +# virtiofsd debug +# +# Whether to enable the debugging output of the virtiofsd daemon. +# Possible values are 0 or 1. +# +#virtiofsd_debug = 1 + # To enhance security, QEMU driver is capable of creating private namespaces # for each domain started. Well, so far only "mount" namespace is supported. If # enabled it means qemu process is unable to see all the devices on the system, diff --git a/src/qemu/qemu_conf.c b/src/qemu/qemu_conf.c index e5051027fc..61f694d438 100644 --- a/src/qemu/qemu_conf.c +++ b/src/qemu/qemu_conf.c @@ -837,6 +837,8 @@ virQEMUDriverConfigLoadDebugEntry(virQEMUDriverConfigPtr cfg, { if (virConfGetValueUInt(conf, "gluster_debug_level", &cfg->glusterDebugLevel) < 0) return -1; + if (virConfGetValueBool(conf, "virtiofsd_debug", &cfg->virtiofsdDebug) < 0) + return -1; return 0; } diff --git a/src/qemu/qemu_conf.h b/src/qemu/qemu_conf.h index cedf232223..3ce9566b71 100644 --- a/src/qemu/qemu_conf.h +++ b/src/qemu/qemu_conf.h @@ -202,6 +202,7 @@ struct _virQEMUDriverConfig { virFirmwarePtr *firmwares; size_t nfirmwares; unsigned int glusterDebugLevel; + bool virtiofsdDebug; char *memoryBackingDir; diff --git a/src/qemu/test_libvirtd_qemu.aug.in b/src/qemu/test_libvirtd_qemu.aug.in index dd90edf687..fca9a942c9 100644 --- a/src/qemu/test_libvirtd_qemu.aug.in +++ b/src/qemu/test_libvirtd_qemu.aug.in @@ -98,6 +98,7 @@ module Test_libvirtd_qemu = } { "stdio_handler" = "logd" } { "gluster_debug_level" = "9" } +{ "virtiofsd_debug" = "1" } { "namespaces" { "1" = "mount" } } -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:23PM +0100, Ján Tomko wrote:
Add a 'virtiofsd_debug' option for tuning whether to run virtiofsd in debug mode.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/libvirtd_qemu.aug | 1 + src/qemu/qemu.conf | 7 +++++++ src/qemu/qemu_conf.c | 2 ++ src/qemu/qemu_conf.h | 1 + src/qemu/test_libvirtd_qemu.aug.in | 1 + 5 files changed, 12 insertions(+)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Reject unsupported configurations. Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_domain.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 6367f5394e..b5d5812ff8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8314,8 +8314,32 @@ qemuDomainDeviceDefValidateFS(virDomainFSDefPtr fs, return -1; case VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS: - /* TODO: vhost-user-fs-pci */ - return 0; + if (fs->accessmode != VIR_DOMAIN_FS_ACCESSMODE_PASSTHROUGH) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs only supports passthrough accessmode")); + return -1; + } + if (fs->wrpolicy != VIR_DOMAIN_FS_WRPOLICY_DEFAULT) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs does not support wrpolicy")); + return -1; + } + if (fs->model != VIR_DOMAIN_FS_MODEL_DEFAULT) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs does not support model")); + return -1; + } + if (fs->format != VIR_STORAGE_FILE_NONE) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs does not support format")); + return -1; + } + if (def->mem.access != VIR_DOMAIN_MEMORY_ACCESS_SHARED) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs requires shared memory")); + return -1; + } + break; case VIR_DOMAIN_FS_DRIVER_TYPE_LAST: default: -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:24PM +0100, Ján Tomko wrote:
Reject unsupported configurations.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_domain.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Jan 30, 2020 at 06:06:24PM +0100, Ján Tomko wrote:
Reject unsupported configurations.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_domain.c | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-)
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 6367f5394e..b5d5812ff8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8314,8 +8314,32 @@ qemuDomainDeviceDefValidateFS(virDomainFSDefPtr fs, return -1;
case VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS: - /* TODO: vhost-user-fs-pci */ - return 0; + if (fs->accessmode != VIR_DOMAIN_FS_ACCESSMODE_PASSTHROUGH) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs only supports passthrough accessmode")); + return -1; + } + if (fs->wrpolicy != VIR_DOMAIN_FS_WRPOLICY_DEFAULT) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs does not support wrpolicy")); + return -1; + } + if (fs->model != VIR_DOMAIN_FS_MODEL_DEFAULT) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs does not support model")); + return -1; + } + if (fs->format != VIR_STORAGE_FILE_NONE) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs does not support format")); + return -1; + }
+ if (def->mem.access != VIR_DOMAIN_MEMORY_ACCESS_SHARED) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", + _("virtiofs requires shared memory")); + return -1; + }
This requires the following <memoryBacking> config, right? <memoryBacking> <access mode='shared'/> </memoryBacking> I think the <memoryBacking> isn't mandatory because the memory access policy can be over written by <numa> config like as: <numa> <cell id='0' cpus='0-31' memory='8912896' unit='KiB' memAccess='shared'/> </numa> And, I suppose virtiofsd requires one or more numa nodes. How about adding following function to check the mem config instead of the if statement? diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index 493864b854..b3b5cb2054 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -8203,6 +8203,39 @@ qemuDomainDeviceDefValidateIOMMU(const virDomainIOMMUDef *iommu, return 0; } +static int +memconfIsSufficientForVirtiofs(const virDomainDef *def) +{ + virDomainNumaPtr numa = def->numa; + virDomainMemoryAccess memAccess = def->mem.access; + virDomainMemoryAccess NodememAccess; + size_t nodes; + size_t i; + + nodes = virDomainNumaGetNodeCount(numa); + if (!nodes) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("virtiofs requires one or more numa nodes")); + return 0; + } + + for (i = 0; i < nodes; i++) { + NodememAccess = virDomainNumaGetNodeMemoryAccessMode(numa, i); + if (NodememAccess == VIR_DOMAIN_MEMORY_ACCESS_PRIVATE) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("virtiofs requires shared memory for node[%ld]"), i); + return 0; + } else if ((NodememAccess == VIR_DOMAIN_MEMORY_ACCESS_DEFAULT) && + (memAccess != VIR_DOMAIN_MEMORY_ACCESS_SHARED)) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("virtiofs requires shared memory for node[%ld]"), i); + return 0; + } + } + + return 1; +} + static int qemuDomainDeviceDefValidateFS(virDomainFSDefPtr fs, @@ -8256,9 +8289,7 @@ qemuDomainDeviceDefValidateFS(virDomainFSDefPtr fs, _("virtiofs does not support format")); return -1; } - if (def->mem.access != VIR_DOMAIN_MEMORY_ACCESS_SHARED) { - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s", - _("virtiofs requires shared memory")); + if (!memconfIsSufficientForVirtiofs(def)) { return -1; } break; --- Thanks, Masa

This is not yet supported. Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_migration.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/qemu/qemu_migration.c b/src/qemu/qemu_migration.c index 9320985648..e7390fdf5b 100644 --- a/src/qemu/qemu_migration.c +++ b/src/qemu/qemu_migration.c @@ -1280,6 +1280,16 @@ qemuMigrationSrcIsAllowed(virQEMUDriverPtr driver, _("migration with shmem device is not supported")); return false; } + + for (i = 0; i < vm->def->nfss; i++) { + virDomainFSDefPtr fs = vm->def->fss[i]; + + if (fs->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("migration with virtiofs device is not supported")); + return false; + } + } } return true; -- 2.21.0

Start virtiofsd for each <filesystem> device using it. Pre-create the socket for communication with QEMU and pass it to virtiofsd. Note that virtiofsd needs to run as root. https://bugzilla.redhat.com/show_bug.cgi?id=1694166 Introduced by QEMU commit a43efa34c7d7b628cbf1ec0fe60043e5c91043ea Signed-off-by: Ján Tomko <jtomko@redhat.com> --- po/POTFILES.in | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/qemu_domain.c | 5 +- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_extdevice.c | 20 ++- src/qemu/qemu_virtiofs.c | 290 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_virtiofs.h | 38 +++++ tests/qemuxml2argvtest.c | 11 ++ 8 files changed, 366 insertions(+), 3 deletions(-) create mode 100644 src/qemu/qemu_virtiofs.c create mode 100644 src/qemu/qemu_virtiofs.h diff --git a/po/POTFILES.in b/po/POTFILES.in index c18e21615f..813fb24199 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -168,6 +168,7 @@ @SRCDIR@/src/qemu/qemu_tpm.c @SRCDIR@/src/qemu/qemu_vhost_user.c @SRCDIR@/src/qemu/qemu_vhost_user_gpu.c +@SRCDIR@/src/qemu/qemu_virtiofs.c @SRCDIR@/src/remote/remote_daemon.c @SRCDIR@/src/remote/remote_daemon_config.c @SRCDIR@/src/remote/remote_daemon_dispatch.c diff --git a/src/qemu/Makefile.inc.am b/src/qemu/Makefile.inc.am index d04a87e659..7a205b4da6 100644 --- a/src/qemu/Makefile.inc.am +++ b/src/qemu/Makefile.inc.am @@ -67,6 +67,8 @@ QEMU_DRIVER_SOURCES = \ qemu/qemu_vhost_user.h \ qemu/qemu_vhost_user_gpu.c \ qemu/qemu_vhost_user_gpu.h \ + qemu/qemu_virtiofs.c \ + qemu/qemu_virtiofs.h \ qemu/qemu_checkpoint.c \ qemu/qemu_checkpoint.h \ qemu/qemu_backup.c \ diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b5d5812ff8..3064c33ca8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1440,8 +1440,11 @@ qemuDomainFSPrivateNew(void) static void -qemuDomainFSPrivateDispose(void *obj G_GNUC_UNUSED) +qemuDomainFSPrivateDispose(void *obj) { + qemuDomainFSPrivatePtr priv = obj; + + g_free(priv->vhostuser_fs_sock); } static virClassPtr qemuDomainVideoPrivateClass; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index c581b3a162..83150e4e6d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -568,7 +568,7 @@ typedef qemuDomainFSPrivate *qemuDomainFSPrivatePtr; struct _qemuDomainFSPrivate { virObject parent; - int dummy; + char *vhostuser_fs_sock; }; diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 7f3bb104d9..5103d4921c 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -20,11 +20,13 @@ #include <config.h> +#include "qemu_command.h" #include "qemu_extdevice.h" #include "qemu_vhost_user_gpu.h" #include "qemu_domain.h" #include "qemu_tpm.h" #include "qemu_slirp.h" +#include "qemu_virtiofs.h" #include "viralloc.h" #include "virlog.h" @@ -153,7 +155,7 @@ qemuExtDevicesCleanupHost(virQEMUDriverPtr driver, int qemuExtDevicesStart(virQEMUDriverPtr driver, virDomainObjPtr vm, - virLogManagerPtr logManager G_GNUC_UNUSED, + virLogManagerPtr logManager, bool incomingMigration) { virDomainDefPtr def = vm->def; @@ -183,6 +185,15 @@ qemuExtDevicesStart(virQEMUDriverPtr driver, return -1; } + for (i = 0; i < def->nfss; i++) { + virDomainFSDefPtr fs = def->fss[i]; + + if (fs->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + if (qemuVirtioFSStart(logManager, driver, vm, fs) < 0) + return -1; + } + } + return 0; } @@ -214,6 +225,13 @@ qemuExtDevicesStop(virQEMUDriverPtr driver, if (slirp) qemuSlirpStop(slirp, vm, driver, net, false); } + + for (i = 0; i < def->nfss; i++) { + virDomainFSDefPtr fs = def->fss[i]; + + if (fs->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) + qemuVirtioFSStop(driver, vm, fs); + } } diff --git a/src/qemu/qemu_virtiofs.c b/src/qemu/qemu_virtiofs.c new file mode 100644 index 0000000000..4aa8eaed2c --- /dev/null +++ b/src/qemu/qemu_virtiofs.c @@ -0,0 +1,290 @@ +/* + * qemu_virtiofs.c: virtiofs support + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> + +#include "logging/log_manager.h" +#include "virlog.h" +#include "qemu_command.h" +#include "qemu_conf.h" +#include "qemu_extdevice.h" +#include "qemu_security.h" +#include "qemu_virtiofs.h" +#include "virpidfile.h" + +#define VIR_FROM_THIS VIR_FROM_QEMU + + +char * +qemuVirtioFSCreatePidFilename(virQEMUDriverConfigPtr cfg, + const virDomainDef *def, + const char *alias) +{ + g_autofree char *shortName = NULL; + g_autofree char *name = NULL; + + if (!(shortName = virDomainDefGetShortName(def))) + return NULL; + + name = g_strdup_printf("%s-%s-virtiofsd", shortName, alias); + + return virPidFileBuildPath(cfg->stateDir, name); +} + + +char * +qemuVirtioFSCreateSocketFilename(virDomainObjPtr vm, + const char *alias) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + return virFileBuildPath(priv->libDir, alias, "-virtiofsd.sock"); +} + + +static char * +qemuVirtioFSCreateLogFilename(virQEMUDriverConfigPtr cfg, + const virDomainDef *def, + const char *alias) +{ + g_autofree char *name = NULL; + + name = g_strdup_printf("%s-%s", def->name, alias); + + return virFileBuildPath(cfg->logDir, name, "-virtiofsd.log"); +} + + +static int +qemuVirtioFSOpenChardev(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *socket_path) +{ + virDomainChrSourceDefPtr chrdev = virDomainChrSourceDefNew(NULL); + virDomainChrDef chr = { .source = chrdev }; + VIR_AUTOCLOSE fd = -1; + int ret = -1; + + chrdev->type = VIR_DOMAIN_CHR_TYPE_UNIX; + chrdev->data.nix.listen = true; + chrdev->data.nix.path = g_strdup(socket_path); + + if (qemuSecuritySetDaemonSocketLabel(driver->securityManager, vm->def) < 0) + goto cleanup; + fd = qemuOpenChrChardevUNIXSocket(chrdev); + if (fd < 0) { + ignore_value(qemuSecurityClearSocketLabel(driver->securityManager, vm->def)); + goto cleanup; + } + if (qemuSecurityClearSocketLabel(driver->securityManager, vm->def) < 0) + goto cleanup; + + if (qemuSecuritySetChardevLabel(driver, vm, &chr) < 0) + goto cleanup; + + ret = fd; + fd = -1; + + cleanup: + virObjectUnref(chrdev); + return ret; +} + +static virCommandPtr +qemuVirtioFSBuildCommandLine(virQEMUDriverConfigPtr cfg, + virDomainFSDefPtr fs, + int *fd) +{ + g_autoptr(virCommand) cmd = NULL; + g_auto(virBuffer) opts = VIR_BUFFER_INITIALIZER; + + if (!(cmd = virCommandNew(fs->binary))) + return NULL; + + virCommandAddArgFormat(cmd, "--fd=%d", *fd); + virCommandPassFD(cmd, *fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + *fd = -1; + + virCommandAddArg(cmd, "-o"); + virBufferAsprintf(&opts, "source=%s,", fs->src->path); + if (fs->cache) + virBufferAsprintf(&opts, "cache=%s,", virDomainFSCacheModeTypeToString(fs->cache)); + + if (fs->xattr == VIR_TRISTATE_SWITCH_ON) + virBufferAddLit(&opts, "xattr,"); + else if (fs->xattr == VIR_TRISTATE_SWITCH_OFF) + virBufferAddLit(&opts, "no_xattr,"); + + if (fs->flock == VIR_TRISTATE_SWITCH_ON) + virBufferAddLit(&opts, "flock,"); + else if (fs->flock == VIR_TRISTATE_SWITCH_OFF) + virBufferAddLit(&opts, "no_flock,"); + + if (fs->posix_lock == VIR_TRISTATE_SWITCH_ON) + virBufferAddLit(&opts, "posix_lock,"); + else if (fs->posix_lock == VIR_TRISTATE_SWITCH_OFF) + virBufferAddLit(&opts, "no_posix_lock,"); + + virBufferTrim(&opts, ",", -1); + + virCommandAddArgBuffer(cmd, &opts); + if (cfg->virtiofsdDebug) + virCommandAddArg(cmd, "-d"); + + return g_steal_pointer(&cmd); +} + +int +qemuVirtioFSStart(virLogManagerPtr logManager, + virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autoptr(virCommand) cmd = NULL; + g_autofree char *socket_path = NULL; + g_autofree char *pidfile = NULL; + g_autofree char *logpath = NULL; + pid_t pid = (pid_t) -1; + VIR_AUTOCLOSE fd = -1; + VIR_AUTOCLOSE logfd = -1; + int ret = -1; + int rc; + + if (!(pidfile = qemuVirtioFSCreatePidFilename(cfg, vm->def, fs->info.alias))) + goto cleanup; + + if (!(socket_path = qemuVirtioFSCreateSocketFilename(vm, fs->info.alias))) + goto cleanup; + + if ((fd = qemuVirtioFSOpenChardev(driver, vm, socket_path)) < 0) + goto cleanup; + + logpath = qemuVirtioFSCreateLogFilename(cfg, vm->def, fs->info.alias); + + if (cfg->stdioLogD) { + if ((logfd = virLogManagerDomainOpenLogFile(logManager, + "qemu", + vm->def->uuid, + vm->def->name, + logpath, + 0, + NULL, NULL)) < 0) + goto cleanup; + } else { + if ((logfd = open(logpath, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, _("failed to create logfile %s"), + logpath); + goto cleanup; + } + if (virSetCloseExec(logfd) < 0) { + virReportSystemError(errno, _("failed to set close-on-exec flag on %s"), + logpath); + goto error; + } + } + + if (!(cmd = qemuVirtioFSBuildCommandLine(cfg, fs, &fd))) + goto cleanup; + + virCommandSetPidFile(cmd, pidfile); + virCommandSetOutputFD(cmd, &logfd); + virCommandSetErrorFD(cmd, &logfd); + virCommandNonblockingFDs(cmd); + virCommandDaemonize(cmd); + + if (qemuExtDeviceLogCommand(driver, vm, cmd, "virtiofsd") < 0) + goto cleanup; + + rc = virCommandRun(cmd, NULL); + logfd = -1; + + if (rc < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not start 'virtiofsd'")); + goto error; + } + + rc = virPidFileReadPath(pidfile, &pid); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to read virtiofsd pidfile '%s'"), + pidfile); + goto error; + } + + if (virProcessKill(pid, 0) != 0) { + virReportSystemError(errno, "%s", + _("virtiofsd died unexpectedly")); + goto error; + } + + QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock = g_steal_pointer(&socket_path); + ret = 0; + + cleanup: + if (socket_path) + unlink(socket_path); + return ret; + + error: + if (pid != -1) + virProcessKillPainfully(pid, true); + if (pidfile) + unlink(pidfile); + goto cleanup; +} + + +void +qemuVirtioFSStop(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autofree char *pidfile = NULL; + virErrorPtr orig_err; + pid_t pid = -1; + int rc; + + virErrorPreserveLast(&orig_err); + + if (!(pidfile = qemuVirtioFSCreatePidFilename(cfg, vm->def, fs->info.alias))) + goto cleanup; + + rc = virPidFileReadPathIfAlive(pidfile, &pid, NULL); + if (rc >= 0 && pid != (pid_t) -1) + virProcessKillPainfully(pid, true); + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale pidfile %s"), + pidfile); + } + + if (QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock) + unlink(QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock); + + cleanup: + virErrorRestore(&orig_err); +} diff --git a/src/qemu/qemu_virtiofs.h b/src/qemu/qemu_virtiofs.h new file mode 100644 index 0000000000..49db807b19 --- /dev/null +++ b/src/qemu/qemu_virtiofs.h @@ -0,0 +1,38 @@ +/* + * qemu_virtiofs.h: virtiofs support + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#pragma once + + +char * +qemuVirtioFSCreatePidFilename(virQEMUDriverConfigPtr cfg, + const virDomainDef *def, + const char *alias); +char * +qemuVirtioFSCreateSocketFilename(virDomainObjPtr vm, + const char *alias); + +int +qemuVirtioFSStart(virLogManagerPtr logManager, + virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs); +void +qemuVirtioFSStop(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs); diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index a36183bf34..f268d336a6 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -496,6 +496,17 @@ testCompareXMLToArgv(const void *data) } } + for (i = 0; i < vm->def->nfss; i++) { + virDomainFSDefPtr fs = vm->def->fss[i]; + char *s; + + if (fs->fsdriver != VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) + continue; + + s = g_strdup_printf("/tmp/lib/domain--1-guest/fs%zu.vhost-fs.sock", i); + QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock = s; + } + if (vm->def->vsock) { virDomainVsockDefPtr vsock = vm->def->vsock; qemuDomainVsockPrivatePtr vsockPriv = -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
So we're not able to use virtiofsd with the session libvirtd which runs completely unprivileged ? I can understand the need to run as root if we want to support chown() of files, or DAC_OVERRIDE, but I'm surprised it isn't possible to run virtiofsd unprivileged & simply have a reduced featureset where it can only create files as that one user.
+int +qemuVirtioFSStart(virLogManagerPtr logManager, + virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autoptr(virCommand) cmd = NULL; + g_autofree char *socket_path = NULL; + g_autofree char *pidfile = NULL; + g_autofree char *logpath = NULL; + pid_t pid = (pid_t) -1; + VIR_AUTOCLOSE fd = -1; + VIR_AUTOCLOSE logfd = -1; + int ret = -1; + int rc;
+ + if (!(pidfile = qemuVirtioFSCreatePidFilename(cfg, vm->def, fs->info.alias))) + goto cleanup; + + if (!(socket_path = qemuVirtioFSCreateSocketFilename(vm, fs->info.alias))) + goto cleanup; + + if ((fd = qemuVirtioFSOpenChardev(driver, vm, socket_path)) < 0) + goto cleanup; + + logpath = qemuVirtioFSCreateLogFilename(cfg, vm->def, fs->info.alias); + + if (cfg->stdioLogD) { + if ((logfd = virLogManagerDomainOpenLogFile(logManager, + "qemu", + vm->def->uuid, + vm->def->name, + logpath, + 0, + NULL, NULL)) < 0) + goto cleanup; + } else { + if ((logfd = open(logpath, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, _("failed to create logfile %s"), + logpath); + goto cleanup; + } + if (virSetCloseExec(logfd) < 0) { + virReportSystemError(errno, _("failed to set close-on-exec flag on %s"), + logpath); + goto error; + } + } + + if (!(cmd = qemuVirtioFSBuildCommandLine(cfg, fs, &fd))) + goto cleanup; + + virCommandSetPidFile(cmd, pidfile); + virCommandSetOutputFD(cmd, &logfd); + virCommandSetErrorFD(cmd, &logfd); + virCommandNonblockingFDs(cmd); + virCommandDaemonize(cmd);
We're not mandating "root" here, it is just inheriting the user that libvirtd runs as. So IIUC ,this will run virtofsd as non-root when used with session libvirtd, unless there's a check somewhere else that prevents this scenario ? I'm also wondering about cgroups placement in this method, and any use of SELinux
+ + if (qemuExtDeviceLogCommand(driver, vm, cmd, "virtiofsd") < 0) + goto cleanup; + + rc = virCommandRun(cmd, NULL); + logfd = -1; + + if (rc < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not start 'virtiofsd'")); + goto error; + } + + rc = virPidFileReadPath(pidfile, &pid); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to read virtiofsd pidfile '%s'"), + pidfile); + goto error; + } + + if (virProcessKill(pid, 0) != 0) { + virReportSystemError(errno, "%s", + _("virtiofsd died unexpectedly")); + goto error; + } + + QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock = g_steal_pointer(&socket_path); + ret = 0; + + cleanup: + if (socket_path) + unlink(socket_path); + return ret; + + error: + if (pid != -1) + virProcessKillPainfully(pid, true); + if (pidfile) + unlink(pidfile); + goto cleanup; +}
Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

[adding virtiofs list] On Mon, Feb 03, 2020 at 04:43:51PM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
So we're not able to use virtiofsd with the session libvirtd which runs completely unprivileged ?
Not with the version of virtiofsd currently merged in the QEMU tree.
I can understand the need to run as root if we want to support chown() of files, or DAC_OVERRIDE, but I'm surprised it isn't possible to run virtiofsd unprivileged & simply have a reduced featureset where it can only create files as that one user.
Apart from the possibly missing features (I don't know how well virtiofsd internals are ready for those), current version of the daemon sets up namespaces and the seccomp sandbox.
+int +qemuVirtioFSStart(virLogManagerPtr logManager, + virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autoptr(virCommand) cmd = NULL; + g_autofree char *socket_path = NULL; + g_autofree char *pidfile = NULL; + g_autofree char *logpath = NULL; + pid_t pid = (pid_t) -1; + VIR_AUTOCLOSE fd = -1; + VIR_AUTOCLOSE logfd = -1; + int ret = -1; + int rc;
+ + if (!(pidfile = qemuVirtioFSCreatePidFilename(cfg, vm->def, fs->info.alias))) + goto cleanup; + + if (!(socket_path = qemuVirtioFSCreateSocketFilename(vm, fs->info.alias))) + goto cleanup; + + if ((fd = qemuVirtioFSOpenChardev(driver, vm, socket_path)) < 0) + goto cleanup; + + logpath = qemuVirtioFSCreateLogFilename(cfg, vm->def, fs->info.alias); + + if (cfg->stdioLogD) { + if ((logfd = virLogManagerDomainOpenLogFile(logManager, + "qemu", + vm->def->uuid, + vm->def->name, + logpath, + 0, + NULL, NULL)) < 0) + goto cleanup; + } else { + if ((logfd = open(logpath, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, _("failed to create logfile %s"), + logpath); + goto cleanup; + } + if (virSetCloseExec(logfd) < 0) { + virReportSystemError(errno, _("failed to set close-on-exec flag on %s"), + logpath); + goto error; + } + } + + if (!(cmd = qemuVirtioFSBuildCommandLine(cfg, fs, &fd))) + goto cleanup; + + virCommandSetPidFile(cmd, pidfile); + virCommandSetOutputFD(cmd, &logfd); + virCommandSetErrorFD(cmd, &logfd); + virCommandNonblockingFDs(cmd); + virCommandDaemonize(cmd);
We're not mandating "root" here, it is just inheriting the user that libvirtd runs as. So IIUC ,this will run virtofsd as non-root when used with session libvirtd, unless there's a check somewhere else that prevents this scenario ?
I'll add a check.
I'm also wondering about cgroups placement in this method, and any use of SELinux
Placing it into a cgroup should be easy, AFAIK it does not need to access any devices. As for SELinux, I don't think there's anything to be done other than updating the selinux-policy. Recursively relabeling the whole directory feels intrusive. Jano
+ + if (qemuExtDeviceLogCommand(driver, vm, cmd, "virtiofsd") < 0) + goto cleanup; + + rc = virCommandRun(cmd, NULL); + logfd = -1;

On Mon, Feb 03, 2020 at 07:36:09PM +0100, Ján Tomko wrote:
[adding virtiofs list]
On Mon, Feb 03, 2020 at 04:43:51PM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
So we're not able to use virtiofsd with the session libvirtd which runs completely unprivileged ?
Not with the version of virtiofsd currently merged in the QEMU tree.
I can understand the need to run as root if we want to support chown() of files, or DAC_OVERRIDE, but I'm surprised it isn't possible to run virtiofsd unprivileged & simply have a reduced featureset where it can only create files as that one user.
Apart from the possibly missing features (I don't know how well virtiofsd internals are ready for those), current version of the daemon sets up namespaces and the seccomp sandbox.
+int +qemuVirtioFSStart(virLogManagerPtr logManager, + virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autoptr(virCommand) cmd = NULL; + g_autofree char *socket_path = NULL; + g_autofree char *pidfile = NULL; + g_autofree char *logpath = NULL; + pid_t pid = (pid_t) -1; + VIR_AUTOCLOSE fd = -1; + VIR_AUTOCLOSE logfd = -1; + int ret = -1; + int rc;
+ + if (!(pidfile = qemuVirtioFSCreatePidFilename(cfg, vm->def, fs->info.alias))) + goto cleanup; + + if (!(socket_path = qemuVirtioFSCreateSocketFilename(vm, fs->info.alias))) + goto cleanup; + + if ((fd = qemuVirtioFSOpenChardev(driver, vm, socket_path)) < 0) + goto cleanup; + + logpath = qemuVirtioFSCreateLogFilename(cfg, vm->def, fs->info.alias); + + if (cfg->stdioLogD) { + if ((logfd = virLogManagerDomainOpenLogFile(logManager, + "qemu", + vm->def->uuid, + vm->def->name, + logpath, + 0, + NULL, NULL)) < 0) + goto cleanup; + } else { + if ((logfd = open(logpath, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, _("failed to create logfile %s"), + logpath); + goto cleanup; + } + if (virSetCloseExec(logfd) < 0) { + virReportSystemError(errno, _("failed to set close-on-exec flag on %s"), + logpath); + goto error; + } + } + + if (!(cmd = qemuVirtioFSBuildCommandLine(cfg, fs, &fd))) + goto cleanup; + + virCommandSetPidFile(cmd, pidfile); + virCommandSetOutputFD(cmd, &logfd); + virCommandSetErrorFD(cmd, &logfd); + virCommandNonblockingFDs(cmd); + virCommandDaemonize(cmd);
We're not mandating "root" here, it is just inheriting the user that libvirtd runs as. So IIUC ,this will run virtofsd as non-root when used with session libvirtd, unless there's a check somewhere else that prevents this scenario ?
I'll add a check.
I'm also wondering about cgroups placement in this method, and any use of SELinux
Placing it into a cgroup should be easy, AFAIK it does not need to access any devices.
As for SELinux, I don't think there's anything to be done other than updating the selinux-policy. Recursively relabeling the whole directory feels intrusive.
Even if we don't do relabelling, we'll still need more than just policy work I believe. If we assume a new "virtiofsd_t" SELinux type. Ff QEMU is launched svirt_t:s0:c123,c532, we will need to explicitly spawn virtiofsd with the matching MCS category eg virtiofsd_t:s0:c123,c532. The policy should be written such that the UNIX domain socket used for comms between QEMU and virtiofsd requires this matching MCS category label. This ensures that QEMU Guest-A can't connect to a virtiofsd used by QEMU Guest-B and vica-verca. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Mon, Feb 03, 2020 at 07:36:09PM +0100, Ján Tomko wrote:
[adding virtiofs list]
On Mon, Feb 03, 2020 at 04:43:51PM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
So we're not able to use virtiofsd with the session libvirtd which runs completely unprivileged ?
Not with the version of virtiofsd currently merged in the QEMU tree.
I can understand the need to run as root if we want to support chown() of files, or DAC_OVERRIDE, but I'm surprised it isn't possible to run virtiofsd unprivileged & simply have a reduced featureset where it can only create files as that one user.
Apart from the possibly missing features (I don't know how well virtiofsd internals are ready for those), current version of the daemon sets up namespaces and the seccomp sandbox.
Yes, running unprivileged is not a configuration that has been investigated yet. Today virtiofsd needs to be launched as root. It would be interesting to rely on user namespaces to provide unprivileged virtio-fs while still supporting more than 1 uid/gid. I'm not a fan of using xattrs or other out-of-band information to store guest ownership and permission information. That is inconvenient to manage (non-virtio-fs users of the directory won't see the right uid/gid), reduces performance, and likely has race conditions. Stefan

On Thu, Feb 06, 2020 at 01:46:58PM +0000, Stefan Hajnoczi wrote:
On Mon, Feb 03, 2020 at 07:36:09PM +0100, Ján Tomko wrote:
[adding virtiofs list]
On Mon, Feb 03, 2020 at 04:43:51PM +0000, Daniel P. Berrangé wrote:
On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
So we're not able to use virtiofsd with the session libvirtd which runs completely unprivileged ?
Not with the version of virtiofsd currently merged in the QEMU tree.
I can understand the need to run as root if we want to support chown() of files, or DAC_OVERRIDE, but I'm surprised it isn't possible to run virtiofsd unprivileged & simply have a reduced featureset where it can only create files as that one user.
Apart from the possibly missing features (I don't know how well virtiofsd internals are ready for those), current version of the daemon sets up namespaces and the seccomp sandbox.
Yes, running unprivileged is not a configuration that has been investigated yet. Today virtiofsd needs to be launched as root.
It would be interesting to rely on user namespaces to provide unprivileged virtio-fs while still supporting more than 1 uid/gid.
I'm not a fan of using xattrs or other out-of-band information to store guest ownership and permission information. That is inconvenient to manage (non-virtio-fs users of the directory won't see the right uid/gid), reduces performance, and likely has race conditions.
I don't think these problem needs to be considered a blockers for the ablity to use as an unprivileged user. There are valid use cases for virtiofs unprivileged, which would be satisfied without needing the ability for the guest to use any UID other than the one the host user has. eg for desktop virt I run a VM as my unprivileged user and I simply want to expose my $HOME (or a subdir) to that guest for convenient file sharing. I'm using the same UID in host & guest for my login. There is no reason for virtiofsd to need to store ownership/permissions out of band. It can directly expose the host file ownership/permissions to the guest, as 9p has been able todo. My guest will simply not be able to chown() files, and will be bound by permissions as normal. I also don't think we need the same security isolation for unprivileged usage. IOW, I would simply like virtiofsd to be capable of simply disabling its use of namespaces, and any other features which are forbidden to non-root users. If guest operations on files result in EPERM that is fine. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
https://bugzilla.redhat.com/show_bug.cgi?id=1694166
Introduced by QEMU commit a43efa34c7d7b628cbf1ec0fe60043e5c91043ea
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- po/POTFILES.in | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/qemu_domain.c | 5 +- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_extdevice.c | 20 ++- src/qemu/qemu_virtiofs.c | 290 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_virtiofs.h | 38 +++++ tests/qemuxml2argvtest.c | 11 ++ 8 files changed, 366 insertions(+), 3 deletions(-) create mode 100644 src/qemu/qemu_virtiofs.c create mode 100644 src/qemu/qemu_virtiofs.h
diff --git a/po/POTFILES.in b/po/POTFILES.in index c18e21615f..813fb24199 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -168,6 +168,7 @@ @SRCDIR@/src/qemu/qemu_tpm.c @SRCDIR@/src/qemu/qemu_vhost_user.c @SRCDIR@/src/qemu/qemu_vhost_user_gpu.c +@SRCDIR@/src/qemu/qemu_virtiofs.c @SRCDIR@/src/remote/remote_daemon.c @SRCDIR@/src/remote/remote_daemon_config.c @SRCDIR@/src/remote/remote_daemon_dispatch.c diff --git a/src/qemu/Makefile.inc.am b/src/qemu/Makefile.inc.am index d04a87e659..7a205b4da6 100644 --- a/src/qemu/Makefile.inc.am +++ b/src/qemu/Makefile.inc.am @@ -67,6 +67,8 @@ QEMU_DRIVER_SOURCES = \ qemu/qemu_vhost_user.h \ qemu/qemu_vhost_user_gpu.c \ qemu/qemu_vhost_user_gpu.h \ + qemu/qemu_virtiofs.c \ + qemu/qemu_virtiofs.h \ qemu/qemu_checkpoint.c \ qemu/qemu_checkpoint.h \ qemu/qemu_backup.c \ diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b5d5812ff8..3064c33ca8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1440,8 +1440,11 @@ qemuDomainFSPrivateNew(void)
static void -qemuDomainFSPrivateDispose(void *obj G_GNUC_UNUSED) +qemuDomainFSPrivateDispose(void *obj) { + qemuDomainFSPrivatePtr priv = obj; + + g_free(priv->vhostuser_fs_sock); }
How about adding close the logfile fd here? That is because virtlogd keeps opening the fd even after the guest is shutdowned. # virsh shutdown guest Domain guest is being shutdown # virsh list Id Name State -------------------- # virsh start guest error: Failed to start domain guest error: Cannot open log file: '/var/log/libvirt/qemu/guest-fs0-virtiofsd.log': Device or resource busy # lsof | grep /var/log/libvirt/qemu/guest-fs0-virtiofsd.log virtlogd 5133 root 19w REG 253,0 0 35189059 /var/log/libvirt/qemu/guest-fs0-virtiofsd.log # Sample patch to close logfile fd: diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b3b5cb2054..7d5b7775fa 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1439,6 +1439,7 @@ qemuDomainFSPrivateDispose(void *obj) qemuDomainFSPrivatePtr priv = obj; g_free(priv->vhostuser_fs_sock); + VIR_FORCE_CLOSE(priv->logfd); } static virClassPtr qemuDomainVideoPrivateClass; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index 83150e4e6d..e4b3ac471f 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -569,6 +569,7 @@ struct _qemuDomainFSPrivate { virObject parent; char *vhostuser_fs_sock; + int logfd; }; diff --git a/src/qemu/qemu_virtiofs.c b/src/qemu/qemu_virtiofs.c index 4ea8f23fd5..17532bac20 100644 --- a/src/qemu/qemu_virtiofs.c +++ b/src/qemu/qemu_virtiofs.c @@ -217,7 +217,7 @@ qemuVirtioFSStart(virLogManagerPtr logManager, goto cleanup; rc = virCommandRun(cmd, NULL); - logfd = -1; + QEMU_DOMAIN_FS_PRIVATE(fs)->logfd = logfd; if (rc < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", -- Thanks, Masa
static virClassPtr qemuDomainVideoPrivateClass; diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index c581b3a162..83150e4e6d 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -568,7 +568,7 @@ typedef qemuDomainFSPrivate *qemuDomainFSPrivatePtr; struct _qemuDomainFSPrivate { virObject parent;
- int dummy; + char *vhostuser_fs_sock; };
diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 7f3bb104d9..5103d4921c 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -20,11 +20,13 @@
#include <config.h>
+#include "qemu_command.h" #include "qemu_extdevice.h" #include "qemu_vhost_user_gpu.h" #include "qemu_domain.h" #include "qemu_tpm.h" #include "qemu_slirp.h" +#include "qemu_virtiofs.h"
#include "viralloc.h" #include "virlog.h" @@ -153,7 +155,7 @@ qemuExtDevicesCleanupHost(virQEMUDriverPtr driver, int qemuExtDevicesStart(virQEMUDriverPtr driver, virDomainObjPtr vm, - virLogManagerPtr logManager G_GNUC_UNUSED, + virLogManagerPtr logManager, bool incomingMigration) { virDomainDefPtr def = vm->def; @@ -183,6 +185,15 @@ qemuExtDevicesStart(virQEMUDriverPtr driver, return -1; }
+ for (i = 0; i < def->nfss; i++) { + virDomainFSDefPtr fs = def->fss[i]; + + if (fs->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + if (qemuVirtioFSStart(logManager, driver, vm, fs) < 0) + return -1; + } + } + return 0; }
@@ -214,6 +225,13 @@ qemuExtDevicesStop(virQEMUDriverPtr driver, if (slirp) qemuSlirpStop(slirp, vm, driver, net, false); } + + for (i = 0; i < def->nfss; i++) { + virDomainFSDefPtr fs = def->fss[i]; + + if (fs->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) + qemuVirtioFSStop(driver, vm, fs); + } }
diff --git a/src/qemu/qemu_virtiofs.c b/src/qemu/qemu_virtiofs.c new file mode 100644 index 0000000000..4aa8eaed2c --- /dev/null +++ b/src/qemu/qemu_virtiofs.c @@ -0,0 +1,290 @@ +/* + * qemu_virtiofs.c: virtiofs support + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#include <config.h> + +#include <sys/types.h> +#include <sys/stat.h> +#include <fcntl.h> + +#include "logging/log_manager.h" +#include "virlog.h" +#include "qemu_command.h" +#include "qemu_conf.h" +#include "qemu_extdevice.h" +#include "qemu_security.h" +#include "qemu_virtiofs.h" +#include "virpidfile.h" + +#define VIR_FROM_THIS VIR_FROM_QEMU + + +char * +qemuVirtioFSCreatePidFilename(virQEMUDriverConfigPtr cfg, + const virDomainDef *def, + const char *alias) +{ + g_autofree char *shortName = NULL; + g_autofree char *name = NULL; + + if (!(shortName = virDomainDefGetShortName(def))) + return NULL; + + name = g_strdup_printf("%s-%s-virtiofsd", shortName, alias); + + return virPidFileBuildPath(cfg->stateDir, name); +} + + +char * +qemuVirtioFSCreateSocketFilename(virDomainObjPtr vm, + const char *alias) +{ + qemuDomainObjPrivatePtr priv = vm->privateData; + + return virFileBuildPath(priv->libDir, alias, "-virtiofsd.sock"); +} + + +static char * +qemuVirtioFSCreateLogFilename(virQEMUDriverConfigPtr cfg, + const virDomainDef *def, + const char *alias) +{ + g_autofree char *name = NULL; + + name = g_strdup_printf("%s-%s", def->name, alias); + + return virFileBuildPath(cfg->logDir, name, "-virtiofsd.log"); +} + + +static int +qemuVirtioFSOpenChardev(virQEMUDriverPtr driver, + virDomainObjPtr vm, + const char *socket_path) +{ + virDomainChrSourceDefPtr chrdev = virDomainChrSourceDefNew(NULL); + virDomainChrDef chr = { .source = chrdev }; + VIR_AUTOCLOSE fd = -1; + int ret = -1; + + chrdev->type = VIR_DOMAIN_CHR_TYPE_UNIX; + chrdev->data.nix.listen = true; + chrdev->data.nix.path = g_strdup(socket_path); + + if (qemuSecuritySetDaemonSocketLabel(driver->securityManager, vm->def) < 0) + goto cleanup; + fd = qemuOpenChrChardevUNIXSocket(chrdev); + if (fd < 0) { + ignore_value(qemuSecurityClearSocketLabel(driver->securityManager, vm->def)); + goto cleanup; + } + if (qemuSecurityClearSocketLabel(driver->securityManager, vm->def) < 0) + goto cleanup; + + if (qemuSecuritySetChardevLabel(driver, vm, &chr) < 0) + goto cleanup; + + ret = fd; + fd = -1; + + cleanup: + virObjectUnref(chrdev); + return ret; +} + +static virCommandPtr +qemuVirtioFSBuildCommandLine(virQEMUDriverConfigPtr cfg, + virDomainFSDefPtr fs, + int *fd) +{ + g_autoptr(virCommand) cmd = NULL; + g_auto(virBuffer) opts = VIR_BUFFER_INITIALIZER; + + if (!(cmd = virCommandNew(fs->binary))) + return NULL; + + virCommandAddArgFormat(cmd, "--fd=%d", *fd); + virCommandPassFD(cmd, *fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT); + *fd = -1; + + virCommandAddArg(cmd, "-o"); + virBufferAsprintf(&opts, "source=%s,", fs->src->path); + if (fs->cache) + virBufferAsprintf(&opts, "cache=%s,", virDomainFSCacheModeTypeToString(fs->cache)); + + if (fs->xattr == VIR_TRISTATE_SWITCH_ON) + virBufferAddLit(&opts, "xattr,"); + else if (fs->xattr == VIR_TRISTATE_SWITCH_OFF) + virBufferAddLit(&opts, "no_xattr,"); + + if (fs->flock == VIR_TRISTATE_SWITCH_ON) + virBufferAddLit(&opts, "flock,"); + else if (fs->flock == VIR_TRISTATE_SWITCH_OFF) + virBufferAddLit(&opts, "no_flock,"); + + if (fs->posix_lock == VIR_TRISTATE_SWITCH_ON) + virBufferAddLit(&opts, "posix_lock,"); + else if (fs->posix_lock == VIR_TRISTATE_SWITCH_OFF) + virBufferAddLit(&opts, "no_posix_lock,"); + + virBufferTrim(&opts, ",", -1); + + virCommandAddArgBuffer(cmd, &opts); + if (cfg->virtiofsdDebug) + virCommandAddArg(cmd, "-d"); + + return g_steal_pointer(&cmd); +} + +int +qemuVirtioFSStart(virLogManagerPtr logManager, + virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autoptr(virCommand) cmd = NULL; + g_autofree char *socket_path = NULL; + g_autofree char *pidfile = NULL; + g_autofree char *logpath = NULL; + pid_t pid = (pid_t) -1; + VIR_AUTOCLOSE fd = -1; + VIR_AUTOCLOSE logfd = -1; + int ret = -1; + int rc; + + if (!(pidfile = qemuVirtioFSCreatePidFilename(cfg, vm->def, fs->info.alias))) + goto cleanup; + + if (!(socket_path = qemuVirtioFSCreateSocketFilename(vm, fs->info.alias))) + goto cleanup; + + if ((fd = qemuVirtioFSOpenChardev(driver, vm, socket_path)) < 0) + goto cleanup; + + logpath = qemuVirtioFSCreateLogFilename(cfg, vm->def, fs->info.alias); + + if (cfg->stdioLogD) { + if ((logfd = virLogManagerDomainOpenLogFile(logManager, + "qemu", + vm->def->uuid, + vm->def->name, + logpath, + 0, + NULL, NULL)) < 0) + goto cleanup; + } else { + if ((logfd = open(logpath, O_WRONLY | O_CREAT | O_APPEND, S_IRUSR | S_IWUSR)) < 0) { + virReportSystemError(errno, _("failed to create logfile %s"), + logpath); + goto cleanup; + } + if (virSetCloseExec(logfd) < 0) { + virReportSystemError(errno, _("failed to set close-on-exec flag on %s"), + logpath); + goto error; + } + } + + if (!(cmd = qemuVirtioFSBuildCommandLine(cfg, fs, &fd))) + goto cleanup; + + virCommandSetPidFile(cmd, pidfile); + virCommandSetOutputFD(cmd, &logfd); + virCommandSetErrorFD(cmd, &logfd); + virCommandNonblockingFDs(cmd); + virCommandDaemonize(cmd); + + if (qemuExtDeviceLogCommand(driver, vm, cmd, "virtiofsd") < 0) + goto cleanup; + + rc = virCommandRun(cmd, NULL); + logfd = -1; + + if (rc < 0) { + virReportError(VIR_ERR_INTERNAL_ERROR, "%s", + _("Could not start 'virtiofsd'")); + goto error; + } + + rc = virPidFileReadPath(pidfile, &pid); + if (rc < 0) { + virReportSystemError(-rc, + _("Unable to read virtiofsd pidfile '%s'"), + pidfile); + goto error; + } + + if (virProcessKill(pid, 0) != 0) { + virReportSystemError(errno, "%s", + _("virtiofsd died unexpectedly")); + goto error; + } + + QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock = g_steal_pointer(&socket_path); + ret = 0; + + cleanup: + if (socket_path) + unlink(socket_path); + return ret; + + error: + if (pid != -1) + virProcessKillPainfully(pid, true); + if (pidfile) + unlink(pidfile); + goto cleanup; +} + + +void +qemuVirtioFSStop(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs) +{ + g_autoptr(virQEMUDriverConfig) cfg = virQEMUDriverGetConfig(driver); + g_autofree char *pidfile = NULL; + virErrorPtr orig_err; + pid_t pid = -1; + int rc; + + virErrorPreserveLast(&orig_err); + + if (!(pidfile = qemuVirtioFSCreatePidFilename(cfg, vm->def, fs->info.alias))) + goto cleanup; + + rc = virPidFileReadPathIfAlive(pidfile, &pid, NULL); + if (rc >= 0 && pid != (pid_t) -1) + virProcessKillPainfully(pid, true); + + if (unlink(pidfile) < 0 && + errno != ENOENT) { + virReportSystemError(errno, + _("Unable to remove stale pidfile %s"), + pidfile); + } + + if (QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock) + unlink(QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock); + + cleanup: + virErrorRestore(&orig_err); +} diff --git a/src/qemu/qemu_virtiofs.h b/src/qemu/qemu_virtiofs.h new file mode 100644 index 0000000000..49db807b19 --- /dev/null +++ b/src/qemu/qemu_virtiofs.h @@ -0,0 +1,38 @@ +/* + * qemu_virtiofs.h: virtiofs support + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library. If not, see + * <http://www.gnu.org/licenses/>. + */ + +#pragma once + + +char * +qemuVirtioFSCreatePidFilename(virQEMUDriverConfigPtr cfg, + const virDomainDef *def, + const char *alias); +char * +qemuVirtioFSCreateSocketFilename(virDomainObjPtr vm, + const char *alias); + +int +qemuVirtioFSStart(virLogManagerPtr logManager, + virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs); +void +qemuVirtioFSStop(virQEMUDriverPtr driver, + virDomainObjPtr vm, + virDomainFSDefPtr fs); diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index a36183bf34..f268d336a6 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -496,6 +496,17 @@ testCompareXMLToArgv(const void *data) } }
+ for (i = 0; i < vm->def->nfss; i++) { + virDomainFSDefPtr fs = vm->def->fss[i]; + char *s; + + if (fs->fsdriver != VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) + continue; + + s = g_strdup_printf("/tmp/lib/domain--1-guest/fs%zu.vhost-fs.sock", i); + QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock = s; + } + if (vm->def->vsock) { virDomainVsockDefPtr vsock = vm->def->vsock; qemuDomainVsockPrivatePtr vsockPriv = -- 2.21.0

On Mon, Feb 03, 2020 at 10:16:29PM -0500, Masayoshi Mizuma wrote:
On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
https://bugzilla.redhat.com/show_bug.cgi?id=1694166
Introduced by QEMU commit a43efa34c7d7b628cbf1ec0fe60043e5c91043ea
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- po/POTFILES.in | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/qemu_domain.c | 5 +- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_extdevice.c | 20 ++- src/qemu/qemu_virtiofs.c | 290 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_virtiofs.h | 38 +++++ tests/qemuxml2argvtest.c | 11 ++ 8 files changed, 366 insertions(+), 3 deletions(-) create mode 100644 src/qemu/qemu_virtiofs.c create mode 100644 src/qemu/qemu_virtiofs.h
diff --git a/po/POTFILES.in b/po/POTFILES.in index c18e21615f..813fb24199 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -168,6 +168,7 @@ @SRCDIR@/src/qemu/qemu_tpm.c @SRCDIR@/src/qemu/qemu_vhost_user.c @SRCDIR@/src/qemu/qemu_vhost_user_gpu.c +@SRCDIR@/src/qemu/qemu_virtiofs.c @SRCDIR@/src/remote/remote_daemon.c @SRCDIR@/src/remote/remote_daemon_config.c @SRCDIR@/src/remote/remote_daemon_dispatch.c diff --git a/src/qemu/Makefile.inc.am b/src/qemu/Makefile.inc.am index d04a87e659..7a205b4da6 100644 --- a/src/qemu/Makefile.inc.am +++ b/src/qemu/Makefile.inc.am @@ -67,6 +67,8 @@ QEMU_DRIVER_SOURCES = \ qemu/qemu_vhost_user.h \ qemu/qemu_vhost_user_gpu.c \ qemu/qemu_vhost_user_gpu.h \ + qemu/qemu_virtiofs.c \ + qemu/qemu_virtiofs.h \ qemu/qemu_checkpoint.c \ qemu/qemu_checkpoint.h \ qemu/qemu_backup.c \ diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b5d5812ff8..3064c33ca8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1440,8 +1440,11 @@ qemuDomainFSPrivateNew(void)
static void -qemuDomainFSPrivateDispose(void *obj G_GNUC_UNUSED) +qemuDomainFSPrivateDispose(void *obj) { + qemuDomainFSPrivatePtr priv = obj; + + g_free(priv->vhostuser_fs_sock); }
How about adding close the logfile fd here? That is because virtlogd keeps opening the fd even after the guest is
Thanks for catching that. Currently libvirtd does not access the log after starting up the virtiofsd so all that is needed is to close it in this function: diff --git a/src/qemu/qemu_virtiofs.c b/src/qemu/qemu_virtiofs.c index e0683d6468..067cef7e22 100644 --- a/src/qemu/qemu_virtiofs.c +++ b/src/qemu/qemu_virtiofs.c @@ -217,7 +217,6 @@ qemuVirtioFSStart(virLogManagerPtr logManager, goto cleanup; rc = virCommandRun(cmd, NULL); - logfd = -1; if (rc < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s", (and the VIR_AUTOCLOSE attribute will take care of closing it when it goes out of scope) Jano

On Tue, Feb 04, 2020 at 05:36:23PM +0100, Ján Tomko wrote:
On Mon, Feb 03, 2020 at 10:16:29PM -0500, Masayoshi Mizuma wrote:
On Thu, Jan 30, 2020 at 06:06:26PM +0100, Ján Tomko wrote:
Start virtiofsd for each <filesystem> device using it.
Pre-create the socket for communication with QEMU and pass it to virtiofsd.
Note that virtiofsd needs to run as root.
https://bugzilla.redhat.com/show_bug.cgi?id=1694166
Introduced by QEMU commit a43efa34c7d7b628cbf1ec0fe60043e5c91043ea
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- po/POTFILES.in | 1 + src/qemu/Makefile.inc.am | 2 + src/qemu/qemu_domain.c | 5 +- src/qemu/qemu_domain.h | 2 +- src/qemu/qemu_extdevice.c | 20 ++- src/qemu/qemu_virtiofs.c | 290 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_virtiofs.h | 38 +++++ tests/qemuxml2argvtest.c | 11 ++ 8 files changed, 366 insertions(+), 3 deletions(-) create mode 100644 src/qemu/qemu_virtiofs.c create mode 100644 src/qemu/qemu_virtiofs.h
diff --git a/po/POTFILES.in b/po/POTFILES.in index c18e21615f..813fb24199 100644 --- a/po/POTFILES.in +++ b/po/POTFILES.in @@ -168,6 +168,7 @@ @SRCDIR@/src/qemu/qemu_tpm.c @SRCDIR@/src/qemu/qemu_vhost_user.c @SRCDIR@/src/qemu/qemu_vhost_user_gpu.c +@SRCDIR@/src/qemu/qemu_virtiofs.c @SRCDIR@/src/remote/remote_daemon.c @SRCDIR@/src/remote/remote_daemon_config.c @SRCDIR@/src/remote/remote_daemon_dispatch.c diff --git a/src/qemu/Makefile.inc.am b/src/qemu/Makefile.inc.am index d04a87e659..7a205b4da6 100644 --- a/src/qemu/Makefile.inc.am +++ b/src/qemu/Makefile.inc.am @@ -67,6 +67,8 @@ QEMU_DRIVER_SOURCES = \ qemu/qemu_vhost_user.h \ qemu/qemu_vhost_user_gpu.c \ qemu/qemu_vhost_user_gpu.h \ + qemu/qemu_virtiofs.c \ + qemu/qemu_virtiofs.h \ qemu/qemu_checkpoint.c \ qemu/qemu_checkpoint.h \ qemu/qemu_backup.c \ diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index b5d5812ff8..3064c33ca8 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -1440,8 +1440,11 @@ qemuDomainFSPrivateNew(void)
static void -qemuDomainFSPrivateDispose(void *obj G_GNUC_UNUSED) +qemuDomainFSPrivateDispose(void *obj) { + qemuDomainFSPrivatePtr priv = obj; + + g_free(priv->vhostuser_fs_sock); }
How about adding close the logfile fd here? That is because virtlogd keeps opening the fd even after the guest is
Thanks for catching that.
Currently libvirtd does not access the log after starting up the virtiofsd so all that is needed is to close it in this function:
diff --git a/src/qemu/qemu_virtiofs.c b/src/qemu/qemu_virtiofs.c index e0683d6468..067cef7e22 100644 --- a/src/qemu/qemu_virtiofs.c +++ b/src/qemu/qemu_virtiofs.c @@ -217,7 +217,6 @@ qemuVirtioFSStart(virLogManagerPtr logManager, goto cleanup;
rc = virCommandRun(cmd, NULL); - logfd = -1;
if (rc < 0) { virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
(and the VIR_AUTOCLOSE attribute will take care of closing it when it goes out of scope)
Great, it works! - Masa

Look into /usr/share/qemu/vhost-user to see whether we can find a suitable virtiofsd binary, in case the user did not provide one in the domain XML. Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_extdevice.c | 9 +++++++++ src/qemu/qemu_vhost_user.c | 40 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_vhost_user.h | 4 ++++ src/qemu/qemu_virtiofs.c | 12 ++++++++++++ src/qemu/qemu_virtiofs.h | 4 ++++ 5 files changed, 69 insertions(+) diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 5103d4921c..226d51a468 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -105,6 +105,15 @@ qemuExtDevicesPrepareDomain(virQEMUDriverPtr driver, } } + for (i = 0; i < vm->def->nfss; i++) { + virDomainFSDefPtr fs = vm->def->fss[i]; + + if (fs->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + if (qemuVirtioFSPrepareDomain(driver, fs) < 0) + return -1; + } + } + return ret; } diff --git a/src/qemu/qemu_vhost_user.c b/src/qemu/qemu_vhost_user.c index 4c25b30664..d437fd1bd5 100644 --- a/src/qemu/qemu_vhost_user.c +++ b/src/qemu/qemu_vhost_user.c @@ -416,3 +416,43 @@ qemuVhostUserFillDomainGPU(virQEMUDriverPtr driver, VIR_FREE(vus); return ret; } + + +int +qemuVhostUserFillDomainFS(virQEMUDriverPtr driver, + virDomainFSDefPtr fs) +{ + qemuVhostUserPtr *vus = NULL; + ssize_t nvus = 0; + ssize_t i; + int ret = -1; + + if ((nvus = qemuVhostUserFetchParsedConfigs(driver->privileged, + &vus, NULL)) < 0) + goto end; + + for (i = 0; i < nvus; i++) { + qemuVhostUserPtr vu = vus[i]; + + if (vu->type != QEMU_VHOST_USER_TYPE_FS) + continue; + + g_free(fs->binary); + fs->binary = g_strdup(vu->binary); + break; + } + + if (i == nvus) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Unable to find a satisfying virtiofsd")); + goto end; + } + + ret = 0; + + end: + for (i = 0; i < nvus; i++) + qemuVhostUserFree(vus[i]); + g_free(vus); + return ret; +} diff --git a/src/qemu/qemu_vhost_user.h b/src/qemu/qemu_vhost_user.h index 369ba00caa..e505c8a473 100644 --- a/src/qemu/qemu_vhost_user.h +++ b/src/qemu/qemu_vhost_user.h @@ -45,3 +45,7 @@ qemuVhostUserFetchConfigs(char ***configs, int qemuVhostUserFillDomainGPU(virQEMUDriverPtr driver, virDomainVideoDefPtr video); + +int +qemuVhostUserFillDomainFS(virQEMUDriverPtr driver, + virDomainFSDefPtr fs); diff --git a/src/qemu/qemu_virtiofs.c b/src/qemu/qemu_virtiofs.c index 4aa8eaed2c..4ea8f23fd5 100644 --- a/src/qemu/qemu_virtiofs.c +++ b/src/qemu/qemu_virtiofs.c @@ -28,6 +28,7 @@ #include "qemu_conf.h" #include "qemu_extdevice.h" #include "qemu_security.h" +#include "qemu_vhost_user.h" #include "qemu_virtiofs.h" #include "virpidfile.h" @@ -288,3 +289,14 @@ qemuVirtioFSStop(virQEMUDriverPtr driver, cleanup: virErrorRestore(&orig_err); } + + +int +qemuVirtioFSPrepareDomain(virQEMUDriverPtr driver, + virDomainFSDefPtr fs) +{ + if (fs->binary) + return 0; + + return qemuVhostUserFillDomainFS(driver, fs); +} diff --git a/src/qemu/qemu_virtiofs.h b/src/qemu/qemu_virtiofs.h index 49db807b19..06b05510c7 100644 --- a/src/qemu/qemu_virtiofs.h +++ b/src/qemu/qemu_virtiofs.h @@ -36,3 +36,7 @@ void qemuVirtioFSStop(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainFSDefPtr fs); + +int +qemuVirtioFSPrepareDomain(virQEMUDriverPtr driver, + virDomainFSDefPtr fs); -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:27PM +0100, Ján Tomko wrote:
Look into /usr/share/qemu/vhost-user to see whether we can find a suitable virtiofsd binary, in case the user did not provide one in the domain XML.
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_extdevice.c | 9 +++++++++ src/qemu/qemu_vhost_user.c | 40 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_vhost_user.h | 4 ++++ src/qemu/qemu_virtiofs.c | 12 ++++++++++++ src/qemu/qemu_virtiofs.h | 4 ++++ 5 files changed, 69 insertions(+)
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

Hi Ján, On Thu, Jan 30, 2020 at 06:06:27PM +0100, Ján Tomko wrote:
Look into /usr/share/qemu/vhost-user to see whether we can find a suitable virtiofsd binary, in case the user did not provide one in the domain XML.
The json file (50-qemu-virtiofsd.json) may not be located in /usr/share/qemu/vhost-user in case qemu is built with '--with-confsuffix' option like as: '--with-confsuffix=/qemu-kvm'. I think libvirt searches /usr/share/qemu/vhost-user because QEMU_SYSTEM_LOCATION is hardcoded: #define QEMU_SYSTEM_LOCATION PREFIX "/share/qemu" Why don't we adding such confsuffix option to libvirt and changing QEMU_SYSTEM_LOCATION as following? #define QEMU_SYSTEM_LOCATION PREFIX "/share" CONFSUFFIX Thanks, Masa
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_extdevice.c | 9 +++++++++ src/qemu/qemu_vhost_user.c | 40 ++++++++++++++++++++++++++++++++++++++ src/qemu/qemu_vhost_user.h | 4 ++++ src/qemu/qemu_virtiofs.c | 12 ++++++++++++ src/qemu/qemu_virtiofs.h | 4 ++++ 5 files changed, 69 insertions(+)
diff --git a/src/qemu/qemu_extdevice.c b/src/qemu/qemu_extdevice.c index 5103d4921c..226d51a468 100644 --- a/src/qemu/qemu_extdevice.c +++ b/src/qemu/qemu_extdevice.c @@ -105,6 +105,15 @@ qemuExtDevicesPrepareDomain(virQEMUDriverPtr driver, } }
+ for (i = 0; i < vm->def->nfss; i++) { + virDomainFSDefPtr fs = vm->def->fss[i]; + + if (fs->fsdriver == VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS) { + if (qemuVirtioFSPrepareDomain(driver, fs) < 0) + return -1; + } + } + return ret; }
diff --git a/src/qemu/qemu_vhost_user.c b/src/qemu/qemu_vhost_user.c index 4c25b30664..d437fd1bd5 100644 --- a/src/qemu/qemu_vhost_user.c +++ b/src/qemu/qemu_vhost_user.c @@ -416,3 +416,43 @@ qemuVhostUserFillDomainGPU(virQEMUDriverPtr driver, VIR_FREE(vus); return ret; } + + +int +qemuVhostUserFillDomainFS(virQEMUDriverPtr driver, + virDomainFSDefPtr fs) +{ + qemuVhostUserPtr *vus = NULL; + ssize_t nvus = 0; + ssize_t i; + int ret = -1; + + if ((nvus = qemuVhostUserFetchParsedConfigs(driver->privileged, + &vus, NULL)) < 0) + goto end; + + for (i = 0; i < nvus; i++) { + qemuVhostUserPtr vu = vus[i]; + + if (vu->type != QEMU_VHOST_USER_TYPE_FS) + continue; + + g_free(fs->binary); + fs->binary = g_strdup(vu->binary); + break; + } + + if (i == nvus) { + virReportError(VIR_ERR_OPERATION_FAILED, "%s", + _("Unable to find a satisfying virtiofsd")); + goto end; + } + + ret = 0; + + end: + for (i = 0; i < nvus; i++) + qemuVhostUserFree(vus[i]); + g_free(vus); + return ret; +} diff --git a/src/qemu/qemu_vhost_user.h b/src/qemu/qemu_vhost_user.h index 369ba00caa..e505c8a473 100644 --- a/src/qemu/qemu_vhost_user.h +++ b/src/qemu/qemu_vhost_user.h @@ -45,3 +45,7 @@ qemuVhostUserFetchConfigs(char ***configs, int qemuVhostUserFillDomainGPU(virQEMUDriverPtr driver, virDomainVideoDefPtr video); + +int +qemuVhostUserFillDomainFS(virQEMUDriverPtr driver, + virDomainFSDefPtr fs); diff --git a/src/qemu/qemu_virtiofs.c b/src/qemu/qemu_virtiofs.c index 4aa8eaed2c..4ea8f23fd5 100644 --- a/src/qemu/qemu_virtiofs.c +++ b/src/qemu/qemu_virtiofs.c @@ -28,6 +28,7 @@ #include "qemu_conf.h" #include "qemu_extdevice.h" #include "qemu_security.h" +#include "qemu_vhost_user.h" #include "qemu_virtiofs.h" #include "virpidfile.h"
@@ -288,3 +289,14 @@ qemuVirtioFSStop(virQEMUDriverPtr driver, cleanup: virErrorRestore(&orig_err); } + + +int +qemuVirtioFSPrepareDomain(virQEMUDriverPtr driver, + virDomainFSDefPtr fs) +{ + if (fs->binary) + return 0; + + return qemuVhostUserFillDomainFS(driver, fs); +} diff --git a/src/qemu/qemu_virtiofs.h b/src/qemu/qemu_virtiofs.h index 49db807b19..06b05510c7 100644 --- a/src/qemu/qemu_virtiofs.h +++ b/src/qemu/qemu_virtiofs.h @@ -36,3 +36,7 @@ void qemuVirtioFSStop(virQEMUDriverPtr driver, virDomainObjPtr vm, virDomainFSDefPtr fs); + +int +qemuVirtioFSPrepareDomain(virQEMUDriverPtr driver, + virDomainFSDefPtr fs); -- 2.21.0

Format the 'vhost-user-fs' device on the QEMU command line. This device provides shared file system access using the FUSE protocol carried over virtio. The actual file server is implemented in an external vhost-user-fs device backend process. https://bugzilla.redhat.com/show_bug.cgi?id=1694166 Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_command.c | 45 +++++++++++++++++- ...vhost-user-fs-fd-memory.x86_64-latest.args | 39 +++++++++++++++ ...vhost-user-fs-hugepages.x86_64-latest.args | 47 +++++++++++++++++++ tests/qemuxml2argvtest.c | 3 ++ 4 files changed, 132 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-fd-memory.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-hugepages.x86_64-latest.args diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c index 0d7cd42912..8914b4c5be 100644 --- a/src/qemu/qemu_command.c +++ b/src/qemu/qemu_command.c @@ -2566,6 +2566,45 @@ qemuBuildDisksCommandLine(virCommandPtr cmd, } +static int +qemuBuildVHostUserFsCommandLine(virCommandPtr cmd, + virDomainFSDef *fs, + const virDomainDef *def, + qemuDomainObjPrivatePtr priv) +{ + g_autofree char *chardev_alias = NULL; + g_auto(virBuffer) opt = VIR_BUFFER_INITIALIZER; + + chardev_alias = g_strdup_printf("chr-vu-%s", fs->info.alias); + + virCommandAddArg(cmd, "-chardev"); + virBufferAddLit(&opt, "socket"); + virBufferAsprintf(&opt, ",id=%s", chardev_alias); + virBufferEscapeString(&opt, ",path=%s", QEMU_DOMAIN_FS_PRIVATE(fs)->vhostuser_fs_sock); + virCommandAddArg(cmd, virBufferContentAndReset(&opt)); + + virCommandAddArg(cmd, "-device"); + + if (qemuBuildVirtioDevStr(&opt, "vhost-user-fs", priv->qemuCaps, + VIR_DOMAIN_DEVICE_FS, fs) < 0) + return -1; + + virBufferAsprintf(&opt, ",chardev=%s", chardev_alias); + if (fs->queue_size) + virBufferAsprintf(&opt, ",queue-size=%llu", fs->queue_size); + virBufferAddLit(&opt, ",tag="); + virQEMUBuildBufferEscapeComma(&opt, fs->dst); + if (qemuBuildVirtioOptionsStr(&opt, fs->virtio, priv->qemuCaps) < 0) + return -1; + + if (qemuBuildDeviceAddressStr(&opt, def, &fs->info, priv->qemuCaps) < 0) + return -1; + + virCommandAddArg(cmd, virBufferContentAndReset(&opt)); + return 0; +} + + static char * qemuBuildFSStr(virDomainFSDefPtr fs) { @@ -2658,7 +2697,7 @@ static int qemuBuildFilesystemCommandLine(virCommandPtr cmd, const virDomainDef *def, virQEMUCapsPtr qemuCaps, - qemuDomainObjPrivatePtr priv G_GNUC_UNUSED) + qemuDomainObjPrivatePtr priv) { size_t i; @@ -2673,7 +2712,9 @@ qemuBuildFilesystemCommandLine(virCommandPtr cmd, break; case VIR_DOMAIN_FS_DRIVER_TYPE_VIRTIOFS: - /* TODO: vhost-user-fs-pci */ + /* vhost-user-fs-pci */ + if (qemuBuildVHostUserFsCommandLine(cmd, def->fss[i], def, priv) < 0) + return -1; break; case VIR_DOMAIN_FS_DRIVER_TYPE_LOOP: diff --git a/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.x86_64-latest.args b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.x86_64-latest.args new file mode 100644 index 0000000000..a7df45a7f0 --- /dev/null +++ b/tests/qemuxml2argvdata/vhost-user-fs-fd-memory.x86_64-latest.args @@ -0,0 +1,39 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-guest \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-guest/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-guest/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-guest/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name guest=guest,debug-threads=on \ +-S \ +-object secret,id=masterKey0,format=raw,\ +file=/tmp/lib/domain--1-guest/master-key.aes \ +-machine pc,accel=kvm,usb=off,dump-guest-core=off \ +-cpu qemu64 \ +-m 14336 \ +-overcommit mem-lock=off \ +-smp 2,sockets=2,cores=1,threads=1 \ +-object memory-backend-file,id=ram-node0,\ +mem-path=/var/lib/libvirt/qemu/ram/libvirt/qemu/-1-guest/ram-node0,share=yes,\ +size=15032385536 \ +-numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \ +-uuid 126f2720-6f8e-45ab-a886-ec9277079a67 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-no-acpi \ +-boot strict=on \ +-chardev socket,id=chr-vu-fs0,path=/tmp/lib/domain--1-guest/fs0.vhost-fs.sock \ +-device vhost-user-fs-pci,chardev=chr-vu-fs0,queue-size=1024,tag=mount_tag,\ +bus=pci.0,addr=0x2 \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\ +resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxml2argvdata/vhost-user-fs-hugepages.x86_64-latest.args b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.x86_64-latest.args new file mode 100644 index 0000000000..39190b8d3e --- /dev/null +++ b/tests/qemuxml2argvdata/vhost-user-fs-hugepages.x86_64-latest.args @@ -0,0 +1,47 @@ +LC_ALL=C \ +PATH=/bin \ +HOME=/tmp/lib/domain--1-guest \ +USER=test \ +LOGNAME=test \ +XDG_DATA_HOME=/tmp/lib/domain--1-guest/.local/share \ +XDG_CACHE_HOME=/tmp/lib/domain--1-guest/.cache \ +XDG_CONFIG_HOME=/tmp/lib/domain--1-guest/.config \ +QEMU_AUDIO_DRV=none \ +/usr/bin/qemu-system-x86_64 \ +-name guest=guest,debug-threads=on \ +-S \ +-object secret,id=masterKey0,format=raw,\ +file=/tmp/lib/domain--1-guest/master-key.aes \ +-machine q35,accel=tcg,usb=off,dump-guest-core=off \ +-cpu qemu64 \ +-m 2048 \ +-overcommit mem-lock=off \ +-smp 2,sockets=2,cores=1,threads=1 \ +-object memory-backend-file,id=ram-node0,prealloc=yes,\ +mem-path=/dev/hugepages2M/libvirt/qemu/-1-guest,share=yes,size=2147483648 \ +-numa node,nodeid=0,cpus=0-1,memdev=ram-node0 \ +-uuid 1ccfd97d-5eb4-478a-bbe6-88d254c16db7 \ +-display none \ +-no-user-config \ +-nodefaults \ +-chardev socket,id=charmonitor,fd=1729,server,nowait \ +-mon chardev=charmonitor,id=monitor,mode=control \ +-rtc base=utc \ +-no-shutdown \ +-boot strict=on \ +-device pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,\ +addr=0x1 \ +-device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \ +-device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \ +-device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \ +-blockdev '{"driver":"file","filename":"/var/lib/libvirt/images/guest.qcow2",\ +"node-name":"libvirt-1-storage","auto-read-only":true,"discard":"unmap"}' \ +-blockdev '{"node-name":"libvirt-1-format","read-only":false,"driver":"qcow2",\ +"file":"libvirt-1-storage"}' \ +-device virtio-blk-pci,scsi=off,bus=pci.4,addr=0x0,drive=libvirt-1-format,\ +id=virtio-disk0,bootindex=1 \ +-chardev socket,id=chr-vu-fs0,path=/tmp/lib/domain--1-guest/fs0.vhost-fs.sock \ +-device vhost-user-fs-pci,chardev=chr-vu-fs0,tag=mount_tag,bus=pci.1,addr=0x0 \ +-sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,\ +resourcecontrol=deny \ +-msg timestamp=on diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c index f268d336a6..5e65223a81 100644 --- a/tests/qemuxml2argvtest.c +++ b/tests/qemuxml2argvtest.c @@ -3053,6 +3053,9 @@ mymain(void) DO_TEST_CAPS_VER("launch-security-sev", "2.12.0"); + DO_TEST_CAPS_LATEST("vhost-user-fs-fd-memory"); + DO_TEST_CAPS_LATEST("vhost-user-fs-hugepages"); + DO_TEST("riscv64-virt", QEMU_CAPS_DEVICE_VIRTIO_MMIO); DO_TEST("riscv64-virt-pci", -- 2.21.0

On Thu, Jan 30, 2020 at 06:06:28PM +0100, Ján Tomko wrote:
Format the 'vhost-user-fs' device on the QEMU command line.
This device provides shared file system access using the FUSE protocol carried over virtio. The actual file server is implemented in an external vhost-user-fs device backend process.
https://bugzilla.redhat.com/show_bug.cgi?id=1694166
Signed-off-by: Ján Tomko <jtomko@redhat.com> --- src/qemu/qemu_command.c | 45 +++++++++++++++++- ...vhost-user-fs-fd-memory.x86_64-latest.args | 39 +++++++++++++++ ...vhost-user-fs-hugepages.x86_64-latest.args | 47 +++++++++++++++++++ tests/qemuxml2argvtest.c | 3 ++ 4 files changed, 132 insertions(+), 2 deletions(-) create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-fd-memory.x86_64-latest.args create mode 100644 tests/qemuxml2argvdata/vhost-user-fs-hugepages.x86_64-latest.args
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Jan 30, 2020 at 06:06:16PM +0100, Ján Tomko wrote:
https://bugzilla.redhat.com/show_bug.cgi?id=1694166 v1: https://www.redhat.com/archives/libvir-list/2019-November/msg00005.html v2: https://www.redhat.com/archives/libvir-list/2020-January/msg00980.html
A couple of general questions I've not found a clear answer to when reviewing the code. - What SELinux integration is done for the virtiofsd binary ? - Is virtiofsd placed into the QEMU cgroups ? Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
participants (4)
-
Daniel P. Berrangé
-
Ján Tomko
-
Masayoshi Mizuma
-
Stefan Hajnoczi