These patches introduce a support for NVMe disks into libvirt. Note that
even without them it is possible to use NVMe disks for your domains in
two ways:
1) <hostdev/> - This is regular PCI assignment with all the drawbacks
(no migration, no snapshots, ...)
2) <disk/> - Since NVMe disks are accessible via /dev/nvme* they can be
assigned to domains. Problem is, because qemu is accessing /dev/nvme*
the host kernel's storage stack is involved which adds significant
latency [1].
Solution to this problem is to combine 1) and 2) together:
- Bypass host kernel's storage stack by detaching the NVMe disk from the
host (and attaching it to VFIO driver), and
- Plug the NVMe disk into qemu's block layer so that all fancy features
can be supported.
On qemu command line this is done via:
-drive file.driver=nvme,file.device=0000:01:00.0,file.namespace=1,format=raw,\
if=none,id=drive-virtio-disk0 \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x4,drive=drive-virtio-disk0,\
id=virtio-disk0,bootindex=1 \
You can find my patches also on my github [2].
1:
https://www.linux-kvm.org/images/4/4c/Userspace_NVMe_driver_in_QEMU_-_Fam...
2:
https://github.com/zippy2/libvirt/commits/nvme
Michal Prívozník (31):
virHostdevPreparePCIDevices: Separate out function body
virHostdevReAttachPCIDevices: Separate out function body
virpcimock: Move actions checking one level up
Revert "virpcitest: Test virPCIDeviceDetach failure"
virpcimock: Create driver_override file in device dirs
virPCIDeviceAddressEqual: Fix const correctness
virPCIDeviceAddressAsString: Fix const correctness
virpci: Introduce virPCIDeviceAddressCopy
qemuDomainDeviceDefValidateDisk: Reorder some checks
schemas: Introduce disk type NVMe
conf: Format and parse NVMe type disk
util: Introduce virNVMeDevice module
virhostdev: Include virNVMeDevice module
virhostdevtest: Don't proceed to test cases if init failed
virhostdevtest: s/CHECK_LIST_COUNT/CHECK_PCI_LIST_COUNT/
virpcimock: Introduce NVMe driver and devices
virhostdevtest: Test virNVMeDevice assignment
qemu: prepare NVMe devices too
qemu: Take NVMe disks into account when calculating memlock limit
virstoragefile: Introduce virStorageSourceChainHasNVMe
domain_conf: Introduce virDomainDefHasNVMeDisk
qemu_domain: Separate VFIO code
qemu_domain: Introduce NVMe path getting helpers
qemu: Create NVMe disk in domain namespace
qemu: Allow NVMe disk in CGroups
security_selinux: Simplify virSecuritySELinuxSetImageLabelInternal
virSecuritySELinuxRestoreImageLabelInt: Don't skip non-local storage
qemu_capabilities: Introduce QEMU_CAPS_DRIVE_NVME
qemu: Generate command line of NVMe disks
qemu: Don't leak storage perms on failure in
qemuDomainAttachDiskGeneric
qemu_hotplug: Prepare NVMe disks on hotplug
docs/formatdomain.html.in | 45 +-
docs/schemas/domaincommon.rng | 32 ++
src/conf/domain_conf.c | 160 +++++++
src/conf/domain_conf.h | 6 +
src/libvirt_private.syms | 26 ++
src/qemu/qemu_block.c | 24 +
src/qemu/qemu_capabilities.c | 4 +
src/qemu/qemu_capabilities.h | 3 +
src/qemu/qemu_cgroup.c | 59 ++-
src/qemu/qemu_command.c | 4 +
src/qemu/qemu_domain.c | 115 ++++-
src/qemu/qemu_domain.h | 6 +
src/qemu/qemu_driver.c | 4 +
src/qemu/qemu_hostdev.c | 49 ++-
src/qemu/qemu_hostdev.h | 10 +
src/qemu/qemu_hotplug.c | 76 +++-
src/qemu/qemu_migration.c | 1 +
src/qemu/qemu_process.c | 7 +
src/security/security_dac.c | 38 ++
src/security/security_selinux.c | 95 ++--
src/util/Makefile.inc.am | 2 +
src/util/virhostdev.c | 350 +++++++++++++--
src/util/virhostdev.h | 25 ++
src/util/virnvme.c | 412 ++++++++++++++++++
src/util/virnvme.h | 89 ++++
src/util/virpci.c | 12 +-
src/util/virpci.h | 8 +-
src/util/virstoragefile.c | 73 ++++
src/util/virstoragefile.h | 17 +
src/xenconfig/xen_xl.c | 1 +
.../caps_2.12.0.aarch64.xml | 1 +
.../caps_2.12.0.ppc64.xml | 1 +
.../caps_2.12.0.s390x.xml | 1 +
.../caps_2.12.0.x86_64.xml | 1 +
.../qemucapabilitiesdata/caps_3.0.0.ppc64.xml | 1 +
.../caps_3.0.0.riscv32.xml | 1 +
.../caps_3.0.0.riscv64.xml | 1 +
.../qemucapabilitiesdata/caps_3.0.0.s390x.xml | 1 +
.../caps_3.0.0.x86_64.xml | 1 +
.../qemucapabilitiesdata/caps_3.1.0.ppc64.xml | 1 +
.../caps_3.1.0.x86_64.xml | 1 +
.../caps_4.0.0.aarch64.xml | 1 +
.../qemucapabilitiesdata/caps_4.0.0.ppc64.xml | 1 +
.../caps_4.0.0.riscv32.xml | 1 +
.../caps_4.0.0.riscv64.xml | 1 +
.../qemucapabilitiesdata/caps_4.0.0.s390x.xml | 1 +
.../caps_4.0.0.x86_64.xml | 1 +
.../caps_4.1.0.x86_64.xml | 1 +
.../disk-nvme.x86_64-latest.args | 52 +++
tests/qemuxml2argvdata/disk-nvme.xml | 63 +++
tests/qemuxml2argvtest.c | 1 +
tests/qemuxml2xmloutdata/disk-nvme.xml | 1 +
tests/qemuxml2xmltest.c | 1 +
tests/virhostdevtest.c | 185 ++++++--
tests/virpcimock.c | 76 +++-
tests/virpcitest.c | 32 --
tests/virpcitestdata/0000-01-00.0.config | Bin 0 -> 4096 bytes
tests/virpcitestdata/0000-02-00.0.config | Bin 0 -> 4096 bytes
58 files changed, 1978 insertions(+), 204 deletions(-)
create mode 100644 src/util/virnvme.c
create mode 100644 src/util/virnvme.h
create mode 100644 tests/qemuxml2argvdata/disk-nvme.x86_64-latest.args
create mode 100644 tests/qemuxml2argvdata/disk-nvme.xml
create mode 120000 tests/qemuxml2xmloutdata/disk-nvme.xml
create mode 100644 tests/virpcitestdata/0000-01-00.0.config
create mode 100644 tests/virpcitestdata/0000-02-00.0.config
--
2.21.0