[PATCH v1] Add basic driver for the Cloud-Hypervisor
by William Douglas
Cloud-Hypervisor is a KVM virtualization using hypervisor. It
functions similarly to qemu and the libvirt Cloud-Hypervisor driver
uses a very similar structure to the libvirt driver.
The biggest difference from the libvirt perspective is that the
"monitor" socket is seperated into two sockets one that commands are
issued to and one that events are notified from. The current
implementation only uses the command socket (running over a REST API
with json encoded data) with future changes to add support for the
event socket (to better handle shutdowns from inside the VM).
This patch adds support for the following initial VM actions using the
Cloud-Hypervsior API:
* vm.create
* vm.delete
* vm.boot
* vm.shutdown
* vm.reboot
* vm.pause
* vm.resume
To use the Cloud-Hypervisor driver, the v0.14.0 release of
Cloud-Hypervisor is required to be installed.
Some additional notes:
* The curl handle is persistent but not useful to detect ch process
shutdown/crash (a future patch will address this shortcoming)
* On a 64-bit host Cloud-Hypervisor needs to support PVH and so can
emulate 32-bit mode but it isn't fully tested (a 64-bit kernel and
32-bit userspace is fine, a 32-bit kernel is unsupported)
* The pause ch implements pauses execution of the guest cpus (and
disk/network i/o threads) which seems to align with
https://wiki.libvirt.org/page/VM_lifecycle#Pausing_a_guest_domain
The original RFC is
https://listman.redhat.com/archives/libvir-list/2020-August/msg01040.html
Changes since the RFC:
* Updated to use g_autofree where applicable
* Fixed meson.build to properly detect and disable the driver if
either Curl or YAJL are missing
* Fixed filenames in comments to match actual filenames
* Switched to use VIR_DOMAIN_VIRT_KVM and removed VIR_DOMAIN_VIRT_CH
* Added default emulator path to the virDomainDef when unspecified
* Updated driver to run as non-root, launching cloud-hypervisor as
non-root
* Updated to build a virtchd daemon
* Enabled non-root session usage.
* Switched to use of g_free instead of VIR_FREE
* Stopped checking for container based virt shutdown methods.
* Stopped checking for container based virt reboot methods.
* Fixed URI schemes to only list ch for the ch://session driver URI
* Updated name of chDriverState to be cloud-hypervisor
* Removed debug code for tracing curl calls
* Consolidated kernel, initrd and cmdline methods
* Added checks on diskdef->src->type prior to working with the
other diskdef->src members
* Added error reporting when failing to build the config from the
user's xml
* Fixed empty Expect Header for create vm
* Switched to use virReportEnumRangeError on unsupported netType
* Removed unused chStrToInt function
* Converted use of virObject to GObject
* Switched to using the virDomainObjPtr's domstatus element to refer
to the guest PID instead of privateData element
* Stopped setting persistent in chDomainCreateXML
* Added cleanup for removing the guest from driver->domains list on
failure
* Simplified min version check
* Added test for guest domain support based on cloud-hypervisor
binary availability
* Updated spec file to disable ch driver by default
* Added error reporting for unsupported device types
* Updated cloud-hypervisor to not need to poll for socket readiness
* Added drvch.rst for a user overview of the driver
* Added explanation of the architecture of cloud-hypervisor and
approach taken to manage with libvirt
Signed-off-by: William Douglas <william.douglas(a)intel.com>
---
docs/drivers.html.in | 1 +
docs/drvch.rst | 55 ++
docs/meson.build | 1 +
include/libvirt/virterror.h | 1 +
libvirt.spec.in | 31 +
meson.build | 43 ++
meson_options.txt | 3 +
po/POTFILES.in | 5 +
src/ch/ch_conf.c | 256 ++++++++
src/ch/ch_conf.h | 85 +++
src/ch/ch_domain.c | 203 ++++++
src/ch/ch_domain.h | 65 ++
src/ch/ch_driver.c | 930 ++++++++++++++++++++++++++++
src/ch/ch_driver.h | 24 +
src/ch/ch_monitor.c | 837 +++++++++++++++++++++++++
src/ch/ch_monitor.h | 60 ++
src/ch/ch_process.c | 126 ++++
src/ch/ch_process.h | 31 +
src/ch/meson.build | 74 +++
src/ch/virtchd.service.in | 47 ++
src/ch/virtchd.sysconf | 3 +
src/meson.build | 1 +
src/remote/remote_daemon.c | 4 +
src/remote/remote_daemon_dispatch.c | 3 +-
src/util/virerror.c | 1 +
tools/virsh.c | 3 +
26 files changed, 2892 insertions(+), 1 deletion(-)
create mode 100644 docs/drvch.rst
create mode 100644 src/ch/ch_conf.c
create mode 100644 src/ch/ch_conf.h
create mode 100644 src/ch/ch_domain.c
create mode 100644 src/ch/ch_domain.h
create mode 100644 src/ch/ch_driver.c
create mode 100644 src/ch/ch_driver.h
create mode 100644 src/ch/ch_monitor.c
create mode 100644 src/ch/ch_monitor.h
create mode 100644 src/ch/ch_process.c
create mode 100644 src/ch/ch_process.h
create mode 100644 src/ch/meson.build
create mode 100644 src/ch/virtchd.service.in
create mode 100644 src/ch/virtchd.sysconf
diff --git a/docs/drivers.html.in b/docs/drivers.html.in
index 34f98f60b6..824604998e 100644
--- a/docs/drivers.html.in
+++ b/docs/drivers.html.in
@@ -37,6 +37,7 @@
<li><strong><a href="drvhyperv.html">Microsoft Hyper-V</a></strong></li>
<li><strong><a href="drvvirtuozzo.html">Virtuozzo</a></strong></li>
<li><strong><a href="drvbhyve.html">Bhyve</a></strong> - The BSD Hypervisor</li>
+ <li><strong><a href="drvch.html">Cloud Hypervisor</a></strong></li>
</ul>
</body>
diff --git a/docs/drvch.rst b/docs/drvch.rst
new file mode 100644
index 0000000000..bb13599e6f
--- /dev/null
+++ b/docs/drvch.rst
@@ -0,0 +1,55 @@
+=======================
+Cloud Hypervisor driver
+=======================
+
+.. contents::
+
+Cloud Hypervisor is an open source Virtual Machine Monitor (VMM) that
+runs on top of KVM. The project focuses on exclusively running modern,
+cloud workloads, on top of a limited set of hardware architectures and
+platforms. Cloud workloads refers to those that are usually run by
+customers inside a cloud provider. For our purposes this means modern
+operating systems with most I/O handled by paravirtualised devices
+(i.e. virtio), no requirement for legacy devices, and 64-bit CPUs.
+
+The libvirt Cloud Hypervisor driver is intended to be run as a session
+driver without privileges. The cloud-hypervisor binary itself should be
+``setcap cap_net_admin+ep`` (in order to create tap interfaces).
+
+Expected connection URI would be
+
+``ch:///session``
+
+
+Example guest domain XML configurations
+=======================================
+
+The Cloud Hypervisor driver in libvirt is in its early stage under active
+development only supporting a limited number of Cloud Hypervisor features.
+
+Firmware is from
+`hypervisor-fw <https://github.com/cloud-hypervisor/rust-hypervisor-firmware/releases>`__
+
+**Note: Only virtio devices are supported**
+
+::
+
+ <domain type='kvm'>
+ <name>cloudhypervisor</name>
+ <uuid>4dea22b3-1d52-d8f3-2516-782e98ab3fa0</uuid>
+ <os>
+ <type>hvm</type>
+ <kernel>hypervisor-fw</kernel>
+ </os>
+ <memory unit='G'>2</memory>
+ <devices>
+ <disk type='file'>
+ <source file='disk.raw'/>
+ <target dev='vda' bus='virtio'/>
+ </disk>
+ <interface type='ethernet'>
+ <model type='virtio'/>
+ </interface>
+ </devices>
+ <vcpu>2</vcpu>
+ </domain>
diff --git a/docs/meson.build b/docs/meson.build
index f550629d8e..bee0d80d53 100644
--- a/docs/meson.build
+++ b/docs/meson.build
@@ -111,6 +111,7 @@ docs_rst_files = [
'daemons',
'developer-tooling',
'drvqemu',
+ 'drvch',
'formatbackup',
'formatcheckpoint',
'formatdomain',
diff --git a/include/libvirt/virterror.h b/include/libvirt/virterror.h
index 524a7bf9e8..57986931fd 100644
--- a/include/libvirt/virterror.h
+++ b/include/libvirt/virterror.h
@@ -136,6 +136,7 @@ typedef enum {
VIR_FROM_TPM = 70, /* Error from TPM */
VIR_FROM_BPF = 71, /* Error from BPF code */
+ VIR_FROM_CH = 72, /* Error from Cloud-Hypervisor driver */
# ifdef VIR_ENUM_SENTINELS
VIR_ERR_DOMAIN_LAST
diff --git a/libvirt.spec.in b/libvirt.spec.in
index da7af2824e..4eeb4dc1b4 100644
--- a/libvirt.spec.in
+++ b/libvirt.spec.in
@@ -244,6 +244,9 @@ Requires: libvirt-daemon-driver-lxc = %{version}-%{release}
%if %{with_qemu}
Requires: libvirt-daemon-driver-qemu = %{version}-%{release}
%endif
+%if %{with_ch}
+Requires: libvirt-daemon-driver-ch = %{version}-%{release}
+%endif
# We had UML driver, but we've removed it.
Obsoletes: libvirt-daemon-driver-uml <= 5.0.0
Obsoletes: libvirt-daemon-uml <= 5.0.0
@@ -761,6 +764,20 @@ QEMU
%endif
+%if %{with_ch}
+%package daemon-driver-ch
+Summary: Cloud-Hypervisor driver plugin for the libvirtd daemon
+Requires: libvirt-daemon = %{version}-%{release}
+Requires: libvirt-libs = %{version}-%{release}
+Requires: /usr/bin/cloud-hypervisor
+
+%description daemon-driver-ch
+The Cloud-Hypervisor driver plugin for the libvirtd daemon,
+providing an implementation of the hypervisor driver APIs
+using Cloud-Hypervisor
+%endif
+
+
%if %{with_lxc}
%package daemon-driver-lxc
Summary: LXC driver plugin for the libvirtd daemon
@@ -1003,6 +1020,12 @@ exit 1
%define arg_qemu -Ddriver_qemu=disabled
%endif
+%if %{with_ch}
+ %define arg_ch -Ddriver_ch=enabled
+%else
+ %define arg_ch -Ddriver_ch=disabled
+%endif
+
%if %{with_openvz}
%define arg_openvz -Ddriver_openvz=enabled
%else
@@ -1146,6 +1169,7 @@ export SOURCE_DATE_EPOCH=$(stat --printf='%Y' %{_specdir}/%{name}.spec)
%meson \
-Drunstatedir=%{_rundir} \
%{?arg_qemu} \
+ %{?arg_ch} \
%{?arg_openvz} \
%{?arg_lxc} \
%{?arg_vbox} \
@@ -1771,6 +1795,13 @@ exit 0
%{_mandir}/man8/virtqemud.8*
%endif
+%if %{with_ch}
+%files daemon-driver-ch
+%dir %attr(0700, root, root) %{_localstatedir}/log/libvirt/ch/
+%ghost %dir %{_rundir}/libvirt/ch/
+%{_libdir}/%{name}/connection-driver/libvirt_driver_ch.so
+%endif
+
%if %{with_lxc}
%files daemon-driver-lxc
%config(noreplace) %{_sysconfdir}/sysconfig/virtlxcd
diff --git a/meson.build b/meson.build
index 951da67896..8c33efb25c 100644
--- a/meson.build
+++ b/meson.build
@@ -1517,6 +1517,48 @@ elif get_option('driver_lxc').enabled()
error('linux and remote_driver are required for LXC')
endif
+if not get_option('driver_ch').disabled() and host_machine.system() == 'linux'
+ use_ch = true
+
+ if not conf.has('WITH_LIBVIRTD')
+ use_ch = false
+ if get_option('driver_ch').enabled()
+ error('libvirtd is required to build Cloud-Hypervisor driver')
+ endif
+ endif
+
+ if not yajl_dep.found()
+ use_ch = false
+ if get_option('driver_ch').enabled()
+ error('YAJL 2 is required to build Cloud-Hypervisor driver')
+ endif
+ endif
+
+ if not curl_dep.found()
+ use_ch = false
+ if get_option('driver_ch').enabled()
+ error('curl is required to build Cloud-Hypervisor driver')
+ endif
+ endif
+
+ if use_ch
+ conf.set('WITH_CH', 1)
+
+ default_ch_user = 'root'
+ default_ch_group = 'root'
+ ch_user = get_option('ch_user')
+ if ch_user == ''
+ ch_user = default_ch_user
+ endif
+ ch_group = get_option('ch_group')
+ if ch_group == ''
+ ch_group = default_ch_group
+ endif
+ conf.set_quoted('CH_USER', ch_user)
+ conf.set_quoted('CH_GROUP', ch_group)
+ endif
+endif
+
# there's no use compiling the network driver without the libvirt
# daemon, nor compiling it for macOS, where it breaks the compile
if not get_option('driver_network').disabled() and conf.has('WITH_LIBVIRTD') and host_machine.system() != 'darwin'
@@ -2170,6 +2212,7 @@ driver_summary = {
'VBox': conf.has('WITH_VBOX'),
'libxl': conf.has('WITH_LIBXL'),
'LXC': conf.has('WITH_LXC'),
+ 'Cloud-Hypervisor': conf.has('WITH_CH'),
'ESX': conf.has('WITH_ESX'),
'Hyper-V': conf.has('WITH_HYPERV'),
'vz': conf.has('WITH_VZ'),
diff --git a/meson_options.txt b/meson_options.txt
index 2606648b64..cd0b4b33be 100644
--- a/meson_options.txt
+++ b/meson_options.txt
@@ -53,6 +53,9 @@ option('driver_interface', type: 'feature', value: 'auto', description: 'host in
option('driver_libvirtd', type: 'feature', value: 'auto', description: 'libvirtd driver')
option('driver_libxl', type: 'feature', value: 'auto', description: 'libxenlight driver')
option('driver_lxc', type: 'feature', value: 'auto', description: 'Linux Container driver')
+option('driver_ch', type: 'feature', value: 'auto', description: 'Cloud-Hypervisor driver')
+option('ch_user', type: 'string', value: '', description: 'username to run Cloud-Hypervisor system instance as')
+option('ch_group', type: 'string', value: '', description: 'groupname to run Cloud-Hypervisor system instance as')
option('driver_network', type: 'feature', value: 'auto', description: 'virtual network driver')
option('driver_openvz', type: 'feature', value: 'auto', description: 'OpenVZ driver')
option('driver_qemu', type: 'feature', value: 'auto', description: 'QEMU/KVM driver')
diff --git a/po/POTFILES.in b/po/POTFILES.in
index 413783ee35..c200d7452a 100644
--- a/po/POTFILES.in
+++ b/po/POTFILES.in
@@ -18,6 +18,11 @@
@SRCDIR(a)src/bhyve/bhyve_monitor.c
@SRCDIR(a)src/bhyve/bhyve_parse_command.c
@SRCDIR(a)src/bhyve/bhyve_process.c
+@SRCDIR(a)src/ch/ch_conf.c
+@SRCDIR(a)src/ch/ch_domain.c
+@SRCDIR(a)src/ch/ch_driver.c
+@SRCDIR(a)src/ch/ch_monitor.c
+@SRCDIR(a)src/ch/ch_process.c
@SRCDIR(a)src/conf/backup_conf.c
@SRCDIR(a)src/conf/capabilities.c
@SRCDIR(a)src/conf/checkpoint_conf.c
diff --git a/src/ch/ch_conf.c b/src/ch/ch_conf.c
new file mode 100644
index 0000000000..3fd9258c52
--- /dev/null
+++ b/src/ch/ch_conf.c
@@ -0,0 +1,256 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_conf.c: functions for Cloud-Hypervisor configuration
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#include <config.h>
+
+#include "configmake.h"
+#include "viralloc.h"
+#include "vircommand.h"
+#include "virlog.h"
+#include "virobject.h"
+#include "virstring.h"
+#include "virutil.h"
+
+#include "ch_conf.h"
+#include "ch_domain.h"
+
+#define VIR_FROM_THIS VIR_FROM_CH
+
+VIR_LOG_INIT("ch.ch_conf");
+
+static virClass *virCHDriverConfigClass;
+static void virCHDriverConfigDispose(void *obj);
+
+static int virCHConfigOnceInit(void)
+{
+ if (!VIR_CLASS_NEW(virCHDriverConfig, virClassForObject()))
+ return -1;
+
+ return 0;
+}
+
+VIR_ONCE_GLOBAL_INIT(virCHConfig);
+
+
+/* Functions */
+virCaps *virCHDriverCapsInit(void)
+{
+ virCaps *caps;
+ virCapsGuest *guest;
+ g_autofree char *ch_cmd = NULL;
+
+ if ((caps = virCapabilitiesNew(virArchFromHost(),
+ false, false)) == NULL)
+ goto cleanup;
+
+ if (!(caps->host.numa = virCapabilitiesHostNUMANewHost()))
+ goto cleanup;
+
+ if (virCapabilitiesInitCaches(caps) < 0)
+ goto cleanup;
+
+ if ((guest = virCapabilitiesAddGuest(caps,
+ VIR_DOMAIN_OSTYPE_HVM,
+ caps->host.arch,
+ NULL,
+ NULL,
+ 0,
+ NULL)) == NULL)
+ goto cleanup;
+
+ if (virCapabilitiesAddGuestDomain(guest,
+ VIR_DOMAIN_VIRT_KVM,
+ NULL,
+ NULL,
+ 0,
+ NULL) == NULL)
+ goto cleanup;
+
+ ch_cmd = g_find_program_in_path(CH_CMD);
+ if (!ch_cmd)
+ goto cleanup;
+
+ return caps;
+
+ cleanup:
+ virObjectUnref(caps);
+ return NULL;
+}
+
+/**
+ * virCHDriverGetCapabilities:
+ *
+ * Get a reference to the virCaps instance for the
+ * driver. If @refresh is true, the capabilities will be
+ * rebuilt first
+ *
+ * The caller must release the reference with virObjetUnref
+ *
+ * Returns: a reference to a virCaps instance or NULL
+ */
+virCaps *virCHDriverGetCapabilities(virCHDriver *driver,
+ bool refresh)
+{
+ virCaps *ret;
+ if (refresh) {
+ virCaps *caps = NULL;
+ if ((caps = virCHDriverCapsInit()) == NULL)
+ return NULL;
+
+ chDriverLock(driver);
+ virObjectUnref(driver->caps);
+ driver->caps = caps;
+ } else {
+ chDriverLock(driver);
+ }
+
+ ret = virObjectRef(driver->caps);
+ chDriverUnlock(driver);
+ return ret;
+}
+
+virDomainXMLOption *
+chDomainXMLConfInit(virCHDriver *driver)
+{
+ virCHDriverDomainDefParserConfig.priv = driver;
+ return virDomainXMLOptionNew(&virCHDriverDomainDefParserConfig,
+ &virCHDriverPrivateDataCallbacks,
+ NULL, NULL, NULL);
+}
+
+virCHDriverConfig *
+virCHDriverConfigNew(bool privileged)
+{
+ virCHDriverConfig *cfg;
+
+ if (virCHConfigInitialize() < 0)
+ return NULL;
+
+ if (!(cfg = virObjectNew(virCHDriverConfigClass)))
+ return NULL;
+
+ cfg->uri = g_strdup(privileged ? "ch:///system" : "ch:///session");
+ if (privileged) {
+ if (virGetUserID(CH_USER, &cfg->user) < 0)
+ return NULL;
+ if (virGetGroupID(CH_GROUP, &cfg->group) < 0)
+ return NULL;
+ } else {
+ cfg->user = (uid_t)-1;
+ cfg->group = (gid_t)-1;
+ }
+
+ if (privileged) {
+ cfg->logDir = g_strdup_printf("%s/log/libvirt/ch", LOCALSTATEDIR);
+ cfg->stateDir = g_strdup_printf("%s/libvirt/ch", RUNSTATEDIR);
+
+ } else {
+ g_autofree char *rundir = NULL;
+ g_autofree char *cachedir = NULL;
+
+ cachedir = virGetUserCacheDirectory();
+
+ cfg->logDir = g_strdup_printf("%s/ch/log", cachedir);
+
+ rundir = virGetUserRuntimeDirectory();
+ cfg->stateDir = g_strdup_printf("%s/ch/run", rundir);
+ }
+
+ return cfg;
+}
+
+virCHDriverConfig *virCHDriverGetConfig(virCHDriver *driver)
+{
+ virCHDriverConfig *cfg;
+ chDriverLock(driver);
+ cfg = virObjectRef(driver->config);
+ chDriverUnlock(driver);
+ return cfg;
+}
+
+static void
+virCHDriverConfigDispose(void *obj)
+{
+ virCHDriverConfig *cfg = obj;
+
+ g_free(cfg->stateDir);
+ g_free(cfg->logDir);
+}
+
+#define MIN_VERSION ((0 * 1000000) + (13 * 1000) + (0))
+
+static int
+chExtractVersionInfo(int *retversion)
+{
+ int ret = -1;
+ unsigned long version;
+ char *help = NULL;
+ char *tmp = NULL;
+ g_autofree char *ch_cmd = g_find_program_in_path(CH_CMD);
+ virCommand *cmd = virCommandNewArgList(ch_cmd, "--version", NULL);
+
+ if (retversion)
+ *retversion = 0;
+
+ virCommandAddEnvString(cmd, "LC_ALL=C");
+ virCommandSetOutputBuffer(cmd, &help);
+
+ if (virCommandRun(cmd, NULL) < 0)
+ goto cleanup;
+
+ tmp = help;
+
+ /* expected format: cloud-hypervisor v<major>.<minor>.<micro> */
+ if ((tmp = STRSKIP(tmp, "cloud-hypervisor v")) == NULL)
+ goto cleanup;
+
+ if (virParseVersionString(tmp, &version, false) < 0)
+ goto cleanup;
+
+ if (version < MIN_VERSION) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Cloud-Hypervisor version is too old (v0.13.0 is the minimum supported version)"));
+ goto cleanup;
+ }
+
+ if (retversion)
+ *retversion = version;
+
+ ret = 0;
+
+ cleanup:
+ virCommandFree(cmd);
+
+ return ret;
+}
+
+int chExtractVersion(virCHDriver *driver)
+{
+ if (driver->version > 0)
+ return 0;
+
+ if (chExtractVersionInfo(&driver->version) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Could not extract Cloud-Hypervisor version"));
+ return -1;
+ }
+
+ return 0;
+}
diff --git a/src/ch/ch_conf.h b/src/ch/ch_conf.h
new file mode 100644
index 0000000000..d856825377
--- /dev/null
+++ b/src/ch/ch_conf.h
@@ -0,0 +1,85 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_conf.h: header file for Cloud-Hypervisor configuration
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#pragma once
+
+#include "virdomainobjlist.h"
+#include "virthread.h"
+
+#define CH_DRIVER_NAME "CH"
+#define CH_CMD "cloud-hypervisor"
+
+typedef struct _virCHDriver virCHDriver;
+
+typedef struct _virCHDriverConfig virCHDriverConfig;
+
+struct _virCHDriverConfig {
+ GObject parent;
+
+ char *stateDir;
+ char *logDir;
+ char *uri;
+
+ uid_t user;
+ gid_t group;
+};
+
+struct _virCHDriver
+{
+ virMutex lock;
+
+ /* Require lock to get a reference on the object,
+ * lockless access thereafter */
+ virCaps *caps;
+
+ /* Immutable pointer, Immutable object */
+ virDomainXMLOption *xmlopt;
+
+ /* Immutable pointer, self-locking APIs */
+ virDomainObjList *domains;
+
+ /* Cloud-Hypervisor version */
+ int version;
+
+ /* Require lock to get reference on 'config',
+ * then lockless thereafter */
+ virCHDriverConfig *config;
+
+ /* pid file FD, ensures two copies of the driver can't use the same root */
+ int lockFD;
+};
+
+virCaps *virCHDriverCapsInit(void);
+virCaps *virCHDriverGetCapabilities(virCHDriver *driver,
+ bool refresh);
+virDomainXMLOption *chDomainXMLConfInit(virCHDriver *driver);
+virCHDriverConfig *virCHDriverConfigNew(bool privileged);
+virCHDriverConfig *virCHDriverGetConfig(virCHDriver *driver);
+int chExtractVersion(virCHDriver *driver);
+
+static inline void chDriverLock(virCHDriver *driver)
+{
+ virMutexLock(&driver->lock);
+}
+
+static inline void chDriverUnlock(virCHDriver *driver)
+{
+ virMutexUnlock(&driver->lock);
+}
diff --git a/src/ch/ch_domain.c b/src/ch/ch_domain.c
new file mode 100644
index 0000000000..9d0b091699
--- /dev/null
+++ b/src/ch/ch_domain.c
@@ -0,0 +1,203 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_domain.c: Domain manager functions for Cloud-Hypervisor driver
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#include <config.h>
+
+#include "ch_domain.h"
+#include "viralloc.h"
+#include "virlog.h"
+#include "virtime.h"
+
+#define VIR_FROM_THIS VIR_FROM_CH
+
+VIR_ENUM_IMPL(virCHDomainJob,
+ CH_JOB_LAST,
+ "none",
+ "query",
+ "destroy",
+ "modify",
+);
+
+VIR_LOG_INIT("ch.ch_domain");
+
+static int
+virCHDomainObjInitJob(virCHDomainObjPrivate *priv)
+{
+ memset(&priv->job, 0, sizeof(priv->job));
+
+ if (virCondInit(&priv->job.cond) < 0)
+ return -1;
+
+ return 0;
+}
+
+static void
+virCHDomainObjResetJob(virCHDomainObjPrivate *priv)
+{
+ struct virCHDomainJobObj *job = &priv->job;
+
+ job->active = CH_JOB_NONE;
+ job->owner = 0;
+}
+
+static void
+virCHDomainObjFreeJob(virCHDomainObjPrivate *priv)
+{
+ ignore_value(virCondDestroy(&priv->job.cond));
+}
+
+/*
+ * obj must be locked before calling, virCHDriver must NOT be locked
+ *
+ * This must be called by anything that will change the VM state
+ * in any way
+ *
+ * Upon successful return, the object will have its ref count increased.
+ * Successful calls must be followed by EndJob eventually.
+ */
+int
+virCHDomainObjBeginJob(virDomainObj *obj, enum virCHDomainJob job)
+{
+ virCHDomainObjPrivate *priv = obj->privateData;
+ unsigned long long now;
+ unsigned long long then;
+
+ if (virTimeMillisNow(&now) < 0)
+ return -1;
+ then = now + CH_JOB_WAIT_TIME;
+
+ while (priv->job.active) {
+ VIR_DEBUG("Wait normal job condition for starting job: %s",
+ virCHDomainJobTypeToString(job));
+ if (virCondWaitUntil(&priv->job.cond, &obj->parent.lock, then) < 0)
+ goto error;
+ }
+
+ virCHDomainObjResetJob(priv);
+
+ VIR_DEBUG("Starting job: %s", virCHDomainJobTypeToString(job));
+ priv->job.active = job;
+ priv->job.owner = virThreadSelfID();
+
+ return 0;
+
+ error:
+ VIR_WARN("Cannot start job (%s) for domain %s;"
+ " current job is (%s) owned by (%d)",
+ virCHDomainJobTypeToString(job),
+ obj->def->name,
+ virCHDomainJobTypeToString(priv->job.active),
+ priv->job.owner);
+
+ if (errno == ETIMEDOUT)
+ virReportError(VIR_ERR_OPERATION_TIMEOUT,
+ "%s", _("cannot acquire state change lock"));
+ else
+ virReportSystemError(errno,
+ "%s", _("cannot acquire job mutex"));
+ return -1;
+}
+
+/*
+ * obj must be locked and have a reference before calling
+ *
+ * To be called after completing the work associated with the
+ * earlier virCHDomainBeginJob() call
+ */
+void
+virCHDomainObjEndJob(virDomainObj *obj)
+{
+ virCHDomainObjPrivate *priv = obj->privateData;
+ enum virCHDomainJob job = priv->job.active;
+
+ VIR_DEBUG("Stopping job: %s",
+ virCHDomainJobTypeToString(job));
+
+ virCHDomainObjResetJob(priv);
+ virCondSignal(&priv->job.cond);
+}
+
+static void *
+virCHDomainObjPrivateAlloc(void *opaque G_GNUC_UNUSED)
+{
+ virCHDomainObjPrivate *priv;
+
+ priv = g_new0(virCHDomainObjPrivate, 1);
+
+ if (virCHDomainObjInitJob(priv) < 0) {
+ g_free(priv);
+ return NULL;
+ }
+
+ return priv;
+}
+
+static void
+virCHDomainObjPrivateFree(void *data)
+{
+ virCHDomainObjPrivate *priv = data;
+
+ virCHDomainObjFreeJob(priv);
+ g_free(priv);
+}
+
+virDomainXMLPrivateDataCallbacks virCHDriverPrivateDataCallbacks = {
+ .alloc = virCHDomainObjPrivateAlloc,
+ .free = virCHDomainObjPrivateFree,
+};
+
+static int
+virCHDomainDefPostParseBasic(virDomainDef *def,
+ void *opaque G_GNUC_UNUSED)
+{
+ /* check for emulator and create a default one if needed */
+ if (!def->emulator) {
+ if (!(def->emulator = g_find_program_in_path(CH_CMD))) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("No emulator found for cloud-hypervisor"));
+ return 1;
+ }
+ }
+
+ return 0;
+}
+
+static int
+virCHDomainDefPostParse(virDomainDef *def,
+ unsigned int parseFlags G_GNUC_UNUSED,
+ void *opaque,
+ void *parseOpaque G_GNUC_UNUSED)
+{
+ virCHDriver *driver = opaque;
+ g_autoptr(virCaps) caps = virCHDriverGetCapabilities(driver, false);
+ if (!caps)
+ return -1;
+ if (!virCapabilitiesDomainSupported(caps, def->os.type,
+ def->os.arch,
+ def->virtType))
+ return -1;
+
+ return 0;
+}
+
+virDomainDefParserConfig virCHDriverDomainDefParserConfig = {
+ .domainPostParseBasicCallback = virCHDomainDefPostParseBasic,
+ .domainPostParseCallback = virCHDomainDefPostParse,
+};
diff --git a/src/ch/ch_domain.h b/src/ch/ch_domain.h
new file mode 100644
index 0000000000..b4e0d4c212
--- /dev/null
+++ b/src/ch/ch_domain.h
@@ -0,0 +1,65 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_domain.h: header file for domain manager's Cloud-Hypervisor driver functions
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#pragma once
+
+#include "ch_conf.h"
+#include "ch_monitor.h"
+
+/* Give up waiting for mutex after 30 seconds */
+#define CH_JOB_WAIT_TIME (1000ull * 30)
+
+/* Only 1 job is allowed at any time
+ * A job includes *all* ch.so api, even those just querying
+ * information, not merely actions */
+
+enum virCHDomainJob {
+ CH_JOB_NONE = 0, /* Always set to 0 for easy if (jobActive) conditions */
+ CH_JOB_QUERY, /* Doesn't change any state */
+ CH_JOB_DESTROY, /* Destroys the domain (cannot be masked out) */
+ CH_JOB_MODIFY, /* May change state */
+ CH_JOB_LAST
+};
+VIR_ENUM_DECL(virCHDomainJob);
+
+
+struct virCHDomainJobObj {
+ virCond cond; /* Use to coordinate jobs */
+ enum virCHDomainJob active; /* Currently running job */
+ int owner; /* Thread which set current job */
+};
+
+
+typedef struct _virCHDomainObjPrivate virCHDomainObjPrivate;
+struct _virCHDomainObjPrivate {
+ struct virCHDomainJobObj job;
+
+ virCHMonitor *monitor;
+};
+
+extern virDomainXMLPrivateDataCallbacks virCHDriverPrivateDataCallbacks;
+extern virDomainDefParserConfig virCHDriverDomainDefParserConfig;
+
+int
+virCHDomainObjBeginJob(virDomainObj *obj, enum virCHDomainJob job)
+ G_GNUC_WARN_UNUSED_RESULT;
+
+void
+virCHDomainObjEndJob(virDomainObj *obj);
diff --git a/src/ch/ch_driver.c b/src/ch/ch_driver.c
new file mode 100644
index 0000000000..11f5a3d01b
--- /dev/null
+++ b/src/ch/ch_driver.c
@@ -0,0 +1,930 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_driver.c: Core Cloud-Hypervisor driver functions
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#include <config.h>
+
+#include "ch_conf.h"
+#include "ch_domain.h"
+#include "ch_driver.h"
+#include "ch_monitor.h"
+#include "ch_process.h"
+#include "datatypes.h"
+#include "driver.h"
+#include "viraccessapicheck.h"
+#include "viralloc.h"
+#include "virbuffer.h"
+#include "vircommand.h"
+#include "virerror.h"
+#include "virfile.h"
+#include "virlog.h"
+#include "virnetdevtap.h"
+#include "virobject.h"
+#include "virstring.h"
+#include "virtypedparam.h"
+#include "viruri.h"
+#include "virutil.h"
+#include "viruuid.h"
+
+#define VIR_FROM_THIS VIR_FROM_CH
+
+VIR_LOG_INIT("ch.ch_driver");
+
+static int chStateInitialize(bool privileged,
+ const char *root,
+ virStateInhibitCallback callback,
+ void *opaque);
+static int chStateCleanup(void);
+virCHDriver *ch_driver = NULL;
+
+static virDomainObj *
+chDomObjFromDomain(virDomain *domain)
+{
+ virDomainObj *vm;
+ virCHDriver *driver = domain->conn->privateData;
+ char uuidstr[VIR_UUID_STRING_BUFLEN];
+
+ vm = virDomainObjListFindByUUID(driver->domains, domain->uuid);
+ if (!vm) {
+ virUUIDFormat(domain->uuid, uuidstr);
+ virReportError(VIR_ERR_NO_DOMAIN,
+ _("no domain with matching uuid '%s' (%s)"),
+ uuidstr, domain->name);
+ return NULL;
+ }
+
+ return vm;
+}
+
+/* Functions */
+static int
+chConnectURIProbe(char **uri)
+{
+ if (ch_driver == NULL)
+ return 0;
+
+ *uri = g_strdup("ch:///system");
+ return 1;
+}
+
+static virDrvOpenStatus chConnectOpen(virConnectPtr conn,
+ virConnectAuthPtr auth G_GNUC_UNUSED,
+ virConf *conf G_GNUC_UNUSED,
+ unsigned int flags)
+{
+ virCheckFlags(VIR_CONNECT_RO, VIR_DRV_OPEN_ERROR);
+
+ /* URI was good, but driver isn't active */
+ if (ch_driver == NULL) {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ "%s", _("Cloud-Hypervisor state driver is not active"));
+ return VIR_DRV_OPEN_ERROR;
+ }
+
+ if (virConnectOpenEnsureACL(conn) < 0)
+ return VIR_DRV_OPEN_ERROR;
+
+ conn->privateData = ch_driver;
+
+ return VIR_DRV_OPEN_SUCCESS;
+}
+
+static int chConnectClose(virConnectPtr conn)
+{
+ conn->privateData = NULL;
+ return 0;
+}
+
+static const char *chConnectGetType(virConnectPtr conn)
+{
+ if (virConnectGetTypeEnsureACL(conn) < 0)
+ return NULL;
+
+ return "CH";
+}
+
+static int chConnectGetVersion(virConnectPtr conn,
+ unsigned long *version)
+{
+ virCHDriver *driver = conn->privateData;
+
+ if (virConnectGetVersionEnsureACL(conn) < 0)
+ return -1;
+
+ chDriverLock(driver);
+ *version = driver->version;
+ chDriverUnlock(driver);
+ return 0;
+}
+
+static char *chConnectGetHostname(virConnectPtr conn)
+{
+ if (virConnectGetHostnameEnsureACL(conn) < 0)
+ return NULL;
+
+ return virGetHostname();
+}
+
+static int chConnectNumOfDomains(virConnectPtr conn)
+{
+ virCHDriver *driver = conn->privateData;
+
+ if (virConnectNumOfDomainsEnsureACL(conn) < 0)
+ return -1;
+
+ return virDomainObjListNumOfDomains(driver->domains, true,
+ virConnectNumOfDomainsCheckACL, conn);
+}
+
+static int chConnectListDomains(virConnectPtr conn, int *ids, int nids)
+{
+ virCHDriver *driver = conn->privateData;
+
+ if (virConnectListDomainsEnsureACL(conn) < 0)
+ return -1;
+
+ return virDomainObjListGetActiveIDs(driver->domains, ids, nids,
+ virConnectListDomainsCheckACL, conn);
+}
+
+static int
+chConnectListAllDomains(virConnectPtr conn,
+ virDomainPtr **domains,
+ unsigned int flags)
+{
+ virCHDriver *driver = conn->privateData;
+
+ virCheckFlags(VIR_CONNECT_LIST_DOMAINS_FILTERS_ALL, -1);
+
+ if (virConnectListAllDomainsEnsureACL(conn) < 0)
+ return -1;
+
+ return virDomainObjListExport(driver->domains, conn, domains,
+ virConnectListAllDomainsCheckACL, flags);
+}
+
+static int chNodeGetInfo(virConnectPtr conn,
+ virNodeInfoPtr nodeinfo)
+{
+ if (virNodeGetInfoEnsureACL(conn) < 0)
+ return -1;
+
+ return virCapabilitiesGetNodeInfo(nodeinfo);
+}
+
+static char *chConnectGetCapabilities(virConnectPtr conn)
+{
+ virCHDriver *driver = conn->privateData;
+ virCaps *caps;
+ char *xml;
+
+ if (virConnectGetCapabilitiesEnsureACL(conn) < 0)
+ return NULL;
+
+ if (!(caps = virCHDriverGetCapabilities(driver, true)))
+ return NULL;
+
+ xml = virCapabilitiesFormatXML(caps);
+
+ virObjectUnref(caps);
+ return xml;
+}
+
+/**
+ * chDomainCreateXML:
+ * @conn: pointer to connection
+ * @xml: XML definition of domain
+ * @flags: bitwise-OR of supported virDomainCreateFlags
+ *
+ * Creates a domain based on xml and starts it
+ *
+ * Returns a new domain object or NULL in case of failure.
+ */
+static virDomainPtr
+chDomainCreateXML(virConnectPtr conn,
+ const char *xml,
+ unsigned int flags)
+{
+ virCHDriver *driver = conn->privateData;
+ virDomainDef *vmdef = NULL;
+ virDomainObj *vm = NULL;
+ virDomainPtr dom = NULL;
+ unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE;
+
+ virCheckFlags(VIR_DOMAIN_START_VALIDATE, NULL);
+
+ if (flags & VIR_DOMAIN_START_VALIDATE)
+ parse_flags |= VIR_DOMAIN_DEF_PARSE_VALIDATE_SCHEMA;
+
+
+ if ((vmdef = virDomainDefParseString(xml, driver->xmlopt,
+ NULL, parse_flags)) == NULL)
+ goto cleanup;
+
+ if (virDomainCreateXMLEnsureACL(conn, vmdef) < 0)
+ goto cleanup;
+
+ if (!(vm = virDomainObjListAdd(driver->domains,
+ vmdef,
+ driver->xmlopt,
+ VIR_DOMAIN_OBJ_LIST_ADD_LIVE |
+ VIR_DOMAIN_OBJ_LIST_ADD_CHECK_LIVE,
+ NULL)))
+ goto cleanup;
+
+ vmdef = NULL;
+
+ if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0)
+ goto cleanup;
+
+ if (virCHProcessStart(driver, vm, VIR_DOMAIN_RUNNING_BOOTED) < 0)
+ goto cleanup;
+
+ dom = virGetDomain(conn, vm->def->name, vm->def->uuid, vm->def->id);
+
+ virCHDomainObjEndJob(vm);
+
+ cleanup:
+ if (!dom) {
+ virDomainObjListRemove(driver->domains, vm);
+ }
+ virDomainDefFree(vmdef);
+ virDomainObjEndAPI(&vm);
+ chDriverUnlock(driver);
+ return dom;
+}
+
+static int
+chDomainCreateWithFlags(virDomainPtr dom, unsigned int flags)
+{
+ virCHDriver *driver = dom->conn->privateData;
+ virDomainObj *vm;
+ int ret = -1;
+
+ virCheckFlags(0, -1);
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ if (virDomainCreateWithFlagsEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0)
+ goto cleanup;
+
+ ret = virCHProcessStart(driver, vm, VIR_DOMAIN_RUNNING_BOOTED);
+
+ virCHDomainObjEndJob(vm);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int
+chDomainCreate(virDomainPtr dom)
+{
+ return chDomainCreateWithFlags(dom, 0);
+}
+
+static virDomainPtr
+chDomainDefineXMLFlags(virConnectPtr conn, const char *xml, unsigned int flags)
+{
+ virCHDriver *driver = conn->privateData;
+ virDomainDef *vmdef = NULL;
+ virDomainObj *vm = NULL;
+ virDomainPtr dom = NULL;
+ unsigned int parse_flags = VIR_DOMAIN_DEF_PARSE_INACTIVE;
+
+ virCheckFlags(VIR_DOMAIN_DEFINE_VALIDATE, NULL);
+
+ if (flags & VIR_DOMAIN_START_VALIDATE)
+ parse_flags |= VIR_DOMAIN_DEF_PARSE_VALIDATE_SCHEMA;
+
+ if ((vmdef = virDomainDefParseString(xml, driver->xmlopt,
+ NULL, parse_flags)) == NULL)
+ goto cleanup;
+
+ if (virXMLCheckIllegalChars("name", vmdef->name, "\n") < 0)
+ goto cleanup;
+
+ if (virDomainDefineXMLFlagsEnsureACL(conn, vmdef) < 0)
+ goto cleanup;
+
+ if (!(vm = virDomainObjListAdd(driver->domains, vmdef,
+ driver->xmlopt,
+ 0, NULL)))
+ goto cleanup;
+
+ vmdef = NULL;
+ vm->persistent = 1;
+
+ dom = virGetDomain(conn, vm->def->name, vm->def->uuid, vm->def->id);
+
+ cleanup:
+ virDomainDefFree(vmdef);
+ virDomainObjEndAPI(&vm);
+ return dom;
+}
+
+static virDomainPtr
+chDomainDefineXML(virConnectPtr conn, const char *xml)
+{
+ return chDomainDefineXMLFlags(conn, xml, 0);
+}
+
+static int
+chDomainUndefineFlags(virDomainPtr dom,
+ unsigned int flags)
+{
+ virCHDriver *driver = dom->conn->privateData;
+ virDomainObj *vm;
+ int ret = -1;
+
+ virCheckFlags(0, -1);
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ if (virDomainUndefineFlagsEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ if (!vm->persistent) {
+ virReportError(VIR_ERR_OPERATION_INVALID,
+ "%s", _("Cannot undefine transient domain"));
+ goto cleanup;
+ }
+
+ if (virDomainObjIsActive(vm)) {
+ vm->persistent = 0;
+ } else {
+ virDomainObjListRemove(driver->domains, vm);
+ }
+
+ ret = 0;
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int
+chDomainUndefine(virDomainPtr dom)
+{
+ return chDomainUndefineFlags(dom, 0);
+}
+
+static int chDomainIsActive(virDomainPtr dom)
+{
+ virCHDriver *driver = dom->conn->privateData;
+ virDomainObj *vm;
+ int ret = -1;
+
+ chDriverLock(driver);
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ if (virDomainIsActiveEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ ret = virDomainObjIsActive(vm);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ chDriverUnlock(driver);
+ return ret;
+}
+
+static int
+chDomainShutdownFlags(virDomainPtr dom,
+ unsigned int flags)
+{
+ virCHDomainObjPrivate *priv;
+ virDomainObj *vm;
+ virDomainState state;
+ int ret = -1;
+
+ virCheckFlags(VIR_DOMAIN_SHUTDOWN_ACPI_POWER_BTN, -1);
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ priv = vm->privateData;
+
+ if (virDomainShutdownFlagsEnsureACL(dom->conn, vm->def, flags) < 0)
+ goto cleanup;
+
+ if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0)
+ goto cleanup;
+
+ if (virDomainObjCheckActive(vm) < 0)
+ goto endjob;
+
+ state = virDomainObjGetState(vm, NULL);
+ if (state != VIR_DOMAIN_RUNNING && state != VIR_DOMAIN_PAUSED) {
+ virReportError(VIR_ERR_OPERATION_INVALID, "%s",
+ _("only can shutdown running/paused domain"));
+ goto endjob;
+ } else {
+ if (virCHMonitorShutdownVM(priv->monitor) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to shutdown guest VM"));
+ goto endjob;
+ }
+ }
+
+ virDomainObjSetState(vm, VIR_DOMAIN_SHUTDOWN, VIR_DOMAIN_SHUTDOWN_USER);
+
+ ret = 0;
+
+ endjob:
+ virCHDomainObjEndJob(vm);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int
+chDomainShutdown(virDomainPtr dom)
+{
+ return chDomainShutdownFlags(dom, 0);
+}
+
+
+static int
+chDomainReboot(virDomainPtr dom, unsigned int flags)
+{
+ virCHDomainObjPrivate *priv;
+ virDomainObj *vm;
+ virDomainState state;
+ int ret = -1;
+
+ virCheckFlags(VIR_DOMAIN_REBOOT_ACPI_POWER_BTN, -1);
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ priv = vm->privateData;
+
+ if (virDomainRebootEnsureACL(dom->conn, vm->def, flags) < 0)
+ goto cleanup;
+
+ if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0)
+ goto cleanup;
+
+ if (virDomainObjCheckActive(vm) < 0)
+ goto endjob;
+
+ state = virDomainObjGetState(vm, NULL);
+ if (state != VIR_DOMAIN_RUNNING && state != VIR_DOMAIN_PAUSED) {
+ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
+ _("only can reboot running/paused domain"));
+ goto endjob;
+ } else {
+ if (virCHMonitorRebootVM(priv->monitor) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to reboot domain"));
+ goto endjob;
+ }
+ }
+
+ if (state == VIR_DOMAIN_RUNNING)
+ virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, VIR_DOMAIN_RUNNING_BOOTED);
+ else
+ virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, VIR_DOMAIN_RUNNING_UNPAUSED);
+
+ ret = 0;
+
+ endjob:
+ virCHDomainObjEndJob(vm);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int
+chDomainSuspend(virDomainPtr dom)
+{
+ virCHDomainObjPrivate *priv;
+ virDomainObj *vm;
+ int ret = -1;
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ priv = vm->privateData;
+
+ if (virDomainSuspendEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0)
+ goto cleanup;
+
+ if (virDomainObjCheckActive(vm) < 0)
+ goto endjob;
+
+ if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
+ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
+ _("only can suspend running domain"));
+ goto endjob;
+ } else {
+ if (virCHMonitorSuspendVM(priv->monitor) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to suspend domain"));
+ goto endjob;
+ }
+ }
+
+ virDomainObjSetState(vm, VIR_DOMAIN_PAUSED, VIR_DOMAIN_PAUSED_USER);
+
+ ret = 0;
+
+ endjob:
+ virCHDomainObjEndJob(vm);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int
+chDomainResume(virDomainPtr dom)
+{
+ virCHDomainObjPrivate *priv;
+ virDomainObj *vm;
+ int ret = -1;
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ priv = vm->privateData;
+
+ if (virDomainResumeEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ if (virCHDomainObjBeginJob(vm, CH_JOB_MODIFY) < 0)
+ goto cleanup;
+
+ if (virDomainObjCheckActive(vm) < 0)
+ goto endjob;
+
+ if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_PAUSED) {
+ virReportError(VIR_ERR_OPERATION_UNSUPPORTED, "%s",
+ _("only can resume paused domain"));
+ goto endjob;
+ } else {
+ if (virCHMonitorResumeVM(priv->monitor) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to resume domain"));
+ goto endjob;
+ }
+ }
+
+ virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, VIR_DOMAIN_RUNNING_UNPAUSED);
+
+ ret = 0;
+
+ endjob:
+ virCHDomainObjEndJob(vm);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+/**
+ * chDomainDestroyFlags:
+ * @dom: pointer to domain to destroy
+ * @flags: extra flags; not used yet.
+ *
+ * Sends SIGKILL to Cloud-Hypervisor process to terminate it
+ *
+ * Returns 0 on success or -1 in case of error
+ */
+static int
+chDomainDestroyFlags(virDomainPtr dom, unsigned int flags)
+{
+ virCHDriver *driver = dom->conn->privateData;
+ virDomainObj *vm;
+ int ret = -1;
+
+ virCheckFlags(0, -1);
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ if (virDomainDestroyFlagsEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ if (virCHDomainObjBeginJob(vm, CH_JOB_DESTROY) < 0)
+ goto cleanup;
+
+ if (virDomainObjCheckActive(vm) < 0)
+ goto endjob;
+
+ ret = virCHProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_DESTROYED);
+
+ endjob:
+ virCHDomainObjEndJob(vm);
+ if (!vm->persistent)
+ virDomainObjListRemove(driver->domains, vm);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int
+chDomainDestroy(virDomainPtr dom)
+{
+ return chDomainDestroyFlags(dom, 0);
+}
+
+static virDomainPtr chDomainLookupByID(virConnectPtr conn,
+ int id)
+{
+ virCHDriver *driver = conn->privateData;
+ virDomainObj *vm;
+ virDomainPtr dom = NULL;
+
+ chDriverLock(driver);
+ vm = virDomainObjListFindByID(driver->domains, id);
+ chDriverUnlock(driver);
+
+ if (!vm) {
+ virReportError(VIR_ERR_NO_DOMAIN,
+ _("no domain with matching id '%d'"), id);
+ goto cleanup;
+ }
+
+ if (virDomainLookupByIDEnsureACL(conn, vm->def) < 0)
+ goto cleanup;
+
+ dom = virGetDomain(conn, vm->def->name, vm->def->uuid, vm->def->id);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return dom;
+}
+
+static virDomainPtr chDomainLookupByName(virConnectPtr conn,
+ const char *name)
+{
+ virCHDriver *driver = conn->privateData;
+ virDomainObj *vm;
+ virDomainPtr dom = NULL;
+
+ chDriverLock(driver);
+ vm = virDomainObjListFindByName(driver->domains, name);
+ chDriverUnlock(driver);
+
+ if (!vm) {
+ virReportError(VIR_ERR_NO_DOMAIN,
+ _("no domain with matching name '%s'"), name);
+ goto cleanup;
+ }
+
+ if (virDomainLookupByNameEnsureACL(conn, vm->def) < 0)
+ goto cleanup;
+
+ dom = virGetDomain(conn, vm->def->name, vm->def->uuid, vm->def->id);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return dom;
+}
+
+static virDomainPtr chDomainLookupByUUID(virConnectPtr conn,
+ const unsigned char *uuid)
+{
+ virCHDriver *driver = conn->privateData;
+ virDomainObj *vm;
+ virDomainPtr dom = NULL;
+
+ chDriverLock(driver);
+ vm = virDomainObjListFindByUUID(driver->domains, uuid);
+ chDriverUnlock(driver);
+
+ if (!vm) {
+ char uuidstr[VIR_UUID_STRING_BUFLEN];
+ virUUIDFormat(uuid, uuidstr);
+ virReportError(VIR_ERR_NO_DOMAIN,
+ _("no domain with matching uuid '%s'"), uuidstr);
+ goto cleanup;
+ }
+
+ if (virDomainLookupByUUIDEnsureACL(conn, vm->def) < 0)
+ goto cleanup;
+
+ dom = virGetDomain(conn, vm->def->name, vm->def->uuid, vm->def->id);
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return dom;
+}
+
+static int
+chDomainGetState(virDomainPtr dom,
+ int *state,
+ int *reason,
+ unsigned int flags)
+{
+ virDomainObj *vm;
+ int ret = -1;
+
+ virCheckFlags(0, -1);
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ if (virDomainGetStateEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ *state = virDomainObjGetState(vm, reason);
+ ret = 0;
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static char *chDomainGetXMLDesc(virDomainPtr dom,
+ unsigned int flags)
+{
+ virCHDriver *driver = dom->conn->privateData;
+ virDomainObj *vm;
+ char *ret = NULL;
+
+ virCheckFlags(VIR_DOMAIN_XML_COMMON_FLAGS, NULL);
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ if (virDomainGetXMLDescEnsureACL(dom->conn, vm->def, flags) < 0)
+ goto cleanup;
+
+ ret = virDomainDefFormat(vm->def, driver->xmlopt,
+ virDomainDefFormatConvertXMLFlags(flags));
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int chDomainGetInfo(virDomainPtr dom,
+ virDomainInfoPtr info)
+{
+ virDomainObj *vm;
+ int ret = -1;
+
+ if (!(vm = chDomObjFromDomain(dom)))
+ goto cleanup;
+
+ if (virDomainGetInfoEnsureACL(dom->conn, vm->def) < 0)
+ goto cleanup;
+
+ info->state = virDomainObjGetState(vm, NULL);
+
+ info->cpuTime = 0;
+
+ info->maxMem = virDomainDefGetMemoryTotal(vm->def);
+ info->memory = vm->def->mem.cur_balloon;
+ info->nrVirtCpu = virDomainDefGetVcpus(vm->def);
+
+ ret = 0;
+
+ cleanup:
+ virDomainObjEndAPI(&vm);
+ return ret;
+}
+
+static int chStateCleanup(void)
+{
+ if (ch_driver == NULL)
+ return -1;
+
+ virObjectUnref(ch_driver->domains);
+ virObjectUnref(ch_driver->xmlopt);
+ virObjectUnref(ch_driver->caps);
+ virObjectUnref(ch_driver->config);
+ virMutexDestroy(&ch_driver->lock);
+ g_free(ch_driver);
+
+ return 0;
+}
+
+static int chStateInitialize(bool privileged,
+ const char *root,
+ virStateInhibitCallback callback G_GNUC_UNUSED,
+ void *opaque G_GNUC_UNUSED)
+{
+ if (root != NULL) {
+ virReportError(VIR_ERR_INVALID_ARG, "%s",
+ _("Driver does not support embedded mode"));
+ return -1;
+ }
+
+ ch_driver = g_new0(virCHDriver, 1);
+
+ if (virMutexInit(&ch_driver->lock) < 0) {
+ g_free(ch_driver);
+ return VIR_DRV_STATE_INIT_ERROR;
+ }
+
+ if (!(ch_driver->domains = virDomainObjListNew()))
+ goto cleanup;
+
+ if (!(ch_driver->caps = virCHDriverCapsInit()))
+ goto cleanup;
+
+ if (!(ch_driver->xmlopt = chDomainXMLConfInit(ch_driver)))
+ goto cleanup;
+
+ if (!(ch_driver->config = virCHDriverConfigNew(privileged)))
+ goto cleanup;
+
+ if (chExtractVersion(ch_driver) < 0)
+ goto cleanup;
+
+ return VIR_DRV_STATE_INIT_COMPLETE;
+
+ cleanup:
+ chStateCleanup();
+ return VIR_DRV_STATE_INIT_ERROR;
+}
+
+/* Function Tables */
+static virHypervisorDriver chHypervisorDriver = {
+ .name = "CH",
+ .connectURIProbe = chConnectURIProbe,
+ .connectOpen = chConnectOpen, /* 6.7.0 */
+ .connectClose = chConnectClose, /* 6.7.0 */
+ .connectGetType = chConnectGetType, /* 6.7.0 */
+ .connectGetVersion = chConnectGetVersion, /* 6.7.0 */
+ .connectGetHostname = chConnectGetHostname, /* 6.7.0 */
+ .connectNumOfDomains = chConnectNumOfDomains, /* 6.7.0 */
+ .connectListAllDomains = chConnectListAllDomains, /* 6.7.0 */
+ .connectListDomains = chConnectListDomains, /* 6.7.0 */
+ .connectGetCapabilities = chConnectGetCapabilities, /* 6.7.0 */
+ .domainCreateXML = chDomainCreateXML, /* 6.7.0 */
+ .domainCreate = chDomainCreate, /* 6.7.0 */
+ .domainCreateWithFlags = chDomainCreateWithFlags, /* 6.7.0 */
+ .domainShutdown = chDomainShutdown, /* 6.7.0 */
+ .domainShutdownFlags = chDomainShutdownFlags, /* 6.7.0 */
+ .domainReboot = chDomainReboot, /* 6.7.0 */
+ .domainSuspend = chDomainSuspend, /* 6.7.0 */
+ .domainResume = chDomainResume, /* 6.7.0 */
+ .domainDestroy = chDomainDestroy, /* 6.7.0 */
+ .domainDestroyFlags = chDomainDestroyFlags, /* 6.7.0 */
+ .domainDefineXML = chDomainDefineXML, /* 6.7.0 */
+ .domainDefineXMLFlags = chDomainDefineXMLFlags, /* 6.7.0 */
+ .domainUndefine = chDomainUndefine, /* 6.7.0 */
+ .domainUndefineFlags = chDomainUndefineFlags, /* 6.7.0 */
+ .domainLookupByID = chDomainLookupByID, /* 6.7.0 */
+ .domainLookupByUUID = chDomainLookupByUUID, /* 6.7.0 */
+ .domainLookupByName = chDomainLookupByName, /* 6.7.0 */
+ .domainGetState = chDomainGetState, /* 6.7.0 */
+ .domainGetXMLDesc = chDomainGetXMLDesc, /* 6.7.0 */
+ .domainGetInfo = chDomainGetInfo, /* 6.7.0 */
+ .domainIsActive = chDomainIsActive, /* 6.7.0 */
+ .nodeGetInfo = chNodeGetInfo, /* 6.7.0 */
+};
+
+static virConnectDriver chConnectDriver = {
+ .localOnly = true,
+ .uriSchemes = (const char *[]){"ch", "Cloud-Hypervisor", NULL},
+ .hypervisorDriver = &chHypervisorDriver,
+};
+
+static virStateDriver chStateDriver = {
+ .name = "cloud-hypervisor",
+ .stateInitialize = chStateInitialize,
+ .stateCleanup = chStateCleanup,
+};
+
+int chRegister(void)
+{
+ if (virRegisterConnectDriver(&chConnectDriver, false) < 0)
+ return -1;
+ if (virRegisterStateDriver(&chStateDriver) < 0)
+ return -1;
+ return 0;
+}
diff --git a/src/ch/ch_driver.h b/src/ch/ch_driver.h
new file mode 100644
index 0000000000..933be3953b
--- /dev/null
+++ b/src/ch/ch_driver.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_driver.h: header file for Cloud-Hypervisor driver functions
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#pragma once
+
+/* Function declarations */
+int chRegister(void);
diff --git a/src/ch/ch_monitor.c b/src/ch/ch_monitor.c
new file mode 100644
index 0000000000..2968c2aae7
--- /dev/null
+++ b/src/ch/ch_monitor.c
@@ -0,0 +1,837 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_monitor.c: Manage Cloud-Hypervisor interactions
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#include <config.h>
+
+#include <stdio.h>
+#include <unistd.h>
+#include <curl/curl.h>
+
+#include "ch_conf.h"
+#include "ch_monitor.h"
+#include "viralloc.h"
+#include "vircommand.h"
+#include "virerror.h"
+#include "virfile.h"
+#include "virjson.h"
+#include "virlog.h"
+#include "virstring.h"
+#include "virtime.h"
+
+#define VIR_FROM_THIS VIR_FROM_CH
+
+VIR_LOG_INIT("ch.ch_monitor");
+
+static virClass *virCHMonitorClass;
+static void virCHMonitorDispose(void *obj);
+
+static int virCHMonitorOnceInit(void)
+{
+ if (!VIR_CLASS_NEW(virCHMonitor, virClassForObjectLockable()))
+ return -1;
+
+ return 0;
+}
+
+VIR_ONCE_GLOBAL_INIT(virCHMonitor);
+
+int virCHMonitorShutdownVMM(virCHMonitor *mon);
+int virCHMonitorPutNoContent(virCHMonitor *mon, const char *endpoint);
+int virCHMonitorGet(virCHMonitor *mon, const char *endpoint);
+
+static int
+virCHMonitorBuildCPUJson(virJSONValue *content, virDomainDef *vmdef)
+{
+ virJSONValue *cpus;
+ unsigned int maxvcpus = 0;
+ unsigned int nvcpus = 0;
+ virDomainVcpuDef *vcpu;
+ size_t i;
+
+ /* count maximum allowed number vcpus and enabled vcpus when boot.*/
+ maxvcpus = virDomainDefGetVcpusMax(vmdef);
+ for (i = 0; i < maxvcpus; i++) {
+ vcpu = virDomainDefGetVcpu(vmdef, i);
+ if (vcpu->online)
+ nvcpus++;
+ }
+
+ if (maxvcpus != 0 || nvcpus != 0) {
+ cpus = virJSONValueNewObject();
+ if (virJSONValueObjectAppendNumberInt(cpus, "boot_vcpus", nvcpus) < 0)
+ goto cleanup;
+ if (virJSONValueObjectAppendNumberInt(cpus, "max_vcpus", vmdef->maxvcpus) < 0)
+ goto cleanup;
+ if (virJSONValueObjectAppend(content, "cpus", &cpus) < 0)
+ goto cleanup;
+ }
+
+ return 0;
+
+ cleanup:
+ virJSONValueFree(cpus);
+ return -1;
+}
+
+static int
+virCHMonitorBuildKernelRelatedJson(virJSONValue *content, virDomainDef *vmdef)
+{
+ virJSONValue *kernel = virJSONValueNewObject();
+ virJSONValue *cmdline = virJSONValueNewObject();
+ virJSONValue *initramfs = virJSONValueNewObject();
+
+ if (vmdef->os.kernel == NULL) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("Kernel image path in this domain is not defined"));
+ goto cleanup;
+ } else {
+ kernel = virJSONValueNewObject();
+ if (virJSONValueObjectAppendString(kernel, "path", vmdef->os.kernel) < 0)
+ goto cleanup;
+ if (virJSONValueObjectAppend(content, "kernel", &kernel) < 0)
+ goto cleanup;
+ }
+
+ if (vmdef->os.cmdline) {
+ if (virJSONValueObjectAppendString(cmdline, "args", vmdef->os.cmdline) < 0)
+ goto cleanup;
+ if (virJSONValueObjectAppend(content, "cmdline", &cmdline) < 0)
+ goto cleanup;
+ }
+
+ if (vmdef->os.initrd != NULL) {
+ initramfs = virJSONValueNewObject();
+ if (virJSONValueObjectAppendString(initramfs, "path", vmdef->os.initrd) < 0)
+ goto cleanup;
+ if (virJSONValueObjectAppend(content, "initramfs", &initramfs) < 0)
+ goto cleanup;
+ }
+
+ return 0;
+
+ cleanup:
+ virJSONValueFree(kernel);
+ virJSONValueFree(cmdline);
+ virJSONValueFree(initramfs);
+
+ return -1;
+}
+
+static int
+virCHMonitorBuildMemoryJson(virJSONValue *content, virDomainDef *vmdef)
+{
+ virJSONValue *memory;
+ unsigned long long total_memory = virDomainDefGetMemoryInitial(vmdef) * 1024;
+
+ if (total_memory != 0) {
+ memory = virJSONValueNewObject();
+ if (virJSONValueObjectAppendNumberUlong(memory, "size", total_memory) < 0)
+ goto cleanup;
+ if (virJSONValueObjectAppend(content, "memory", &memory) < 0)
+ goto cleanup;
+ }
+
+ return 0;
+
+ cleanup:
+ virJSONValueFree(memory);
+ return -1;
+}
+
+static int
+virCHMonitorBuildDiskJson(virJSONValue *disks, virDomainDiskDef *diskdef)
+{
+ virJSONValue *disk = virJSONValueNewObject();
+
+ if (!diskdef->src)
+ goto cleanup;
+
+ switch (diskdef->src->type) {
+ case VIR_STORAGE_TYPE_FILE:
+ if (!diskdef->src->path) {
+ virReportError(VIR_ERR_INVALID_ARG, "%s",
+ _("Missing disk file path in domain"));
+ goto cleanup;
+ }
+ if (diskdef->bus != VIR_DOMAIN_DISK_BUS_VIRTIO) {
+ virReportError(VIR_ERR_INVALID_ARG,
+ _("Only virtio bus types are supported for '%s'"), diskdef->src->path);
+ goto cleanup;
+ }
+ if (virJSONValueObjectAppendString(disk, "path", diskdef->src->path) < 0)
+ goto cleanup;
+ if (diskdef->src->readonly) {
+ if (virJSONValueObjectAppendBoolean(disk, "readonly", true) < 0)
+ goto cleanup;
+ }
+ if (virJSONValueArrayAppend(disks, &disk) < 0)
+ goto cleanup;
+
+ break;
+ case VIR_STORAGE_TYPE_NONE:
+ case VIR_STORAGE_TYPE_BLOCK:
+ case VIR_STORAGE_TYPE_DIR:
+ case VIR_STORAGE_TYPE_NETWORK:
+ case VIR_STORAGE_TYPE_VOLUME:
+ case VIR_STORAGE_TYPE_NVME:
+ case VIR_STORAGE_TYPE_VHOST_USER:
+ default:
+ virReportEnumRangeError(virStorageType, diskdef->src->type);
+ goto cleanup;
+ }
+
+ return 0;
+
+ cleanup:
+ virJSONValueFree(disk);
+ return -1;
+}
+
+static int
+virCHMonitorBuildDisksJson(virJSONValue *content, virDomainDef *vmdef)
+{
+ virJSONValue *disks;
+ size_t i;
+
+ if (vmdef->ndisks > 0) {
+ disks = virJSONValueNewArray();
+
+ for (i = 0; i < vmdef->ndisks; i++) {
+ if (virCHMonitorBuildDiskJson(disks, vmdef->disks[i]) < 0)
+ goto cleanup;
+ }
+ if (virJSONValueObjectAppend(content, "disks", &disks) < 0)
+ goto cleanup;
+ }
+
+ return 0;
+
+ cleanup:
+ virJSONValueFree(disks);
+ return -1;
+}
+
+static int
+virCHMonitorBuildNetJson(virJSONValue *nets, virDomainNetDef *netdef)
+{
+ virDomainNetType netType = virDomainNetGetActualType(netdef);
+ char macaddr[VIR_MAC_STRING_BUFLEN];
+ virJSONValue *net;
+
+ // check net type at first
+ net = virJSONValueNewObject();
+
+ switch (netType) {
+ case VIR_DOMAIN_NET_TYPE_ETHERNET:
+ if (netdef->guestIP.nips == 1) {
+ const virNetDevIPAddr *ip = netdef->guestIP.ips[0];
+ g_autofree char *addr = NULL;
+ virSocketAddr netmask;
+ g_autofree char *netmaskStr = NULL;
+ if (!(addr = virSocketAddrFormat(&ip->address)))
+ goto cleanup;
+ if (virJSONValueObjectAppendString(net, "ip", addr) < 0)
+ goto cleanup;
+
+ if (virSocketAddrPrefixToNetmask(ip->prefix, &netmask, AF_INET) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("Failed to translate net prefix %d to netmask"),
+ ip->prefix);
+ goto cleanup;
+ }
+ if (!(netmaskStr = virSocketAddrFormat(&netmask)))
+ goto cleanup;
+ if (virJSONValueObjectAppendString(net, "mask", netmaskStr) < 0)
+ goto cleanup;
+ } else if (netdef->guestIP.nips > 1) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("ethernet type supports a single guest ip"));
+ }
+ break;
+ case VIR_DOMAIN_NET_TYPE_VHOSTUSER:
+ if ((virDomainChrType)netdef->data.vhostuser->type != VIR_DOMAIN_CHR_TYPE_UNIX) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED, "%s",
+ _("vhost_user type support UNIX socket in this CH"));
+ goto cleanup;
+ } else {
+ if (virJSONValueObjectAppendString(net, "vhost_socket", netdef->data.vhostuser->data.nix.path) < 0)
+ goto cleanup;
+ if (virJSONValueObjectAppendBoolean(net, "vhost_user", true) < 0)
+ goto cleanup;
+ }
+ break;
+ case VIR_DOMAIN_NET_TYPE_BRIDGE:
+ case VIR_DOMAIN_NET_TYPE_NETWORK:
+ case VIR_DOMAIN_NET_TYPE_DIRECT:
+ case VIR_DOMAIN_NET_TYPE_USER:
+ case VIR_DOMAIN_NET_TYPE_SERVER:
+ case VIR_DOMAIN_NET_TYPE_CLIENT:
+ case VIR_DOMAIN_NET_TYPE_MCAST:
+ case VIR_DOMAIN_NET_TYPE_INTERNAL:
+ case VIR_DOMAIN_NET_TYPE_HOSTDEV:
+ case VIR_DOMAIN_NET_TYPE_UDP:
+ case VIR_DOMAIN_NET_TYPE_VDPA:
+ case VIR_DOMAIN_NET_TYPE_LAST:
+ default:
+ virReportEnumRangeError(virDomainNetType, netType);
+ goto cleanup;
+ }
+
+ if (netdef->ifname != NULL) {
+ if (virJSONValueObjectAppendString(net, "tap", netdef->ifname) < 0)
+ goto cleanup;
+ }
+ if (virJSONValueObjectAppendString(net, "mac", virMacAddrFormat(&netdef->mac, macaddr)) < 0)
+ goto cleanup;
+
+
+ if (netdef->virtio != NULL) {
+ if (netdef->virtio->iommu == VIR_TRISTATE_SWITCH_ON) {
+ if (virJSONValueObjectAppendBoolean(net, "iommu", true) < 0)
+ goto cleanup;
+ }
+ }
+ if (netdef->driver.virtio.queues) {
+ if (virJSONValueObjectAppendNumberInt(net, "num_queues", netdef->driver.virtio.queues) < 0)
+ goto cleanup;
+ }
+
+ if (netdef->driver.virtio.rx_queue_size || netdef->driver.virtio.tx_queue_size) {
+ if (netdef->driver.virtio.rx_queue_size != netdef->driver.virtio.tx_queue_size) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("virtio rx_queue_size option %d is not same with tx_queue_size %d"),
+ netdef->driver.virtio.rx_queue_size,
+ netdef->driver.virtio.tx_queue_size);
+ goto cleanup;
+ }
+ if (virJSONValueObjectAppendNumberInt(net, "queue_size", netdef->driver.virtio.rx_queue_size) < 0)
+ goto cleanup;
+ }
+
+ if (virJSONValueArrayAppend(nets, &net) < 0)
+ goto cleanup;
+
+ return 0;
+
+ cleanup:
+ virJSONValueFree(net);
+ return -1;
+}
+
+static int
+virCHMonitorBuildNetsJson(virJSONValue *content, virDomainDef *vmdef)
+{
+ virJSONValue *nets;
+ size_t i;
+
+ if (vmdef->nnets > 0) {
+ nets = virJSONValueNewArray();
+
+ for (i = 0; i < vmdef->nnets; i++) {
+ if (virCHMonitorBuildNetJson(nets, vmdef->nets[i]) < 0)
+ goto cleanup;
+ }
+ if (virJSONValueObjectAppend(content, "net", &nets) < 0)
+ goto cleanup;
+ }
+
+ return 0;
+
+ cleanup:
+ virJSONValueFree(nets);
+ return -1;
+}
+
+static int
+virCHMonitorDetectUnsupportedDevices(virDomainDef *vmdef)
+{
+ int ret = 0;
+
+ if (vmdef->ngraphics > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support graphics"));
+ ret = 1;
+ }
+ if (vmdef->ncontrollers > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support controllers"));
+ ret = 1;
+ }
+ if (vmdef->nfss > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support fss"));
+ ret = 1;
+ }
+ if (vmdef->ninputs > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support inputs"));
+ ret = 1;
+ }
+ if (vmdef->nsounds > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support sounds"));
+ ret = 1;
+ }
+ if (vmdef->naudios > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support audios"));
+ ret = 1;
+ }
+ if (vmdef->nvideos > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support videos"));
+ ret = 1;
+ }
+ if (vmdef->nhostdevs > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support hostdevs"));
+ ret = 1;
+ }
+ if (vmdef->nredirdevs > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support redirdevs"));
+ ret = 1;
+ }
+ if (vmdef->nsmartcards > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support smartcards"));
+ ret = 1;
+ }
+ if (vmdef->nserials > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support serials"));
+ ret = 1;
+ }
+ if (vmdef->nparallels > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support parallels"));
+ ret = 1;
+ }
+ if (vmdef->nchannels > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support channels"));
+ ret = 1;
+ }
+ if (vmdef->nconsoles > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support consoles"));
+ ret = 1;
+ }
+ if (vmdef->nleases > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support leases"));
+ ret = 1;
+ }
+ if (vmdef->nhubs > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support hubs"));
+ ret = 1;
+ }
+ if (vmdef->nseclabels > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support seclabels"));
+ ret = 1;
+ }
+ if (vmdef->nrngs > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support rngs"));
+ ret = 1;
+ }
+ if (vmdef->nshmems > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support shmems"));
+ ret = 1;
+ }
+ if (vmdef->nmems > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support mems"));
+ ret = 1;
+ }
+ if (vmdef->npanics > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support panics"));
+ ret = 1;
+ }
+ if (vmdef->nsysinfo > 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("Cloud-Hypervisor doesn't support sysinfo"));
+ ret = 1;
+ }
+
+ return ret;
+}
+
+static int
+virCHMonitorBuildVMJson(virDomainDef *vmdef, char **jsonstr)
+{
+ virJSONValue *content = virJSONValueNewObject();
+ int ret = -1;
+
+ if (vmdef == NULL) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("VM is not defined"));
+ goto cleanup;
+ }
+
+ if (virCHMonitorDetectUnsupportedDevices(vmdef))
+ goto cleanup;
+
+ if (virCHMonitorBuildCPUJson(content, vmdef) < 0)
+ goto cleanup;
+
+ if (virCHMonitorBuildMemoryJson(content, vmdef) < 0)
+ goto cleanup;
+
+ if (virCHMonitorBuildKernelRelatedJson(content, vmdef) < 0)
+ goto cleanup;
+
+ if (virCHMonitorBuildDisksJson(content, vmdef) < 0)
+ goto cleanup;
+
+ if (virCHMonitorBuildNetsJson(content, vmdef) < 0)
+ goto cleanup;
+
+ if (!(*jsonstr = virJSONValueToString(content, false)))
+ goto cleanup;
+
+ ret = 0;
+
+ cleanup:
+ virJSONValueFree(content);
+ return ret;
+}
+
+static int
+chMonitorCreateSocket(const char *socket_path)
+{
+ struct sockaddr_un addr;
+ socklen_t addrlen = sizeof(addr);
+ int fd;
+
+ if ((fd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) {
+ virReportSystemError(errno, "%s",
+ _("Unable to create UNIX socket"));
+ goto error;
+ }
+
+ memset(&addr, 0, sizeof(addr));
+ addr.sun_family = AF_UNIX;
+ if (virStrcpyStatic(addr.sun_path, socket_path) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("UNIX socket path '%s' too long"),
+ socket_path);
+ goto error;
+ }
+
+ if (unlink(socket_path) < 0 && errno != ENOENT) {
+ virReportSystemError(errno,
+ _("Unable to unlink %s"),
+ socket_path);
+ goto error;
+ }
+
+ if (bind(fd, (struct sockaddr *)&addr, addrlen) < 0) {
+ virReportSystemError(errno,
+ _("Unable to bind to UNIX socket path '%s'"),
+ socket_path);
+ goto error;
+ }
+
+ if (listen(fd, 1) < 0) {
+ virReportSystemError(errno,
+ _("Unable to listen to UNIX socket path '%s'"),
+ socket_path);
+ goto error;
+ }
+
+ /* We run cloud-hypervisor with umask 0002. Compensate for the umask
+ * libvirtd might be running under to get the same permission
+ * cloud-hypervisor would have. */
+ if (virFileUpdatePerm(socket_path, 0002, 0664) < 0)
+ goto error;
+
+ return fd;
+
+ error:
+ VIR_FORCE_CLOSE(fd);
+ return -1;
+}
+
+virCHMonitor *
+virCHMonitorNew(virDomainObj *vm, const char *socketdir)
+{
+ virCHMonitor *ret = NULL;
+ virCHMonitor *mon = NULL;
+ virCommand *cmd = NULL;
+ int socket_fd = 0;
+
+ if (virCHMonitorInitialize() < 0)
+ return NULL;
+
+ if (!(mon = virObjectLockableNew(virCHMonitorClass)))
+ return NULL;
+
+ if (!vm->def) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("VM is not defined"));
+ return NULL;
+ }
+
+ /* prepare to launch Cloud-Hypervisor socket */
+ mon->socketpath = g_strdup_printf("%s/%s-socket", socketdir, vm->def->name);
+ if (g_mkdir_with_parents(socketdir, 0777) < 0) {
+ virReportSystemError(errno,
+ _("Cannot create socket directory '%s'"),
+ socketdir);
+ goto cleanup;
+ }
+
+ cmd = virCommandNew(vm->def->emulator);
+ virCommandSetUmask(cmd, 0x002);
+ socket_fd = chMonitorCreateSocket(mon->socketpath);
+ if (socket_fd < 0) {
+ virReportSystemError(errno,
+ _("Cannot create socket '%s'"),
+ mon->socketpath);
+ goto cleanup;
+ }
+
+ virCommandAddArg(cmd, "--api-socket");
+ virCommandAddArgFormat(cmd, "fd=%d", socket_fd);
+ virCommandPassFD(cmd, socket_fd, VIR_COMMAND_PASS_FD_CLOSE_PARENT);
+
+ /* launch Cloud-Hypervisor socket */
+ if (virCommandRunAsync(cmd, &mon->pid) < 0)
+ goto cleanup;
+
+ /* get a curl handle */
+ mon->handle = curl_easy_init();
+
+ /* now has its own reference */
+ virObjectRef(mon);
+ mon->vm = virObjectRef(vm);
+
+ ret = mon;
+
+ cleanup:
+ virCommandFree(cmd);
+ return ret;
+}
+
+static void virCHMonitorDispose(void *opaque)
+{
+ virCHMonitor *mon = opaque;
+
+ VIR_DEBUG("mon=%p", mon);
+ virObjectUnref(mon->vm);
+}
+
+void virCHMonitorClose(virCHMonitor *mon)
+{
+ if (!mon)
+ return;
+
+ if (mon->pid > 0) {
+ /* try cleaning up the Cloud-Hypervisor process */
+ virProcessAbort(mon->pid);
+ mon->pid = 0;
+ }
+
+ if (mon->handle)
+ curl_easy_cleanup(mon->handle);
+
+ if (mon->socketpath) {
+ if (virFileRemove(mon->socketpath, -1, -1) < 0) {
+ VIR_WARN("Unable to remove CH socket file '%s'",
+ mon->socketpath);
+ }
+ g_free(mon->socketpath);
+ }
+
+ virObjectUnref(mon);
+ if (mon->vm)
+ virObjectUnref(mon->vm);
+}
+
+static int
+virCHMonitorCurlPerform(CURL *handle)
+{
+ CURLcode errorCode;
+ long responseCode = 0;
+
+ errorCode = curl_easy_perform(handle);
+
+ if (errorCode != CURLE_OK) {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("curl_easy_perform() returned an error: %s (%d)"),
+ curl_easy_strerror(errorCode), errorCode);
+ return -1;
+ }
+
+ errorCode = curl_easy_getinfo(handle, CURLINFO_RESPONSE_CODE,
+ &responseCode);
+
+ if (errorCode != CURLE_OK) {
+ virReportError(VIR_ERR_INTERNAL_ERROR,
+ _("curl_easy_getinfo(CURLINFO_RESPONSE_CODE) returned an "
+ "error: %s (%d)"), curl_easy_strerror(errorCode),
+ errorCode);
+ return -1;
+ }
+
+ if (responseCode < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("curl_easy_getinfo(CURLINFO_RESPONSE_CODE) returned a "
+ "negative response code"));
+ return -1;
+ }
+
+ return responseCode;
+}
+
+int
+virCHMonitorPutNoContent(virCHMonitor *mon, const char *endpoint)
+{
+ g_autofree char *url = NULL;
+ int responseCode = 0;
+ int ret = -1;
+
+ url = g_strdup_printf("%s/%s", URL_ROOT, endpoint);
+
+ virObjectLock(mon);
+
+ /* reset all options of a libcurl session handle at first */
+ curl_easy_reset(mon->handle);
+
+ curl_easy_setopt(mon->handle, CURLOPT_UNIX_SOCKET_PATH, mon->socketpath);
+ curl_easy_setopt(mon->handle, CURLOPT_URL, url);
+ curl_easy_setopt(mon->handle, CURLOPT_PUT, true);
+ curl_easy_setopt(mon->handle, CURLOPT_HTTPHEADER, NULL);
+
+ responseCode = virCHMonitorCurlPerform(mon->handle);
+
+ virObjectUnlock(mon);
+
+ if (responseCode == 200 || responseCode == 204)
+ ret = 0;
+
+ return ret;
+}
+
+int
+virCHMonitorGet(virCHMonitor *mon, const char *endpoint)
+{
+ g_autofree char *url = NULL;
+ int responseCode = 0;
+ int ret = -1;
+
+ url = g_strdup_printf("%s/%s", URL_ROOT, endpoint);
+
+ virObjectLock(mon);
+
+ /* reset all options of a libcurl session handle at first */
+ curl_easy_reset(mon->handle);
+
+ curl_easy_setopt(mon->handle, CURLOPT_UNIX_SOCKET_PATH, mon->socketpath);
+ curl_easy_setopt(mon->handle, CURLOPT_URL, url);
+
+ responseCode = virCHMonitorCurlPerform(mon->handle);
+
+ virObjectUnlock(mon);
+
+ if (responseCode == 200 || responseCode == 204)
+ ret = 0;
+
+ return ret;
+}
+
+int
+virCHMonitorShutdownVMM(virCHMonitor *mon)
+{
+ return virCHMonitorPutNoContent(mon, URL_VMM_SHUTDOWN);
+}
+
+int
+virCHMonitorCreateVM(virCHMonitor *mon)
+{
+ g_autofree char *url = NULL;
+ int responseCode = 0;
+ int ret = -1;
+ g_autofree char *payload = NULL;
+ struct curl_slist *headers = NULL;
+
+ url = g_strdup_printf("%s/%s", URL_ROOT, URL_VM_CREATE);
+ headers = curl_slist_append(headers, "Accept: application/json");
+ headers = curl_slist_append(headers, "Content-Type: application/json");
+
+ if (virCHMonitorBuildVMJson(mon->vm->def, &payload) != 0)
+ return -1;
+
+ virObjectLock(mon);
+
+ /* reset all options of a libcurl session handle at first */
+ curl_easy_reset(mon->handle);
+
+ curl_easy_setopt(mon->handle, CURLOPT_UNIX_SOCKET_PATH, mon->socketpath);
+ curl_easy_setopt(mon->handle, CURLOPT_URL, url);
+ curl_easy_setopt(mon->handle, CURLOPT_CUSTOMREQUEST, "PUT");
+ curl_easy_setopt(mon->handle, CURLOPT_HTTPHEADER, headers);
+ curl_easy_setopt(mon->handle, CURLOPT_POSTFIELDS, payload);
+
+ responseCode = virCHMonitorCurlPerform(mon->handle);
+
+ virObjectUnlock(mon);
+
+ if (responseCode == 200 || responseCode == 204)
+ ret = 0;
+
+ curl_slist_free_all(headers);
+ return ret;
+}
+
+int
+virCHMonitorBootVM(virCHMonitor *mon)
+{
+ return virCHMonitorPutNoContent(mon, URL_VM_BOOT);
+}
+
+int
+virCHMonitorShutdownVM(virCHMonitor *mon)
+{
+ return virCHMonitorPutNoContent(mon, URL_VM_SHUTDOWN);
+}
+
+int
+virCHMonitorRebootVM(virCHMonitor *mon)
+{
+ return virCHMonitorPutNoContent(mon, URL_VM_REBOOT);
+}
+
+int
+virCHMonitorSuspendVM(virCHMonitor *mon)
+{
+ return virCHMonitorPutNoContent(mon, URL_VM_Suspend);
+}
+
+int
+virCHMonitorResumeVM(virCHMonitor *mon)
+{
+ return virCHMonitorPutNoContent(mon, URL_VM_RESUME);
+}
diff --git a/src/ch/ch_monitor.h b/src/ch/ch_monitor.h
new file mode 100644
index 0000000000..e717e11cbc
--- /dev/null
+++ b/src/ch/ch_monitor.h
@@ -0,0 +1,60 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_monitor.h: header file for managing Cloud-Hypervisor interactions
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#pragma once
+
+#include <curl/curl.h>
+
+#include "virobject.h"
+#include "domain_conf.h"
+
+#define URL_ROOT "http://localhost/api/v1"
+#define URL_VMM_SHUTDOWN "vmm.shutdown"
+#define URL_VM_CREATE "vm.create"
+#define URL_VM_DELETE "vm.delete"
+#define URL_VM_BOOT "vm.boot"
+#define URL_VM_SHUTDOWN "vm.shutdown"
+#define URL_VM_REBOOT "vm.reboot"
+#define URL_VM_Suspend "vm.pause"
+#define URL_VM_RESUME "vm.resume"
+
+typedef struct _virCHMonitor virCHMonitor;
+
+struct _virCHMonitor {
+ virObjectLockable parent;
+
+ CURL *handle;
+
+ char *socketpath;
+
+ pid_t pid;
+
+ virDomainObj *vm;
+};
+
+virCHMonitor *virCHMonitorNew(virDomainObj *vm, const char *socketdir);
+void virCHMonitorClose(virCHMonitor *mon);
+
+int virCHMonitorCreateVM(virCHMonitor *mon);
+int virCHMonitorBootVM(virCHMonitor *mon);
+int virCHMonitorShutdownVM(virCHMonitor *mon);
+int virCHMonitorRebootVM(virCHMonitor *mon);
+int virCHMonitorSuspendVM(virCHMonitor *mon);
+int virCHMonitorResumeVM(virCHMonitor *mon);
diff --git a/src/ch/ch_process.c b/src/ch/ch_process.c
new file mode 100644
index 0000000000..93b1f7f97e
--- /dev/null
+++ b/src/ch/ch_process.c
@@ -0,0 +1,126 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_process.c: Process controller for Cloud-Hypervisor driver
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#include <config.h>
+
+#include <unistd.h>
+#include <fcntl.h>
+
+#include "ch_domain.h"
+#include "ch_monitor.h"
+#include "ch_process.h"
+#include "viralloc.h"
+#include "virerror.h"
+#include "virlog.h"
+
+#define VIR_FROM_THIS VIR_FROM_CH
+
+VIR_LOG_INIT("ch.ch_process");
+
+#define START_SOCKET_POSTFIX ": starting up socket\n"
+#define START_VM_POSTFIX ": starting up vm\n"
+
+
+
+static virCHMonitor *
+virCHProcessConnectMonitor(virCHDriver *driver,
+ virDomainObj *vm)
+{
+ virCHMonitor *monitor = NULL;
+ virCHDriverConfig *cfg = virCHDriverGetConfig(driver);
+
+ monitor = virCHMonitorNew(vm, cfg->stateDir);
+
+ virObjectUnref(cfg);
+ return monitor;
+}
+
+/**
+ * virCHProcessStart:
+ * @driver: pointer to driver structure
+ * @vm: pointer to virtual machine structure
+ * @reason: reason for switching vm to running state
+ *
+ * Starts Cloud-Hypervisor listen on a local socket
+ *
+ * Returns 0 on success or -1 in case of error
+ */
+int virCHProcessStart(virCHDriver *driver,
+ virDomainObj *vm,
+ virDomainRunningReason reason)
+{
+ int ret = -1;
+ virCHDomainObjPrivate *priv = vm->privateData;
+
+ if (!priv->monitor) {
+ /* And we can get the first monitor connection now too */
+ if (!(priv->monitor = virCHProcessConnectMonitor(driver, vm))) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to create connection to CH socket"));
+ goto cleanup;
+ }
+
+ if (virCHMonitorCreateVM(priv->monitor) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to create guest VM"));
+ goto cleanup;
+ }
+ }
+
+ if (virCHMonitorBootVM(priv->monitor) < 0) {
+ virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
+ _("failed to boot guest VM"));
+ goto cleanup;
+ }
+
+ vm->pid = priv->monitor->pid;
+ vm->def->id = vm->pid;
+ virDomainObjSetState(vm, VIR_DOMAIN_RUNNING, reason);
+
+ return 0;
+
+ cleanup:
+ if (ret)
+ virCHProcessStop(driver, vm, VIR_DOMAIN_SHUTOFF_FAILED);
+
+ return ret;
+}
+
+int virCHProcessStop(virCHDriver *driver G_GNUC_UNUSED,
+ virDomainObj *vm,
+ virDomainShutoffReason reason)
+{
+ virCHDomainObjPrivate *priv = vm->privateData;
+
+ VIR_DEBUG("Stopping VM name=%s pid=%d reason=%d",
+ vm->def->name, (int)vm->pid, (int)reason);
+
+ if (priv->monitor) {
+ virCHMonitorClose(priv->monitor);
+ priv->monitor = NULL;
+ }
+
+ vm->pid = -1;
+ vm->def->id = -1;
+
+ virDomainObjSetState(vm, VIR_DOMAIN_SHUTOFF, reason);
+
+ return 0;
+}
diff --git a/src/ch/ch_process.h b/src/ch/ch_process.h
new file mode 100644
index 0000000000..abc4915979
--- /dev/null
+++ b/src/ch/ch_process.h
@@ -0,0 +1,31 @@
+/*
+ * Copyright Intel Corp. 2020-2021
+ *
+ * ch_process.h: header file for Cloud-Hypervisor's process controller
+ *
+ * This library is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU Lesser General Public
+ * License as published by the Free Software Foundation; either
+ * version 2.1 of the License, or (at your option) any later version.
+ *
+ * This library is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ * Lesser General Public License for more details.
+ *
+ * You should have received a copy of the GNU Lesser General Public
+ * License along with this library. If not, see
+ * <http://www.gnu.org/licenses/>.
+ */
+
+#pragma once
+
+#include "ch_conf.h"
+#include "internal.h"
+
+int virCHProcessStart(virCHDriver *driver,
+ virDomainObj *vm,
+ virDomainRunningReason reason);
+int virCHProcessStop(virCHDriver *driver,
+ virDomainObj *vm,
+ virDomainShutoffReason reason);
diff --git a/src/ch/meson.build b/src/ch/meson.build
new file mode 100644
index 0000000000..e34974d56c
--- /dev/null
+++ b/src/ch/meson.build
@@ -0,0 +1,74 @@
+ch_driver_sources = [
+ 'ch_conf.c',
+ 'ch_conf.h',
+ 'ch_domain.c',
+ 'ch_domain.h',
+ 'ch_driver.c',
+ 'ch_driver.h',
+ 'ch_monitor.c',
+ 'ch_monitor.h',
+ 'ch_process.c',
+ 'ch_process.h',
+]
+
+driver_source_files += files(ch_driver_sources)
+
+stateful_driver_source_files += files(ch_driver_sources)
+
+if conf.has('WITH_CH')
+ ch_driver_impl = static_library(
+ 'virt_driver_ch_impl',
+ [
+ ch_driver_sources,
+ ],
+ dependencies: [
+ access_dep,
+ curl_dep,
+ log_dep,
+ src_dep,
+ ],
+ include_directories: [
+ conf_inc_dir,
+ ],
+ )
+
+ virt_modules += {
+ 'name': 'virt_driver_ch',
+ 'link_whole': [
+ ch_driver_impl,
+ ],
+ 'link_args': [
+ libvirt_no_undefined,
+ ],
+ }
+
+ virt_daemons += {
+ 'name': 'virtchd',
+ 'c_args': [
+ '-DDAEMON_NAME="virtchd"',
+ '-DMODULE_NAME="ch"',
+ ],
+ }
+
+ virt_daemon_confs += {
+ 'name': 'virtchd',
+ }
+
+ virt_daemon_units += {
+ 'service': 'virtchd',
+ 'service_in': files('virtchd.service.in'),
+ 'name': 'Libvirt ch',
+ 'sockprefix': 'virtchd',
+ 'sockets': [ 'main', 'ro', 'admin' ],
+ }
+
+ sysconf_files += {
+ 'name': 'virtchd',
+ 'file': files('virtchd.sysconf'),
+ }
+
+ virt_install_dirs += [
+ localstatedir / 'lib' / 'libvirt' / 'ch',
+ runstatedir / 'libvirt' / 'ch',
+ ]
+endif
diff --git a/src/ch/virtchd.service.in b/src/ch/virtchd.service.in
new file mode 100644
index 0000000000..cc1e85d1df
--- /dev/null
+++ b/src/ch/virtchd.service.in
@@ -0,0 +1,47 @@
+[Unit]
+Description=Virtualization Cloud-Hypervisor daemon
+Conflicts=libvirtd.service
+Requires=virtchd.socket
+Requires=virtchd-ro.socket
+Requires=virtchd-admin.socket
+Wants=systemd-machined.service
+Before=libvirt-guests.service
+After=network.target
+After=dbus.service
+After=apparmor.service
+After=local-fs.target
+After=remote-fs.target
+After=systemd-logind.service
+After=systemd-machined.service
+Documentation=man:libvirtd(8)
+Documentation=https://libvirt.org
+
+[Service]
+Type=notify
+EnvironmentFile=-@sysconfdir@/sysconfig/virtchd
+ExecStart=@sbindir@/virtchd $VIRTCHD_ARGS
+ExecReload=/bin/kill -HUP $MAINPID
+KillMode=process
+Restart=on-failure
+# At least 2 FD per guest (eg ch monitor + ch socket).
+# eg if we want to support 4096 guests, we'll typically need 8192 FDs
+# If changing this, also consider virtlogd.service & virtlockd.service
+# limits which are also related to number of guests
+LimitNOFILE=8192
+# The cgroups pids controller can limit the number of tasks started by
+# the daemon, which can limit the number of domains for some hypervisors.
+# A conservative default of 8 tasks per guest results in a TasksMax of
+# 32k to support 4096 guests.
+TasksMax=32768
+# With cgroups v2 there is no devices controller anymore, we have to use
+# eBPF to control access to devices. In order to do that we create a eBPF
+# hash MAP which locks memory. The default map size for 64 devices together
+# with program takes 12k per guest. After rounding up we will get 64M to
+# support 4096 guests.
+LimitMEMLOCK=64M
+
+[Install]
+WantedBy=multi-user.target
+Also=virtchd.socket
+Also=virtchd-ro.socket
+Also=virtchd-admin.socket
diff --git a/src/ch/virtchd.sysconf b/src/ch/virtchd.sysconf
new file mode 100644
index 0000000000..5ee44be5cf
--- /dev/null
+++ b/src/ch/virtchd.sysconf
@@ -0,0 +1,3 @@
+# Customizations for the virtchd.service systemd unit
+
+VIRTCHD_ARGS="--timeout 120"
diff --git a/src/meson.build b/src/meson.build
index c7ff9e978c..2bd88e6699 100644
--- a/src/meson.build
+++ b/src/meson.build
@@ -271,6 +271,7 @@ subdir('esx')
subdir('hyperv')
subdir('libxl')
subdir('lxc')
+subdir('ch')
subdir('openvz')
subdir('qemu')
subdir('test')
diff --git a/src/remote/remote_daemon.c b/src/remote/remote_daemon.c
index ec2661d0a8..7076fe3294 100644
--- a/src/remote/remote_daemon.c
+++ b/src/remote/remote_daemon.c
@@ -169,6 +169,10 @@ static int daemonInitialize(void)
if (virDriverLoadModule("qemu", "qemuRegister", false) < 0)
return -1;
# endif
+# ifdef WITH_CH
+ if (virDriverLoadModule("ch", "chRegister", false) < 0)
+ return -1;
+# endif
# ifdef WITH_LXC
if (virDriverLoadModule("lxc", "lxcRegister", false) < 0)
return -1;
diff --git a/src/remote/remote_daemon_dispatch.c b/src/remote/remote_daemon_dispatch.c
index 1b4f5256c3..e500b41564 100644
--- a/src/remote/remote_daemon_dispatch.c
+++ b/src/remote/remote_daemon_dispatch.c
@@ -2139,7 +2139,8 @@ remoteDispatchConnectOpen(virNetServer *server G_GNUC_UNUSED,
STREQ(type, "VBOX") ||
STREQ(type, "bhyve") ||
STREQ(type, "vz") ||
- STREQ(type, "Parallels")) {
+ STREQ(type, "Parallels") ||
+ STREQ(type, "CH")) {
VIR_DEBUG("Hypervisor driver found, setting URIs for secondary drivers");
if (getuid() == 0) {
priv->interfaceURI = "interface:///system";
diff --git a/src/util/virerror.c b/src/util/virerror.c
index 1746487f7d..d9e2c65dc8 100644
--- a/src/util/virerror.c
+++ b/src/util/virerror.c
@@ -145,6 +145,7 @@ VIR_ENUM_IMPL(virErrorDomain,
"TPM", /* 70 */
"BPF",
+ "Cloud-Hypervisor Driver",
);
diff --git a/tools/virsh.c b/tools/virsh.c
index 7d7109cfdf..70355a606b 100644
--- a/tools/virsh.c
+++ b/tools/virsh.c
@@ -506,6 +506,9 @@ virshShowVersion(vshControl *ctl G_GNUC_UNUSED)
#ifdef WITH_OPENVZ
vshPrint(ctl, " OpenVZ");
#endif
+#ifdef WITH_CH
+ vshPrint(ctl, " Cloud-Hypervisor");
+#endif
#ifdef WITH_VZ
vshPrint(ctl, " Virtuozzo");
#endif
--
2.31.1
3 years, 6 months
[PATCH 0/2] Minor fixes to the RPM spec
by Neal Gompa
As part of the work to add openSUSE support to the spec file,
I discovered a couple of small fixes worth submitting separately.
Neal Gompa (2):
rpm: Simplify expression of supported platforms
rpm: Drop unnecessary libiscsi runtime dependency
libvirt.spec.in | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--
2.29.2
3 years, 6 months
[libvirt PATCH v3 0/8] cleanup meson checks for runtime binaries
by Pavel Hrdina
Recent attempt to add a lot of meson options to specify different
runtime paths motivated me enough to cleanup this from meson.
Changes in v3:
- some patches were already pushed
- removed patch that moved virFindFileInPath from testutilsqemu.c
Changes in v2:
- split and rework patch 16/17 to address review comments
- added a new patch to cleanup libvirt.spec.in file
Pavel Hrdina (8):
virfile: introduce virFindFileInPathFull()
qemu_conf: use virFindFileInPathFull for runtime binaries
meson: drop check for runtime binary dependencies
meson: move iscsiadm check into storage_iscsi condition
meson: stop setting runtime binaries defines during compilation
meson: use runtime binaries to only resolve features with "auto" value
meson: optional_programs should be used only for building libvirt
libvirt.spec: drop no longer required build dependencies
libvirt.spec.in | 31 ----
meson.build | 214 +++++++++----------------
src/bhyve/bhyve_command.c | 4 +
src/libvirt_private.syms | 2 +-
src/locking/lock_driver_lockd.c | 12 +-
src/network/bridge_driver.c | 2 +
src/node_device/node_device_driver.c | 2 +
src/qemu/qemu_conf.c | 23 ++-
src/storage/storage_backend_logical.c | 13 ++
src/storage/storage_backend_sheepdog.c | 2 +
src/storage/storage_backend_zfs.c | 3 +
src/storage/storage_util.c | 2 +
src/storage/storage_util.h | 6 +
src/util/virdnsmasq.c | 1 +
src/util/virfile.c | 16 +-
src/util/virfile.h | 5 +-
src/util/virfirewall.h | 4 +
src/util/viriscsi.h | 2 +
src/util/virkmod.h | 3 +
src/util/virnetdevbandwidth.h | 2 +
src/util/virnetdevip.c | 2 +
src/util/virnetdevmidonet.c | 2 +
src/util/virnetdevopenvswitch.c | 2 +
src/util/virnuma.c | 1 +
src/util/virsysinfo.c | 1 +
src/util/virutil.c | 2 +
tests/testutilsqemu.c | 3 +-
tests/virfirewallmock.c | 3 +-
28 files changed, 173 insertions(+), 192 deletions(-)
--
2.30.2
3 years, 6 months
[libvirt PATCH 00/10] Refactor more XML parsing boilerplate code, part VII
by Tim Wiederhake
For background, see
https://listman.redhat.com/archives/libvir-list/2021-April/msg00668.html
Tim Wiederhake (10):
virDomainAudioIOCommon: Change type of format to virDomainAudioFormat
virDomainAudioCommonParse: Use virXMLProp*
virDomainMemballoonDef: Change type of model to
virDomainMemballoonModel
virDomainMemballoonDefParseXML: Use virXMLProp*
virDomainShmemDef: Change type of model to virDomainShmemModel
virDomainShmemDef: Change type of role to virDomainShmemRole
virDomainShmemDefParseXML: Use virXMLProp*
conf: domain: Register autoptr cleanup function for virDomainDeviceDef
virDomainShmemDef: Use g_auto*
virDomainRedirFilterUSBDevDefParseXML: Use virXMLProp*
src/conf/domain_conf.c | 268 ++++++++++-----------------------
src/conf/domain_conf.h | 9 +-
src/libxl/libxl_conf.c | 8 +-
src/qemu/qemu_command.c | 4 +-
src/qemu/qemu_domain_address.c | 2 +-
src/qemu/qemu_hotplug.c | 4 +-
src/qemu/qemu_monitor.c | 2 +-
src/qemu/qemu_validate.c | 5 +
src/security/virt-aa-helper.c | 4 +
9 files changed, 100 insertions(+), 206 deletions(-)
--
2.26.3
3 years, 6 months
[PATCH RFC v5 00/12] Add riscv kvm accel support
by Yifei Jiang
This series adds both riscv32 and riscv64 kvm support, and implements
migration based on riscv. It is based on temporarily unaccepted kvm:
https://github.com/kvm-riscv/linux (lastest version v17).
This series depends on above pending changes which haven't yet been
accepted, so this QEMU patch series is treated as RFC patches until
that dependency has been dealt with.
Several steps to use this:
1. Build emulation
$ ./configure --target-list=riscv64-softmmu
$ make -j$(nproc)
2. Build kernel
https://github.com/kvm-riscv/linux
3. Build QEMU VM
Cross built in riscv toolchain.
$ PKG_CONFIG_LIBDIR=<toolchain pkgconfig path>
$ export PKG_CONFIG_SYSROOT_DIR=<toolchain sysroot path>
$ ./configure --target-list=riscv64-softmmu --enable-kvm \
--cross-prefix=riscv64-linux-gnu- --disable-libiscsi --disable-glusterfs \
--disable-libusb --disable-usb-redir --audio-drv-list= --disable-opengl \
--disable-libxml2
$ make -j$(nproc)
4. Start emulation
$ ./qemu-system-riscv64 -M virt -m 4096M -cpu rv64,x-h=true -nographic \
-name guest=riscv-hyp,debug-threads=on \
-smp 4 \
-bios ./fw_jump.bin \
-kernel ./Image \
-drive file=./hyp.img,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0 \
-append "root=/dev/vda rw console=ttyS0 earlycon=sbi"
5. Start kvm-acceled QEMU VM in emulation
$ ./qemu-system-riscv64 -M virt,accel=kvm -m 1024M -cpu host -nographic \
-name guest=riscv-guset \
-smp 2 \
-bios none \
-kernel ./Image \
-drive file=./guest.img,format=raw,id=hd0 \
-device virtio-blk-device,drive=hd0 \
-append "root=/dev/vda rw console=ttyS0 earlycon=sbi"
Changes since RFC v4
- Rebase on QEMU v6.0.0-rc2 and kvm-riscv linux v17.
- Remove time scaling support as software solution is incomplete.
Because it will cause unacceptable performance degradation. and
We will post a better solution.
- Revise according to Alistair's review comments.
- Remove compile time XLEN checks in kvm_riscv_reg_id
- Surround TYPE_RISCV_CPU_HOST definition by CONFIG_KVM and share
it between RV32 and RV64.
- Add kvm-stub.c for reduce unnecessary compilation checks.
- Add riscv_setup_direct_kernel() to direct boot kernel for KVM.
Changes since RFC v3
- Rebase on QEMU v5.2.0-rc2 and kvm-riscv linux v15.
- Add time scaling support(New patches 13, 14 and 15).
- Fix the bug that guest vm can't reboot.
Changes since RFC v2
- Fix checkpatch error at target/riscv/sbi_ecall_interface.h.
- Add riscv migration support.
Changes since RFC v1
- Add separate SBI ecall interface header.
- Add riscv32 kvm accel support.
Yifei Jiang (12):
linux-header: Update linux/kvm.h
target/riscv: Add target/riscv/kvm.c to place the public kvm interface
target/riscv: Implement function kvm_arch_init_vcpu
target/riscv: Implement kvm_arch_get_registers
target/riscv: Implement kvm_arch_put_registers
target/riscv: Support start kernel directly by KVM
hw/riscv: PLIC update external interrupt by KVM when kvm enabled
target/riscv: Handle KVM_EXIT_RISCV_SBI exit
target/riscv: Add host cpu type
target/riscv: Add kvm_riscv_get/put_regs_timer
target/riscv: Implement virtual time adjusting with vm state changing
target/riscv: Support virtual time context synchronization
hw/intc/sifive_plic.c | 29 +-
hw/riscv/boot.c | 11 +
hw/riscv/virt.c | 7 +
include/hw/riscv/boot.h | 1 +
linux-headers/linux/kvm.h | 97 +++++
meson.build | 2 +
target/riscv/cpu.c | 17 +
target/riscv/cpu.h | 10 +
target/riscv/kvm-stub.c | 30 ++
target/riscv/kvm.c | 605 +++++++++++++++++++++++++++++
target/riscv/kvm_riscv.h | 25 ++
target/riscv/machine.c | 14 +
target/riscv/meson.build | 1 +
target/riscv/sbi_ecall_interface.h | 72 ++++
14 files changed, 912 insertions(+), 9 deletions(-)
create mode 100644 target/riscv/kvm-stub.c
create mode 100644 target/riscv/kvm.c
create mode 100644 target/riscv/kvm_riscv.h
create mode 100644 target/riscv/sbi_ecall_interface.h
--
2.19.1
3 years, 6 months
[PATCH] virtio-blk: drop deprecated scsi=on|off property
by Stefan Hajnoczi
The scsi=on|off property was deprecated in QEMU 5.0 and can be removed
completely at this point.
Drop the scsi=on|off option. It was only available on Legacy virtio-blk
devices. Linux v5.6 already dropped support for it.
Remove the hw_compat_2_4[] property assignment since scsi=on|off no
longer exists. Old guests with Legacy virtio-blk devices no longer see
the SCSI host features bit.
Live migrating old guests from an old QEMU with the SCSI feature bit
enabled will fail with "Features 0x... unsupported. Allowed features:
0x...". We've followed the QEMU deprecation policy so users have been
warned...
I have tested that libvirt still works when the property is absent. It
no longer adds scsi=on|off to the command-line.
Cc: Markus Armbruster <armbru(a)redhat.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Peter Krempa <pkrempa(a)redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert(a)redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha(a)redhat.com>
---
docs/specs/tpm.rst | 2 +-
docs/system/deprecated.rst | 13 ---
docs/pci_expander_bridge.txt | 2 +-
hw/block/virtio-blk.c | 192 +----------------------------------
hw/core/machine.c | 2 -
5 files changed, 3 insertions(+), 208 deletions(-)
diff --git a/docs/specs/tpm.rst b/docs/specs/tpm.rst
index 3be190343a..0ec017ab95 100644
--- a/docs/specs/tpm.rst
+++ b/docs/specs/tpm.rst
@@ -328,7 +328,7 @@ In case a pSeries machine is emulated, use the following command line:
-tpmdev emulator,id=tpm0,chardev=chrtpm \
-device tpm-spapr,tpmdev=tpm0 \
-device spapr-vscsi,id=scsi0,reg=0x00002000 \
- -device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0 \
+ -device virtio-blk-pci,bus=pci.0,addr=0x3,drive=drive-virtio-disk0,id=virtio-disk0 \
-drive file=test.img,format=raw,if=none,id=drive-virtio-disk0
In case an Arm virt machine is emulated, use the following command line:
diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index 80cae86252..1abb64b669 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -248,19 +248,6 @@ machines have been renamed ``raspi2b`` and ``raspi3b``.
Device options
--------------
-Emulated device options
-'''''''''''''''''''''''
-
-``-device virtio-blk,scsi=on|off`` (since 5.0.0)
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The virtio-blk SCSI passthrough feature is a legacy VIRTIO feature. VIRTIO 1.0
-and later do not support it because the virtio-scsi device was introduced for
-full SCSI support. Use virtio-scsi instead when SCSI passthrough is required.
-
-Note this also applies to ``-device virtio-blk-pci,scsi=on|off``, which is an
-alias.
-
Block device options
''''''''''''''''''''
diff --git a/docs/pci_expander_bridge.txt b/docs/pci_expander_bridge.txt
index 36750273bb..540191f5e0 100644
--- a/docs/pci_expander_bridge.txt
+++ b/docs/pci_expander_bridge.txt
@@ -25,7 +25,7 @@ A detailed command line would be:
-object memory-backend-ram,size=1024M,policy=bind,host-nodes=1,id=ram-node1 -numa node,nodeid=1,cpus=1,memdev=ram-node1
-device pxb,id=bridge1,bus=pci.0,numa_node=1,bus_nr=4 -netdev user,id=nd -device e1000,bus=bridge1,addr=0x4,netdev=nd
-device pxb,id=bridge2,bus=pci.0,numa_node=0,bus_nr=8 -device e1000,bus=bridge2,addr=0x3
--device pxb,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,scsi=off,bus=bridge3,addr=1
+-device pxb,id=bridge3,bus=pci.0,bus_nr=40 -drive if=none,id=drive0,file=[img] -device virtio-blk-pci,drive=drive0,bus=bridge3,addr=1
Here you have:
- 2 NUMA nodes for the guest, 0 and 1. (both mapped to the same NUMA node in host, but you can and should put it in different host NUMA nodes)
diff --git a/hw/block/virtio-blk.c b/hw/block/virtio-blk.c
index d28979efb8..5023e046fc 100644
--- a/hw/block/virtio-blk.c
+++ b/hw/block/virtio-blk.c
@@ -25,10 +25,6 @@
#include "sysemu/runstate.h"
#include "hw/virtio/virtio-blk.h"
#include "dataplane/virtio-blk.h"
-#include "scsi/constants.h"
-#ifdef __linux__
-# include <scsi/sg.h>
-#endif
#include "hw/virtio/virtio-bus.h"
#include "migration/qemu-file-types.h"
#include "hw/virtio/virtio-access.h"
@@ -200,59 +196,6 @@ out:
aio_context_release(blk_get_aio_context(s->conf.conf.blk));
}
-#ifdef __linux__
-
-typedef struct {
- VirtIOBlockReq *req;
- struct sg_io_hdr hdr;
-} VirtIOBlockIoctlReq;
-
-static void virtio_blk_ioctl_complete(void *opaque, int status)
-{
- VirtIOBlockIoctlReq *ioctl_req = opaque;
- VirtIOBlockReq *req = ioctl_req->req;
- VirtIOBlock *s = req->dev;
- VirtIODevice *vdev = VIRTIO_DEVICE(s);
- struct virtio_scsi_inhdr *scsi;
- struct sg_io_hdr *hdr;
-
- scsi = (void *)req->elem.in_sg[req->elem.in_num - 2].iov_base;
-
- if (status) {
- status = VIRTIO_BLK_S_UNSUPP;
- virtio_stl_p(vdev, &scsi->errors, 255);
- goto out;
- }
-
- hdr = &ioctl_req->hdr;
- /*
- * From SCSI-Generic-HOWTO: "Some lower level drivers (e.g. ide-scsi)
- * clear the masked_status field [hence status gets cleared too, see
- * block/scsi_ioctl.c] even when a CHECK_CONDITION or COMMAND_TERMINATED
- * status has occurred. However they do set DRIVER_SENSE in driver_status
- * field. Also a (sb_len_wr > 0) indicates there is a sense buffer.
- */
- if (hdr->status == 0 && hdr->sb_len_wr > 0) {
- hdr->status = CHECK_CONDITION;
- }
-
- virtio_stl_p(vdev, &scsi->errors,
- hdr->status | (hdr->msg_status << 8) |
- (hdr->host_status << 16) | (hdr->driver_status << 24));
- virtio_stl_p(vdev, &scsi->residual, hdr->resid);
- virtio_stl_p(vdev, &scsi->sense_len, hdr->sb_len_wr);
- virtio_stl_p(vdev, &scsi->data_len, hdr->dxfer_len);
-
-out:
- aio_context_acquire(blk_get_aio_context(s->conf.conf.blk));
- virtio_blk_req_complete(req, status);
- virtio_blk_free_request(req);
- aio_context_release(blk_get_aio_context(s->conf.conf.blk));
- g_free(ioctl_req);
-}
-
-#endif
-
static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s, VirtQueue *vq)
{
VirtIOBlockReq *req = virtqueue_pop(vq, sizeof(VirtIOBlockReq));
@@ -263,126 +206,6 @@ static VirtIOBlockReq *virtio_blk_get_request(VirtIOBlock *s, VirtQueue *vq)
return req;
}
-static int virtio_blk_handle_scsi_req(VirtIOBlockReq *req)
-{
- int status = VIRTIO_BLK_S_OK;
- struct virtio_scsi_inhdr *scsi = NULL;
- VirtIOBlock *blk = req->dev;
- VirtIODevice *vdev = VIRTIO_DEVICE(blk);
- VirtQueueElement *elem = &req->elem;
-
-#ifdef __linux__
- int i;
- VirtIOBlockIoctlReq *ioctl_req;
- BlockAIOCB *acb;
-#endif
-
- /*
- * We require at least one output segment each for the virtio_blk_outhdr
- * and the SCSI command block.
- *
- * We also at least require the virtio_blk_inhdr, the virtio_scsi_inhdr
- * and the sense buffer pointer in the input segments.
- */
- if (elem->out_num < 2 || elem->in_num < 3) {
- status = VIRTIO_BLK_S_IOERR;
- goto fail;
- }
-
- /*
- * The scsi inhdr is placed in the second-to-last input segment, just
- * before the regular inhdr.
- */
- scsi = (void *)elem->in_sg[elem->in_num - 2].iov_base;
-
- if (!virtio_has_feature(blk->host_features, VIRTIO_BLK_F_SCSI)) {
- status = VIRTIO_BLK_S_UNSUPP;
- goto fail;
- }
-
- /*
- * No support for bidirection commands yet.
- */
- if (elem->out_num > 2 && elem->in_num > 3) {
- status = VIRTIO_BLK_S_UNSUPP;
- goto fail;
- }
-
-#ifdef __linux__
- ioctl_req = g_new0(VirtIOBlockIoctlReq, 1);
- ioctl_req->req = req;
- ioctl_req->hdr.interface_id = 'S';
- ioctl_req->hdr.cmd_len = elem->out_sg[1].iov_len;
- ioctl_req->hdr.cmdp = elem->out_sg[1].iov_base;
- ioctl_req->hdr.dxfer_len = 0;
-
- if (elem->out_num > 2) {
- /*
- * If there are more than the minimally required 2 output segments
- * there is write payload starting from the third iovec.
- */
- ioctl_req->hdr.dxfer_direction = SG_DXFER_TO_DEV;
- ioctl_req->hdr.iovec_count = elem->out_num - 2;
-
- for (i = 0; i < ioctl_req->hdr.iovec_count; i++) {
- ioctl_req->hdr.dxfer_len += elem->out_sg[i + 2].iov_len;
- }
-
- ioctl_req->hdr.dxferp = elem->out_sg + 2;
-
- } else if (elem->in_num > 3) {
- /*
- * If we have more than 3 input segments the guest wants to actually
- * read data.
- */
- ioctl_req->hdr.dxfer_direction = SG_DXFER_FROM_DEV;
- ioctl_req->hdr.iovec_count = elem->in_num - 3;
- for (i = 0; i < ioctl_req->hdr.iovec_count; i++) {
- ioctl_req->hdr.dxfer_len += elem->in_sg[i].iov_len;
- }
-
- ioctl_req->hdr.dxferp = elem->in_sg;
- } else {
- /*
- * Some SCSI commands don't actually transfer any data.
- */
- ioctl_req->hdr.dxfer_direction = SG_DXFER_NONE;
- }
-
- ioctl_req->hdr.sbp = elem->in_sg[elem->in_num - 3].iov_base;
- ioctl_req->hdr.mx_sb_len = elem->in_sg[elem->in_num - 3].iov_len;
-
- acb = blk_aio_ioctl(blk->blk, SG_IO, &ioctl_req->hdr,
- virtio_blk_ioctl_complete, ioctl_req);
- if (!acb) {
- g_free(ioctl_req);
- status = VIRTIO_BLK_S_UNSUPP;
- goto fail;
- }
- return -EINPROGRESS;
-#else
- abort();
-#endif
-
-fail:
- /* Just put anything nonzero so that the ioctl fails in the guest. */
- if (scsi) {
- virtio_stl_p(vdev, &scsi->errors, 255);
- }
- return status;
-}
-
-static void virtio_blk_handle_scsi(VirtIOBlockReq *req)
-{
- int status;
-
- status = virtio_blk_handle_scsi_req(req);
- if (status != -EINPROGRESS) {
- virtio_blk_req_complete(req, status);
- virtio_blk_free_request(req);
- }
-}
-
static inline void submit_requests(BlockBackend *blk, MultiReqBuffer *mrb,
int start, int num_reqs, int niov)
{
@@ -699,9 +522,6 @@ static int virtio_blk_handle_request(VirtIOBlockReq *req, MultiReqBuffer *mrb)
case VIRTIO_BLK_T_FLUSH:
virtio_blk_handle_flush(req, mrb);
break;
- case VIRTIO_BLK_T_SCSI_CMD:
- virtio_blk_handle_scsi(req);
- break;
case VIRTIO_BLK_T_GET_ID:
{
/*
@@ -1010,14 +830,8 @@ static uint64_t virtio_blk_get_features(VirtIODevice *vdev, uint64_t features,
virtio_add_feature(&features, VIRTIO_BLK_F_GEOMETRY);
virtio_add_feature(&features, VIRTIO_BLK_F_TOPOLOGY);
virtio_add_feature(&features, VIRTIO_BLK_F_BLK_SIZE);
- if (virtio_has_feature(features, VIRTIO_F_VERSION_1)) {
- if (virtio_has_feature(s->host_features, VIRTIO_BLK_F_SCSI)) {
- error_setg(errp, "Please set scsi=off for virtio-blk devices in order to use virtio 1.0");
- return 0;
- }
- } else {
+ if (!virtio_has_feature(features, VIRTIO_F_VERSION_1)) {
virtio_clear_feature(&features, VIRTIO_F_ANY_LAYOUT);
- virtio_add_feature(&features, VIRTIO_BLK_F_SCSI);
}
if (blk_enable_write_cache(s->blk) ||
@@ -1289,10 +1103,6 @@ static Property virtio_blk_properties[] = {
DEFINE_PROP_STRING("serial", VirtIOBlock, conf.serial),
DEFINE_PROP_BIT64("config-wce", VirtIOBlock, host_features,
VIRTIO_BLK_F_CONFIG_WCE, true),
-#ifdef __linux__
- DEFINE_PROP_BIT64("scsi", VirtIOBlock, host_features,
- VIRTIO_BLK_F_SCSI, false),
-#endif
DEFINE_PROP_BIT("request-merging", VirtIOBlock, conf.request_merging, 0,
true),
DEFINE_PROP_UINT16("num-queues", VirtIOBlock, conf.num_queues,
diff --git a/hw/core/machine.c b/hw/core/machine.c
index 40def78183..286f18ec6d 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -194,8 +194,6 @@ GlobalProperty hw_compat_2_5[] = {
const size_t hw_compat_2_5_len = G_N_ELEMENTS(hw_compat_2_5);
GlobalProperty hw_compat_2_4[] = {
- /* Optional because the 'scsi' property is Linux-only */
- { "virtio-blk-device", "scsi", "true", .optional = true },
{ "e1000", "extra_mac_registers", "off" },
{ "virtio-pci", "x-disable-pcie", "on" },
{ "virtio-pci", "migrate-extra", "off" },
--
2.30.2
3 years, 6 months
[PATCH] Add page_per_vq flag to the 'driver' element of virtio devices
by Gavi Teitz
https://bugzilla.redhat.com/show_bug.cgi?id=1925363
Add support for setting the page-per-vq flag, which is important for
vdpa with vhost-user performance.
Signed-off-by: Gavi Teitz <gavi(a)nvidia.com>
---
docs/formatdomain.rst | 11 +++++-
docs/schemas/domaincommon.rng | 5 +++
src/conf/domain_conf.c | 15 ++++++++
src/conf/domain_conf.h | 1 +
src/qemu/qemu_capabilities.c | 2 +
src/qemu/qemu_capabilities.h | 1 +
src/qemu/qemu_command.c | 5 +++
src/qemu/qemu_hotplug.c | 1 +
tests/qemuxml2argvdata/net-virtio-page-per-vq.args | 31 ++++++++++++++++
tests/qemuxml2argvdata/net-virtio-page-per-vq.xml | 29 +++++++++++++++
tests/qemuxml2argvtest.c | 2 +
.../qemuxml2xmloutdata/net-virtio-page-per-vq.xml | 43 ++++++++++++++++++++++
tests/qemuxml2xmltest.c | 2 +
13 files changed, 147 insertions(+), 1 deletion(-)
create mode 100644 tests/qemuxml2argvdata/net-virtio-page-per-vq.args
create mode 100644 tests/qemuxml2argvdata/net-virtio-page-per-vq.xml
create mode 100644 tests/qemuxml2xmloutdata/net-virtio-page-per-vq.xml
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst
index 282176c..f5a9c70 100644
--- a/docs/formatdomain.rst
+++ b/docs/formatdomain.rst
@@ -5120,7 +5120,7 @@ Setting NIC driver-specific options
<source network='default'/>
<target dev='vnet1'/>
<model type='virtio'/>
- <driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off' queues='5' rx_queue_size='256' tx_queue_size='256'>
+ <driver name='vhost' txmode='iothread' ioeventfd='on' event_idx='off' queues='5' rx_queue_size='256' tx_queue_size='256' page_per_vq='on'>
<host csum='off' gso='off' tso4='off' tso6='off' ecn='off' ufo='off' mrg_rxbuf='off'/>
<guest csum='off' tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>
@@ -5215,6 +5215,15 @@ following attributes are available for the ``"virtio"`` NIC driver:
only for ``vhostuser`` type. :since:`Since 3.7.0 (QEMU and KVM only)`
**In general you should leave this option alone, unless you are very certain
you know what you are doing.**
+``page_per_vq``
+ This optional attribute controls the layout of the notification capabilities
+ exposed to the guest. When enabled, each virtio queue will have a dedicated
+ page on the device BAR exposed to the guest. It is recommended to be used when
+ vDPA is enabled on the hypervisor, as it enables mapping the notification area
+ to the physical device, which is only supported in page granularity. The
+ default is determined by QEMU; as off. :since:`Since 2.8.0 (QEMU only)`
+ **In general you should leave this option alone, unless you are very certain
+ you know what you are doing.**
virtio options
For virtio interfaces, `Virtio-specific options <#elementsVirtio>`__ can also
be set. ( :since:`Since 3.5.0` )
diff --git a/docs/schemas/domaincommon.rng b/docs/schemas/domaincommon.rng
index a2e5c50..e61ad67 100644
--- a/docs/schemas/domaincommon.rng
+++ b/docs/schemas/domaincommon.rng
@@ -3463,6 +3463,11 @@
</attribute>
</optional>
<optional>
+ <attribute name="page_per_vq">
+ <ref name="virOnOff"/>
+ </attribute>
+ </optional>
+ <optional>
<attribute name="txmode">
<choice>
<value>iothread</value>
diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c
index 9d98f48..0350fde 100644
--- a/src/conf/domain_conf.c
+++ b/src/conf/domain_conf.c
@@ -10446,6 +10446,7 @@ virDomainNetDefParseXML(virDomainXMLOption *xmlopt,
g_autofree char *queues = NULL;
g_autofree char *rx_queue_size = NULL;
g_autofree char *tx_queue_size = NULL;
+ g_autofree char *page_per_vq = NULL;
g_autofree char *filter = NULL;
g_autofree char *internal = NULL;
g_autofree char *mode = NULL;
@@ -10615,6 +10616,7 @@ virDomainNetDefParseXML(virDomainXMLOption *xmlopt,
queues = virXMLPropString(cur, "queues");
rx_queue_size = virXMLPropString(cur, "rx_queue_size");
tx_queue_size = virXMLPropString(cur, "tx_queue_size");
+ page_per_vq = virXMLPropString(cur, "page_per_vq");
if (virDomainVirtioOptionsParseXML(cur, &def->virtio) < 0)
goto error;
@@ -11041,6 +11043,15 @@ virDomainNetDefParseXML(virDomainXMLOption *xmlopt,
}
def->driver.virtio.tx_queue_size = q;
}
+ if (page_per_vq) {
+ if ((val = virTristateSwitchTypeFromString(page_per_vq)) <= 0) {
+ virReportError(VIR_ERR_CONFIG_UNSUPPORTED,
+ _("unknown page_per_vq mode '%s'"),
+ page_per_vq);
+ goto error;
+ }
+ def->driver.virtio.page_per_vq = val;
+ }
if ((tmpNode = virXPathNode("./driver/host", ctxt))) {
if (virXMLPropTristateSwitch(tmpNode, "csum", VIR_XML_PROP_NONE,
@@ -25453,6 +25464,10 @@ virDomainVirtioNetDriverFormat(virBuffer *buf,
if (def->driver.virtio.tx_queue_size)
virBufferAsprintf(buf, " tx_queue_size='%u'",
def->driver.virtio.tx_queue_size);
+ if (def->driver.virtio.page_per_vq) {
+ virBufferAsprintf(buf, " page_per_vq='%s'",
+ virTristateSwitchTypeToString(def->driver.virtio.page_per_vq));
+ }
virDomainVirtioOptionsFormat(buf, def->virtio);
}
diff --git a/src/conf/domain_conf.h b/src/conf/domain_conf.h
index 85c318d..7aab565 100644
--- a/src/conf/domain_conf.h
+++ b/src/conf/domain_conf.h
@@ -1027,6 +1027,7 @@ struct _virDomainNetDef {
virDomainNetVirtioTxModeType txmode;
virTristateSwitch ioeventfd;
virTristateSwitch event_idx;
+ virTristateSwitch page_per_vq;
unsigned int queues; /* Multiqueue virtio-net */
unsigned int rx_queue_size;
unsigned int tx_queue_size;
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index 7971a9c..e5646cb 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -629,6 +629,7 @@ VIR_ENUM_IMPL(virQEMUCaps,
/* 400 */
"compat-deprecated",
"acpi-index",
+ "virtio-net.page_per_vq",
);
@@ -1405,6 +1406,7 @@ static struct virQEMUCapsDevicePropsFlags virQEMUCapsDevicePropsVirtioNet[] = {
{ "event_idx", QEMU_CAPS_VIRTIO_NET_EVENT_IDX, NULL },
{ "rx_queue_size", QEMU_CAPS_VIRTIO_NET_RX_QUEUE_SIZE, NULL },
{ "tx_queue_size", QEMU_CAPS_VIRTIO_NET_TX_QUEUE_SIZE, NULL },
+ { "page_per_vq", QEMU_CAPS_VIRTIO_NET_PAGE_PER_VQ, NULL },
{ "host_mtu", QEMU_CAPS_VIRTIO_NET_HOST_MTU, NULL },
{ "disable-legacy", QEMU_CAPS_VIRTIO_PCI_DISABLE_LEGACY, NULL },
{ "iommu_platform", QEMU_CAPS_VIRTIO_PCI_IOMMU_PLATFORM, NULL },
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h
index f54aad5..75feed4 100644
--- a/src/qemu/qemu_capabilities.h
+++ b/src/qemu/qemu_capabilities.h
@@ -609,6 +609,7 @@ typedef enum { /* virQEMUCapsFlags grouping marker for syntax-check */
/* 400 */
QEMU_CAPS_COMPAT_DEPRECATED, /* -compat deprecated-(input|output) is supported */
QEMU_CAPS_ACPI_INDEX, /* PCI device 'acpi-index' property */
+ QEMU_CAPS_VIRTIO_NET_PAGE_PER_VQ, /* virtio-net-*.page_per_vq */
QEMU_CAPS_LAST /* this must always be the last item */
} virQEMUCapsFlags;
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index d7f1c71..7b5834f 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -3630,6 +3630,11 @@ qemuBuildNicDevStr(virDomainDef *def,
if (net->driver.virtio.tx_queue_size)
virBufferAsprintf(&buf, ",tx_queue_size=%u", net->driver.virtio.tx_queue_size);
+ if (net->driver.virtio.page_per_vq) {
+ virBufferAsprintf(&buf, ",page-per-vq=%s",
+ virTristateSwitchTypeToString(net->driver.virtio.page_per_vq));
+ }
+
if (net->mtu)
virBufferAsprintf(&buf, ",host_mtu=%u", net->mtu);
diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
index 4344edc..f79c3d8 100644
--- a/src/qemu/qemu_hotplug.c
+++ b/src/qemu/qemu_hotplug.c
@@ -3578,6 +3578,7 @@ qemuDomainChangeNet(virQEMUDriver *driver,
olddev->driver.virtio.queues != newdev->driver.virtio.queues ||
olddev->driver.virtio.rx_queue_size != newdev->driver.virtio.rx_queue_size ||
olddev->driver.virtio.tx_queue_size != newdev->driver.virtio.tx_queue_size ||
+ olddev->driver.virtio.page_per_vq != newdev->driver.virtio.page_per_vq ||
olddev->driver.virtio.host.csum != newdev->driver.virtio.host.csum ||
olddev->driver.virtio.host.gso != newdev->driver.virtio.host.gso ||
olddev->driver.virtio.host.tso4 != newdev->driver.virtio.host.tso4 ||
diff --git a/tests/qemuxml2argvdata/net-virtio-page-per-vq.args b/tests/qemuxml2argvdata/net-virtio-page-per-vq.args
new file mode 100644
index 0000000..d7db71f
--- /dev/null
+++ b/tests/qemuxml2argvdata/net-virtio-page-per-vq.args
@@ -0,0 +1,31 @@
+LC_ALL=C \
+PATH=/bin \
+HOME=/tmp/lib/domain--1-QEMUGuest1 \
+USER=test \
+LOGNAME=test \
+XDG_DATA_HOME=/tmp/lib/domain--1-QEMUGuest1/.local/share \
+XDG_CACHE_HOME=/tmp/lib/domain--1-QEMUGuest1/.cache \
+XDG_CONFIG_HOME=/tmp/lib/domain--1-QEMUGuest1/.config \
+QEMU_AUDIO_DRV=none \
+/usr/bin/qemu-system-i386 \
+-name QEMUGuest1 \
+-S \
+-machine pc,accel=tcg,usb=off,dump-guest-core=off \
+-m 214 \
+-realtime mlock=off \
+-smp 1,sockets=1,cores=1,threads=1 \
+-uuid c7a5fdbd-edaf-9455-926a-d65c16db1809 \
+-display none \
+-no-user-config \
+-nodefaults \
+-chardev socket,id=charmonitor,path=/tmp/lib/domain--1-QEMUGuest1/monitor.sock,server=on,wait=off \
+-mon chardev=charmonitor,id=monitor,mode=control \
+-rtc base=utc \
+-no-shutdown \
+-no-acpi \
+-usb \
+-drive file=/dev/HostVG/QEMUGuest1,format=raw,if=none,id=drive-ide0-0-0 \
+-device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 \
+-netdev user,id=hostnet0 \
+-device virtio-net-pci,page-per-vq=on,netdev=hostnet0,id=net0,mac=00:11:22:33:44:55,bus=pci.0,addr=0x3 \
+-device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x4
diff --git a/tests/qemuxml2argvdata/net-virtio-page-per-vq.xml b/tests/qemuxml2argvdata/net-virtio-page-per-vq.xml
new file mode 100644
index 0000000..e22ecd6
--- /dev/null
+++ b/tests/qemuxml2argvdata/net-virtio-page-per-vq.xml
@@ -0,0 +1,29 @@
+<domain type='qemu'>
+ <name>QEMUGuest1</name>
+ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+ <memory unit='KiB'>219100</memory>
+ <currentMemory unit='KiB'>219100</currentMemory>
+ <vcpu placement='static'>1</vcpu>
+ <os>
+ <type arch='i686' machine='pc'>hvm</type>
+ <boot dev='hd'/>
+ </os>
+ <clock offset='utc'/>
+ <on_poweroff>destroy</on_poweroff>
+ <on_reboot>restart</on_reboot>
+ <on_crash>destroy</on_crash>
+ <devices>
+ <emulator>/usr/bin/qemu-system-i386</emulator>
+ <disk type='block' device='disk'>
+ <source dev='/dev/HostVG/QEMUGuest1'/>
+ <target dev='hda' bus='ide'/>
+ </disk>
+ <controller type='usb' index='0'/>
+ <interface type='user'>
+ <mac address='00:11:22:33:44:55'/>
+ <model type='virtio'/>
+ <driver page_per_vq='on'/>
+ </interface>
+ <memballoon model='virtio'/>
+ </devices>
+</domain>
diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
index f0efe98..f6b8f34 100644
--- a/tests/qemuxml2argvtest.c
+++ b/tests/qemuxml2argvtest.c
@@ -1654,6 +1654,8 @@ mymain(void)
QEMU_CAPS_VIRTIO_NET_TX_QUEUE_SIZE);
DO_TEST_PARSE_ERROR("net-virtio-rxqueuesize-invalid-size",
QEMU_CAPS_VIRTIO_NET_RX_QUEUE_SIZE);
+ DO_TEST("net-virtio-page-per-vq",
+ QEMU_CAPS_VIRTIO_NET_PAGE_PER_VQ);
DO_TEST("net-virtio-teaming",
QEMU_CAPS_VIRTIO_NET_FAILOVER,
QEMU_CAPS_DEVICE_VFIO_PCI);
diff --git a/tests/qemuxml2xmloutdata/net-virtio-page-per-vq.xml b/tests/qemuxml2xmloutdata/net-virtio-page-per-vq.xml
new file mode 100644
index 0000000..a35ed8a
--- /dev/null
+++ b/tests/qemuxml2xmloutdata/net-virtio-page-per-vq.xml
@@ -0,0 +1,43 @@
+<domain type='qemu'>
+ <name>QEMUGuest1</name>
+ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+ <memory unit='KiB'>219100</memory>
+ <currentMemory unit='KiB'>219100</currentMemory>
+ <vcpu placement='static'>1</vcpu>
+ <os>
+ <type arch='i686' machine='pc'>hvm</type>
+ <boot dev='hd'/>
+ </os>
+ <clock offset='utc'/>
+ <on_poweroff>destroy</on_poweroff>
+ <on_reboot>restart</on_reboot>
+ <on_crash>destroy</on_crash>
+ <devices>
+ <emulator>/usr/bin/qemu-system-i386</emulator>
+ <disk type='block' device='disk'>
+ <driver name='qemu' type='raw'/>
+ <source dev='/dev/HostVG/QEMUGuest1'/>
+ <target dev='hda' bus='ide'/>
+ <address type='drive' controller='0' bus='0' target='0' unit='0'/>
+ </disk>
+ <controller type='usb' index='0'>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/>
+ </controller>
+ <controller type='pci' index='0' model='pci-root'/>
+ <controller type='ide' index='0'>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/>
+ </controller>
+ <interface type='user'>
+ <mac address='00:11:22:33:44:55'/>
+ <model type='virtio'/>
+ <driver page_per_vq='on'/>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
+ </interface>
+ <input type='mouse' bus='ps2'/>
+ <input type='keyboard' bus='ps2'/>
+ <audio id='1' type='none'/>
+ <memballoon model='virtio'>
+ <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/>
+ </memballoon>
+ </devices>
+</domain>
diff --git a/tests/qemuxml2xmltest.c b/tests/qemuxml2xmltest.c
index c37de0c..2bd6eea 100644
--- a/tests/qemuxml2xmltest.c
+++ b/tests/qemuxml2xmltest.c
@@ -450,6 +450,8 @@ mymain(void)
DO_TEST("net-virtio-rxtxqueuesize",
QEMU_CAPS_VIRTIO_NET_RX_QUEUE_SIZE,
QEMU_CAPS_VIRTIO_NET_TX_QUEUE_SIZE);
+ DO_TEST("net-virtio-page-per-vq",
+ QEMU_CAPS_VIRTIO_NET_PAGE_PER_VQ);
DO_TEST("net-virtio-teaming",
QEMU_CAPS_VIRTIO_NET_FAILOVER,
QEMU_CAPS_DEVICE_VFIO_PCI);
--
1.8.3.1
3 years, 6 months
[PATCH v3 RESEND for v7.4.0 00/14] Introduce virtio-mem <memory/> model
by Michal Privoznik
This is a rebased version of v3 I've sent about a month ago:
https://listman.redhat.com/archives/libvir-list/2021-March/msg00823.html
v2 here:
https://listman.redhat.com/archives/libvir-list/2021-February/msg00961.html
My intent is to merge these only after the upcoming release so that we
have the biggest test window possible.
diff to v2:
- Dropped code that forbade use of virtio-mem and memballoon at the same
time;
- This meant that I had to adjust memory accounting,
qemuDomainSetMemoryFlags() - see patches 11/15 and 12/15 which are new.
- Fixed small nits raised by Peter in his review of v2
Michal Prívozník (14):
virhostmem: Introduce virHostMemGetTHPSize()
qemu_process: Deduplicate code in qemuProcessNeedHugepagesPath()
qemu_process: Drop needless check in
qemuProcessNeedMemoryBackingPath()
qemu_capabilities: Introduce QEMU_CAPS_DEVICE_VIRTIO_MEM_PCI
conf: Introduce virtio-mem <memory/> model
qemu: Build command line for virtio-mem
qemu: Wire up <memory/> live update
qemu: Wire up MEMORY_DEVICE_SIZE_CHANGE event
qemu: Refresh the actual size of virtio-mem on monitor reconnect
qemu: Account for both memballoon and virtio-mem
qemuDomainSetMemoryFlags: Take virtio-mem into consideration
virsh: Introduce update-memory-device command
news: document recent virtio memory addition
kbase: Document virtio-mem
NEWS.rst | 16 ++
docs/formatdomain.rst | 45 +++-
docs/kbase/index.rst | 4 +
docs/kbase/memorydevices.rst | 150 +++++++++++
docs/kbase/meson.build | 1 +
docs/manpages/virsh.rst | 30 +++
docs/schemas/domaincommon.rng | 16 ++
examples/c/misc/event-test.c | 17 ++
include/libvirt/libvirt-domain.h | 23 ++
src/conf/domain_conf.c | 115 ++++++++-
src/conf/domain_conf.h | 15 ++
src/conf/domain_event.c | 84 +++++++
src/conf/domain_event.h | 10 +
src/conf/domain_validate.c | 39 +++
src/libvirt_private.syms | 5 +
src/qemu/qemu_alias.c | 10 +-
src/qemu/qemu_capabilities.c | 2 +
src/qemu/qemu_capabilities.h | 1 +
src/qemu/qemu_command.c | 13 +-
src/qemu/qemu_domain.c | 50 +++-
src/qemu/qemu_domain.h | 1 +
src/qemu/qemu_domain_address.c | 38 ++-
src/qemu/qemu_driver.c | 233 +++++++++++++++++-
src/qemu/qemu_hotplug.c | 18 ++
src/qemu/qemu_hotplug.h | 5 +
src/qemu/qemu_monitor.c | 37 +++
src/qemu/qemu_monitor.h | 28 +++
src/qemu/qemu_monitor_json.c | 97 ++++++--
src/qemu/qemu_monitor_json.h | 5 +
src/qemu/qemu_process.c | 118 ++++++++-
src/qemu/qemu_validate.c | 8 +
src/remote/remote_daemon_dispatch.c | 30 +++
src/remote/remote_driver.c | 32 +++
src/remote/remote_protocol.x | 14 +-
src/remote_protocol-structs | 7 +
src/security/security_apparmor.c | 1 +
src/security/security_dac.c | 2 +
src/security/security_selinux.c | 2 +
src/util/virhostmem.c | 63 +++++
src/util/virhostmem.h | 3 +
tests/domaincapsmock.c | 9 +
.../caps_5.1.0.x86_64.xml | 1 +
.../caps_5.2.0.x86_64.xml | 1 +
.../caps_6.0.0.x86_64.xml | 1 +
...mory-hotplug-virtio-mem.x86_64-latest.args | 41 +++
.../memory-hotplug-virtio-mem.xml | 67 +++++
tests/qemuxml2argvtest.c | 1 +
...emory-hotplug-virtio-mem.x86_64-latest.xml | 1 +
tests/qemuxml2xmltest.c | 1 +
tools/virsh-domain.c | 169 +++++++++++++
50 files changed, 1613 insertions(+), 67 deletions(-)
create mode 100644 docs/kbase/memorydevices.rst
create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.x86_64-latest.args
create mode 100644 tests/qemuxml2argvdata/memory-hotplug-virtio-mem.xml
create mode 120000 tests/qemuxml2xmloutdata/memory-hotplug-virtio-mem.x86_64-latest.xml
--
2.26.3
3 years, 6 months
[PATCH] Deprecate pmem=on with non-DAX capable backend file
by Igor Mammedov
It is not safe to pretend that emulated NVDIMM supports
persistence while backend actually failed to enable it
and used non-persistent mapping as fall back.
Instead of falling-back, QEMU should be more strict and
error out with clear message that it's not supported.
So if user asks for persistence (pmem=on), they should
store backing file on NVDIMM.
Signed-off-by: Igor Mammedov <imammedo(a)redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd(a)redhat.com>
---
v2:
rephrase deprecation comment andwarning message
(Philippe Mathieu-Daudé <philmd(a)redhat.com>)
---
docs/system/deprecated.rst | 17 +++++++++++++++++
util/mmap-alloc.c | 3 +++
2 files changed, 20 insertions(+)
diff --git a/docs/system/deprecated.rst b/docs/system/deprecated.rst
index bacd76d7a5..e79fb02b3a 100644
--- a/docs/system/deprecated.rst
+++ b/docs/system/deprecated.rst
@@ -327,6 +327,23 @@ The Raspberry Pi machines come in various models (A, A+, B, B+). To be able
to distinguish which model QEMU is implementing, the ``raspi2`` and ``raspi3``
machines have been renamed ``raspi2b`` and ``raspi3b``.
+Backend options
+---------------
+
+Using non-persistent backing file with pmem=on (since 6.0)
+''''''''''''''''''''''''''''''''''''''''''''''''''''''''''
+
+This option is used when ``memory-backend-file`` is consumed by emulated NVDIMM
+device. However enabling ``memory-backend-file.pmem`` option, when backing file
+is (a) not DAX capable or (b) not on a filesystem that support direct mapping
+of persistent memory, is not safe and may lead to data loss or corruption in case
+of host crash.
+Options are:
+ - modify VM configuration to set ``pmem=off`` to continue using fake NVDIMM
+ (without persistence guaranties) with backing file on non DAX storage
+ - move backing file to NVDIMM storage and keep ``pmem=on``
+ (to have NVDIMM with persistence guaranties).
+
Device options
--------------
diff --git a/util/mmap-alloc.c b/util/mmap-alloc.c
index 27dcccd8ec..0388cc3be2 100644
--- a/util/mmap-alloc.c
+++ b/util/mmap-alloc.c
@@ -20,6 +20,7 @@
#include "qemu/osdep.h"
#include "qemu/mmap-alloc.h"
#include "qemu/host-utils.h"
+#include "qemu/error-report.h"
#define HUGETLBFS_MAGIC 0x958458f6
@@ -166,6 +167,8 @@ void *qemu_ram_mmap(int fd,
"crash.\n", file_name);
g_free(proc_link);
g_free(file_name);
+ warn_report("Using non DAX backing file with 'pmem=on' option"
+ " is deprecated");
}
/*
* if map failed with MAP_SHARED_VALIDATE | MAP_SYNC,
--
2.27.0
3 years, 6 months