[libvirt] [PATCH] audit: add audit information about panic devices
by Chen Hanxiao
From: Chen Hanxiao <chenhanxiao(a)gmail.com>
This patch add audit info for panic notifier devices.
Signed-off-by: Chen Hanxiao <chenhanxiao(a)gmail.com>
---
docs/auditlog.html.in | 15 +++++++++++++++
src/conf/domain_audit.c | 38 ++++++++++++++++++++++++++++++++++++++
src/conf/domain_audit.h | 4 ++++
src/libvirt_private.syms | 1 +
4 files changed, 58 insertions(+)
diff --git a/docs/auditlog.html.in b/docs/auditlog.html.in
index 0c778aa..45464af 100644
--- a/docs/auditlog.html.in
+++ b/docs/auditlog.html.in
@@ -371,5 +371,20 @@
<dd>Path of the backing character device for given emulated device</dd>
</dl>
+
+ <h4><a name="typeresourcepanic">Panic notifier</a></h4>
+ <p>
+ The <code>msg</code> field will include the following sub-fields
+ </p>
+
+ <dl>
+ <dt><code>resrc</code></dt>
+ <dd>The type of resource assigned. Set to <code>panic</code></dd>
+ <dt><code>reason</code></dt>
+ <dd>The reason which caused the resource to be assigned to happen</dd>
+ <dt><code>model</code></dt>
+ <dd>The model of the panic notifier device</dd>
+ </dl>
+
</body>
</html>
diff --git a/src/conf/domain_audit.c b/src/conf/domain_audit.c
index fd20ace..e48a63d 100644
--- a/src/conf/domain_audit.c
+++ b/src/conf/domain_audit.c
@@ -893,6 +893,9 @@ virDomainAuditStart(virDomainObjPtr vm, const char *reason, bool success)
for (i = 0; i < vm->def->nshmems; i++)
virDomainAuditShmem(vm, vm->def->shmems[i], "start", true);
+ for (i = 0; i < vm->def->npanics; i++)
+ virDomainAuditPanic(vm, vm->def->panics[i], "start", true);
+
virDomainAuditMemory(vm, 0, virDomainDefGetMemoryTotal(vm->def),
"start", true);
virDomainAuditVcpu(vm, 0, virDomainDefGetVcpus(vm->def), "start", true);
@@ -1006,3 +1009,38 @@ virDomainAuditShmem(virDomainObjPtr vm,
VIR_FREE(shmem);
return;
}
+
+void
+virDomainAuditPanic(virDomainObjPtr vm,
+ virDomainPanicDefPtr def,
+ const char *reason,
+ bool success)
+{
+ char uuidstr[VIR_UUID_STRING_BUFLEN];
+ char *vmname = virAuditEncode("vm", vm->def->name);
+ const char *panic_model = virDomainPanicModelTypeToString(def->model);
+ char *model = virAuditEncode("model", VIR_AUDIT_STR(panic_model));
+ const char *virt = virDomainVirtTypeToString(vm->def->virtType);
+
+ virUUIDFormat(vm->def->uuid, uuidstr);
+
+ if (!vmname || !model) {
+ VIR_WARN("OOM while encoding audit message");
+ goto cleanup;
+ }
+
+ if (!virt) {
+ VIR_WARN("Unexpected virt type %d while encoding audit message",
+ vm->def->virtType);
+ virt = "?";
+ }
+
+ VIR_AUDIT(VIR_AUDIT_RECORD_RESOURCE, success,
+ "virt=%s resrc=PanicNotifier reason=%s %s uuid=%s %s",
+ virt, reason, vmname, uuidstr, model);
+
+ cleanup:
+ VIR_FREE(vmname);
+ VIR_FREE(model);
+ return;
+}
diff --git a/src/conf/domain_audit.h b/src/conf/domain_audit.h
index 8cb585d..10ecc2a 100644
--- a/src/conf/domain_audit.h
+++ b/src/conf/domain_audit.h
@@ -133,6 +133,10 @@ void virDomainAuditShmem(virDomainObjPtr vm,
virDomainShmemDefPtr def,
const char *reason, bool success)
ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3);
+void virDomainAuditPanic(virDomainObjPtr vm,
+ virDomainPanicDefPtr def,
+ const char *reason, bool success)
+ ATTRIBUTE_NONNULL(1) ATTRIBUTE_NONNULL(2) ATTRIBUTE_NONNULL(3);
#endif /* __VIR_DOMAIN_AUDIT_H__ */
diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
index 923afd1..94ec7cb 100644
--- a/src/libvirt_private.syms
+++ b/src/libvirt_private.syms
@@ -146,6 +146,7 @@ virDomainAuditIOThread;
virDomainAuditMemory;
virDomainAuditNet;
virDomainAuditNetDevice;
+virDomainAuditPanic;
virDomainAuditRedirdev;
virDomainAuditRNG;
virDomainAuditSecurityLabel;
--
1.8.3.1
8 years, 1 month
[libvirt] [PATCH 00/11] cleanups and improvements for video device code
by Pavel Hrdina
Pavel Hrdina (11):
tests: fix some QXL capability combinations that doesn't make sense
qemu_capabilities: join capabilities for qxl and qxl-vga devices
qemu_capabilities: mark QEMU_CAPS_VGA_QXL capability as deprecated
qemu_domain: move video validation out of qemu_command
qemu_process: move video validation out of qemu_command
qemu_capabilities: rename QEMU_CAPS_VIRTIO_GPU_VIRGL
qemu_command: separate code for video device via -vga attribute
qemu_command: cleanup qemuBuildVideoCommandLine
qemu_capabilities: check for existence of virtio-vga
qemu_command: properly detect which model to use for video device
qemu_command: add support to use virtio as secondary video device
docs/formatdomain.html.in | 3 +-
src/qemu/qemu_capabilities.c | 23 +-
src/qemu/qemu_capabilities.h | 15 +-
src/qemu/qemu_command.c | 335 +++++++++------------
src/qemu/qemu_domain.c | 70 +++++
src/qemu/qemu_domain.h | 3 +
src/qemu/qemu_domain_address.c | 6 -
src/qemu/qemu_process.c | 54 +++-
.../qemu_2.6.0-gicv2-virt.aarch64.xml | 1 -
.../qemu_2.6.0-gicv3-virt.aarch64.xml | 1 -
tests/domaincapsschemadata/qemu_2.6.0.aarch64.xml | 1 -
tests/domaincapsschemadata/qemu_2.6.0.ppc64le.xml | 1 -
.../qemucapabilitiesdata/caps_1.2.2.x86_64.replies | 70 +----
tests/qemucapabilitiesdata/caps_1.2.2.x86_64.xml | 4 -
.../qemucapabilitiesdata/caps_1.3.1.x86_64.replies | 76 +----
tests/qemucapabilitiesdata/caps_1.3.1.x86_64.xml | 4 -
.../qemucapabilitiesdata/caps_1.4.2.x86_64.replies | 74 +----
tests/qemucapabilitiesdata/caps_1.4.2.x86_64.xml | 4 -
.../qemucapabilitiesdata/caps_1.5.3.x86_64.replies | 74 +----
tests/qemucapabilitiesdata/caps_1.5.3.x86_64.xml | 4 -
.../qemucapabilitiesdata/caps_1.6.0.x86_64.replies | 74 +----
tests/qemucapabilitiesdata/caps_1.6.0.x86_64.xml | 4 -
.../qemucapabilitiesdata/caps_1.7.0.x86_64.replies | 74 +----
tests/qemucapabilitiesdata/caps_1.7.0.x86_64.xml | 4 -
.../qemucapabilitiesdata/caps_2.1.1.x86_64.replies | 74 +----
tests/qemucapabilitiesdata/caps_2.1.1.x86_64.xml | 4 -
.../qemucapabilitiesdata/caps_2.4.0.x86_64.replies | 107 ++-----
tests/qemucapabilitiesdata/caps_2.4.0.x86_64.xml | 6 +-
.../qemucapabilitiesdata/caps_2.5.0.x86_64.replies | 117 +++----
tests/qemucapabilitiesdata/caps_2.5.0.x86_64.xml | 6 +-
.../caps_2.6.0-gicv2.aarch64.replies | 43 ++-
.../caps_2.6.0-gicv2.aarch64.xml | 1 -
.../caps_2.6.0-gicv3.aarch64.replies | 43 ++-
.../caps_2.6.0-gicv3.aarch64.xml | 1 -
.../caps_2.6.0.ppc64le.replies | 43 ++-
tests/qemucapabilitiesdata/caps_2.6.0.ppc64le.xml | 2 +-
.../qemucapabilitiesdata/caps_2.6.0.x86_64.replies | 117 +++----
tests/qemucapabilitiesdata/caps_2.6.0.x86_64.xml | 6 +-
.../qemucapabilitiesdata/caps_2.7.0.x86_64.replies | 122 +++-----
tests/qemucapabilitiesdata/caps_2.7.0.x86_64.xml | 6 +-
tests/qemuhelptest.c | 4 -
.../qemuxml2argv-pcie-root-port.args | 5 +-
.../qemuxml2argv-pcie-switch-downstream-port.args | 5 +-
.../qemuxml2argv-pcie-switch-upstream-port.args | 5 +-
.../qemuxml2argv-pcihole64-q35.args | 5 +-
.../qemuxml2argv-q35-usb2-multi.args | 5 +-
.../qemuxml2argv-q35-usb2-reorder.args | 5 +-
tests/qemuxml2argvdata/qemuxml2argv-q35-usb2.args | 5 +-
tests/qemuxml2argvdata/qemuxml2argv-q35.args | 5 +-
.../qemuxml2argv-video-virtio-gpu-device.args | 2 +-
.../qemuxml2argv-video-virtio-gpu-sec.args | 25 ++
.../qemuxml2argv-video-virtio-gpu-sec.xml | 36 +++
.../qemuxml2argv-video-virtio-gpu-spice-gl.args | 2 +-
.../qemuxml2argv-video-virtio-gpu-virgl.args | 2 +-
.../qemuxml2argv-video-virtio-vga.args | 24 ++
...evice.xml => qemuxml2argv-video-virtio-vga.xml} | 11 +-
tests/qemuxml2argvtest.c | 152 +++++-----
tests/qemuxml2xmltest.c | 18 +-
58 files changed, 778 insertions(+), 1215 deletions(-)
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-video-virtio-gpu-sec.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-video-virtio-gpu-sec.xml
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-video-virtio-vga.args
rename tests/qemuxml2argvdata/{qemuxml2argv-video-qxl-sec-nodevice.xml => qemuxml2argv-video-virtio-vga.xml} (77%)
--
2.10.0
8 years, 1 month
[libvirt] [PATCH v2 00/20] Split parsing and defining logic of daemon's logging
by Erik Skultety
v2 of the original series
https://www.redhat.com/archives/libvir-list/2016-May/msg00229.html
since v1:
- as Cole pointed out in 20/38 of the original series, the patches were not
designed in an elegant way and they were hard to review, so this series reworked
the whole series:
-> first the existing methods that do combine parsing and defining logic
and which should be dropped are renamed to a more accurate name
-> all the necessary methods to achieve the "split" are introduced
gradually, interconnected with each other
-> finally, all the callers switch to the new logic introduced in the early
patches in a transparent way
-> all the original poorly named methods are completely dropped
- also, the original series introduced a new set of API locks because there was
an issue with 2 concurrent setters that while setter1 was preparing its local
set of outputs to replace the existing global one, setter2 might just replace
the global set with its copy, invalidating all fds of the setter1's set because
the original series used a concept of *copying* (not duplicating) of fds, so
the copied fd would be invalidated by issuing reset by setter2.
This series however, duplicates the file-based outputs'
(that should remain opened) fds. So even if setter2 replaces the original set
with its copy and calls reset, effectively closing all fds, it does not matter
for setter1, since unlink only decrements the number of references to a
specific opened fd.
Erik Skultety (20):
virlog: Rename virLogParse* to virLogParseAndDefine*
virlog: Introduce virLogOutputNew
virlog: Introduce virLogFilterNew
virlog: Introduce virLogFindOutput
virlog: Introduce virLogDefineOutputs
virlog: Introduce virLogDefineFilters
virlog: Introduce virLogNewOutputTo* as a replacement for
virLogAddOutputTo*
virlog: Take a special care of syslog when setting new set of log
outputs
virlog: Introduce virLogParseOutput
virlog: Introduce virLogParseFilter
virlog: Introduce virLogParseOutputs
virlog: Introduce virLogParseFilters
virlog: Introduce virLogSetOutputs
virlog: Introduce virLogSetFilters
daemon: Split output parsing and output defining
daemon: Split filter parsing and filter defining
virlog: Remove functions that aren't used anywhere anymore
virlog: Make some of the methods static
virlog: Store the journald fd within the output object
virlog: Split parsing and setting priority
daemon/libvirtd.c | 8 +-
src/libvirt_private.syms | 10 +-
src/locking/lock_daemon.c | 8 +-
src/logging/log_daemon.c | 8 +-
src/util/virlog.c | 1079 ++++++++++++++++++++++++++-------------------
src/util/virlog.h | 61 +--
tests/eventtest.c | 3 +-
tests/testutils.c | 11 +-
tests/virlogtest.c | 10 +-
9 files changed, 702 insertions(+), 496 deletions(-)
--
2.5.5
8 years, 1 month
[libvirt] [PATCH] vz: fix vnc host address
by Mikhail Feoktistov
Empty string means 127.0.0.1 in terms of vzSDK
But in this case we should set 0.0.0.0
---
.gnulib | 2 +-
src/vz/vz_sdk.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/.gnulib b/.gnulib
index e89b4a7..a2a3943 160000
--- a/.gnulib
+++ b/.gnulib
@@ -1 +1 @@
-Subproject commit e89b4a7aefce9cb02963920712ba7cdd13641644
+Subproject commit a2a39436b65f329630df4a93ec4e30aeae403c54
diff --git a/src/vz/vz_sdk.c b/src/vz/vz_sdk.c
index f2a5c96..7011cbf 100644
--- a/src/vz/vz_sdk.c
+++ b/src/vz/vz_sdk.c
@@ -2967,7 +2967,7 @@ static int prlsdkApplyGraphicsParams(PRL_HANDLE sdkdom,
glisten = virDomainGraphicsGetListen(gr, 0);
pret = PrlVmCfg_SetVNCHostName(sdkdom, glisten && glisten->address ?
- glisten->address : "");
+ glisten->address : "0.0.0.0");
prlsdkCheckRetGoto(pret, cleanup);
ret = 0;
--
1.8.3.1
8 years, 1 month
[libvirt] [PATCH] Add 2 systemtap script in EXTRA_DIST
by Luyao Huang
Then we will include these 2 useful script in VPATH build.
Signed-off-by: Luyao Huang <lhuang(a)redhat.com>
---
examples/Makefile.am | 2 ++
1 file changed, 2 insertions(+)
diff --git a/examples/Makefile.am b/examples/Makefile.am
index bd8460d..ca6e039 100644
--- a/examples/Makefile.am
+++ b/examples/Makefile.am
@@ -28,6 +28,8 @@ EXTRA_DIST = \
lxcconvert/virt-lxc-convert \
polkit/libvirt-acl.rules \
systemtap/events.stp \
+ systemtap/lock-debug.stp \
+ systemtap/qemu-monitor.stp \
systemtap/rpc-monitor.stp \
$(FILTERS) \
$(wildcard $(srcdir)/xml/storage/*.xml) \
--
1.8.3.1
8 years, 1 month
[libvirt] [PATCH] Forbid new-line char in name of networks
by Sławek Kapłoński
New line character in name of network is now forbidden because it
mess virsh output and can be confusing for users.
Closes-Bug: https://bugzilla.redhat.com/show_bug.cgi?id=818064
---
src/conf/network_conf.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/src/conf/network_conf.c b/src/conf/network_conf.c
index aa39776..f9c537f 100644
--- a/src/conf/network_conf.c
+++ b/src/conf/network_conf.c
@@ -2123,6 +2123,13 @@ virNetworkDefParseXML(xmlXPathContextPtr ctxt)
goto error;
}
+ if (strchr(def->name, '\n')) {
+ virReportError(
+ VIR_ERR_XML_ERROR,
+ _("name %s cannot contain new-line character"), def->name);
+ goto error;
+ }
+
/* Extract network uuid */
tmp = virXPathString("string(./uuid[1])", ctxt);
if (!tmp) {
--
2.10.0
8 years, 1 month
[libvirt] [PATCH v3] virsh domdisplay: introduce '--all' for showing all possible graphical displays
by Chen Hanxiao
From: Chen Hanxiao <chenhanxiao(a)gmail.com>
For one VM, it could had more than one graphical display.
Such as we coud add both vnc and spice display to a VM.
This patch introduces '--all' for showing all
possible graphical display of a active VM.
Signed-off-by: Chen Hanxiao <chenhanxiao(a)gmail.com>
---
v2: VIR_FREE befor use in loops
add descriptions in virsh.pod
v3: add missing command args list in virsh.pod
suppress --type if --all specified
tools/virsh-domain.c | 17 +++++++++++++++--
tools/virsh.pod | 6 ++++--
2 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
index 3829b17..bd18798 100644
--- a/tools/virsh-domain.c
+++ b/tools/virsh-domain.c
@@ -10648,6 +10648,10 @@ static const vshCmdOptDef opts_domdisplay[] = {
.help = N_("select particular graphical display "
"(e.g. \"vnc\", \"spice\", \"rdp\")")
},
+ {.name = "all",
+ .type = VSH_OT_BOOL,
+ .help = N_("show all possible graphical displays")
+ },
{.name = NULL}
};
@@ -10671,6 +10675,7 @@ cmdDomDisplay(vshControl *ctl, const vshCmd *cmd)
int tmp;
int flags = 0;
bool params = false;
+ bool all = vshCommandOptBool(cmd, "all");
const char *xpath_fmt = "string(/domain/devices/graphics[@type='%s']/%s)";
virSocketAddr addr;
@@ -10697,10 +10702,11 @@ cmdDomDisplay(vshControl *ctl, const vshCmd *cmd)
/* Attempt to grab our display info */
for (iter = 0; scheme[iter] != NULL; iter++) {
/* Particular scheme requested */
- if (type && STRNEQ(type, scheme[iter]))
+ if (!all && type && STRNEQ(type, scheme[iter]))
continue;
/* Create our XPATH lookup for the current display's port */
+ VIR_FREE(xpath);
if (virAsprintf(&xpath, xpath_fmt, scheme[iter], "@port") < 0)
goto cleanup;
@@ -10733,6 +10739,7 @@ cmdDomDisplay(vshControl *ctl, const vshCmd *cmd)
/* Attempt to get the listening addr if set for the current
* graphics scheme */
+ VIR_FREE(listen_addr);
listen_addr = virXPathString(xpath, ctxt);
VIR_FREE(xpath);
@@ -10788,6 +10795,7 @@ cmdDomDisplay(vshControl *ctl, const vshCmd *cmd)
goto cleanup;
/* Attempt to get the password */
+ VIR_FREE(passwd);
passwd = virXPathString(xpath, ctxt);
VIR_FREE(xpath);
@@ -10840,12 +10848,17 @@ cmdDomDisplay(vshControl *ctl, const vshCmd *cmd)
}
/* Print out our full URI */
+ VIR_FREE(output);
output = virBufferContentAndReset(&buf);
vshPrint(ctl, "%s", output);
/* We got what we came for so return successfully */
ret = true;
- break;
+ if (!all) {
+ break;
+ } else {
+ vshPrint(ctl, "\n");
+ }
}
if (!ret) {
diff --git a/tools/virsh.pod b/tools/virsh.pod
index 49abda9..2a49553 100644
--- a/tools/virsh.pod
+++ b/tools/virsh.pod
@@ -1223,13 +1223,15 @@ I<size> is a scaled integer (see B<NOTES> above) which defaults to KiB
"B" to get bytes (note that for historical reasons, this differs from
B<vol-resize> which defaults to bytes without a suffix).
-=item B<domdisplay> I<domain> [I<--include-password>] [[I<--type>] B<type>]
+=item B<domdisplay> I<domain> [I<--include-password>]
+[[I<--type>] B<type>] [I<--all>]
Output a URI which can be used to connect to the graphical display of the
domain via VNC, SPICE or RDP. The particular graphical display type can
be selected using the B<type> parameter (e.g. "vnc", "spice", "rdp"). If
I<--include-password> is specified, the SPICE channel password will be
-included in the URI.
+included in the URI. If I<--all> is specified, then all show all possible
+graphical displays, for a VM could have more than one graphical displays.
=item B<domfsinfo> I<domain>
--
1.8.3.1
8 years, 1 month
[libvirt] QEMU migration with LOCAL STORAGE
by Lulina (A)
Hello,
Libvirt consider it's safe to migrate a guest when its disks are readonly, or cache=none or has coherent cluster fs backend. But I've some questions over that:
When migrating guests with LOCAL STORAGE, as qemu uses DRIVE MIRROR to migrate disk images, all storage data will migrate to dest by qemu. So, I don't find it necessary to let the disk cache be none, even its cache is not none, it's still safe to migrate. Is there any situation that destination VM can't see all completed writes?
Regards,
Lina
8 years, 1 month
[libvirt] [RFC v2] libvirt vGPU QEMU integration
by Kirti Wankhede
Hi libvirt experts,
Thanks for valuable input on v1 version of RFC.
Quick brief, VFIO based mediated device framework provides a way to
virtualize their devices without SR-IOV, like NVIDIA vGPU, Intel KVMGT
and IBM's channel IO. This framework reuses VFIO APIs for all the
functionalities for mediated devices which are currently being used for
pass through devices. This framework introduces a set of new sysfs files
for device creation and its life cycle management.
Here is the summary of discussion on v1:
1. Discover mediated device:
As part of physical device initialization process, vendor driver will
register their physical devices, which will be used to create virtual
device (mediated device, aka mdev) to the mediated framework.
Vendor driver should specify mdev_supported_types in directory format.
This format is class based, for example, display class directory format
should be as below. We need to define such set for each class of devices
which would be supported by mediated device framework.
--- mdev_destroy
--- mdev_supported_types
|-- 11
| |-- create
| |-- name
| |-- fb_length
| |-- resolution
| |-- heads
| |-- max_instances
| |-- params
| |-- requires_group
|-- 12
| |-- create
| |-- name
| |-- fb_length
| |-- resolution
| |-- heads
| |-- max_instances
| |-- params
| |-- requires_group
|-- 13
|-- create
|-- name
|-- fb_length
|-- resolution
|-- heads
|-- max_instances
|-- params
|-- requires_group
In the above example directory '11' represents a type id of mdev device.
'name', 'fb_length', 'resolution', 'heads', 'max_instance' and
'requires_group' would be Read-Only files that vendor would provide to
describe about that type.
'create':
Write-only file. Mandatory.
Accepts string to create mediated device.
'name':
Read-Only file. Mandatory.
Returns string, the name of that type id.
'fb_length':
Read-only file. Mandatory.
Returns <number>{K,M,G}, size of framebuffer.
'resolution':
Read-Only file. Mandatory.
Returns 'hres x vres' format. Maximum supported resolution.
'heads':
Read-Only file. Mandatory.
Returns integer. Number of maximum heads supported.
'max_instance':
Read-Only file. Mandatory.
Returns integer. Returns maximum mdev device could be created
at the moment when this file is read. This count would be updated by
vendor driver. Before creating mdev device of this type, check if
max_instance is > 0.
'params'
Write-Only file. Optional.
String input. Libvirt would pass the string given in XML file to
this file and then create mdev device. Set empty string to clear params.
For example, set parameter 'frame_rate_limiter=0' to disable frame rate
limiter for performance benchmarking, then create device of type 11. The
device created would have that parameter set by vendor driver.
'requires_group'
Read-Only file. Optional.
This should be provided by vendor driver if vendor driver need to
group mdev devices in one domain so that vendor driver can use 'first
open' to commit resources of all mdev devices associated to that domain
and 'last close' to free those.
The parent device would look like:
<device>
<name>pci_0000_86_00_0</name>
<capability type='pci'>
<domain>0</domain>
<bus>134</bus>
<slot>0</slot>
<function>0</function>
<capability type='mdev'>
<!-- one type element per sysfs directory -->
<type id='11'>
<!-- one element per sysfs file roughly -->
<name>GRID M60-0B</name>
<attribute name='fb_length'>512M</attribute>
<attribute name='resolution'>2560x1600</attribute>
<attribute name='heads'>2</attribute>
<attribute name='max_instances'>16</attribute>
<attribute name='requires_group'>1</attribute>
</type>
</capability>
<product id='...'>GRID M60</product>
<vendor id='0x10de'>NVIDIA</vendor>
</capability>
</device>
2. Create/destroy mediated device
With above example, vGPU device XML would look like:
<device>
<name>my-vgpu</name>
<parent>pci_0000_86_00_0</parent>
<capability type='mdev'>
<type id='11'/>
<group>1</group>
<params>'frame_rate_limiter=0'</params>
</capability>
</device>
'type id' is mandatory.
'group' is optional. It should be a unique number in the system among
all the groups created for mdev devices. Its usage is:
- not needed if single vGPU device is being assigned to a domain.
- only need to be set if multiple vGPUs need to be assigned to a
domain and vendor driver have 'requires_group' file in type id directory.
- if type id directory include 'requires_group' and user tries to
assign multiple vGPUs to a domain without having <group> field in XML,
it will create single vGPU.
'params' is optional field. User should set this field if extra
parameters need to be set for a particular vGPU device. Libvirt don't
need to parse these params. These are meant for vendor driver.
Libvirt need to follow the sequence to create device:
* Read /sys/../0000\:86\:00.0/11/max_instances. If it is greater than 0,
then only proceed else fail.
* Set extra params if 'params' field exist in device XML and 'params'
file exist in type id directory
echo "frame_rate_limiter=0" > /sys/../0000\:86\:00.0/11/params
* Autogenerate UUID
* Create device:
echo "$UUID:<group>" > /sys/../0000\:86\:00.0/11/create
where <group> is optional. Group should be unique number among all
the groups created for mdev devices.
* Clear params, if set earlier:
echo "" > /sys/../0000\:86\:00.0/11/params
* To destroy device:
echo $UUID > /sys/../0000\:86\:00.0/mdev_destroy
3. Start/stop mediated device
No change or requirement for libvirt as this will be handled by open()
and close() callbacks to vendor driver. In case of multiple devices and
'requires_group' set, this will be handled in 'first open()' and 'last
close()' on device in that group.
4. Launch QEMU/VM
Pass the mdev sysfs path to QEMU as vfio-pci device.
For above vGPU device example:
-device vfio-pci,sysfsdev=/sys/bus/mdev/devices/$UUID
5. QEMU/VM Shutdown sequence
No change or requirement for libvirt.
6. VM Reset
No change or requirement for libvirt as this will be handled via VFIO
reset API and QEMU process will keep running as before.
7. Hot-plug
It is same syntax to create a virtual device for hot-plug.
Thanks,
Kirti
8 years, 1 month
[libvirt] RFC: Exposing "ready" bool (of `query-block-jobs`) or QMP BLOCK_JOB_READY event
by Kashyap Chamarthy
Backround
---------
For QEMU block device jobs, the "ready" boolean field (part of QMP
`query-block-jobs`) was introduced in commit ef6dbf1 (available in QEMU
v2.2.0 or above):
http://git.qemu.org/?p=qemu.git;a=commitdiff;h=ef6dbf1e4 --
blockjob: Add "ready" field
"When a block job signals readiness, this is currently reported only
through QMP. If qemu wants to use block jobs for internal tasks,
there needs to be another way to correctly detect when a block job
may be completed.
For this reason, introduce a bool "ready" which is set when the
block job may be completed."
And, libvirt was fixed to use the above field in this commit (available
in libvirt v1.2.18 or above):
http://libvirt.org/git/?p=libvirt.git;a=commitdiff;h=eae5924 -- qemu:
Update state of block job to READY only if it actually is ready
RFC
---
Currently libvirt block APIs (& consequently higher-level applications
like Nova which use these APIs) rely on polling for job completion via
virDomainGetBlockJobInfo(), which uses QMP `query-block-jobs`, and
waits for QEMU to report "offset" == "len", which translates to libvirt
"cur" == "end". Based on this, libvirt can take an action (whether to
gracefully abort, or pivot to the copy in case of a COPY job).
Since QEMU reports the "ready": true field (followed by a
BLOCK_JOB_READY QMP event). It would be helpful if libvirt expose this
via an API, so upper layers could instead use that, rather than polling.
Problem scenario
----------------
When virDomainBlockRebase() is invoked to start a copy job, then
aborting the said copy operation with virDomainBlockJobAbort() + flag
VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT can result in a potential race
condition (due to the way the virDomainGetBlockJobInfo() reports the job
status) where the pivot operation fails.
Race condition window
~~~~~~~~~~~~~~~~~~~~~
libvirt finds cur==end AND sends a pivot request, all in the window
before QEMU would have sent "ready": true field [emitted as part of the
QMP `query-block-jobs` command's response, indicating that the job has
actually completed], however the pivot request fails because it requires
"ready": true.
So Eric Blake suggests:
QEMU 2.0 or 1.x probably had a synchronous setup where you could
never observer cur==end on a non-ready job. But I don't remember
enough history to point to when QEMU switched jobs to be a bit more
asynchronous. Maybe there was no qemu regression - maybe it was
BECAUSE of other block-job additions in 2.2 that offset==len was no
longer reliable. I don't know that for sure.
But what it DOES sound like is that IF qemu reports "ready": false,
offset==len is not reliable, and libvirt should be taught to fudge
that.
And hopefully, QEMU too old to report "ready:" at all is reliable
with regards to offset==len, because that's all we have to go by.
For now, I filed this upstream libvirt bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1382165 --
virDomainGetBlockJobInfo: Adjust job reporting based on QEMU stats &
the "ready" field of `query-block-jobs`
However, exposing the "ready" boolean from QMP `query-block-jobs` might
be worth considering.
--
/kashyap
8 years, 1 month