[libvirt] When vm's status file being left over, some persistent but inactive vms will be lost by libvirtd after libvirtd rebooting.
by Wangyufei (A)
Hello,
I found a problem that:
vm's status file may be left over in the path /var/run/libvirt/qemu under some situation, such as host reboot. When vm's status file is left over, some
persistent but inactive vms will be lost by libvirtd after it is rebooted. And you can do as follows to reproduce the problem:
1、Create a vm and start it by the commands: virsh define vm-xml and virsh start vm-name.
2、Stop the libvirtd by the command: service libvirtd stop.
3、Kill the qemu process related to the vm, and make the vm's status file left over.
4、Start libvirtd.
After starting the libvirtd service, we find that the vm has been lost by libvirtd with command"virsh list --all".
What we expect is that the vm is shown with shutoff status, should we?
The reason for the problem is that:
During libvirtd startup, it first loads status files of vms under the path /var/run/libvirt/qemu, creates virDomainObj for each vm and adds it to
driver->domains list.
Then it creates a thread to connect related qemu process for each virDomainObj in the domains list.Because the qemu process has been killed, so connecting to
qemu will be failed. When connecting to qemu failed, connection-thread will do the follows:
1、Check if vm->persistent is 1.
2、If vm->persistent is not 1, then qemuDomainRemoveInactive() is called to remove the virDomainObj.
3、Then the following calling sequence will occur:qemuDomainRemoveInactive() -->virDomainObjListRemove()-->virHashRemoveEntry(). Around virHashRemoveEntry(),
domlist and dom will be locked and unlocked sequencely.
The problem of the above steps is that vm->persistent maybe has been set to 1 by libvirtd main-thread when connection-thread calling virHashRemoveEntry() to
remove the dom. That is a persistent virDomainObj is removed during libvirtd startup.
Two ways can resolve the above problem:
1、expending the range of locking virDomainObj and virDomainObjList, lock the object of virDomainObj and virDomainObjList in connection-thread before checking vm->persistent.
2、checking vm->persistent again before calling virHashRemoveEntry().
Do you think it is a problem described above and which way listed above is more suitable to resolve the problem, or is there any other better idea? Any suggestions?
Best Regards,
-WangYufei
11 years
[libvirt] [PATCHv2 00/10] Strip libnuma functions from nodeinfo.c
by Peter Krempa
As a pre-requisite for doing unit tests for topology detection code
the code in nodeinfo.c needs to be cleaned up.
Peter Krempa (10):
nodeinfo: Rename linuxNodeInfoCPUPopulate and export it properly
numa: Introduce virNumaIsAvailable and use it instead of
numa_available
nodeinfo: Avoid forward declarations of static functions
numa: Introduce virNumaGetMaxNode and use it instead of numa_max_node
numa: Introduce virNumaGetNodeMemory and use it instead of
numa_node_size64
nodeinfo: Get rid of nodeGetCellMemory
numa: Replace NUMA_MAX_N_CPUS macro with virNumaGetMaxCPUs()
caps: Fix function docs for virCapabilitiesAddHostNUMACell
numa: Add wrapper of numa_node_to_cpus and use it
nodeinfo: Remove libnuma include
src/conf/capabilities.c | 1 +
src/libvirt_linux.syms | 2 +-
src/libvirt_private.syms | 3 +
src/nodeinfo.c | 222 +++++++++++++----------------------------------
src/nodeinfo.h | 5 ++
src/util/virnuma.c | 221 +++++++++++++++++++++++++++++++++++++++++++++-
src/util/virnuma.h | 11 +++
tests/nodeinfotest.c | 6 +-
8 files changed, 304 insertions(+), 167 deletions(-)
--
1.8.3.2
11 years
[libvirt] [PATCH v2 0/5] Test our PCI device handling functions
by Michal Privoznik
The 4/5 is actually fix for a bug,
Michal Privoznik (5):
tests: Introduce virpcitest
virpcitest: Test virPCIDeviceDetach
virpcitest: Introduce testVirPCIDeviceReattach
virpci: Don't error on not binded devices
virpcitest: Introduce check for unbinded devices
.gitignore | 1 +
cfg.mk | 4 +-
src/util/virpci.c | 12 +-
tests/Makefile.am | 21 +-
tests/virpcimock.c | 913 +++++++++++++++++++++++++++++++++++++++++++++++++++++
tests/virpcitest.c | 204 ++++++++++++
6 files changed, 1148 insertions(+), 7 deletions(-)
create mode 100644 tests/virpcimock.c
create mode 100644 tests/virpcitest.c
--
1.8.1.5
11 years
[libvirt] PATCH: pci-subsystem: ixgbe: SR-IOV: kernel way to order driver's virtfn -entries is odd causing libvirt failures.
by Niilona
Hi.
As Bjorn Helgaas recommend, this might be the item to discuss in the wider area.
-----------------------------------
There is a behavior effecting virtfn -entries in sysfs, when amount of
them increases over 10.
Run VM's through LIBVIRT -> QEMU/KVM, this causes :
- MAC address setting by LIBVIRT disordered ie. setting targeted to wrong VF.
- VLAN setting by LIBVIRT overall failed
Basics of this are in "/libvirt-x.x.x/src/util/virpci.c" ; in function below,
which don't order virtfn entries correctly.
/*
* Returns virtual functions of a physical function
*/
int
virPCIGetVirtualFunctions(const char *sysfs_path,
virPCIDeviceAddressPtr **virtual_functions,
unsigned int *num_virtual_functions)
{
But I let you to decide which is best way to fix this, as if every
application reads "virtfn" entries from PF's directory, they all need
to sort entries in alphabet. order to avoid this
influence.
So personally I did get over this by adding pre-zeroes to names to
have them in sorted order in PF's directory.
----------------------------
Tested :
- kernel 3.8.0-21
- libvirt 1.0.2 & libvirt 1.1.3
- ixgbe 3.11.33-k, ixgbe 3.17-3 & ixgbe 3.18.7
After applying this MAC/VLAN seen to work and VM's get contacted using
MAC's & VLAN's, which been checked using tcpdump.
P.S. Tested with 3.8.0-21, but patch constructed against
"https://github.com/torvalds/linux.git", so there is possibilty this
works in latest kernel, but haven't checked it.
At least ordering scheme seen in "iov.c" in same form as before.
-----------------------------
>From fac886b86099dca1937730026da24c34857c4077 Mon Sep 17 00:00:00 2001
From: Niilo Minkkinen <niilo.minkkinen(a)tieto.com>
Date: Wed, 23 Oct 2013 14:50:07 +0300
Subject: [PATCH 1/1] Fixes ordering of virtfn -entries on drivers with more
than 10 entries ( virtfn0...virtfn9 ).
This effects on drivers like Intel 82599 "ixgbe", which have max 31
VF's (Virtual Function) on
one PF (Physical Function).
Currently, ordering is "diffused", which in case causes MAC/VLAN
-settings on libvirt based
to PCI -addressing failed.
Signed-off-by: Niilo Minkkinen <niilo.minkkinen(a)tieto.com>
Signed-off-by: Tommy Varre <tommy.varre(a)tieto.com>
---
drivers/pci/iov.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c
index 21a7182..fbd1209 100644
--- a/drivers/pci/iov.c
+++ b/drivers/pci/iov.c
@@ -106,7 +106,7 @@ static int virtfn_add(struct pci_dev *dev, int id,
int reset)
mutex_unlock(&iov->dev->sriov->lock);
rc = pci_bus_add_device(virtfn);
- sprintf(buf, "virtfn%u", id);
+ sprintf(buf, "virtfn%02u", id);
rc = sysfs_create_link(&dev->dev.kobj, &virtfn->dev.kobj, buf);
if (rc)
goto failed1;
@@ -149,7 +149,7 @@ static void virtfn_remove(struct pci_dev *dev, int
id, int reset)
__pci_reset_function(virtfn);
}
- sprintf(buf, "virtfn%u", id);
+ sprintf(buf, "virtfn%02u", id);
sysfs_remove_link(&dev->dev.kobj, buf);
/*
* pci_stop_dev() could have been called for this virtfn already,
--
1.8.1.2
-----------------------------
What you think ?
Niilo Minkkinen
niilo.minkkinen(a)tieto.com
11 years
[libvirt] [PATCH v3] qemu: don't use deprecated -no-kvm-pit-reinjection
by Ján Tomko
Since qemu-kvm 1.1 [1] (since 1.3. in upstream QEMU [2])
'-no-kvm-pit-reinjection' has been deprecated.
Use -global kvm-pit.lost_tick_policy=discard instead.
https://bugzilla.redhat.com/show_bug.cgi?id=978719
[1] http://git.kernel.org/cgit/virt/kvm/qemu-kvm.git/commit/?id=4e4fa39
[2] http://git.qemu.org/?p=qemu.git;a=commitdiff;h=c21fb4f
---
v3: use -global instead of trying to add another -device
re: https://www.redhat.com/archives/libvir-list/2013-July/msg00833.html
Unsetting the QEMU_CAPS_NO_KVM_PIT capability for QEMU >1.2 seems to work
okay with libvirtd upgrades.
v2: https://www.redhat.com/archives/libvir-list/2013-July/msg00821.html
v1: https://www.redhat.com/archives/libvir-list/2013-July/msg00045.html
src/qemu/qemu_capabilities.c | 5 ++--
src/qemu/qemu_capabilities.h | 1 +
src/qemu/qemu_command.c | 8 ++++--
.../qemuxml2argv-kvm-pit-delay.args | 5 ++++
.../qemuxml2argv-kvm-pit-delay.xml | 29 ++++++++++++++++++++++
.../qemuxml2argv-kvm-pit-device.args | 5 ++++
.../qemuxml2argv-kvm-pit-device.xml | 29 ++++++++++++++++++++++
tests/qemuxml2argvtest.c | 4 +++
8 files changed, 82 insertions(+), 4 deletions(-)
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.xml
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.xml
diff --git a/src/qemu/qemu_capabilities.c b/src/qemu/qemu_capabilities.c
index dc8f0be..06b71b5 100644
--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -242,6 +242,7 @@ VIR_ENUM_IMPL(virQEMUCaps, QEMU_CAPS_LAST,
"usb-storage.removable",
"virtio-mmio",
"ich9-intel-hda",
+ "kvm-pit",
);
struct _virQEMUCaps {
@@ -1393,6 +1394,7 @@ struct virQEMUCapsStringFlags virQEMUCapsObjectTypes[] = {
{ "usb-storage", QEMU_CAPS_DEVICE_USB_STORAGE },
{ "virtio-mmio", QEMU_CAPS_DEVICE_VIRTIO_MMIO },
{ "ich9-intel-hda", QEMU_CAPS_DEVICE_ICH9_INTEL_HDA },
+ { "kvm-pit", QEMU_CAPS_DEVICE_KVM_PIT },
};
static struct virQEMUCapsStringFlags virQEMUCapsObjectPropsVirtioBlk[] = {
@@ -2477,13 +2479,12 @@ virQEMUCapsInitArchQMPBasic(virQEMUCapsPtr qemuCaps,
/*
* Currently only x86_64 and i686 support PCI-multibus,
- * -no-acpi and -no-kvm-pit-reinjection.
+ * -no-acpi
*/
if (qemuCaps->arch == VIR_ARCH_X86_64 ||
qemuCaps->arch == VIR_ARCH_I686) {
virQEMUCapsSet(qemuCaps, QEMU_CAPS_PCI_MULTIBUS);
virQEMUCapsSet(qemuCaps, QEMU_CAPS_NO_ACPI);
- virQEMUCapsSet(qemuCaps, QEMU_CAPS_NO_KVM_PIT);
}
ret = 0;
diff --git a/src/qemu/qemu_capabilities.h b/src/qemu/qemu_capabilities.h
index f8dc728..770744e 100644
--- a/src/qemu/qemu_capabilities.h
+++ b/src/qemu/qemu_capabilities.h
@@ -197,6 +197,7 @@ enum virQEMUCapsFlags {
QEMU_CAPS_USB_STORAGE_REMOVABLE = 156, /* usb-storage.removable */
QEMU_CAPS_DEVICE_VIRTIO_MMIO = 157, /* -device virtio-mmio */
QEMU_CAPS_DEVICE_ICH9_INTEL_HDA = 158, /* -device ich9-intel-hda */
+ QEMU_CAPS_DEVICE_KVM_PIT = 159, /* -device kvm-pit */
QEMU_CAPS_LAST, /* this must always be the last item */
};
diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index ba102f4..c988646 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -7966,11 +7966,15 @@ qemuBuildCommandLine(virConnectPtr conn,
case VIR_DOMAIN_TIMER_TICKPOLICY_DELAY:
/* delay is the default if we don't have kernel
(-no-kvm-pit), otherwise, the default is catchup. */
- if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_NO_KVM_PIT))
+ if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_KVM_PIT))
+ virCommandAddArgList(cmd, "-global",
+ "kvm-pit.lost_tick_policy=discard", NULL);
+ else if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_NO_KVM_PIT))
virCommandAddArg(cmd, "-no-kvm-pit-reinjection");
break;
case VIR_DOMAIN_TIMER_TICKPOLICY_CATCHUP:
- if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_NO_KVM_PIT)) {
+ if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_NO_KVM_PIT) ||
+ virQEMUCapsGet(qemuCaps, QEMU_CAPS_DEVICE_KVM_PIT)) {
/* do nothing - this is default for kvm-pit */
} else if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_TDF)) {
/* -tdf switches to 'catchup' with userspace pit. */
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.args b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.args
new file mode 100644
index 0000000..ca5823f
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.args
@@ -0,0 +1,5 @@
+LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \
+/usr/bin/qemu -S -M pc -m 214 -smp 2 -nographic \
+-monitor unix:/tmp/test-monitor,server,nowait \
+-no-kvm-pit-reinjection -no-acpi -boot c -usb -hda /dev/HostVG/QEMUGuest1 \
+-net none -serial none -parallel none
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.xml b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.xml
new file mode 100644
index 0000000..7835a1b
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-delay.xml
@@ -0,0 +1,29 @@
+<domain type='qemu'>
+ <name>QEMUGuest1</name>
+ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+ <memory unit='KiB'>219136</memory>
+ <currentMemory unit='KiB'>219136</currentMemory>
+ <vcpu placement='static'>2</vcpu>
+ <os>
+ <type arch='i686' machine='pc'>hvm</type>
+ <boot dev='hd'/>
+ </os>
+ <clock offset='utc'>
+ <timer name='pit' tickpolicy='delay'/>
+ </clock>
+ <on_poweroff>destroy</on_poweroff>
+ <on_reboot>restart</on_reboot>
+ <on_crash>destroy</on_crash>
+ <devices>
+ <emulator>/usr/bin/qemu</emulator>
+ <disk type='block' device='disk'>
+ <source dev='/dev/HostVG/QEMUGuest1'/>
+ <target dev='hda' bus='ide'/>
+ <address type='drive' controller='0' bus='0' target='0' unit='0'/>
+ </disk>
+ <controller type='usb' index='0'/>
+ <controller type='ide' index='0'/>
+ <controller type='pci' index='0' model='pci-root'/>
+ <memballoon model='virtio'/>
+ </devices>
+</domain>
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.args b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.args
new file mode 100644
index 0000000..f03840f
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.args
@@ -0,0 +1,5 @@
+LC_ALL=C PATH=/bin HOME=/home/test USER=test LOGNAME=test QEMU_AUDIO_DRV=none \
+/usr/bin/qemu -S -M pc -m 214 -smp 2 -nographic \
+-monitor unix:/tmp/test-monitor,server,nowait \
+-global kvm-pit.lost_tick_policy=discard -no-acpi -boot c -usb \
+-hda /dev/HostVG/QEMUGuest1 -net none -serial none -parallel none
diff --git a/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.xml b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.xml
new file mode 100644
index 0000000..7835a1b
--- /dev/null
+++ b/tests/qemuxml2argvdata/qemuxml2argv-kvm-pit-device.xml
@@ -0,0 +1,29 @@
+<domain type='qemu'>
+ <name>QEMUGuest1</name>
+ <uuid>c7a5fdbd-edaf-9455-926a-d65c16db1809</uuid>
+ <memory unit='KiB'>219136</memory>
+ <currentMemory unit='KiB'>219136</currentMemory>
+ <vcpu placement='static'>2</vcpu>
+ <os>
+ <type arch='i686' machine='pc'>hvm</type>
+ <boot dev='hd'/>
+ </os>
+ <clock offset='utc'>
+ <timer name='pit' tickpolicy='delay'/>
+ </clock>
+ <on_poweroff>destroy</on_poweroff>
+ <on_reboot>restart</on_reboot>
+ <on_crash>destroy</on_crash>
+ <devices>
+ <emulator>/usr/bin/qemu</emulator>
+ <disk type='block' device='disk'>
+ <source dev='/dev/HostVG/QEMUGuest1'/>
+ <target dev='hda' bus='ide'/>
+ <address type='drive' controller='0' bus='0' target='0' unit='0'/>
+ </disk>
+ <controller type='usb' index='0'/>
+ <controller type='ide' index='0'/>
+ <controller type='pci' index='0' model='pci-root'/>
+ <memballoon model='virtio'/>
+ </devices>
+</domain>
diff --git a/tests/qemuxml2argvtest.c b/tests/qemuxml2argvtest.c
index 0b808a4..58eb3e9 100644
--- a/tests/qemuxml2argvtest.c
+++ b/tests/qemuxml2argvtest.c
@@ -1080,6 +1080,10 @@ mymain(void)
QEMU_CAPS_DRIVE, QEMU_CAPS_DEVICE_VIRTIO_MMIO,
QEMU_CAPS_DEVICE_VIRTIO_RNG, QEMU_CAPS_OBJECT_RNG_RANDOM);
+ DO_TEST("kvm-pit-device", QEMU_CAPS_DEVICE_KVM_PIT);
+ DO_TEST("kvm-pit-delay", QEMU_CAPS_NO_KVM_PIT);
+ DO_TEST("kvm-pit-device", QEMU_CAPS_NO_KVM_PIT, QEMU_CAPS_DEVICE_KVM_PIT);
+
virObjectUnref(driver.config);
virObjectUnref(driver.caps);
virObjectUnref(driver.xmlopt);
--
1.8.1.5
11 years
[libvirt] [PATCHv2 0/8] Introduce paravirtual spinlock support
by Peter Krempa
Version 2 fixes stuff pointed out in Dan's review
Jiri Denemark (2):
cpu: Export few x86-specific APIs
qemu: Add monitor APIs to fetch CPUID data from QEMU
Peter Krempa (6):
cpu_x86: Refactor storage of CPUID data to add support for KVM
features
cpu: x86: Parse the CPU feature map only once
cpu: x86: Add internal CPUID features support and KVM feature bits
conf: Refactor storing and usage of feature flags
qemu: Add support for paravirtual spinlocks in the guest
qemu: process: Validate specific CPUID flags of a guest
docs/formatdomain.html.in | 8 +
docs/schemas/domaincommon.rng | 10 +-
src/conf/domain_conf.c | 195 +++++++++---
src/conf/domain_conf.h | 3 +-
src/cpu/cpu_x86.c | 350 +++++++++++----------
src/cpu/cpu_x86.h | 9 +
src/cpu/cpu_x86_data.h | 20 +-
src/libvirt_private.syms | 6 +
src/libxl/libxl_conf.c | 9 +-
src/lxc/lxc_container.c | 6 +-
src/qemu/qemu_command.c | 28 +-
src/qemu/qemu_monitor.c | 35 +++
src/qemu/qemu_monitor.h | 5 +
src/qemu/qemu_monitor_json.c | 145 +++++++++
src/qemu/qemu_monitor_json.h | 2 +
src/qemu/qemu_process.c | 53 ++++
src/vbox/vbox_tmpl.c | 45 ++-
src/xenapi/xenapi_driver.c | 10 +-
src/xenapi/xenapi_utils.c | 22 +-
src/xenxs/xen_sxpr.c | 20 +-
src/xenxs/xen_xm.c | 30 +-
tests/Makefile.am | 1 +
.../qemumonitorjson-getcpu-full.data | 5 +
.../qemumonitorjson-getcpu-full.json | 46 +++
.../qemumonitorjson-getcpu-host.data | 6 +
.../qemumonitorjson-getcpu-host.json | 45 +++
tests/qemumonitorjsontest.c | 76 +++++
.../qemuxml2argv-pv-spinlock-disabled.args | 5 +
.../qemuxml2argv-pv-spinlock-disabled.xml | 26 ++
.../qemuxml2argv-pv-spinlock-enabled.args | 5 +
.../qemuxml2argv-pv-spinlock-enabled.xml | 26 ++
tests/qemuxml2argvtest.c | 2 +
tests/qemuxml2xmltest.c | 2 +
33 files changed, 952 insertions(+), 304 deletions(-)
create mode 100644 tests/qemumonitorjsondata/qemumonitorjson-getcpu-full.data
create mode 100644 tests/qemumonitorjsondata/qemumonitorjson-getcpu-full.json
create mode 100644 tests/qemumonitorjsondata/qemumonitorjson-getcpu-host.data
create mode 100644 tests/qemumonitorjsondata/qemumonitorjson-getcpu-host.json
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pv-spinlock-disabled.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pv-spinlock-disabled.xml
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pv-spinlock-enabled.args
create mode 100644 tests/qemuxml2argvdata/qemuxml2argv-pv-spinlock-enabled.xml
--
1.8.3.2
11 years
[libvirt] [RFC] net-dhcp-leases: Query: Reg: Leases API Script
by Nehal J Wani
Q1. The --dhcp-script option will require libvirt to provide a script
or executable file to be run. Now as the man page says, this file is
executed "Whenever a new DHCP ease is created, or an old one
destroyed". Life was easy with the script, as it used the command sed
to remove the destroyed lease. eblake had suggested me to use a C
program instead. So I wanted to finalize whether to go with C or
continue with shell script.
Q2. The above executable file will be writing the custom formatted
lease parameters to a file "dnsmasq-ip-mac.status" (suggestion open
for name). This newly created/updated file will be parsed by the API.
We need to decide the format for the file. Do we continue with space
separated parameters as before?
Q3. What should be the location of the above two files? Is there any
example that I can follow in libvirt for deploying custom-script
files/C programs which are not to be linked?
Script is attached.
--
Nehal J Wani
11 years
[libvirt] pvpanic plans?
by Paolo Bonzini
The thread from yesterday has died off (perhaps also because of
my inappropriate answer to Michael, for which I apologize to him
and everyone). I took some time to discuss the libvirt requirements
further with Daniel Berrange and Eric Blake on IRC. If anyone is
interested, I can give logs. This is a suggestion for how to
proceed in both QEMU and libvirt.
== Builtin pvpanic ==
QEMU will remove pvpanic from pc-1.5 in 1.6.1 and 1.5.4. This does not
break migration.
== Support in libvirt for current functionality ==
libvirt will add a <panic-notifier/> element, and possibly a capability
for it accessible via "virsh capabilities". There are two possibilities:
1) On QEMU 1.5.4/1.6.1 and newer (and on QEMU 1.6.0 with a machine type
other than pc-1.5), <on_crash> will only work if the element is there.
On QEMU 1.5.0->1.5.3, and on QEMU 1.6.0 with the pc-1.5 machine type,
<on_crash> will be obeyed always, and may override e.g. reboot-on-panic
if a guest driver exist.
2) On all versions, <on_crash> will only work if the element is there.
In turn, there are two ways to implement (2):
2a) libvirt will always add -global pvpanic.iobase=0 to neutralize
the builtin pvpanic device if present. <panic-notifier/>
will create the device with -device pvpanic,iobase=0x505
Advantage: no changes to QEMU
Disadvantage 1: writes to port 0 with QEMU 1.{5.0,5.1,5.2,5.3,6.0}
and pc-1.5 machine type will write to a pvpanic device instead of
the DMA controller. Probably harmless, and limited to some QEMU
versions.
Disadvantage 2: libvirt has knowledge of the pvpanic port number
2b) QEMU will provide a way for libvirt to detect that no machine type
has the builtin pvpanic. If some machine type may have the builtin
pvpanic, and <panic-notifier/> is absent, libvirt will add
"-global pvpanic.iobase=0" to neutralize it. Otherwise, libvirt
will create the device normally.
A possible way for libvirt to detect "good" machine types is a
dummy property. This is a bit ugly in that the property would not
affect the behavior of the device. The property would remain in
the long term.
Another possibility is for QEMU to rename the device, e.g. to
isa-pvpanic. This is also somewhat gross, but not visible in the
long term when the "pvpanic" name will be lost in history.
Advantage 1: libvirt has no knowledge of the pvpanic port number
Disadvantage 1: same as above
Disadvantage 2: need a somewhat gross change in QEMU
This method also provides an (also somewhat gross on the QEMU side)
way to detect other changes in the pvpanic semantics. One example
mentioned below, is making the panicked state temporary.
== Possible improvements to pvpanic ==
The current implementation of pvpanic supports three modes: reset system
on panic, destroy domain on panic, preserve domain with no possibility
to resume it. (Optionally a domain can be dumped too).
Long term, the choice to include pvpanic should not be on the guest
admin's shoulders, but rather in libosinfo. Thus, it would be nice to
have a fourth mode where the panic is logged but the guest otherwise
keeps running. This mode would let libosinfo add pvpanic by default
without affecting the guest's behavior on panic.
With this change, <on_crash>ignore</on_crash> will behave as follows
for the three possibilities above:
(1) With QEMU 1.5.0 to 1.6.1, <on_crash> will _not_ obey the setting,
never (even if no <panic-notifier/> is specified).
libvirt will have to pick a fallback action.
advantage of destroy as fallback: it is the default (but
note that restart is the default for virt-install)
advantage of preserve as fallback: lets the admin examine
the panic
advantage of restart as fallback: maximum availability of
the VM, it is the default for virt-install
(2a) With QEMU 1.5.0 to 1.6.1, <on_crash> will _not_ obey the setting
if <panic-notifier/> is specified. libvirt has _no way_ to learn
about this, so the capability would always be present with these
QEMU versions and libosinfo would always add <panic-notifier/> with
these versions. Given the libosinfo scenario being considered here,
this is not very different from (1).
(2b) With QEMU 1.5.0 to 1.6.1, the <panic-notifier/> element will not
be available and not exposed in libvirt capabilities. Thus with
this version libosinfo would omit <panic-notifier/> from the XML.
Guest policy will always be followed correctly.
The problem in both (1) and (2a) can be summarized as follows. First,
libvirt will have to implement and document a fallback action for buggy
QEMU. Second, even though the problems would be limited to some version
of QEMU, they would be relatively hard to debug for a casual user, could
start happening randomly by updating any one of QEMU, libvirt, libosinfo
or the guest kernel, and there is no fallback action for libvirt that is
always correct.
Thus, considering future libosinfo support for pvpanic, (2b) is preferrable
in my opinion.
Now, making pvpanic reversible requires a change in QEMU (patch already
posted). Andreas proposed using a pvpanic property to determine whether
the panicked state is temporary or definitive. libvirt could piggyback
on such a property to detect the "goodness" of machine types (as mentioned
regarding solution 2b above). However:
First, this would require a more intrusive patch, less appealing for
1.5 and 1.6 stable branches. Second, there is no reason why libvirt would
want to make the panicked state definitive. To achieve the same effect,
libvirt can just not issue the "continue" monitor command when the guest
is panicked. Thus the new property would be useless except to communicate
pvpanic behavior---and renaming the device still seems preferrable to me.
Thanks for reading up to here!
Paolo
11 years
[libvirt] [PATCHv2 1/2] virsh: move command cpu-baseline from domain group to host group.
by liyang
From: Li Yang <liyang.fnst(a)cn.fujitsu.com>
cpu-baseline command can compute baseline CPU for a set of given CPUs,
I think it should not belong to domain group, should belong to host
group, so I moved the related source from virsh-domain.c to virsh-host.c
Signed-off-by: Li Yang <liyang.fnst(a)cn.fujitsu.com>
---
tools/virsh-domain.c | 116 -------------------------------------------------
tools/virsh-host.c | 117 ++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 117 insertions(+), 116 deletions(-)
diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c
index 42c9920..8fc6c59 100644
--- a/tools/virsh-domain.c
+++ b/tools/virsh-domain.c
@@ -6107,116 +6107,6 @@ cleanup:
}
/*
- * "cpu-baseline" command
- */
-static const vshCmdInfo info_cpu_baseline[] = {
- {.name = "help",
- .data = N_("compute baseline CPU")
- },
- {.name = "desc",
- .data = N_("Compute baseline CPU for a set of given CPUs.")
- },
- {.name = NULL}
-};
-
-static const vshCmdOptDef opts_cpu_baseline[] = {
- {.name = "file",
- .type = VSH_OT_DATA,
- .flags = VSH_OFLAG_REQ,
- .help = N_("file containing XML CPU descriptions")
- },
- {.name = "features",
- .type = VSH_OT_BOOL,
- .help = N_("Show features that are part of the CPU model type")
- },
- {.name = NULL}
-};
-
-static bool
-cmdCPUBaseline(vshControl *ctl, const vshCmd *cmd)
-{
- const char *from = NULL;
- bool ret = false;
- char *buffer;
- char *result = NULL;
- char **list = NULL;
- unsigned int flags = 0;
- int count = 0;
-
- xmlDocPtr xml = NULL;
- xmlNodePtr *node_list = NULL;
- xmlXPathContextPtr ctxt = NULL;
- virBuffer buf = VIR_BUFFER_INITIALIZER;
- size_t i;
-
- if (vshCommandOptBool(cmd, "features"))
- flags |= VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES;
-
- if (vshCommandOptStringReq(ctl, cmd, "file", &from) < 0)
- return false;
-
- if (virFileReadAll(from, VSH_MAX_XML_FILE, &buffer) < 0)
- return false;
-
- /* add a separate container around the xml */
- virBufferStrcat(&buf, "<container>", buffer, "</container>", NULL);
- if (virBufferError(&buf))
- goto no_memory;
-
- VIR_FREE(buffer);
- buffer = virBufferContentAndReset(&buf);
-
-
- if (!(xml = virXMLParseStringCtxt(buffer, from, &ctxt)))
- goto cleanup;
-
- if ((count = virXPathNodeSet("//cpu[not(ancestor::cpus)]",
- ctxt, &node_list)) == -1)
- goto cleanup;
-
- if (count == 0) {
- vshError(ctl, _("No host CPU specified in '%s'"), from);
- goto cleanup;
- }
-
- list = vshCalloc(ctl, count, sizeof(const char *));
-
- for (i = 0; i < count; i++) {
- if (!(list[i] = virXMLNodeToString(xml, node_list[i]))) {
- vshSaveLibvirtError();
- goto cleanup;
- }
- }
-
- result = virConnectBaselineCPU(ctl->conn,
- (const char **)list, count, flags);
-
- if (result) {
- vshPrint(ctl, "%s", result);
- ret = true;
- }
-
-cleanup:
- xmlXPathFreeContext(ctxt);
- xmlFreeDoc(xml);
- VIR_FREE(result);
- if (list != NULL && count > 0) {
- for (i = 0; i < count; i++)
- VIR_FREE(list[i]);
- }
- VIR_FREE(list);
- VIR_FREE(buffer);
- VIR_FREE(node_list);
-
- return ret;
-
-no_memory:
- vshError(ctl, "%s", _("Out of memory"));
- ret = false;
- goto cleanup;
-}
-
-/*
* "cpu-stats" command
*/
static const vshCmdInfo info_cpu_stats[] = {
@@ -10517,12 +10407,6 @@ const vshCmdDef domManagementCmds[] = {
.flags = 0
},
#endif
- {.name = "cpu-baseline",
- .handler = cmdCPUBaseline,
- .opts = opts_cpu_baseline,
- .info = info_cpu_baseline,
- .flags = 0
- },
{.name = "cpu-compare",
.handler = cmdCPUCompare,
.opts = opts_cpu_compare,
diff --git a/tools/virsh-host.c b/tools/virsh-host.c
index 1d1bb97..b69de7c 100644
--- a/tools/virsh-host.c
+++ b/tools/virsh-host.c
@@ -35,6 +35,7 @@
#include "virbuffer.h"
#include "viralloc.h"
#include "virsh-domain.h"
+#include "virfile.h"
#include "virxml.h"
#include "virtypedparam.h"
#include "virstring.h"
@@ -193,6 +194,116 @@ cleanup:
}
/*
+ * "cpu-baseline" command
+ */
+static const vshCmdInfo info_cpu_baseline[] = {
+ {.name = "help",
+ .data = N_("compute baseline CPU")
+ },
+ {.name = "desc",
+ .data = N_("Compute baseline CPU for a set of given CPUs.")
+ },
+ {.name = NULL}
+};
+
+static const vshCmdOptDef opts_cpu_baseline[] = {
+ {.name = "file",
+ .type = VSH_OT_DATA,
+ .flags = VSH_OFLAG_REQ,
+ .help = N_("file containing XML CPU descriptions")
+ },
+ {.name = "features",
+ .type = VSH_OT_BOOL,
+ .help = N_("Show features that are part of the CPU model type")
+ },
+ {.name = NULL}
+};
+
+static bool
+cmdCPUBaseline(vshControl *ctl, const vshCmd *cmd)
+{
+ const char *from = NULL;
+ bool ret = false;
+ char *buffer;
+ char *result = NULL;
+ char **list = NULL;
+ unsigned int flags = 0;
+ int count = 0;
+
+ xmlDocPtr xml = NULL;
+ xmlNodePtr *node_list = NULL;
+ xmlXPathContextPtr ctxt = NULL;
+ virBuffer buf = VIR_BUFFER_INITIALIZER;
+ size_t i;
+
+ if (vshCommandOptBool(cmd, "features"))
+ flags |= VIR_CONNECT_BASELINE_CPU_EXPAND_FEATURES;
+
+ if (vshCommandOptStringReq(ctl, cmd, "file", &from) < 0)
+ return false;
+
+ if (virFileReadAll(from, VSH_MAX_XML_FILE, &buffer) < 0)
+ return false;
+
+ /* add a separate container around the xml */
+ virBufferStrcat(&buf, "<container>", buffer, "</container>", NULL);
+ if (virBufferError(&buf))
+ goto no_memory;
+
+ VIR_FREE(buffer);
+ buffer = virBufferContentAndReset(&buf);
+
+
+ if (!(xml = virXMLParseStringCtxt(buffer, from, &ctxt)))
+ goto cleanup;
+
+ if ((count = virXPathNodeSet("//cpu[not(ancestor::cpus)]",
+ ctxt, &node_list)) == -1)
+ goto cleanup;
+
+ if (count == 0) {
+ vshError(ctl, _("No host CPU specified in '%s'"), from);
+ goto cleanup;
+ }
+
+ list = vshCalloc(ctl, count, sizeof(const char *));
+
+ for (i = 0; i < count; i++) {
+ if (!(list[i] = virXMLNodeToString(xml, node_list[i]))) {
+ vshSaveLibvirtError();
+ goto cleanup;
+ }
+ }
+
+ result = virConnectBaselineCPU(ctl->conn,
+ (const char **)list, count, flags);
+
+ if (result) {
+ vshPrint(ctl, "%s", result);
+ ret = true;
+ }
+
+cleanup:
+ xmlXPathFreeContext(ctxt);
+ xmlFreeDoc(xml);
+ VIR_FREE(result);
+ if (list != NULL && count > 0) {
+ for (i = 0; i < count; i++)
+ VIR_FREE(list[i]);
+ }
+ VIR_FREE(list);
+ VIR_FREE(buffer);
+ VIR_FREE(node_list);
+
+ return ret;
+
+no_memory:
+ vshError(ctl, "%s", _("Out of memory"));
+ ret = false;
+ goto cleanup;
+}
+
+/*
* "maxvcpus" command
*/
static const vshCmdInfo info_maxvcpus[] = {
@@ -955,6 +1066,12 @@ const vshCmdDef hostAndHypervisorCmds[] = {
.info = info_hostname,
.flags = 0
},
+ {.name = "cpu-baseline",
+ .handler = cmdCPUBaseline,
+ .opts = opts_cpu_baseline,
+ .info = info_cpu_baseline,
+ .flags = 0
+ },
{.name = "maxvcpus",
.handler = cmdMaxvcpus,
.opts = opts_maxvcpus,
--
1.7.1
11 years