[libvirt] [RFC PATCH v2 0/3] Start fixing the pvpanic mess

The pvpanic mess is even bigger than anticipated. Let's fix the monitor's behavior (patch 1), get rid of all traces that the broken pvpanic existed (patch 2), and give it a new name so that libvirt can detect a design that works (patch 3). All downstreams are urged to apply patches 1+2 as soon as they are merged in QEMU. Still, there are still other problems to solve. In QEMU, exposing "-device isa-pvpanic" in the ACPI tables. Quite frankly I don't have the time to fix this. We have ~3 months though. Patch 3 should not be applied until it is fixed. Also, libvirt needs to know under which circumstances to add "-device isa-pvpanic", besides obviously the availability of the device. IMHO it is just too complicated to retrofit all complications in <on_crash>. In fact, I suspect <on_crash> would match more closely QEMU's "internal error" state, and it would be quite useful to add that to the QEMU driver. Thus, libvirt could add support for an <on_panic> element with the following values: - default, which depends on the hypervisor and is "crash" for Xen, "ignore" for everything else - "crash", which is the only supported value on Xen and means "use the value of <on_crash>" - ignore, which is the behavior of old libvirt that don't know pvpanic - restart, destroy, preserve, rename-restart as for <on_crash>. In particular, preserve can still use the crashed_guest_panicked status, thus keeping the API compatible. - pause, to pause the VM (with a new substatus) <on_panic> would also have a coredump='yes/no' attribute, where yes is only valid if the element is *not* one of default/crash/ignore. Only "default" and "ignore" would be accepted for QEMU that does not support the isa-pvpanic device. Ideally, the coredump attribute would be added to <on_crash> too for simplicity (Xen would refuse coredump + rename-restart and coredump + preserve). Paolo Paolo Bonzini (3): vl: allow "cont" from panicked state pc: get rid of builtin pvpanic pvpanic: rename to isa-pvpanic gdbstub.c | 3 --- hw/i386/pc_piix.c | 8 -------- hw/i386/pc_q35.c | 6 ------ hw/misc/pvpanic.c | 16 +++------------- include/hw/i386/pc.h | 3 --- vl.c | 6 ++---- 6 files changed, 5 insertions(+), 37 deletions(-) -- 1.8.3.1

After reporting the GUEST_PANICKED monitor event, QEMU stops the VM. The reason for this is that events are edge-triggered, and can be lost if management dies at the wrong time. Stopping a panicked VM lets management know of a panic even if it has crashed; management can learn about the panic when it restarts and queries running QEMU processes. The downside is of course that the VM will be paused while management is not running, but that is acceptable if it only happens with explicit "-device pvpanic". Upon learning of a panic, management (if configured to do so) can pick a variety of behaviors: leave the VM paused, reset it, destroy it. In addition to all of these behaviors, it is possible to dump the VM core from the host. However, right now, the panicked state is irreversible, and can only be exited by resetting the machine. This means that any policy decision is entirely in the hands of the host. In particular there is no way to use the "reboot on panic" option together with pvpanic. This patch makes the panicked state reversible (and removes various workarounds that were there because of the state being irreversible). With this change, management has a wider set of possible policies: it can just log the crash and leave policy to the guest, it can leave the VM paused. In particular, the "log the crash and continue" is implemented simply by sending a "cont" as soon as management learns about the panic. Management could also implement the "irreversible paused state" itself. And again, all such actions can be coupled with dumping the VM core. Unfortunately we cannot change the behavior of 1.6.0. Thus, even if it uses "-device pvpanic", management should check for "cont" failures. If "cont" fails, management can then log that the VM remained paused and urge the administrator to update QEMU. Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- gdbstub.c | 3 --- vl.c | 6 ++---- 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/gdbstub.c b/gdbstub.c index 35ca7c2..747e67d 100644 --- a/gdbstub.c +++ b/gdbstub.c @@ -372,9 +372,6 @@ static inline void gdb_continue(GDBState *s) #ifdef CONFIG_USER_ONLY s->running_state = 1; #else - if (runstate_check(RUN_STATE_GUEST_PANICKED)) { - runstate_set(RUN_STATE_DEBUG); - } if (!runstate_needs_reset()) { vm_start(); } diff --git a/vl.c b/vl.c index 25b8f2f..818d99e 100644 --- a/vl.c +++ b/vl.c @@ -637,9 +637,8 @@ static const RunStateTransition runstate_transitions_def[] = { { RUN_STATE_WATCHDOG, RUN_STATE_RUNNING }, { RUN_STATE_WATCHDOG, RUN_STATE_FINISH_MIGRATE }, - { RUN_STATE_GUEST_PANICKED, RUN_STATE_PAUSED }, + { RUN_STATE_GUEST_PANICKED, RUN_STATE_RUNNING }, { RUN_STATE_GUEST_PANICKED, RUN_STATE_FINISH_MIGRATE }, - { RUN_STATE_GUEST_PANICKED, RUN_STATE_DEBUG }, { RUN_STATE_MAX, RUN_STATE_MAX }, }; @@ -685,8 +684,7 @@ int runstate_is_running(void) bool runstate_needs_reset(void) { return runstate_check(RUN_STATE_INTERNAL_ERROR) || - runstate_check(RUN_STATE_SHUTDOWN) || - runstate_check(RUN_STATE_GUEST_PANICKED); + runstate_check(RUN_STATE_SHUTDOWN); } StatusInfo *qmp_query_status(Error **errp) -- 1.8.3.1

It is a source of pain, and the previous patch anyway changed the behavior of "-M pc-1.5" compared to the real 1.5. This also makes it clear that "-device pvpanic" is not enough: it will not expose pvpanic in fw_cfg properly. No idea how to fix that. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/i386/pc_piix.c | 8 -------- hw/i386/pc_q35.c | 6 ------ hw/misc/pvpanic.c | 14 ++------------ include/hw/i386/pc.h | 3 --- 4 files changed, 2 insertions(+), 29 deletions(-) diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index b58c255..b80f9a3 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -56,7 +56,6 @@ static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 }; static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 }; static const int ide_irq[MAX_IDE_BUS] = { 14, 15 }; -static bool has_pvpanic = true; static bool has_pci_info = true; /* PC hardware initialisation */ @@ -240,10 +239,6 @@ static void pc_init1(MemoryRegion *system_memory, if (pci_enabled) { pc_pci_device_init(pci_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } } static void pc_init_pci(QEMUMachineInitArgs *args) @@ -269,7 +264,6 @@ static void pc_init_pci_1_5(QEMUMachineInitArgs *args) static void pc_init_pci_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_init_pci_1_5(args); } @@ -302,7 +296,6 @@ static void pc_init_pci_no_kvmclock(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; disable_kvm_pv_eoi(); enable_compat_apic_id_mode(); @@ -321,7 +314,6 @@ static void pc_init_isa(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; if (cpu_model == NULL) cpu_model = "486"; diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 0b1d2e3..fb403b8 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -46,7 +46,6 @@ /* ICH9 AHCI has 6 ports */ #define MAX_SATA_PORTS 6 -static bool has_pvpanic = true; static bool has_pci_info = true; /* PC hardware initialisation */ @@ -210,10 +209,6 @@ static void pc_q35_init(QEMUMachineInitArgs *args) if (pci_enabled) { pc_pci_device_init(host_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } } static void pc_q35_init_1_5(QEMUMachineInitArgs *args) @@ -224,7 +219,6 @@ static void pc_q35_init_1_5(QEMUMachineInitArgs *args) static void pc_q35_init_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_q35_init_1_5(args); } diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c index 7bb49a5..1928cc9 100644 --- a/hw/misc/pvpanic.c +++ b/hw/misc/pvpanic.c @@ -101,7 +101,8 @@ static void pvpanic_isa_realizefn(DeviceState *dev, Error **errp) isa_register_ioport(d, &s->io, s->ioport); } -static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) +static void __attribute__((unused)) pvpanic_fw_cfg(ISADevice *dev, + FWCfgState *fw_cfg) { PVPanicState *s = ISA_PVPANIC_DEVICE(dev); uint16_t *pvpanic_port = g_malloc(sizeof(*pvpanic_port)); @@ -111,17 +112,6 @@ static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) sizeof(*pvpanic_port)); } -void pvpanic_init(ISABus *bus) -{ - ISADevice *dev; - FWCfgState *fw_cfg = fw_cfg_find(); - if (!fw_cfg) { - return; - } - dev = isa_create_simple (bus, TYPE_ISA_PVPANIC_DEVICE); - pvpanic_fw_cfg(dev, fw_cfg); -} - static Property pvpanic_isa_properties[] = { DEFINE_PROP_UINT16("ioport", PVPanicState, ioport, 0x505), DEFINE_PROP_END_OF_LIST(), diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 7fb97b0..064cc81 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -194,9 +194,6 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, int irq, NICInfo *nd) /* pc_sysfw.c */ void pc_system_firmware_init(MemoryRegion *rom_memory); -/* pvpanic.c */ -void pvpanic_init(ISABus *bus); - /* e820 types */ #define E820_RAM 1 #define E820_RESERVED 2 -- 1.8.3.1

On Wed, Aug 21, 2013 at 06:43:15PM +0200, Paolo Bonzini wrote:
It is a source of pain, and the previous patch anyway changed the behavior of "-M pc-1.5" compared to the real 1.5.
This also makes it clear that "-device pvpanic" is not enough: it will not expose pvpanic in fw_cfg properly.
No idea how to fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
That should be the first step. Make the device work properly. We'll discuss compatibility when that is clear.
--- hw/i386/pc_piix.c | 8 -------- hw/i386/pc_q35.c | 6 ------ hw/misc/pvpanic.c | 14 ++------------ include/hw/i386/pc.h | 3 --- 4 files changed, 2 insertions(+), 29 deletions(-)
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index b58c255..b80f9a3 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -56,7 +56,6 @@ static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 }; static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 }; static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
-static bool has_pvpanic = true; static bool has_pci_info = true;
/* PC hardware initialisation */ @@ -240,10 +239,6 @@ static void pc_init1(MemoryRegion *system_memory, if (pci_enabled) { pc_pci_device_init(pci_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } }
static void pc_init_pci(QEMUMachineInitArgs *args) @@ -269,7 +264,6 @@ static void pc_init_pci_1_5(QEMUMachineInitArgs *args)
static void pc_init_pci_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_init_pci_1_5(args); } @@ -302,7 +296,6 @@ static void pc_init_pci_no_kvmclock(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; disable_kvm_pv_eoi(); enable_compat_apic_id_mode(); @@ -321,7 +314,6 @@ static void pc_init_isa(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; if (cpu_model == NULL) cpu_model = "486"; diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 0b1d2e3..fb403b8 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -46,7 +46,6 @@ /* ICH9 AHCI has 6 ports */ #define MAX_SATA_PORTS 6
-static bool has_pvpanic = true; static bool has_pci_info = true;
/* PC hardware initialisation */ @@ -210,10 +209,6 @@ static void pc_q35_init(QEMUMachineInitArgs *args) if (pci_enabled) { pc_pci_device_init(host_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } }
static void pc_q35_init_1_5(QEMUMachineInitArgs *args) @@ -224,7 +219,6 @@ static void pc_q35_init_1_5(QEMUMachineInitArgs *args)
static void pc_q35_init_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_q35_init_1_5(args); } diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c index 7bb49a5..1928cc9 100644 --- a/hw/misc/pvpanic.c +++ b/hw/misc/pvpanic.c @@ -101,7 +101,8 @@ static void pvpanic_isa_realizefn(DeviceState *dev, Error **errp) isa_register_ioport(d, &s->io, s->ioport); }
-static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) +static void __attribute__((unused)) pvpanic_fw_cfg(ISADevice *dev, + FWCfgState *fw_cfg) { PVPanicState *s = ISA_PVPANIC_DEVICE(dev); uint16_t *pvpanic_port = g_malloc(sizeof(*pvpanic_port)); @@ -111,17 +112,6 @@ static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) sizeof(*pvpanic_port)); }
-void pvpanic_init(ISABus *bus) -{ - ISADevice *dev; - FWCfgState *fw_cfg = fw_cfg_find(); - if (!fw_cfg) { - return; - } - dev = isa_create_simple (bus, TYPE_ISA_PVPANIC_DEVICE); - pvpanic_fw_cfg(dev, fw_cfg); -} - static Property pvpanic_isa_properties[] = { DEFINE_PROP_UINT16("ioport", PVPanicState, ioport, 0x505), DEFINE_PROP_END_OF_LIST(), diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 7fb97b0..064cc81 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -194,9 +194,6 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, int irq, NICInfo *nd) /* pc_sysfw.c */ void pc_system_firmware_init(MemoryRegion *rom_memory);
-/* pvpanic.c */ -void pvpanic_init(ISABus *bus); - /* e820 types */ #define E820_RAM 1 #define E820_RESERVED 2 -- 1.8.3.1

Il 21/08/2013 19:03, Michael S. Tsirkin ha scritto:
It is a source of pain, and the previous patch anyway changed the behavior of "-M pc-1.5" compared to the real 1.5.
This also makes it clear that "-device pvpanic" is not enough: it will not expose pvpanic in fw_cfg properly.
No idea how to fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> That should be the first step. Make the device work properly. We'll discuss compatibility when that is clear.
Actually these patches are not based on origin/master (my mistake)... I haven't tested, rebasing should be enough if "-device pvpanic" was tested properly for 1.6. Paolo

Am 21.08.2013 19:02, schrieb Paolo Bonzini:
Il 21/08/2013 19:03, Michael S. Tsirkin ha scritto:
It is a source of pain, and the previous patch anyway changed the behavior of "-M pc-1.5" compared to the real 1.5.
This also makes it clear that "-device pvpanic" is not enough: it will not expose pvpanic in fw_cfg properly.
No idea how to fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> That should be the first step. Make the device work properly. We'll discuss compatibility when that is clear.
Actually these patches are not based on origin/master (my mistake)... I haven't tested, rebasing should be enough if "-device pvpanic" was tested properly for 1.6.
I haven't tested myself but I remember seeing patches moving fw_cfg integration into the initfn/realizefn. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

On Wed, Aug 21, 2013 at 08:03:58PM +0300, Michael S. Tsirkin wrote:
On Wed, Aug 21, 2013 at 06:43:15PM +0200, Paolo Bonzini wrote:
It is a source of pain, and the previous patch anyway changed the behavior of "-M pc-1.5" compared to the real 1.5.
This also makes it clear that "-device pvpanic" is not enough: it will not expose pvpanic in fw_cfg properly.
No idea how to fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
That should be the first step. Make the device work properly. We'll discuss compatibility when that is clear.
So not nacking yet but I think it's too early to apply.
--- hw/i386/pc_piix.c | 8 -------- hw/i386/pc_q35.c | 6 ------ hw/misc/pvpanic.c | 14 ++------------ include/hw/i386/pc.h | 3 --- 4 files changed, 2 insertions(+), 29 deletions(-)
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index b58c255..b80f9a3 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -56,7 +56,6 @@ static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 }; static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 }; static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
-static bool has_pvpanic = true; static bool has_pci_info = true;
/* PC hardware initialisation */ @@ -240,10 +239,6 @@ static void pc_init1(MemoryRegion *system_memory, if (pci_enabled) { pc_pci_device_init(pci_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } }
static void pc_init_pci(QEMUMachineInitArgs *args) @@ -269,7 +264,6 @@ static void pc_init_pci_1_5(QEMUMachineInitArgs *args)
static void pc_init_pci_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_init_pci_1_5(args); } @@ -302,7 +296,6 @@ static void pc_init_pci_no_kvmclock(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; disable_kvm_pv_eoi(); enable_compat_apic_id_mode(); @@ -321,7 +314,6 @@ static void pc_init_isa(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; if (cpu_model == NULL) cpu_model = "486"; diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 0b1d2e3..fb403b8 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -46,7 +46,6 @@ /* ICH9 AHCI has 6 ports */ #define MAX_SATA_PORTS 6
-static bool has_pvpanic = true; static bool has_pci_info = true;
/* PC hardware initialisation */ @@ -210,10 +209,6 @@ static void pc_q35_init(QEMUMachineInitArgs *args) if (pci_enabled) { pc_pci_device_init(host_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } }
static void pc_q35_init_1_5(QEMUMachineInitArgs *args) @@ -224,7 +219,6 @@ static void pc_q35_init_1_5(QEMUMachineInitArgs *args)
static void pc_q35_init_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_q35_init_1_5(args); } diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c index 7bb49a5..1928cc9 100644 --- a/hw/misc/pvpanic.c +++ b/hw/misc/pvpanic.c @@ -101,7 +101,8 @@ static void pvpanic_isa_realizefn(DeviceState *dev, Error **errp) isa_register_ioport(d, &s->io, s->ioport); }
-static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) +static void __attribute__((unused)) pvpanic_fw_cfg(ISADevice *dev, + FWCfgState *fw_cfg) { PVPanicState *s = ISA_PVPANIC_DEVICE(dev); uint16_t *pvpanic_port = g_malloc(sizeof(*pvpanic_port)); @@ -111,17 +112,6 @@ static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) sizeof(*pvpanic_port)); }
-void pvpanic_init(ISABus *bus) -{ - ISADevice *dev; - FWCfgState *fw_cfg = fw_cfg_find(); - if (!fw_cfg) { - return; - } - dev = isa_create_simple (bus, TYPE_ISA_PVPANIC_DEVICE); - pvpanic_fw_cfg(dev, fw_cfg); -} - static Property pvpanic_isa_properties[] = { DEFINE_PROP_UINT16("ioport", PVPanicState, ioport, 0x505), DEFINE_PROP_END_OF_LIST(), diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 7fb97b0..064cc81 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -194,9 +194,6 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, int irq, NICInfo *nd) /* pc_sysfw.c */ void pc_system_firmware_init(MemoryRegion *rom_memory);
-/* pvpanic.c */ -void pvpanic_init(ISABus *bus); - /* e820 types */ #define E820_RAM 1 #define E820_RESERVED 2 -- 1.8.3.1

On Wed, Aug 21, 2013 at 06:43:15PM +0200, Paolo Bonzini wrote:
It is a source of pain, and the previous patch anyway changed the behavior of "-M pc-1.5" compared to the real 1.5.
This also makes it clear that "-device pvpanic" is not enough: it will not expose pvpanic in fw_cfg properly.
No idea how to fix that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Re: fixing that. I should stop arguing on the mailing list and post latest version of the ACPI generation patchset :). Once there we won't need a FW CFG entry anymore.
--- hw/i386/pc_piix.c | 8 -------- hw/i386/pc_q35.c | 6 ------ hw/misc/pvpanic.c | 14 ++------------ include/hw/i386/pc.h | 3 --- 4 files changed, 2 insertions(+), 29 deletions(-)
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index b58c255..b80f9a3 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -56,7 +56,6 @@ static const int ide_iobase[MAX_IDE_BUS] = { 0x1f0, 0x170 }; static const int ide_iobase2[MAX_IDE_BUS] = { 0x3f6, 0x376 }; static const int ide_irq[MAX_IDE_BUS] = { 14, 15 };
-static bool has_pvpanic = true; static bool has_pci_info = true;
/* PC hardware initialisation */ @@ -240,10 +239,6 @@ static void pc_init1(MemoryRegion *system_memory, if (pci_enabled) { pc_pci_device_init(pci_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } }
static void pc_init_pci(QEMUMachineInitArgs *args) @@ -269,7 +264,6 @@ static void pc_init_pci_1_5(QEMUMachineInitArgs *args)
static void pc_init_pci_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_init_pci_1_5(args); } @@ -302,7 +296,6 @@ static void pc_init_pci_no_kvmclock(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; disable_kvm_pv_eoi(); enable_compat_apic_id_mode(); @@ -321,7 +314,6 @@ static void pc_init_isa(QEMUMachineInitArgs *args) const char *kernel_cmdline = args->kernel_cmdline; const char *initrd_filename = args->initrd_filename; const char *boot_device = args->boot_device; - has_pvpanic = false; has_pci_info = false; if (cpu_model == NULL) cpu_model = "486"; diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index 0b1d2e3..fb403b8 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -46,7 +46,6 @@ /* ICH9 AHCI has 6 ports */ #define MAX_SATA_PORTS 6
-static bool has_pvpanic = true; static bool has_pci_info = true;
/* PC hardware initialisation */ @@ -210,10 +209,6 @@ static void pc_q35_init(QEMUMachineInitArgs *args) if (pci_enabled) { pc_pci_device_init(host_bus); } - - if (has_pvpanic) { - pvpanic_init(isa_bus); - } }
static void pc_q35_init_1_5(QEMUMachineInitArgs *args) @@ -224,7 +219,6 @@ static void pc_q35_init_1_5(QEMUMachineInitArgs *args)
static void pc_q35_init_1_4(QEMUMachineInitArgs *args) { - has_pvpanic = false; x86_cpu_compat_set_features("n270", FEAT_1_ECX, 0, CPUID_EXT_MOVBE); pc_q35_init_1_5(args); } diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c index 7bb49a5..1928cc9 100644 --- a/hw/misc/pvpanic.c +++ b/hw/misc/pvpanic.c @@ -101,7 +101,8 @@ static void pvpanic_isa_realizefn(DeviceState *dev, Error **errp) isa_register_ioport(d, &s->io, s->ioport); }
-static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) +static void __attribute__((unused)) pvpanic_fw_cfg(ISADevice *dev, + FWCfgState *fw_cfg) { PVPanicState *s = ISA_PVPANIC_DEVICE(dev); uint16_t *pvpanic_port = g_malloc(sizeof(*pvpanic_port)); @@ -111,17 +112,6 @@ static void pvpanic_fw_cfg(ISADevice *dev, FWCfgState *fw_cfg) sizeof(*pvpanic_port)); }
-void pvpanic_init(ISABus *bus) -{ - ISADevice *dev; - FWCfgState *fw_cfg = fw_cfg_find(); - if (!fw_cfg) { - return; - } - dev = isa_create_simple (bus, TYPE_ISA_PVPANIC_DEVICE); - pvpanic_fw_cfg(dev, fw_cfg); -} - static Property pvpanic_isa_properties[] = { DEFINE_PROP_UINT16("ioport", PVPanicState, ioport, 0x505), DEFINE_PROP_END_OF_LIST(), diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 7fb97b0..064cc81 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -194,9 +194,6 @@ static inline bool isa_ne2000_init(ISABus *bus, int base, int irq, NICInfo *nd) /* pc_sysfw.c */ void pc_system_firmware_init(MemoryRegion *rom_memory);
-/* pvpanic.c */ -void pvpanic_init(ISABus *bus); - /* e820 types */ #define E820_RAM 1 #define E820_RESERVED 2 -- 1.8.3.1

The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side. Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> --- hw/misc/pvpanic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c index 1928cc9..b53011d 100644 --- a/hw/misc/pvpanic.c +++ b/hw/misc/pvpanic.c @@ -27,7 +27,7 @@ /* The pv event value */ #define PVPANIC_PANICKED (1 << PVPANIC_F_PANICKED) -#define TYPE_ISA_PVPANIC_DEVICE "pvpanic" +#define TYPE_ISA_PVPANIC_DEVICE "isa-pvpanic" #define ISA_PVPANIC_DEVICE(obj) \ OBJECT_CHECK(PVPanicState, (obj), TYPE_ISA_PVPANIC_DEVICE) -- 1.8.3.1

On Wed, Aug 21, 2013 at 06:43:16PM +0200, Paolo Bonzini wrote:
The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side.
Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So it's isa-pvpanic meaning "I-am-sure-this-is-the-last-bug-pvpanic" If we feel there's need to give libvirt a way to do introspection into QEMU bugs, let's architect one. Randomly renaming devices in the vain hope it's the last major bug is not it. NACK
--- hw/misc/pvpanic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/hw/misc/pvpanic.c b/hw/misc/pvpanic.c index 1928cc9..b53011d 100644 --- a/hw/misc/pvpanic.c +++ b/hw/misc/pvpanic.c @@ -27,7 +27,7 @@ /* The pv event value */ #define PVPANIC_PANICKED (1 << PVPANIC_F_PANICKED)
-#define TYPE_ISA_PVPANIC_DEVICE "pvpanic" +#define TYPE_ISA_PVPANIC_DEVICE "isa-pvpanic" #define ISA_PVPANIC_DEVICE(obj) \ OBJECT_CHECK(PVPanicState, (obj), TYPE_ISA_PVPANIC_DEVICE)
-- 1.8.3.1

Il 21/08/2013 19:01, Michael S. Tsirkin ha scritto:
The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side.
Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> So it's isa-pvpanic meaning "I-am-sure-this-is-the-last-bug-pvpanic"
More like "we tested pvpanic for more than 2 weeks and did not find anything that's utterly broken in the design". And more practically "you are sure there are no traces of builtin pvpanic; also, panicked state is reversible". Paolo
If we feel there's need to give libvirt a way to do introspection into QEMU bugs, let's architect one. Randomly renaming devices in the vain hope it's the last major bug is not it.

On Wed, Aug 21, 2013 at 07:01:39PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 19:01, Michael S. Tsirkin ha scritto:
The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side.
Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> So it's isa-pvpanic meaning "I-am-sure-this-is-the-last-bug-pvpanic"
More like "we tested pvpanic for more than 2 weeks and did not find anything that's utterly broken in the design".
And more practically "you are sure there are no traces of builtin pvpanic; also, panicked state is reversible".
Paolo
isa-pvpanic does not look like a sane way to say that. NACK
If we feel there's need to give libvirt a way to do introspection into QEMU bugs, let's architect one. Randomly renaming devices in the vain hope it's the last major bug is not it.

Il 21/08/2013 19:07, Michael S. Tsirkin ha scritto:
On Wed, Aug 21, 2013 at 07:01:39PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 19:01, Michael S. Tsirkin ha scritto:
The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side.
Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> So it's isa-pvpanic meaning "I-am-sure-this-is-the-last-bug-pvpanic"
More like "we tested pvpanic for more than 2 weeks and did not find anything that's utterly broken in the design".
And more practically "you are sure there are no traces of builtin pvpanic; also, panicked state is reversible".
isa-pvpanic does not look like a sane way to say that.
NACK
You know that a single developer's NACK counts nothing (it can be you, it can be me), don't you? Paolo

On Wed, Aug 21, 2013 at 07:06:21PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 19:07, Michael S. Tsirkin ha scritto:
On Wed, Aug 21, 2013 at 07:01:39PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 19:01, Michael S. Tsirkin ha scritto:
The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side.
Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> So it's isa-pvpanic meaning "I-am-sure-this-is-the-last-bug-pvpanic"
More like "we tested pvpanic for more than 2 weeks and did not find anything that's utterly broken in the design".
And more practically "you are sure there are no traces of builtin pvpanic; also, panicked state is reversible".
isa-pvpanic does not look like a sane way to say that.
NACK
You know that a single developer's NACK counts nothing (it can be you, it can be me), don't you?
Paolo
No I don't. -- MST

On 08/21/13 19:06, Paolo Bonzini wrote:
Il 21/08/2013 19:07, Michael S. Tsirkin ha scritto:
NACK
You know that a single developer's NACK counts nothing (it can be you, it can be me), don't you?
going meta... What's this? All I know (... I think I know) about patch acceptance is that Anthony prefers to have at least one R-b. As far as I've seen this is not a hard requirement (for example, maintainers sometimes send unreviewed patches in a pull request, and on occasion they are merged). No words have been spent on NAKs yet (... since my subscription, that is). Is this stuff formalized somewhere? Sorry for wasting time... Thanks, Laszlo

Il 22/08/2013 14:43, Laszlo Ersek ha scritto:
On 08/21/13 19:06, Paolo Bonzini wrote:
Il 21/08/2013 19:07, Michael S. Tsirkin ha scritto:
NACK
You know that a single developer's NACK counts nothing (it can be you, it can be me), don't you?
going meta...
What's this?
All I know (... I think I know) about patch acceptance is that Anthony prefers to have at least one R-b. As far as I've seen this is not a hard requirement (for example, maintainers sometimes send unreviewed patches in a pull request, and on occasion they are merged).
No words have been spent on NAKs yet (... since my subscription, that is). Is this stuff formalized somewhere?
Sorry for wasting time...
No, it's not. But for example I NACKed removal of pvpanic from 1.6, it was overridden, and I didn't complain too much. Paolo

On Thu, Aug 22, 2013 at 02:41:51PM +0200, Paolo Bonzini wrote:
Il 22/08/2013 14:43, Laszlo Ersek ha scritto:
On 08/21/13 19:06, Paolo Bonzini wrote:
Il 21/08/2013 19:07, Michael S. Tsirkin ha scritto:
NACK
You know that a single developer's NACK counts nothing (it can be you, it can be me), don't you?
going meta...
What's this?
All I know (... I think I know) about patch acceptance is that Anthony prefers to have at least one R-b. As far as I've seen this is not a hard requirement (for example, maintainers sometimes send unreviewed patches in a pull request, and on occasion they are merged).
No words have been spent on NAKs yet (... since my subscription, that is). Is this stuff formalized somewhere?
Sorry for wasting time...
No, it's not. But for example I NACKed removal of pvpanic from 1.6, it was overridden, and I didn't complain too much.
Paolo
I don't think it was overridden. In fact you NACKed an explicit -device pvpanic. You suggested disabling in 1.6 but keeping it a builtin, but this was never implemented, afterwards issues with Linux guests surfaced, we discussed this again on the KVM call, and there seemed to be a concensus that it's an OK patch, with some issues. A week later Marcel sent v2, it worked and looked like the least problematic path to take. -- MST

Laszlo Ersek <lersek@redhat.com> writes:
On 08/21/13 19:06, Paolo Bonzini wrote:
Il 21/08/2013 19:07, Michael S. Tsirkin ha scritto:
NACK
You know that a single developer's NACK counts nothing (it can be you, it can be me), don't you?
going meta...
What's this?
All I know (... I think I know) about patch acceptance is that Anthony prefers to have at least one R-b. As far as I've seen this is not a hard requirement (for example, maintainers sometimes send unreviewed patches in a pull request, and on occasion they are merged).
I look very poorly on anyone nacking anything. I value constructive feedback. Nacking does not add any value to the conversation. I admire the fact that we've been able to maintain a very high level of conversation over the years on qemu-devel and throwing around nacks just lowers the overall tone. If you can't think of anything better to say than NACK, don't even bother sending the email in the first place. Regards, Anthony Liguori
No words have been spent on NAKs yet (... since my subscription, that is). Is this stuff formalized somewhere?
Sorry for wasting time...
Thanks, Laszlo

On Thu, Aug 22, 2013 at 11:50:43AM -0500, Anthony Liguori wrote:
Laszlo Ersek <lersek@redhat.com> writes:
On 08/21/13 19:06, Paolo Bonzini wrote:
Il 21/08/2013 19:07, Michael S. Tsirkin ha scritto:
NACK
You know that a single developer's NACK counts nothing (it can be you, it can be me), don't you?
going meta...
What's this?
All I know (... I think I know) about patch acceptance is that Anthony prefers to have at least one R-b. As far as I've seen this is not a hard requirement (for example, maintainers sometimes send unreviewed patches in a pull request, and on occasion they are merged).
I look very poorly on anyone nacking anything. I value constructive feedback. Nacking does not add any value to the conversation. I admire the fact that we've been able to maintain a very high level of conversation over the years on qemu-devel and throwing around nacks just lowers the overall tone.
In that case, what's a good way to clarify that one is opposed to the idea, not the implementation? We have Acked-by: versus Reviewed-by: on the positive side, and I was looking for something like this on the negative side.
If you can't think of anything better to say than NACK, don't even bother sending the email in the first place.
I did add motivation too, it was snipped in the response.
Regards,
Anthony Liguori
No words have been spent on NAKs yet (... since my subscription, that is). Is this stuff formalized somewhere?
Sorry for wasting time...
Thanks, Laszlo

Am 21.08.2013 19:01, schrieb Michael S. Tsirkin:
On Wed, Aug 21, 2013 at 06:43:16PM +0200, Paolo Bonzini wrote:
The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side.
Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So it's isa-pvpanic meaning "I-am-sure-this-is-the-last-bug-pvpanic"
If we feel there's need to give libvirt a way to do introspection into QEMU bugs, let's architect one. Randomly renaming devices in the vain hope it's the last major bug is not it.
NACK
Seconded. While we shouldn't rule out renaming devices, doing so as a criteria for libvirt sounds utterly wrong. Paolo, you are right that a single "NACK" cannot be a criteria, but a single convincing justification by a random reviewer should be sufficient to reconsider. :) Adding some device property or obtaining the info via some existing query-* QMP command might be better alternatives. Andreas -- SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer; HRB 16746 AG Nürnberg

Il 21/08/2013 19:35, Andreas Färber ha scritto:
Am 21.08.2013 19:01, schrieb Michael S. Tsirkin:
On Wed, Aug 21, 2013 at 06:43:16PM +0200, Paolo Bonzini wrote:
The pvpanic situation is already messed up enough. Let us give our libvirt friends an easy indication that we have untied our side.
Not-yet-signed-off-by: Paolo Bonzini <pbonzini@redhat.com> ... because we first have to determine how to expose the device's existence in the ACPI tables or in fw_cfg.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
So it's isa-pvpanic meaning "I-am-sure-this-is-the-last-bug-pvpanic"
If we feel there's need to give libvirt a way to do introspection into QEMU bugs, let's architect one. Randomly renaming devices in the vain hope it's the last major bug is not it.
NACK
Seconded. While we shouldn't rule out renaming devices, doing so as a criteria for libvirt sounds utterly wrong.
Paolo, you are right that a single "NACK" cannot be a criteria, but a single convincing justification by a random reviewer should be sufficient to reconsider. :)
Adding some device property or obtaining the info via some existing query-* QMP command might be better alternatives.
Device properties are a good way to communicate changes to the guest, but here the guest ABI is unchanged. Anyhow, this patch is not strictly necessary. It's just a safety net to avoid that libvirt uses a buggy device on buggy versions of QEMU. Paolo

On Wed, Aug 21, 2013 at 06:43:13PM +0200, Paolo Bonzini wrote:
The pvpanic mess is even bigger than anticipated. Let's fix the monitor's behavior (patch 1), get rid of all traces that the broken pvpanic existed (patch 2), and give it a new name so that libvirt can detect a design that works (patch 3).
All downstreams are urged to apply patches 1+2 as soon as they are merged in QEMU.
Still, there are still other problems to solve.
In QEMU, exposing "-device isa-pvpanic" in the ACPI tables. Quite frankly I don't have the time to fix this. We have ~3 months though. Patch 3 should not be applied until it is fixed.
Also, libvirt needs to know under which circumstances to add "-device isa-pvpanic", besides obviously the availability of the device.
IMHO it is just too complicated to retrofit all complications in <on_crash>. In fact, I suspect <on_crash> would match more closely QEMU's "internal error" state, and it would be quite useful to add that to the QEMU driver.
Thus, libvirt could add support for an <on_panic> element with the following values:
No, <on_crash> is the right thing to be using for this from libvirt's pov & I don't think we should invent something new. The <on_crash> element has always been intended to represent handling of guest panics, not qemu internal errors. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
No, <on_crash> is the right thing to be using for this from libvirt's pov & I don't think we should invent something new. The <on_crash> element has always been intended to represent handling of guest panics, not qemu internal errors.
Actually for Xen HVM guests, it mostly traps things such as failed vmentries. The Xen PV-on-HVM drivers do not register a panic notifier that moves the guest to the "crashed" state. <on_crash> cannot be salvaged, in my opinion, because all domain XMLs in the wild will have a setting that causes libvirt to add "-device isa-pvpanic". Thus changing libvirt versions will change guest hardware, which is _very_ bad. In addition, Windows XP and 2003 will show the annoying device wizard upon a libvirt upgrade, and fixing this is what surfaced all the mess. Paolo

On Wed, Aug 21, 2013 at 06:51:11PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
No, <on_crash> is the right thing to be using for this from libvirt's pov & I don't think we should invent something new. The <on_crash> element has always been intended to represent handling of guest panics, not qemu internal errors.
Actually for Xen HVM guests, it mostly traps things such as failed vmentries. The Xen PV-on-HVM drivers do not register a panic notifier that moves the guest to the "crashed" state.
<on_crash> cannot be salvaged, in my opinion, because all domain XMLs in the wild will have a setting that causes libvirt to add "-device isa-pvpanic". Thus changing libvirt versions will change guest hardware, which is _very_ bad.
In addition, Windows XP and 2003 will show the annoying device wizard upon a libvirt upgrade, and fixing this is what surfaced all the mess.
The existance of a <on_crash> element should not be having any effect on what hardware we create. That is merely a lifecycle policy setting that should be completely independant of the guest device model. eg it is valid to have <on_crash> present in the XML at all times, even if there's no pvpanic device present. That simply means the actions will never be triggered. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Il 21/08/2013 18:55, Daniel P. Berrange ha scritto:
On Wed, Aug 21, 2013 at 06:51:11PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
No, <on_crash> is the right thing to be using for this from libvirt's pov & I don't think we should invent something new. The <on_crash> element has always been intended to represent handling of guest panics, not qemu internal errors.
Actually for Xen HVM guests, it mostly traps things such as failed vmentries. The Xen PV-on-HVM drivers do not register a panic notifier that moves the guest to the "crashed" state.
<on_crash> cannot be salvaged, in my opinion, because all domain XMLs in the wild will have a setting that causes libvirt to add "-device isa-pvpanic". Thus changing libvirt versions will change guest hardware, which is _very_ bad.
In addition, Windows XP and 2003 will show the annoying device wizard upon a libvirt upgrade, and fixing this is what surfaced all the mess.
The existance of a <on_crash> element should not be having any effect on what hardware we create. That is merely a lifecycle policy setting that should be completely independant of the guest device model.
eg it is valid to have <on_crash> present in the XML at all times, even if there's no pvpanic device present. That simply means the actions will never be triggered.
So are you suggesting to add a <pvpanic/> element to <devices>? That may be fine, but it doesn't seem very user-friendly. Paolo

On 08/21/2013 10:56 AM, Paolo Bonzini wrote:
eg it is valid to have <on_crash> present in the XML at all times, even if there's no pvpanic device present. That simply means the actions will never be triggered.
So are you suggesting to add a <pvpanic/> element to <devices>? That may be fine, but it doesn't seem very user-friendly.
No less friendly than having a <memballoon> device, and certainly automatable via libosinfo integration into virt-install. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Il 21/08/2013 19:10, Eric Blake ha scritto:
On 08/21/2013 10:56 AM, Paolo Bonzini wrote:
eg it is valid to have <on_crash> present in the XML at all times, even if there's no pvpanic device present. That simply means the actions will never be triggered.
So are you suggesting to add a <pvpanic/> element to <devices>? That may be fine, but it doesn't seem very user-friendly.
No less friendly than having a <memballoon> device, and certainly automatable via libosinfo integration into virt-install.
Fair enough. Still I'm not sure that a new <on_panic> element should be ruled out, see my other message. Paolo -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSFPTPAAoJEBvWZb6bTYby4bkP/RKM2mycCtwhdZIZLJU2fWCf F7pcmD+uRqmBc5VtVh1K423ozo42lUcdb+erIzkIwH2sPVsiGNoue1IPRgHs1Azy Fq06uESPTVAroudFsFAYj8XtG6rH4UqfKFeym4RK97fMH87FLjRLX8bzm1R/mdw8 WWJ1TIspJ0tJceSewRVO2F12OjKC5AIIA36BTUH85dR2P64mJX/aZvhkQzTUn4ii Tp/+661RwyfRfgK2zY/gl6sh0qRp4i3Q3sdElKyPwDkFh6Z86XE6zTJTxJRMRfvl tqVg5elZeNHkgoBtAr4azGpn1j+mxuJ0dNmLJLC1/tyGCZojzL9C4Cc/gflJLGSP gW9VqBG7L8FRpwcCVdeoCMHFoZh2N39grAl7vkrqHf0zYR0oZidJ98MvCNwxZA4m YCf8gFMLx0vkEwa8BB4YexMzn5DCbQmSOuoJSd7XX3lYlb6mJudkQh+d+OXpYV8i bVDM9tZqUlejki9h9ZuS/sYIl0T4Aaa0QLYPMOvyl+zbLxfw0kmLI7VpFaUyxaWh Oz/fGL5A+lI2BT2SLCdphmeTj5dQ4R9a2nqMYAmiOZmFAYJvMRzDmuooJ65KPPvx xSsCbJhZoTp9JQz1nslWRcex7MCRth9BrL7wVu8hgb6RAw+lBkbpTeyQ7em5EjCb h7xMmCjeBr7wouhnAz3p =jYpU -----END PGP SIGNATURE-----

On Wed, Aug 21, 2013 at 06:56:52PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 18:55, Daniel P. Berrange ha scritto:
On Wed, Aug 21, 2013 at 06:51:11PM +0200, Paolo Bonzini wrote:
Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
No, <on_crash> is the right thing to be using for this from libvirt's pov & I don't think we should invent something new. The <on_crash> element has always been intended to represent handling of guest panics, not qemu internal errors.
Actually for Xen HVM guests, it mostly traps things such as failed vmentries. The Xen PV-on-HVM drivers do not register a panic notifier that moves the guest to the "crashed" state.
<on_crash> cannot be salvaged, in my opinion, because all domain XMLs in the wild will have a setting that causes libvirt to add "-device isa-pvpanic". Thus changing libvirt versions will change guest hardware, which is _very_ bad.
In addition, Windows XP and 2003 will show the annoying device wizard upon a libvirt upgrade, and fixing this is what surfaced all the mess.
The existance of a <on_crash> element should not be having any effect on what hardware we create. That is merely a lifecycle policy setting that should be completely independant of the guest device model.
eg it is valid to have <on_crash> present in the XML at all times, even if there's no pvpanic device present. That simply means the actions will never be triggered.
So are you suggesting to add a <pvpanic/> element to <devices>? That may be fine, but it doesn't seem very user-friendly.
Yes, if we're going to have pvpanic be user controllable, it must be via an explicit device element. None of the <on_XXXX> elements should have any impact on guest ABI model. They're purely lifecycle policy settings. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 08/21/2013 10:51 AM, Paolo Bonzini wrote:
Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
No, <on_crash> is the right thing to be using for this from libvirt's pov & I don't think we should invent something new. The <on_crash> element has always been intended to represent handling of guest panics, not qemu internal errors.
Actually for Xen HVM guests, it mostly traps things such as failed vmentries. The Xen PV-on-HVM drivers do not register a panic notifier that moves the guest to the "crashed" state.
<on_crash> cannot be salvaged, in my opinion, because all domain XMLs in the wild will have a setting that causes libvirt to add "-device isa-pvpanic". Thus changing libvirt versions will change guest hardware, which is _very_ bad.
Let's expand on that statement: Libvirt's default for <on_crash> is 'destroy'. But virt-install (and thus virt-manager) have been setting explicit 'restart' for AGES now. Arguably, this is YET ANOTHER reason why virt-manager should be using libosinfo to make sane choices about new guest XML, based on known capabilities of the guest it will be installing. But that only affects newly created guests after we fix the virt stack. In the meantime, you have a point that we have a back-compat mess - we promise ABI stability (guests shall not see hardware changes when upgrading versions of libvirtd but leaving the XML unchanged - the only way to change hardware seen by an existing guest is to explicitly modify XML).
In addition, Windows XP and 2003 will show the annoying device wizard upon a libvirt upgrade, and fixing this is what surfaced all the mess.
Yes, so we need the back-compat code to leave pvpanic out of pre-existing guests, if we can find a way to sensibly do that. So, this boils down to a question of what SHOULD the valid states for <on_crash> be? Generically, we want <on_crash>destroy</on_crash> to not invalidate a guest, but also to not instantiate a pvpanic device; since that covers the libvirt defaults. We also want <on_crash>restart</on_crash> to not invalidate a guest, but also to not instantiate a pvpanic device, since so many existing guests have that setting thanks to virt-install. Maybe that means we add attributes/sub-elements to <on_crash> that express whether pvpanic device is permitted; and the absence of that attribute means the status quo (the <on_crash> tag is effectively ignored because without pvpanic device, there is no way for libvirt to learn if a guest panicked). Or does it mean we expose a new sub-element of <devices>, similar to how we have a <memballoon> subelement that controls whether the memballoon device is show to the guest, and just document that for qemu, <on_crash> is a no-op without the <pvpanic> subelement? -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Il 21/08/2013 19:02, Eric Blake ha scritto:
So, this boils down to a question of what SHOULD the valid states for <on_crash> be? Generically, we want <on_crash>destroy</on_crash> to not invalidate a guest, but also to not instantiate a pvpanic device; since that covers the libvirt defaults. We also want <on_crash>restart</on_crash> to not invalidate a guest, but also to not instantiate a pvpanic device, since so many existing guests have that setting thanks to virt-install.
Maybe that means we add attributes/sub-elements to <on_crash> that express whether pvpanic device is permitted; and the absence of that attribute means the status quo (the <on_crash> tag is effectively ignored because without pvpanic device, there is no way for libvirt to learn if a guest panicked). Or does it mean we expose a new sub-element of <devices>, similar to how we have a <memballoon> subelement that controls whether the memballoon device is show to the guest, and just document that for qemu, <on_crash> is a no-op without the <pvpanic> subelement?
Perhaps <panic_notifier bus='isa'/> is better? Note that for s390 <on_crash> works without <pvpanic>. Anyway yes, that's a possibility. More precisely, you could still use <on_crash> for internal errors even without having anything in <devices> (Xen conflates panics and internal errors). But then, "pause" and "ignore" are useful on-panic policies, yet they do not make sense for internal errors. Hence my suggestion to introduce a new element <on_panic>. We can still make <on_panic> a no-op without the appropriate element under <devices>. Paolo -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.19 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBAgAGBQJSFPSWAAoJEBvWZb6bTYbywvMP/3MKlED6Y69QM7Xt0aAKEtZp RmcKUt1m4MAuV91eBmN5/fv+ui0yKzrnZWz0mgF+V3eWCJdnEiewkGocetpx+Mw7 V4wuGVkYUWXV/O8A6x3thENOoxaHYO4OP8dUgkWQLGHXRNTljAd4iyVxBiIbITod fZvhVEbk8n9Mk3U61JxeMRB5PDjXRHcjgLgpR7htujpVBMTBFAsqxLzflsxsd/p7 UOZ0D3vk6m00DHdIzcJ5pc0dyqaylEaljs3Lf1MNbC/fN8I1sgrMWYMYnukU+moC GRKS6OB1ySeq2MkMwe73RimtE8M8MNmtVUKle94bmymPdGD3V+qKmBq6o7Hzd+b7 l8Rkht9gWP7Z4T22ZOYVqHpRXaDivkHuRCL2Va3BpyYv48Atyk/G77uB1uSaGTj+ ooRNoEqO61e49JOM3NiEYRI6Gl2YJ5O560j7dK59mQ6OIrLMtuN6Wo1ZiubdKCOa HB3e6qF0drAEQIBSdqQiU83F58ta634Rqy5R5kc1ad9cuiLMtUDPNbHlKsJtf+Th Dyu301fxt/1IIxPheoBQNVLRoXtAfoqpxM1nasjRphQqnULpnl7q7MipAclEUadQ Q7KE0YFPw38BYxl2FWhZOuTNI2kN921PGNiqYFFVpOWCf/aW/uqxU86LnJVSdvZs T9sRlLpVL4LTGHbFDooI =nCgT -----END PGP SIGNATURE-----

On Wed, Aug 21, 2013 at 11:02:56AM -0600, Eric Blake wrote:
On 08/21/2013 10:51 AM, Paolo Bonzini wrote:
Il 21/08/2013 18:48, Daniel P. Berrange ha scritto:
No, <on_crash> is the right thing to be using for this from libvirt's pov & I don't think we should invent something new. The <on_crash> element has always been intended to represent handling of guest panics, not qemu internal errors.
Actually for Xen HVM guests, it mostly traps things such as failed vmentries. The Xen PV-on-HVM drivers do not register a panic notifier that moves the guest to the "crashed" state.
<on_crash> cannot be salvaged, in my opinion, because all domain XMLs in the wild will have a setting that causes libvirt to add "-device isa-pvpanic". Thus changing libvirt versions will change guest hardware, which is _very_ bad.
Let's expand on that statement:
Libvirt's default for <on_crash> is 'destroy'. But virt-install (and thus virt-manager) have been setting explicit 'restart' for AGES now.
Arguably, this is YET ANOTHER reason why virt-manager should be using libosinfo to make sane choices about new guest XML, based on known capabilities of the guest it will be installing. But that only affects newly created guests after we fix the virt stack.
In the meantime, you have a point that we have a back-compat mess - we promise ABI stability (guests shall not see hardware changes when upgrading versions of libvirtd but leaving the XML unchanged - the only way to change hardware seen by an existing guest is to explicitly modify XML).
In addition, Windows XP and 2003 will show the annoying device wizard upon a libvirt upgrade, and fixing this is what surfaced all the mess.
Yes, so we need the back-compat code to leave pvpanic out of pre-existing guests, if we can find a way to sensibly do that.
So, this boils down to a question of what SHOULD the valid states for <on_crash> be? Generically, we want <on_crash>destroy</on_crash> to not invalidate a guest, but also to not instantiate a pvpanic device; since that covers the libvirt defaults. We also want <on_crash>restart</on_crash> to not invalidate a guest, but also to not instantiate a pvpanic device, since so many existing guests have that setting thanks to virt-install.
Maybe that means we add attributes/sub-elements to <on_crash> that express whether pvpanic device is permitted; and the absence of that attribute means the status quo (the <on_crash> tag is effectively ignored because without pvpanic device, there is no way for libvirt to learn if a guest panicked). Or does it mean we expose a new sub-element of <devices>, similar to how we have a <memballoon> subelement that controls whether the memballoon device is show to the guest, and just document that for qemu, <on_crash> is a no-op without the <pvpanic> subelement?
This is a QEMU bug that you happened to be Cc'd on. So you started worry about supporting a buggy QEMU. This is generally futile. There are uncounted bugs that we silently fixed. They are often much more major than this silly reversibility bug. Some bios versions have racy hotplug support so hotplug event can be missed. Should libvirt warn the user that bios is broken and suggest restarting guest to see the device? Some QEMU versions had a racy implementation of virtio that would corrupt guest memory. Should libvirt warn the user that virtio is broken and suggest switching to e1000 or upgrading QEMU? Some QEMU versions have buggy qcow2 that would corrupt disk. Should libvirt warn the user that qcow2 is broken and suggest switching to raw? Some kernels have buggy vhost drivers which would crash host. Should libvirt detect these and tell user to upgrade kernel or switch to userspace virtio? Some kernels have NIC drivers that brick hardware. Should libvirt detect these and tell user to upgrade kernel or switch to a different NIC? There are libc bugs, glib bugs .... Let's fix the bug in QEMU and move on. Working around them in libvirt is unnecessary. -- MST
participants (7)
-
Andreas Färber
-
Anthony Liguori
-
Daniel P. Berrange
-
Eric Blake
-
Laszlo Ersek
-
Michael S. Tsirkin
-
Paolo Bonzini