[libvirt] [PATCH 1/6] Allow multiple monitor devices (v2)

Right now only one monitor device can be enabled at a time. In order to support asynchronous notification of events, I would like to introduce a 'wait' command that waits for an event to occur. This implies that we need an additional monitor session to allow commands to still be executed while waiting for an asynchronous notification. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> diff --git a/vl.c b/vl.c index 4bd173f..f78cabb 100644 --- a/vl.c +++ b/vl.c @@ -184,6 +184,9 @@ int main(int argc, char **argv) /* Max number of bluetooth switches on the commandline. */ #define MAX_BT_CMDLINE 10 +/* Maximum number of monitor devices */ +#define MAX_MONITOR_DEVICES 10 + /* XXX: use a two level table to limit memory usage */ #define MAX_IOPORTS 65536 @@ -4252,8 +4255,9 @@ int main(int argc, char **argv, char **envp) int hda_index; int optind; const char *r, *optarg; - CharDriverState *monitor_hd = NULL; - const char *monitor_device; + CharDriverState *monitor_hds[MAX_MONITOR_DEVICES]; + const char *monitor_devices[MAX_MONITOR_DEVICES]; + int monitor_device_index; const char *serial_devices[MAX_SERIAL_PORTS]; int serial_device_index; const char *parallel_devices[MAX_PARALLEL_PORTS]; @@ -4324,7 +4328,6 @@ int main(int argc, char **argv, char **envp) kernel_cmdline = ""; cyls = heads = secs = 0; translation = BIOS_ATA_TRANSLATION_AUTO; - monitor_device = "vc:80Cx24C"; serial_devices[0] = "vc:80Cx24C"; for(i = 1; i < MAX_SERIAL_PORTS; i++) @@ -4340,6 +4343,11 @@ int main(int argc, char **argv, char **envp) virtio_consoles[i] = NULL; virtio_console_index = 0; + monitor_devices[0] = "vc:80Cx24C"; + for (i = 1; i < MAX_MONITOR_DEVICES; i++) + monitor_devices[i] = NULL; + monitor_device_index = 0; + usb_devices_index = 0; nb_net_clients = 0; @@ -4723,7 +4731,12 @@ int main(int argc, char **argv, char **envp) break; } case QEMU_OPTION_monitor: - monitor_device = optarg; + if (monitor_device_index >= MAX_MONITOR_DEVICES) { + fprintf(stderr, "qemu: too many monitor devices\n"); + exit(1); + } + monitor_devices[monitor_device_index] = optarg; + monitor_device_index++; break; case QEMU_OPTION_serial: if (serial_device_index >= MAX_SERIAL_PORTS) { @@ -4974,8 +4987,8 @@ int main(int argc, char **argv, char **envp) serial_devices[0] = "stdio"; if (parallel_device_index == 0) parallel_devices[0] = "null"; - if (strncmp(monitor_device, "vc", 2) == 0) - monitor_device = "stdio"; + if (strncmp(monitor_devices[0], "vc", 2) == 0) + monitor_devices[0] = "stdio"; } #ifndef _WIN32 @@ -5184,14 +5197,14 @@ int main(int argc, char **argv, char **envp) #endif /* Maintain compatibility with multiple stdio monitors */ - if (!strcmp(monitor_device,"stdio")) { + if (!strcmp(monitor_devices[0],"stdio")) { for (i = 0; i < MAX_SERIAL_PORTS; i++) { const char *devname = serial_devices[i]; if (devname && !strcmp(devname,"mon:stdio")) { - monitor_device = NULL; + monitor_devices[0] = NULL; break; } else if (devname && !strcmp(devname,"stdio")) { - monitor_device = NULL; + monitor_devices[0] = NULL; serial_devices[i] = "mon:stdio"; break; } @@ -5208,11 +5221,20 @@ int main(int argc, char **argv, char **envp) } } - if (monitor_device) { - monitor_hd = qemu_chr_open("monitor", monitor_device, NULL); - if (!monitor_hd) { - fprintf(stderr, "qemu: could not open monitor device '%s'\n", monitor_device); - exit(1); + for (i = 0; i < MAX_MONITOR_DEVICES; i++) { + const char *devname = monitor_devices[i]; + if (devname && strcmp(devname, "none")) { + char label[32]; + if (i == 0) + snprintf(label, sizeof(label), "monitor"); + else + snprintf(label, sizeof(label), "monitor%d", i); + monitor_hds[i] = qemu_chr_open(label, devname, NULL); + if (!monitor_hds[i]) { + fprintf(stderr, "qemu: could not open monitor device '%s'\n", + devname); + exit(1); + } } } @@ -5335,8 +5357,13 @@ int main(int argc, char **argv, char **envp) text_consoles_set_display(display_state); qemu_chr_initial_reset(); - if (monitor_device && monitor_hd) - monitor_init(monitor_hd, MONITOR_USE_READLINE | MONITOR_IS_DEFAULT); + for (i = 0; i < MAX_MONITOR_DEVICES; i++) { + if (monitor_devices[i] && monitor_hds[i]) { + monitor_init(monitor_hds[i], + MONITOR_USE_READLINE | + ((i == 0) ? MONITOR_IS_DEFAULT : 0)); + } + } for(i = 0; i < MAX_SERIAL_PORTS; i++) { const char *devname = serial_devices[i];

The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available. Today, we queue events indefinitely but in the future, I suspect we'll drop events that are older than a certain amount of time to avoid infinitely allocating memory for long running VMs. To make use of the new notification mechanism, this patch introduces a qemu_notify_event() API. This API takes three parameters: a class which is meant to classify the type of event being generated, a name which is meant to distinguish which event in the class that has been generated, and a details parameters which is meant to allow events to send arbitrary data with a given event. v1->v2 - Added a -d flag to poll for events instead of waiting - Made old events expire after 10 minutes Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> diff --git a/Makefile b/Makefile index 633774e..b87870b 100644 --- a/Makefile +++ b/Makefile @@ -84,7 +84,7 @@ OBJS+=usb-serial.o usb-net.o OBJS+=sd.o ssi-sd.o OBJS+=bt.o bt-host.o bt-vhci.o bt-l2cap.o bt-sdp.o bt-hci.o bt-hid.o usb-bt.o OBJS+=buffered_file.o migration.o migration-tcp.o net.o qemu-sockets.o -OBJS+=qemu-char.o aio.o net-checksum.o savevm.o cache-utils.o +OBJS+=qemu-char.o aio.o net-checksum.o savevm.o cache-utils.o wait.o ifdef CONFIG_BRLAPI OBJS+= baum.o diff --git a/monitor.c b/monitor.c index ca1c11c..94c14b2 100644 --- a/monitor.c +++ b/monitor.c @@ -42,6 +42,7 @@ #include "migration.h" #include "kvm.h" #include "acl.h" +#include "wait.h" //#define DEBUG //#define DEBUG_COMPLETION @@ -1745,6 +1746,8 @@ static const mon_cmd_t mon_cmds[] = { "acl allow vnc.username fred\n" "acl deny vnc.username bob\n" "acl reset vnc.username\n" }, + { "wait", "-d", do_wait, + "[-d]", "wait for an asynchronous event (use -d to poll event)" }, { NULL, NULL, }, }; diff --git a/wait.c b/wait.c new file mode 100644 index 0000000..8f6cbad --- /dev/null +++ b/wait.c @@ -0,0 +1,145 @@ +/* + * Asynchronous monitor notification support + * + * Copyright IBM, Corp. 2009 + * + * Authors: + * Anthony Liguori <aliguori@us.ibm.com> + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include "qemu-common.h" +#include "monitor.h" +#include "sys-queue.h" +#include "osdep.h" +#include "wait.h" + +typedef struct WaitEvent +{ + qemu_timeval timestamp; + char *class; + char *name; + char *details; + TAILQ_ENTRY(WaitEvent) node; +} WaitEvent; + +typedef struct PendingWaiter +{ + Monitor *mon; + int polling; + TAILQ_ENTRY(PendingWaiter) node; +} PendingWaiter; + +/* How long do we hold on to events (in seconds) + * default to 10 minutes. */ +#define EVENT_EXPIRATION_AGE (10 * 60) + +static TAILQ_HEAD(, WaitEvent) pending_events = + TAILQ_HEAD_INITIALIZER(pending_events); +static TAILQ_HEAD(, PendingWaiter) pending_waiters = + TAILQ_HEAD_INITIALIZER(pending_waiters); + +static void free_event(WaitEvent *e) +{ + qemu_free(e->details); + qemu_free(e->name); + qemu_free(e->class); + qemu_free(e); +} + +static void dispatch_event(PendingWaiter *w, WaitEvent *e) +{ + monitor_printf(w->mon, "%ld.%06ld: %s: %s\n", + e->timestamp.tv_sec, e->timestamp.tv_usec, + e->class, e->name); + if (e->details && strlen(e->details)) { + monitor_printf(w->mon, "%s\n", e->details); + } + if (!w->polling) + monitor_resume(w->mon); +} + +static void remove_stale_events(void) +{ + qemu_timeval now; + + qemu_gettimeofday(&now); + + while (!TAILQ_EMPTY(&pending_events)) { + WaitEvent *e; + + e = TAILQ_FIRST(&pending_events); + if ((now.tv_sec - e->timestamp.tv_sec) > EVENT_EXPIRATION_AGE) { + TAILQ_REMOVE(&pending_events, e, node); + free_event(e); + } else { + break; + } + } +} + +static int try_to_process_events(void) +{ + int processed_events = 0; + + while (!TAILQ_EMPTY(&pending_events) && !TAILQ_EMPTY(&pending_waiters)) { + WaitEvent *e; + PendingWaiter *w; + + e = TAILQ_FIRST(&pending_events); + TAILQ_REMOVE(&pending_events, e, node); + + w = TAILQ_FIRST(&pending_waiters); + TAILQ_REMOVE(&pending_waiters, w, node); + + dispatch_event(w, e); + + free_event(e); + qemu_free(w); + + processed_events = 1; + } + + remove_stale_events(); + + return processed_events; +} + +void qemu_notify_event(const char *class, const char *name, const char *details) +{ + WaitEvent *e; + + e = qemu_mallocz(sizeof(*e)); + + qemu_gettimeofday(&e->timestamp); + e->class = qemu_strdup(class); + e->name = qemu_strdup(name); + if (details) + e->details = qemu_strdup(details); + + TAILQ_INSERT_TAIL(&pending_events, e, node); + + try_to_process_events(); +} + +void do_wait(Monitor *mon, int polling) +{ + PendingWaiter *w; + + w = qemu_mallocz(sizeof(*w)); + w->mon = mon; + w->polling = polling; + + TAILQ_INSERT_TAIL(&pending_waiters, w, node); + + if (!w->polling) + monitor_suspend(w->mon); + + if (!try_to_process_events() && w->polling) { + TAILQ_REMOVE(&pending_waiters, w, node); + qemu_free(w); + } +} diff --git a/wait.h b/wait.h new file mode 100644 index 0000000..3fb455f --- /dev/null +++ b/wait.h @@ -0,0 +1,23 @@ +/* + * Asynchronous monitor notification support + * + * Copyright IBM, Corp. 2009 + * + * Authors: + * Anthony Liguori <aliguori@us.ibm.com> + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#ifndef QEMU_WAIT_H +#define QEMU_WAIT_H + +#include "monitor.h" + +void qemu_notify_event(const char *class, const char *name, + const char *details); +void do_wait(Monitor *mon, int polling); + +#endif

This patch allows the wait command to be take a class mask that is used to filter which events are waited for. Because of limitations in the monitor callbacks, this required splitting wait into two commands: wait and poll. This allows us to introduce the -x option to treat the mask as an exclusion mask. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> diff --git a/monitor.c b/monitor.c index 94c14b2..1a7d026 100644 --- a/monitor.c +++ b/monitor.c @@ -1746,8 +1746,12 @@ static const mon_cmd_t mon_cmds[] = { "acl allow vnc.username fred\n" "acl deny vnc.username bob\n" "acl reset vnc.username\n" }, - { "wait", "-d", do_wait, - "[-d]", "wait for an asynchronous event (use -d to poll event)" }, + { "wait", "-xs?", do_wait, + "[-x] [mask]", + "wait for an asynchronous event (-x to make mask exclusive, mask specifies the event classes to wait for)" }, + { "poll", "-xs?", do_poll, + "[-x] [mask]", + "poll for an asynchronous event (-x to make mask exclusive, mask specifies the event classes to poll for)" }, { NULL, NULL, }, }; diff --git a/wait.c b/wait.c index 8f6cbad..ebc156a 100644 --- a/wait.c +++ b/wait.c @@ -30,6 +30,8 @@ typedef struct PendingWaiter { Monitor *mon; int polling; + int exclusive; + char *mask; TAILQ_ENTRY(PendingWaiter) node; } PendingWaiter; @@ -50,6 +52,12 @@ static void free_event(WaitEvent *e) qemu_free(e); } +static void free_waiter(PendingWaiter *w) +{ + qemu_free(w->mask); + qemu_free(w); +} + static void dispatch_event(PendingWaiter *w, WaitEvent *e) { monitor_printf(w->mon, "%ld.%06ld: %s: %s\n", @@ -81,6 +89,41 @@ static void remove_stale_events(void) } } +static int find_string(const char *haystack, const char *needle) +{ + const char *p; + int clen = strlen(needle); + + p = haystack; + while (p && strlen(p) >= clen) { + if (memcmp(p, needle, clen) == 0 && + (p[clen] == 0 || p[clen] == ' ')) { + return 1; + } + + p = strchr(p, ' '); + if (p) + p++; + } + + return 0; +} + +static int event_in_mask(PendingWaiter *w, WaitEvent *e) +{ + int ret; + + if (e->class == NULL) + ret = 1; + else + ret = find_string(w->mask, e->class); + + if (w->exclusive) + ret = !ret; + + return ret; +} + static int try_to_process_events(void) { int processed_events = 0; @@ -92,15 +135,21 @@ static int try_to_process_events(void) e = TAILQ_FIRST(&pending_events); TAILQ_REMOVE(&pending_events, e, node); - w = TAILQ_FIRST(&pending_waiters); - TAILQ_REMOVE(&pending_waiters, w, node); + TAILQ_FOREACH(w, &pending_waiters, node) { + if (event_in_mask(w, e)) + break; + } - dispatch_event(w, e); + if (w) { + TAILQ_REMOVE(&pending_waiters, w, node); - free_event(e); - qemu_free(w); + dispatch_event(w, e); - processed_events = 1; + free_event(e); + free_waiter(w); + + processed_events = 1; + } } remove_stale_events(); @@ -125,13 +174,17 @@ void qemu_notify_event(const char *class, const char *name, const char *details) try_to_process_events(); } -void do_wait(Monitor *mon, int polling) +static void wait_with_mask(Monitor *mon, int polling, int exclusive, + const char *mask) { PendingWaiter *w; w = qemu_mallocz(sizeof(*w)); w->mon = mon; w->polling = polling; + w->exclusive = exclusive; + if (mask && strlen(mask)) + w->mask = qemu_strdup(mask); TAILQ_INSERT_TAIL(&pending_waiters, w, node); @@ -140,6 +193,16 @@ void do_wait(Monitor *mon, int polling) if (!try_to_process_events() && w->polling) { TAILQ_REMOVE(&pending_waiters, w, node); - qemu_free(w); + free_waiter(w); } } + +void do_wait(Monitor *mon, int exclusive, const char *mask) +{ + wait_with_mask(mon, 0, exclusive, mask); +} + +void do_poll(Monitor *mon, int exclusive, const char *mask) +{ + wait_with_mask(mon, 1, exclusive, mask); +} diff --git a/wait.h b/wait.h index 3fb455f..0942823 100644 --- a/wait.h +++ b/wait.h @@ -18,6 +18,7 @@ void qemu_notify_event(const char *class, const char *name, const char *details); -void do_wait(Monitor *mon, int polling); +void do_wait(Monitor *mon, int exclusive, const char *mask); +void do_poll(Monitor *mon, int exclusive, const char *mask); #endif

This patch introduces a standard way to document all new events. It also enforces that documentation exists for each event. This documentation includes a description of the details parameters that the event generates. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> diff --git a/Makefile.target b/Makefile.target index b32d1af..adfe81b 100644 --- a/Makefile.target +++ b/Makefile.target @@ -495,6 +495,7 @@ endif #CONFIG_BSD_USER ifndef CONFIG_USER_ONLY OBJS=vl.o osdep.o monitor.o pci.o loader.o isa_mmio.o machine.o dma-helpers.o +OBJS+=wait-events.o # virtio has to be here due to weird dependency between PCI and virtio-net. # need to fix this properly OBJS+=virtio.o virtio-blk.o virtio-balloon.o virtio-net.o virtio-console.o diff --git a/monitor.c b/monitor.c index 1a7d026..8aed435 100644 --- a/monitor.c +++ b/monitor.c @@ -43,6 +43,7 @@ #include "kvm.h" #include "acl.h" #include "wait.h" +#include "wait-events.h" //#define DEBUG //#define DEBUG_COMPLETION @@ -1752,6 +1753,8 @@ static const mon_cmd_t mon_cmds[] = { { "poll", "-xs?", do_poll, "[-x] [mask]", "poll for an asynchronous event (-x to make mask exclusive, mask specifies the event classes to poll for)" }, + { "show_event", "s?s?", do_show_event, "[class [name]]", + "show information about supported events" }, { NULL, NULL, }, }; diff --git a/wait-events.c b/wait-events.c new file mode 100644 index 0000000..3617ca6 --- /dev/null +++ b/wait-events.c @@ -0,0 +1,143 @@ +/* + * Asynchronous monitor notification support + * + * Copyright IBM, Corp. 2009 + * + * Authors: + * Anthony Liguori <aliguori@us.ibm.com> + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#include "qemu-common.h" +#include "wait-events.h" + +typedef struct EventDesc +{ + const char *name; + const char *short_desc; + const char *long_desc; + const char *detail_format; +} EventDesc; + +typedef struct EventClassDesc +{ + const char *class; + const char *short_desc; + const char *long_desc; + const EventDesc *events; +} EventClassDesc; + +static const EventDesc vm_state_events[] = { + { 0 }, +}; + +static const EventClassDesc vm_event_classes[] = { + { 0 }, +}; + +static const EventDesc *find_event_desc(const EventClassDesc *desc, const char *name) +{ + int i; + + for (i = 0; desc->events[i].name; i++) { + if (strcmp(desc->events[i].name, name) == 0) + return &desc->events[i]; + } + + return NULL; +} + +static const EventClassDesc *find_event_class(const char *class) +{ + int i; + + for (i = 0; vm_event_classes[i].class; i++) { + if (strcmp(vm_event_classes[i].class, class) == 0) + return &vm_event_classes[i]; + } + + return NULL; +} + +int qemu_is_notify_event_valid(const char *class, const char *name) +{ + const EventClassDesc *c; + const EventDesc *e; + + c = find_event_class(class); + if (c == NULL) + return 0; + + e = find_event_desc(c, name); + if (e == NULL) + return 0; + + return 1; +} + +static void do_show_event_name(Monitor *mon, const EventDesc *e) +{ + monitor_printf(mon, "%s\n", e->long_desc); + if (e->detail_format && strlen(e->detail_format)) { + monitor_printf(mon, "\n"); + monitor_printf(mon, "Details Format:\n"); + monitor_printf(mon, "%s\n", e->detail_format); + } +} + +static void do_show_event_class(Monitor *mon, const EventClassDesc *c) +{ + int i; + + monitor_printf(mon, "%s\n", c->long_desc); + monitor_printf(mon, "\n"); + monitor_printf(mon, "Events:\n"); + for (i = 0; c->events[i].name; i++) { + monitor_printf(mon, " %s - %s\n", + c->events[i].name, + c->events[i].short_desc); + } +} + +static void do_show_event_list(Monitor *mon) +{ + int i; + + for (i = 0; vm_event_classes[i].class; i++) { + monitor_printf(mon, "%s - %s\n", + vm_event_classes[i].class, + vm_event_classes[i].short_desc); + } +} + +void do_show_event(Monitor *mon, const char *class, const char *name) +{ + if (class && strlen(class)) { + const EventClassDesc *c; + + c = find_event_class(class); + if (c == NULL) { + monitor_printf(mon, "Unknown class: %s\n", class); + return; + } + + if (name && strlen(name)) { + const EventDesc *e; + + e = find_event_desc(c, name); + if (e == NULL) { + monitor_printf(mon, "Unknown event: %s\n", name); + return; + } + + do_show_event_name(mon, e); + } else { + do_show_event_class(mon, c); + } + } else { + do_show_event_list(mon); + } +} diff --git a/wait-events.h b/wait-events.h new file mode 100644 index 0000000..4770ec5 --- /dev/null +++ b/wait-events.h @@ -0,0 +1,23 @@ +/* + * Asynchronous monitor notification support + * + * Copyright IBM, Corp. 2009 + * + * Authors: + * Anthony Liguori <aliguori@us.ibm.com> + * + * This work is licensed under the terms of the GNU GPL, version 2. See + * the COPYING file in the top-level directory. + * + */ + +#ifndef QEMU_WAIT_EVENTS_H +#define QEMU_WAIT_EVENTS_H + +#include "monitor.h" + +int qemu_is_notify_event_valid(const char *class, const char *name); + +void do_show_event(Monitor *mon, const char *class, const char *name); + +#endif diff --git a/wait.c b/wait.c index ebc156a..a976a72 100644 --- a/wait.c +++ b/wait.c @@ -16,6 +16,7 @@ #include "sys-queue.h" #include "osdep.h" #include "wait.h" +#include "wait-events.h" typedef struct WaitEvent { @@ -161,6 +162,11 @@ void qemu_notify_event(const char *class, const char *name, const char *details) { WaitEvent *e; + if (!qemu_is_notify_event_valid(class, name)) { + fprintf(stderr, "BUG: invalid class/name %s: %s\n", class, name); + abort(); + } + e = qemu_mallocz(sizeof(*e)); qemu_gettimeofday(&e->timestamp);

This patch adds notification whenever a vm starts or stops. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> diff --git a/vl.c b/vl.c index f78cabb..ca0bcf9 100644 --- a/vl.c +++ b/vl.c @@ -164,6 +164,8 @@ int main(int argc, char **argv) #include "libslirp.h" #endif +#include "wait.h" + //#define DEBUG_UNUSED_IOPORT //#define DEBUG_IOPORT //#define DEBUG_NET @@ -3545,6 +3547,11 @@ static void vm_state_notify(int running, int reason) { VMChangeStateEntry *e; + if (running) + qemu_notify_event("vm-state", "start", ""); + else + qemu_notify_event("vm-state", "stop", ""); + for (e = vm_change_state_head.lh_first; e; e = e->entries.le_next) { e->cb(e->opaque, running, reason); } diff --git a/wait-events.c b/wait-events.c index 3617ca6..ecd26d2 100644 --- a/wait-events.c +++ b/wait-events.c @@ -31,10 +31,24 @@ typedef struct EventClassDesc } EventClassDesc; static const EventDesc vm_state_events[] = { + { "start", "the VM was started", + "This event occurs whenever a VM's state changes from stopped to\n" + "started. A vm initially is in a stopped state so a start event will\n" + "always occur when the vm is first launched unless the -S option is\n" + "provided.", + NULL, }, + { "stop", "the VM was stopped", + "This event occurs whenever a VM's state changes from started to stopped\n" + "A vm initially is in the stopped state but no notification is generated\n" + "by this.", + NULL, }, { 0 }, }; static const EventClassDesc vm_event_classes[] = { + { "vm-state", "the VM start/stop state has changed", + "These events occur whenever a VM's state changes.", + vm_state_events, }, { 0 }, };

This patch adds notification whenever a client connects or disconnects via VNC. I'd like to add a lot more events especially ones that are auth related but this is at least a start. Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> diff --git a/vnc.c b/vnc.c index 6d93215..abe6bdf 100644 --- a/vnc.c +++ b/vnc.c @@ -29,6 +29,7 @@ #include "qemu_socket.h" #include "qemu-timer.h" #include "acl.h" +#include "wait.h" #define VNC_REFRESH_INTERVAL (1000 / 30) @@ -872,6 +873,15 @@ static void audio_del(VncState *vs) } } +static void vnc_send_connect_event(VncState *vs) +{ + qemu_notify_event("vnc-event", "connect", vs->peer_address); +} + +static void vnc_send_disconnect_event(VncState *vs) +{ + qemu_notify_event("vnc-event", "disconnect", vs->peer_address); +} int vnc_client_io_error(VncState *vs, int ret, int last_errno) { @@ -889,6 +899,8 @@ int vnc_client_io_error(VncState *vs, int ret, int last_errno) } } + vnc_send_disconnect_event(vs); + qemu_free(vs->peer_address); VNC_DEBUG("Closing down client sock %d %d\n", ret, ret < 0 ? last_errno : 0); qemu_set_fd_handler2(vs->csock, NULL, NULL, NULL, NULL); closesocket(vs->csock); @@ -2008,6 +2020,9 @@ static void vnc_connect(VncDisplay *vd, int csock) vs->next = vd->clients; vd->clients = vs; + + vs->peer_address = vnc_socket_remote_addr("client-address=%s:%s", vs->csock); + vnc_send_connect_event(vs); } static void vnc_listen_read(void *opaque) @@ -2055,7 +2070,6 @@ void vnc_display_init(DisplayState *ds) register_displaychangelistener(ds, dcl); } - void vnc_display_close(DisplayState *ds) { VncDisplay *vs = ds ? (VncDisplay *)ds->opaque : vnc_display; diff --git a/vnc.h b/vnc.h index 3ae95f3..745d655 100644 --- a/vnc.h +++ b/vnc.h @@ -161,6 +161,8 @@ struct VncState Buffer zlib_tmp; z_stream zlib_stream[4]; + char *peer_address; + VncState *next; }; diff --git a/wait-events.c b/wait-events.c index ecd26d2..8a2af85 100644 --- a/wait-events.c +++ b/wait-events.c @@ -45,10 +45,24 @@ static const EventDesc vm_state_events[] = { { 0 }, }; +static const EventDesc vnc_events[] = { + { "connect", "a client has connected", + "This event occurs whenever a client connects to the builtin VNC server.", + "address=host:service", }, + { "disconnect", "a client has disconnected", + "This event occurs whenever a client disconnects to the builtin VNC server.", + "client-address=host:service", }, + { 0 }, +}; + static const EventClassDesc vm_event_classes[] = { { "vm-state", "the VM start/stop state has changed", "These events occur whenever a VM's state changes.", vm_state_events, }, + { "vnc-event", "events from the builtin VNC server", + "These events are generated by the builtin VNC server. They include\n" + "connection information, authentication information, etc.", + vnc_events, }, { 0 }, }; diff --git a/wait.c b/wait.c index a976a72..33fb702 100644 --- a/wait.c +++ b/wait.c @@ -114,7 +114,7 @@ static int event_in_mask(PendingWaiter *w, WaitEvent *e) { int ret; - if (e->class == NULL) + if (w->mask == NULL) ret = 1; else ret = find_string(w->mask, e->class); @@ -128,20 +128,18 @@ static int event_in_mask(PendingWaiter *w, WaitEvent *e) static int try_to_process_events(void) { int processed_events = 0; + WaitEvent *e, *ne; - while (!TAILQ_EMPTY(&pending_events) && !TAILQ_EMPTY(&pending_waiters)) { - WaitEvent *e; + TAILQ_FOREACH_SAFE(e, &pending_events, node, ne) { PendingWaiter *w; - e = TAILQ_FIRST(&pending_events); - TAILQ_REMOVE(&pending_events, e, node); - TAILQ_FOREACH(w, &pending_waiters, node) { if (event_in_mask(w, e)) break; } if (w) { + TAILQ_REMOVE(&pending_events, e, node); TAILQ_REMOVE(&pending_waiters, w, node); dispatch_event(w, e);

Anthony Liguori wrote:
The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available.
Today, we queue events indefinitely but in the future, I suspect we'll drop events that are older than a certain amount of time to avoid infinitely allocating memory for long running VMs.
To make use of the new notification mechanism, this patch introduces a qemu_notify_event() API. This API takes three parameters: a class which is meant to classify the type of event being generated, a name which is meant to distinguish which event in the class that has been generated, and a details parameters which is meant to allow events to send arbitrary data with a given event.
v1->v2 - Added a -d flag to poll for events instead of waiting - Made old events expire after 10 minutes
Shame on me. I meant to rename wait/poll to wait_event/poll_event. It'll be in v3. Regards, Anthony Liguori

On 4/8/09, Anthony Liguori <aliguori@us.ibm.com> wrote:
The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available.
+ if (!w->polling) + monitor_resume(w->mon);
CODING_STYLE, chapter 4: "Every indented statement is braced; even if the block contains just one statement." ;-)

Blue Swirl wrote:
On 4/8/09, Anthony Liguori <aliguori@us.ibm.com> wrote:
The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available.
+ if (!w->polling) + monitor_resume(w->mon);
CODING_STYLE, chapter 4: "Every indented statement is braced; even if the block contains just one statement."
;-)
That's going to take me some time to get used to :-) I've been trying to stick with it... Regards, Anthony Liguori -- Regards, Anthony Liguori

Anthony Liguori wrote:
The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available.
How do you stop a wait if there are no pending events?
Today, we queue events indefinitely but in the future, I suspect we'll drop events that are older than a certain amount of time to avoid infinitely allocating memory for long running VMs.
This queueing plug the race where an event happens immediately after a wait completes. But it could be avoided completely by having asynchronous notifications on a single monitor. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

Avi Kivity wrote:
Anthony Liguori wrote:
The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available.
How do you stop a wait if there are no pending events?
You mean, cancel a wait? You cannot. I thought about whether it was a problem or not. I'm not sure. You could introduce a wait-cancel command, but then you need a way to identify which wait you want to cancel. I can't think of a simple way to do that today.
Today, we queue events indefinitely but in the future, I suspect we'll drop events that are older than a certain amount of time to avoid infinitely allocating memory for long running VMs.
This queueing plug the race where an event happens immediately after a wait completes. But it could be avoided completely by having asynchronous notifications on a single monitor.
There are multiple things I think are being confused: asynchronous completion of monitor commands, events, monitor multiplexing, and non-human mode. There can only be one command active at any given time on a Monitor context. We can have many Monitor contexts. There is currently only one Monitor context connected to a character device at a given time. I think what you want to see is something like this: <input> tag=4: info cpus <input> tag=5: info kvm <output> tag=5,rc=0: kvm enabled <output> tag=4,rc=0: eip = 0x0000000444 <ouput> rc=0,class=vm-state,name=start: vm started To me, this is a combination of events, which is introduced by this patch, a non-human monitor mode, and finally multiplexing multiple monitors into a single session. Right now, you can have the same functional thing by having three monitors. In the first monitor you'd see: (qemu) info cpus eip = 0x000000444 (qemu) In the second you'd see: (qemu) info kvm kvm enabled (qemu) In the third you'd see: (qemu) wait 23423423.23423: vm-state: start: vm started (qemu) Even those the two info commands today are synchronous, there's nothing requiring them to be (see migrate as an example). So I think we're in agreement but you just want to jump ahead 6 months ;-) -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
Anthony Liguori wrote:
The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available.
How do you stop a wait if there are no pending events?
You mean, cancel a wait? You cannot. I thought about whether it was a problem or not. I'm not sure.
A management agent might want to detach from guests, upgrade, restart, and reattach.
You could introduce a wait-cancel command, but then you need a way to identify which wait you want to cancel. I can't think of a simple way to do that today.
This, as well as the queueing that's necessary with this model, indicates (to me) that it is too complicated.
Today, we queue events indefinitely but in the future, I suspect we'll drop events that are older than a certain amount of time to avoid infinitely allocating memory for long running VMs.
This queueing plug the race where an event happens immediately after a wait completes. But it could be avoided completely by having asynchronous notifications on a single monitor.
There are multiple things I think are being confused: asynchronous completion of monitor commands, events, monitor multiplexing, and non-human mode.
There can only be one command active at any given time on a Monitor context. We can have many Monitor contexts. There is currently only one Monitor context connected to a character device at a given time.
Don't think of 'migrate -d' as a command to perform migration, instead it's a command to start migration. I also object to exposing internal qemu implementation details like monitor contexts to the user (by forcing them to have multiple connections). If we can't have more than one monitor command, we need to fix that.
I think what you want to see is something like this:
<input> tag=4: info cpus <input> tag=5: info kvm <output> tag=5,rc=0: kvm enabled <output> tag=4,rc=0: eip = 0x0000000444 <ouput> rc=0,class=vm-state,name=start: vm started
To me, this is a combination of events, which is introduced by this patch, a non-human monitor mode, and finally multiplexing multiple monitors into a single session.
Right now, you can have the same functional thing by having three monitors. In the first monitor you'd see:
(qemu) info cpus eip = 0x000000444 (qemu)
In the second you'd see:
(qemu) info kvm kvm enabled (qemu)
In the third you'd see:
(qemu) wait 23423423.23423: vm-state: start: vm started (qemu)
Even those the two info commands today are synchronous, there's nothing requiring them to be (see migrate as an example). So I think we're in agreement but you just want to jump ahead 6 months ;-)
Commands which are inherently synchronous should remain so. Commands which are inherently async should be coded like that. It was a mistake IMO to have 'migrate' be a synchronous command, it should have always behaved as if -d is given. Having tagged replies is a good idea, but IMO, introducing multiple monitors will create a lot of subtle problems, several of which we've already identified. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
Anthony Liguori wrote:
Avi Kivity wrote:
Anthony Liguori wrote:
The wait command will pause the monitor the command was issued in until a new event becomes available. Events are queued if there isn't a waiter present. The wait command completes after a single event is available.
How do you stop a wait if there are no pending events?
You mean, cancel a wait? You cannot. I thought about whether it was a problem or not. I'm not sure.
A management agent might want to detach from guests, upgrade, restart, and reattach.
You could introduce a wait-cancel command, but then you need a way to identify which wait you want to cancel. I can't think of a simple way to do that today.
This, as well as the queueing that's necessary with this model, indicates (to me) that it is too complicated.
It should be feasible to handle some key sequence like ^C while the monitor is suspended (ie. blocked on things like event-wait or migrate). I thought about such a feature while reworking the involved parts but postponed it as I saw no urgent need for it.
Today, we queue events indefinitely but in the future, I suspect we'll drop events that are older than a certain amount of time to avoid infinitely allocating memory for long running VMs.
This queueing plug the race where an event happens immediately after a wait completes. But it could be avoided completely by having asynchronous notifications on a single monitor.
There are multiple things I think are being confused: asynchronous completion of monitor commands, events, monitor multiplexing, and non-human mode.
There can only be one command active at any given time on a Monitor context. We can have many Monitor contexts. There is currently only one Monitor context connected to a character device at a given time.
Don't think of 'migrate -d' as a command to perform migration, instead it's a command to start migration.
I also object to exposing internal qemu implementation details like monitor contexts to the user (by forcing them to have multiple connections). If we can't have more than one monitor command, we need to fix that.
I think what you want to see is something like this:
<input> tag=4: info cpus <input> tag=5: info kvm <output> tag=5,rc=0: kvm enabled <output> tag=4,rc=0: eip = 0x0000000444 <ouput> rc=0,class=vm-state,name=start: vm started
To me, this is a combination of events, which is introduced by this patch, a non-human monitor mode, and finally multiplexing multiple monitors into a single session.
Right now, you can have the same functional thing by having three monitors. In the first monitor you'd see:
(qemu) info cpus eip = 0x000000444 (qemu)
In the second you'd see:
(qemu) info kvm kvm enabled (qemu)
In the third you'd see:
(qemu) wait 23423423.23423: vm-state: start: vm started (qemu)
Even those the two info commands today are synchronous, there's nothing requiring them to be (see migrate as an example). So I think we're in agreement but you just want to jump ahead 6 months ;-)
Commands which are inherently synchronous should remain so. Commands which are inherently async should be coded like that. It was a mistake IMO to have 'migrate' be a synchronous command, it should have always behaved as if -d is given.
Having tagged replies is a good idea, but IMO, introducing multiple monitors will create a lot of subtle problems, several of which we've already identified.
We do have multiple monitors already, try '-monitor X -serial mon:Y', or also attach via gdb and issue 'monitor whatever'. And we should continue to design the monitor interface for this (which means here: event notification should be configured per monitor session). Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux

Anthony Liguori wrote:
Right now only one monitor device can be enabled at a time. In order to support asynchronous notification of events, I would like to introduce a 'wait' command that waits for an event to occur. This implies that we need an additional monitor session to allow commands to still be executed while waiting for an asynchronous notification.
I think this is race prone. For example: monitor 1: wait monitor 2: hotunplug dev1 monitor 2: hotplug dev1 monitor 1: event there is no way to tell whether event (which relates to dev1) happened the hotunplug or after the hotunplug. In general there is no way to correlate events to commands. What's wrong with having async notifications? Sure, we'll need to make sure notifications don't mix with command responses, and that all commands are acked (the (qemu) prompt serves that purpose now), but it seems to me do be a lot easier for the management end. Multiple monitors would still come in useful for debugging in a managed environment. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

Avi Kivity wrote:
Anthony Liguori wrote:
Right now only one monitor device can be enabled at a time. In order to support asynchronous notification of events, I would like to introduce a 'wait' command that waits for an event to occur. This implies that we need an additional monitor session to allow commands to still be executed while waiting for an asynchronous notification.
I think this is race prone. For example:
monitor 1: wait
monitor 2: hotunplug dev1 monitor 2: hotplug dev1
monitor 1: event
there is no way to tell whether event (which relates to dev1) happened the hotunplug or after the hotunplug. In general there is no way to correlate events to commands.
events don't correlate to commands. Can you give an example of why you'd want to correlate and event to a command.
What's wrong with having async notifications?
Isn't this async notifications?
Sure, we'll need to make sure notifications don't mix with command responses, and that all commands are acked (the (qemu) prompt serves that purpose now), but it seems to me do be a lot easier for the management end.
FWIW, you can have asynchronous completion of monitor commands. See migration as an example. The (qemu) prompt is the ack that the command is finished. You can only have one async command per monitor session which is why you need multiple monitors. If we had a non-human monitor mode, I think it would be fine to have many monitors multiplexed over a single channel. The internal monitor abstraction doesn't change. I think multiplexing multiple monitor sessions requires a non-human mode because you need to tag all output, etc. which is what we need for non-human mode anyway. -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
Anthony Liguori wrote:
Right now only one monitor device can be enabled at a time. In order to support asynchronous notification of events, I would like to introduce a 'wait' command that waits for an event to occur. This implies that we need an additional monitor session to allow commands to still be executed while waiting for an asynchronous notification.
I think this is race prone. For example:
monitor 1: wait
monitor 2: hotunplug dev1 monitor 2: hotplug dev1
monitor 1: event
there is no way to tell whether event (which relates to dev1) happened the hotunplug or after the hotunplug. In general there is no way to correlate events to commands.
events don't correlate to commands. Can you give an example of why you'd want to correlate and event to a command.
hotplug disk -ENOSPC on disk It's true that events don't correlate directly to commands, but they do correlate to the state of the system and that is affected by commands.
What's wrong with having async notifications?
Isn't this async notifications?
I meant, on the good old single monitor.
Sure, we'll need to make sure notifications don't mix with command responses, and that all commands are acked (the (qemu) prompt serves that purpose now), but it seems to me do be a lot easier for the management end.
FWIW, you can have asynchronous completion of monitor commands. See migration as an example. The (qemu) prompt is the ack that the command is finished. You can only have one async command per monitor session which is why you need multiple monitors.
Migration has the -d switch to be truly async (not to wait). We need an async notifier to tell us migration has finished or failed. If you go the multiple monitor route, you need one monitor per long-running command. That way -> madness. Moreover, having long running commands block implies there is no way to cancel them.
If we had a non-human monitor mode, I think it would be fine to have many monitors multiplexed over a single channel. The internal monitor abstraction doesn't change.
I don't understand why we need to multiplex many monitors over one channel. Let's keep one monitor, but with the ability to send notifications to the other end.
I think multiplexing multiple monitor sessions requires a non-human mode because you need to tag all output, etc. which is what we need for non-human mode anyway.
I think this is orthogonal. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
hotplug disk -ENOSPC on disk
It's true that events don't correlate directly to commands, but they do correlate to the state of the system and that is affected by commands.
events are time stamped. In non-human mode, I think command responses should be time stamped too so that a management tool knows when the event was generated and can correlate the system state to the particular event.
Sure, we'll need to make sure notifications don't mix with command responses, and that all commands are acked (the (qemu) prompt serves that purpose now), but it seems to me do be a lot easier for the management end.
FWIW, you can have asynchronous completion of monitor commands. See migration as an example. The (qemu) prompt is the ack that the command is finished. You can only have one async command per monitor session which is why you need multiple monitors.
Migration has the -d switch to be truly async (not to wait). We need an async notifier to tell us migration has finished or failed.
Multiplexing multiple monitor sessions AFAICT is going to require a non-human mode. But it's totally orthogonal to this patch set. You could implement it today if you wanted.
If we had a non-human monitor mode, I think it would be fine to have many monitors multiplexed over a single channel. The internal monitor abstraction doesn't change.
I don't understand why we need to multiplex many monitors over one channel. Let's keep one monitor, but with the ability to send notifications to the other end.
I don't understand how you think it would be implemented. Each command is going to have a unique 'Monitor *' associated with it. How that ends up being displayed to the user/management tool is irrelevant. Right now, you can only have one 'Monitor *' active on a single "channel" at a time. This is mostly a matter of output format. I don't see how we can keep the monitor output consistent and compatible as it stands today and still support multiple 'Monitor *'s active in a single channel. -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
hotplug disk -ENOSPC on disk
It's true that events don't correlate directly to commands, but they do correlate to the state of the system and that is affected by commands.
events are time stamped. In non-human mode, I think command responses should be time stamped too so that a management tool knows when the event was generated and can correlate the system state to the particular event.
Timestamping doesn't help since the command could have been delayed in the monitor socket. Further, we're now so deep down the complexity spiral that it has now become the most complicated piece of code in the entire system. Surely the best way to synchronize events is to keep them on one timeline?
Sure, we'll need to make sure notifications don't mix with command responses, and that all commands are acked (the (qemu) prompt serves that purpose now), but it seems to me do be a lot easier for the management end.
FWIW, you can have asynchronous completion of monitor commands. See migration as an example. The (qemu) prompt is the ack that the command is finished. You can only have one async command per monitor session which is why you need multiple monitors.
Migration has the -d switch to be truly async (not to wait). We need an async notifier to tell us migration has finished or failed.
Multiplexing multiple monitor sessions AFAICT is going to require a non-human mode. But it's totally orthogonal to this patch set. You could implement it today if you wanted.
I don't want multiplexed monitor sessions, at all. I want async notifications added to a single monitor session. That too could be implemented today (as simple as a term_printf("notification: ...\n");
If we had a non-human monitor mode, I think it would be fine to have many monitors multiplexed over a single channel. The internal monitor abstraction doesn't change.
I don't understand why we need to multiplex many monitors over one channel. Let's keep one monitor, but with the ability to send notifications to the other end.
I don't understand how you think it would be implemented. Each command is going to have a unique 'Monitor *' associated with it. How that ends up being displayed to the user/management tool is irrelevant. Right now, you can only have one 'Monitor *' active on a single "channel" at a time. This is mostly a matter of output format. I don't see how we can keep the monitor output consistent and compatible as it stands today and still support multiple 'Monitor *'s active in a single channel.
Again, I have no interest in multiple Monitor *s on the same channel. One is quite sufficient. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
Anthony Liguori wrote:
Avi Kivity wrote:
hotplug disk -ENOSPC on disk
It's true that events don't correlate directly to commands, but they do correlate to the state of the system and that is affected by commands.
events are time stamped. In non-human mode, I think command responses should be time stamped too so that a management tool knows when the event was generated and can correlate the system state to the particular event.
Timestamping doesn't help since the command could have been delayed in the monitor socket.
Further, we're now so deep down the complexity spiral that it has now become the most complicated piece of code in the entire system.
You certainly cannot believe that, can you?
Surely the best way to synchronize events is to keep them on one timeline?
Can you show exactly how you would expect things to work in the monitor output? I'm having a hard time understanding what you're suggesting. -- Regards, Anthony Liguori

Anthony Liguori wrote:
Timestamping doesn't help since the command could have been delayed in the monitor socket.
Further, we're now so deep down the complexity spiral that it has now become the most complicated piece of code in the entire system.
You certainly cannot believe that, can you?
It was one of my little exaggerations. But it does look disproportionatly complicated compared to the problem we're trying to solve.
Surely the best way to synchronize events is to keep them on one timeline?
Can you show exactly how you would expect things to work in the monitor output? I'm having a hard time understanding what you're suggesting.
(qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notification: vnc connection ... (qemu) notify migration-completion on (qemu) migrate -d ... notification: enospc on ide0-0 (qemu) migrate_cancel notification: migration cancelled (qemu) migrate -d ... (qemu) notification: migration completed -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

Avi Kivity wrote:
(qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notification: vnc connection ... (qemu) notify migration-completion on (qemu) migrate -d ... notification: enospc on ide0-0 (qemu) migrate_cancel notification: migration cancelled (qemu) migrate -d ... (qemu) notification: migration completed
What are the rules for printing out 'notification'? Do you want for the end of the buffer to be '\n' or '\n(qemu )'? If so, if I type: (qemu) f But don't hit enter, would that suppress notification? -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
(qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notification: vnc connection ... (qemu) notify migration-completion on (qemu) migrate -d ... notification: enospc on ide0-0 (qemu) migrate_cancel notification: migration cancelled (qemu) migrate -d ... (qemu) notification: migration completed
What are the rules for printing out 'notification'? Do you want for the end of the buffer to be '\n' or '\n(qemu )'? If so, if I type:
(qemu) f
But don't hit enter, would that suppress notification?
I'd make everything line-oriented. Anything from the user up to \n is buffered and ignored until the \n arrives. Once the \n arrives, the command is acted upon atomically, either completing fully or launching an async notification. So the rules are: whenever the monitor is idle, a notification can be printed out. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
I'd make everything line-oriented. Anything from the user up to \n is buffered and ignored until the \n arrives.
Once the \n arrives, the command is acted upon atomically, either completing fully or launching an async notification.
So the rules are: whenever the monitor is idle, a notification can be printed out.
So by idle, you mean, the end of the output buffer ends in either '\n' or '\n(qemu) '. The input buffer must also be empty.
(qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notification: vnc connection ... (qemu) notify migration-completion on (qemu) migrate -d ... notification: enospc on ide0-0 (qemu) migrate_cancel notification: migration cancelled (qemu) migrate -d ... (qemu) notification: migration completed
This hurts my eyes. It's not human readable. If we're going to do this, we might as well have a non-human mode which would oddly enough be more human readable. If you do this, then your session looks an awful lot like my session from a previous note. I think the thing that is missing is that the 'wait' command does not have to be part of the non-human mode. In non-human mode, you are always doing an implicit wait. -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
I'd make everything line-oriented. Anything from the user up to \n is buffered and ignored until the \n arrives.
Once the \n arrives, the command is acted upon atomically, either completing fully or launching an async notification.
So the rules are: whenever the monitor is idle, a notification can be printed out.
So by idle, you mean, the end of the output buffer ends in either '\n' or '\n(qemu) '. The input buffer must also be empty.
You don't have to look any buffers. If the monitor is processing a command, it is busy. An asynchronous command ('migrate -d') is not processed in the monitor after it is launched, so it doesn't keep the monitor busy. A monitor enters idle after printing the prompt, and leaves idle when it starts processing a command. If you meant from the user side, a notification always follows the prompt.
(qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notification: vnc connection ... (qemu) notify migration-completion on (qemu) migrate -d ... notification: enospc on ide0-0 (qemu) migrate_cancel notification: migration cancelled (qemu) migrate -d ... (qemu) notification: migration completed
This hurts my eyes. It's not human readable.
I'm sorry, I don't see why. It's just like a shell session. Compare with: Monitor 1: (qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notify migration-completion on (qemu) migrate -d ... (qemu) migrate_cancel (qemu) migrate -d ... Monitor 2: (qemu) wait vnc connection ... (qemu) wait enospc on ide0-0 (qemu) wait migration cancelled (qemu) wait notification: migration completed There is no way to tell by looking what has happened (well, in this case you can, but in the general case you cannot). You have to look at two separate interactive sessions (ctrl-alt-2 ctrl-alt-3 ctrl-alt-3). You have to keep reissuing the wait command. Oh, and it's racy, so if you're interested in what really happens you have to issue info commands on session 1. That's unusable.
If we're going to do this, we might as well have a non-human mode which would oddly enough be more human readable. If you do this, then your session looks an awful lot like my session from a previous note.
I think we should.
I think the thing that is missing is that the 'wait' command does not have to be part of the non-human mode. In non-human mode, you are always doing an implicit wait.
I think 'wait' is unusable for humans. If I want qemu to tell me something happened, it's enough to enable notifications. There's no need to tell it to wait every time something happens. That's poll(2), there's no poll(1). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
I'm sorry, I don't see why. It's just like a shell session.
Compare with:
Monitor 1: (qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notify migration-completion on (qemu) migrate -d ... (qemu) migrate_cancel (qemu) migrate -d ...
Monitor 2: (qemu) wait vnc connection ... (qemu) wait enospc on ide0-0 (qemu) wait migration cancelled (qemu) wait notification: migration completed
There is no way to tell by looking what has happened (well, in this case you can, but in the general case you cannot). You have to look at two separate interactive sessions (ctrl-alt-2 ctrl-alt-3 ctrl-alt-3). You have to keep reissuing the wait command. Oh, and it's racy, so if you're interested in what really happens you have to issue info commands on session 1.
How is it less racy? -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
I'm sorry, I don't see why. It's just like a shell session.
Compare with:
Monitor 1: (qemu) notify enospace on (qemu) notify vnc-connect on (qemu) notify migration-completion on (qemu) migrate -d ... (qemu) migrate_cancel (qemu) migrate -d ...
Monitor 2: (qemu) wait vnc connection ... (qemu) wait enospc on ide0-0 (qemu) wait migration cancelled (qemu) wait notification: migration completed
There is no way to tell by looking what has happened (well, in this case you can, but in the general case you cannot). You have to look at two separate interactive sessions (ctrl-alt-2 ctrl-alt-3 ctrl-alt-3). You have to keep reissuing the wait command. Oh, and it's racy, so if you're interested in what really happens you have to issue info commands on session 1.
How is it less racy?
Suppose you have a command which changes the meaning of a notification. If a notification arrives before the command completion, then it happened before the command was executed. If it arrives after command completion, then it happened after the command was executed. Oh. If the command generates no output (like most), you can't tell when it completes. I suppose we could have qemu print OK after completing a command. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
Suppose you have a command which changes the meaning of a notification. If a notification arrives before the command completion, then it happened before the command was executed.
If you want to make that reliable, you cannot have multiple monitors. Since you can mask notifications, there can be an arbitrarily long time between notification and the event happening. Socket buffering presents the same problem. Image: Monitor 1: time 0: (qemu) hotadd_cpu 2 time 1: (qemu) hello world <no new line> time 5: <new line> time 6: notification: cpu 2 added time 6: (qemu) Monitor 2: time 3: (qemu) hotremove_cpu 2 time 4: (qemu) time 5: notification: cpu 2 removed time 6: (qemu) So to eliminate this, you have to ban multiple monitors. Fine, let's say we did that, it's *still* racy because at time 3, the guest may hot remove cpu 2 on it's own since the guests VCPUs get to run in parallel to the monitor. And even if you somehow eliminate the issue around masking notifications, you still have socket buffering that introduces the same problem. The best you can do is stick a time stamp on a notification and make sure the management tool understands that the notification is reflectively of the state when the event happened, not of the current state. FWIW, this problem is not at all unique to QEMU and is generally true of most protocols that support an out-of-band notification mechanism. -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
Suppose you have a command which changes the meaning of a notification. If a notification arrives before the command completion, then it happened before the command was executed.
If you want to make that reliable, you cannot have multiple monitors.
Right.
Since you can mask notifications, there can be an arbitrarily long time between notification and the event happening. Socket buffering presents the same problem. Image:
Monitor 1: time 0: (qemu) hotadd_cpu 2 time 1: (qemu) hello world <no new line> time 5: <new line> time 6: notification: cpu 2 added time 6: (qemu)
Monitor 2: time 3: (qemu) hotremove_cpu 2 time 4: (qemu) time 5: notification: cpu 2 removed time 6: (qemu)
So to eliminate this, you have to ban multiple monitors.
Well, not ban multiple monitors, but require that for non-racy operation commands and notifications be on the same session. We can still debug on our dev-only monitor.
Fine, let's say we did that, it's *still* racy because at time 3, the guest may hot remove cpu 2 on it's own since the guests VCPUs get to run in parallel to the monitor.
A guest can't hotremove a vcpu. It may offline a vcpu, but that's not the same. Obviously, if both the guest and the management application can initiate the same action, then there will be races. But I don't think that's how things should be -- the guest should request a vcpu to be removed (or added), management thinks and files forms in triplicate, then hotadds or hotremoves the vcpu (most likely after it is no longer needed). With the proper beaurocracy, there is no race.
And even if you somehow eliminate the issue around masking notifications, you still have socket buffering that introduces the same problem.
If you have one monitor, the problem is much simpler, since events travelling in the same direction (command acknowledge and a notification) cannot be reordered. With a command+wait, the problem is inherent.
The best you can do is stick a time stamp on a notification and make sure the management tool understands that the notification is reflectively of the state when the event happened, not of the current state.
Timestamps are really bad. They don't work at all if the management application is not on the same host. They work badly if it is on the same host, since commands and events will be timestamped at different processes.
FWIW, this problem is not at all unique to QEMU and is generally true of most protocols that support an out-of-band notification mechanism.
command+wait makes it worse. Let's stick with established practice. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
Fine, let's say we did that, it's *still* racy because at time 3, the guest may hot remove cpu 2 on it's own since the guests VCPUs get to run in parallel to the monitor.
A guest can't hotremove a vcpu. It may offline a vcpu, but that's not the same.
Obviously, if both the guest and the management application can initiate the same action, then there will be races. But I don't think that's how things should be -- the guest should request a vcpu to be removed (or added), management thinks and files forms in triplicate, then hotadds or hotremoves the vcpu (most likely after it is no longer needed).
With the proper beaurocracy, there is no race.
You still have the same basic problem: time 0: (qemu) notify-enable vnc-events time 1: (qemu) foo <no newline> time 4: <newline> time 5: notification: client connected time 0: vnc client connects time 2: vnc client disconnects At time 5, when the management app gets the notification, it cannot make any assumptions about the state of the system. You still need timestamps.
And even if you somehow eliminate the issue around masking notifications, you still have socket buffering that introduces the same problem.
If you have one monitor, the problem is much simpler, since events travelling in the same direction (command acknowledge and a notification) cannot be reordered. With a command+wait, the problem is inherent.
Command acknowledge is not an event. Events are out-of-band. Command completions are in-band. Right now, they are synchronous and I expect that in the short term future, we'll have a non-human monitor mode that allows commands to be asynchronous. However, it's a mistake to muddle the distinction between an in-band completion and an out-of-band event. You cannot relate the out-of-band events commands.
The best you can do is stick a time stamp on a notification and make sure the management tool understands that the notification is reflectively of the state when the event happened, not of the current state.
Timestamps are really bad. They don't work at all if the management application is not on the same host. They work badly if it is on the same host, since commands and events will be timestamped at different processes.
Timestamps are relative, not absolutely. They should not be used to associate anything with the outside world. In fact, I have no problem making the timestamps relative to QEMU startup just to ensure that noone tries to do something silly like associate notification timestamps with system time.
FWIW, this problem is not at all unique to QEMU and is generally true of most protocols that support an out-of-band notification mechanism.
command+wait makes it worse. Let's stick with established practice.
What's the established practice? Do you know of any protocol that is line based that does notifications like this? IMAP IDLE is pretty close to "wait-forever". Regards, Anthony Liguori -- Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
Fine, let's say we did that, it's *still* racy because at time 3, the guest may hot remove cpu 2 on it's own since the guests VCPUs get to run in parallel to the monitor.
A guest can't hotremove a vcpu. It may offline a vcpu, but that's not the same.
Obviously, if both the guest and the management application can initiate the same action, then there will be races. But I don't think that's how things should be -- the guest should request a vcpu to be removed (or added), management thinks and files forms in triplicate, then hotadds or hotremoves the vcpu (most likely after it is no longer needed).
With the proper beaurocracy, there is no race.
You still have the same basic problem:
time 0: (qemu) notify-enable vnc-events time 1: (qemu) foo <no newline> time 4: <newline> time 5: notification: client connected
time 0: vnc client connects time 2: vnc client disconnects
At time 5, when the management app gets the notification, it cannot make any assumptions about the state of the system. You still need timestamps.
You don't even need the foo <no newline> to trigger this, qemu->user traffic can be arbitrarily delayed (I don't think we should hold notifications on partial input anyway). But there's no race here. The notification at time 5 means that the connect happened sometime before time 5, and that it may not be true now. The user cannot assume anything. A race can only happen against something the user initiated. Suppose we're implementing some kind of single sign on: (qemu) notify vnc on ... time passes, we want to allow members of group x to log in (qemu) vnc_set_acl group:x OK (qemu) notification: vnc connect aliguori (qemu) with a single monitor, we can be sure that the connect happened the vnc_set_acl. If the notification arrives on a different session, we have no way of knowing that.
And even if you somehow eliminate the issue around masking notifications, you still have socket buffering that introduces the same problem.
If you have one monitor, the problem is much simpler, since events travelling in the same direction (command acknowledge and a notification) cannot be reordered. With a command+wait, the problem is inherent.
Command acknowledge is not an event. Events are out-of-band. Command completions are in-band. Right now, they are synchronous and
That's all correct, but I don't see how that changes anything.
I expect that in the short term future, we'll have a non-human monitor mode that allows commands to be asynchronous.
Then let's defer this until then? 'wait' is not useful for humans, they won't be retyping 'wait' every time something happens.
However, it's a mistake to muddle the distinction between an in-band completion and an out-of-band event. You cannot relate the out-of-band events commands.
I can, if one happens before the other, and I have a single stream of command completions and event notifications.
The best you can do is stick a time stamp on a notification and make sure the management tool understands that the notification is reflectively of the state when the event happened, not of the current state.
Timestamps are really bad. They don't work at all if the management application is not on the same host. They work badly if it is on the same host, since commands and events will be timestamped at different processes.
Timestamps are relative, not absolutely. They should not be used to associate anything with the outside world. In fact, I have no problem making the timestamps relative to QEMU startup just to ensure that noone tries to do something silly like associate notification timestamps with system time.
Dunno, seems totally artificial to me to have to introduce timestamps to compensate for different delays in multiple sockets that we introduced five patches earlier. Please, let's keep this simple.
FWIW, this problem is not at all unique to QEMU and is generally true of most protocols that support an out-of-band notification mechanism.
command+wait makes it worse. Let's stick with established practice.
What's the established practice? Do you know of any protocol that is line based that does notifications like this?
I guess most MUDs?
IMAP IDLE is pretty close to "wait-forever".
IMAP IDLE can be terminated by the client, and so does not require multiple sessions (though IMAP supports them). -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
(qemu) notify vnc on
... time passes, we want to allow members of group x to log in
(qemu) vnc_set_acl group:x OK (qemu) notification: vnc connect aliguori (qemu)
with a single monitor, we can be sure that the connect happened the vnc_set_acl. If the notification arrives on a different session, we have no way of knowing that.
Only because there isn't a time stamp associated with the completion of the other monitor command. And you can globally replace timestamp with some sort of incrementing id that's associated with each notification and command completion. You'll need this to support multiple monitors even with your model. IMHO, multiple monitors is a critical feature to support in the long term.
I expect that in the short term future, we'll have a non-human monitor mode that allows commands to be asynchronous.
Then let's defer this until then? 'wait' is not useful for humans, they won't be retyping 'wait' every time something happens.
But wait is useful for management apps today. A wait-forever, which is already in the next series, is also useful for humans. It may not be a perfect interface, but it's a step in the right direction. We have time before the next release and I expect that we'll have a non-human mode before then.
What's the established practice? Do you know of any protocol that is line based that does notifications like this?
I guess most MUDs?
I've never used a MUD before, I think that qualifies as before my time :-)
IMAP IDLE is pretty close to "wait-forever".
IMAP IDLE can be terminated by the client, and so does not require multiple sessions (though IMAP supports them).
Most modern clients use multiple sessions. If you look at IMAP, it doesn't multiplex commands so multiple sessions are necessary to maintain usefulness while performing a long running task. Anyway, I think terminating a wait is a perfectly reasonable requirement. Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
(qemu) notify vnc on
... time passes, we want to allow members of group x to log in
(qemu) vnc_set_acl group:x OK (qemu) notification: vnc connect aliguori (qemu)
with a single monitor, we can be sure that the connect happened the vnc_set_acl. If the notification arrives on a different session, we have no way of knowing that.
Only because there isn't a time stamp associated with the completion of the other monitor command. And you can globally replace timestamp with some sort of incrementing id that's associated with each notification and command completion.
Sure, you can fix the problem, but why introduce it in the first place? I understand the urge for a simple command/response, but introducing multiple sessions breaks the "simple" and introduces new problems.
You'll need this to support multiple monitors even with your model.
Can you explain why? As far as I can tell, if you have async notifications, you can do everything from one monitor.
IMHO, multiple monitors is a critical feature to support in the long term.
Multiple monitors are nice to have (for developers), but I don't see them as critical.
I expect that in the short term future, we'll have a non-human monitor mode that allows commands to be asynchronous.
Then let's defer this until then? 'wait' is not useful for humans, they won't be retyping 'wait' every time something happens.
But wait is useful for management apps today. A wait-forever, which is already in the next series, is also useful for humans. It may not be a perfect interface, but it's a step in the right direction. We have time before the next release and I expect that we'll have a non-human mode before then.
I disagree, I think requiring multiple sessions for controlling a single application is clumsy. I can't think of one protocol which uses it. I don't think IMAP requires multiple sessions (and I don't think commands from one session can affect the other, except through the mail store).
What's the established practice? Do you know of any protocol that is line based that does notifications like this?
I guess most MUDs?
I've never used a MUD before, I think that qualifies as before my time :-)
Well I haven't either. Maybe time to start.
IMAP IDLE is pretty close to "wait-forever".
IMAP IDLE can be terminated by the client, and so does not require multiple sessions (though IMAP supports them).
Most modern clients use multiple sessions. If you look at IMAP, it doesn't multiplex commands so multiple sessions are necessary to maintain usefulness while performing a long running task.
But commands in one session don't affect others.
Anyway, I think terminating a wait is a perfectly reasonable requirement.
It breaks you command/response, though. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
Anthony Liguori wrote:
IMHO, multiple monitors is a critical feature to support in the long term.
Multiple monitors are nice to have (for developers), but I don't see them as critical.
If you live in a world where there is a single management application that provides the only interface to interact with a QEMU instance, then yes, they aren't critical. The problem with this is that most management applications are lossy by their nature. They expose only a subset of functionality supported by QEMU. Currently, the monitor is the "management interface" for QEMU. If we only every support one instance of that management interface, then it means if multiple management applications are to interact with a given QEMU instance, they must all use a single API to do that then allows for multiplexing. I see no reason that QEMU shouldn't do the multiplexing itself though. To put it another way, a user that uses libvirt today cannot see QEMU instances that are run manually. That is not true when a user uses libvirt with Xen today because Xend provides a management interface that is capable of supporting multiple clients. I think it's important to get the same level of functionality for QEMU. N.B. yes, Xend is a horrendous example especially when your argument has been simplicity vs. complexity. At the end of the day, I want to be able to run a QEMU instance from the command line, and have virt-manager be able to see it remotely and connect to it. That means multiple monitors and it means that all commands that change VM state must generate some sort of notification such that libvirt can keep track of the changing state of a VM. Regards, Anthony Liguori

Anthony Liguori wrote:
Avi Kivity wrote:
Anthony Liguori wrote:
IMHO, multiple monitors is a critical feature to support in the long term.
Multiple monitors are nice to have (for developers), but I don't see them as critical.
If you live in a world where there is a single management application that provides the only interface to interact with a QEMU instance, then yes, they aren't critical.
I do (or at least I hope I do). Exposing the monitor to users is a layering violation.
The problem with this is that most management applications are lossy by their nature. They expose only a subset of functionality supported by QEMU.
What if they don't expose a feature because they don't want to make the feature available to the user? What happens when the user changes something that the management application thinks it controls? Do we add notifiers on everything? The qemu monitor is a different privilege level from being a virtual machine owner. Sure, we could theoritically plug all the holes with, for example the user filling up the disk with screendumps. But do we want to reduce security this way? You're taking away control from the management application, due to what are the management application's misfeatures. You should instead tell the vendor of your management application to add the missing feature. Oh, and don't expect users of a management application to connect to the qemu monitor to administer their virtual machines. They expect the management application to do that for them. The qemu monitor is an excellent way to control a single VM, but not for controlling many.
Currently, the monitor is the "management interface" for QEMU. If we only every support one instance of that management interface, then it means if multiple management applications are to interact with a given QEMU instance, they must all use a single API to do that then allows for multiplexing. I see no reason that QEMU shouldn't do the multiplexing itself though.
Again, I don't oppose multiplexing (though I do oppose the wait command which requires it, and I oppose this "management apps suck, let's telnet to qemu directly" use you propose.
To put it another way, a user that uses libvirt today cannot see QEMU instances that are run manually. That is not true when a user uses libvirt with Xen today because Xend provides a management interface that is capable of supporting multiple clients. I think it's important to get the same level of functionality for QEMU.
N.B. yes, Xend is a horrendous example especially when your argument has been simplicity vs. complexity.
I'm sure libvirt really enjoys it when users use xm commands to change the VM state. What happens when you migrate it, for example? Or add a few dozen vcpus?
At the end of the day, I want to be able to run a QEMU instance from the command line, and have virt-manager be able to see it remotely and connect to it. That means multiple monitors and it means that all commands that change VM state must generate some sort of notification such that libvirt can keep track of the changing state of a VM.
I don't think most management application authors would expose the qemu monitor to users. It sounds like a huge risk, and for what benefit? If there's something interesting you can do with the monitor, add it to the management interface so people can actually use it. They don't buy this stuff so they can telnet into the monitor. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

Avi Kivity wrote:
Anthony Liguori wrote:
At the end of the day, I want to be able to run a QEMU instance from the command line, and have virt-manager be able to see it remotely and connect to it. That means multiple monitors and it means that all commands that change VM state must generate some sort of notification such that libvirt can keep track of the changing state of a VM.
I don't think most management application authors would expose the qemu monitor to users. It sounds like a huge risk, and for what benefit? If there's something interesting you can do with the monitor, add it to the management interface so people can actually use it. They don't buy this stuff so they can telnet into the monitor.
I want the same as Anthony. I want to do unusual things that libvirt doesn't support and shouldn't have to support itself, such as sending keystrokes to a running VM (from a script), attaching a debugger, and hotplugging network devices that are configured differently to how libvirt would like to do it. I also want these VMs to show in the nice GUI along with other non-debugging VMs, show their resources, start and stop them easily, catch them when they attempt to reboot, and let me do these things remotely. My solutionat the moment is to put a monitor multiplexer outside QEMU (it's a small Perl script). It accepts multiple monitor connections and forwards to QEMU's single monitor, parsing the "^\(qemu\) " prompt. This is obviously silly but it's what we have to do to get this functionality. I don't see how adding those low-level monitory things to libvirt is an improvement - debugging and scripted keystrokes are not the sort of functionality libvirt is for - or is it? The other alternative is not to use libvirt for these VMs, but that means losing functionality that's useful to me (visibility in the GUI), and more importantly it means I have to configure nearly identical VMs in a completely different way (totally different configuration syntax between libvirt and QEMU direct) depending on what I'm going to do with them. Hence multiplexing monitors, either outside or inside QEMU. Inside is better because its behaviour is more well-defined. -- Jamie

On Sun, Apr 12, 2009 at 07:42:17PM +0100, Jamie Lokier wrote:
Avi Kivity wrote:
Anthony Liguori wrote:
At the end of the day, I want to be able to run a QEMU instance from the command line, and have virt-manager be able to see it remotely and connect to it. That means multiple monitors and it means that all commands that change VM state must generate some sort of notification such that libvirt can keep track of the changing state of a VM.
I don't think most management application authors would expose the qemu monitor to users. It sounds like a huge risk, and for what benefit? If there's something interesting you can do with the monitor, add it to the management interface so people can actually use it. They don't buy this stuff so they can telnet into the monitor.
I want the same as Anthony. I want to do unusual things that libvirt doesn't support and shouldn't have to support itself, such as sending keystrokes to a running VM (from a script), attaching a debugger, and hotplugging network devices that are configured differently to how libvirt would like to do it. I also want these VMs to show in the nice GUI along with other non-debugging VMs, show their resources, start and stop them easily, catch them when they attempt to reboot, and let me do these things remotely.
My solutionat the moment is to put a monitor multiplexer outside QEMU (it's a small Perl script). It accepts multiple monitor connections and forwards to QEMU's single monitor, parsing the "^\(qemu\) " prompt. This is obviously silly but it's what we have to do to get this functionality.
Yes indeed its a little crazy :-) As anthony mentioned if libvirt were able to be notified of changes a user makes in the monitor, there's no reason we could not allow end users to access the monitor of a VM libvirt is managing. We just need to make sure libvirt doesn't miss changes like attaching or detaching block devices, etc, because that'll cause crash/data loss later when libvirt migrates or does save/restore, etc because it'll launch QEMU with wrong args
I don't see how adding those low-level monitory things to libvirt is an improvement - debugging and scripted keystrokes are not the sort of functionality libvirt is for - or is it?
I think it could probably be argued that sending fake keystrokes could be within scope. Random ad-hoc debugging probably out of scope. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel P. Berrange wrote:
Yes indeed its a little crazy :-) As anthony mentioned if libvirt were able to be notified of changes a user makes in the monitor, there's no reason we could not allow end users to access the monitor of a VM libvirt is managing. We just need to make sure libvirt doesn't miss changes like attaching or detaching block devices, etc, because that'll cause crash/data loss later when libvirt migrates or does save/restore, etc because it'll launch QEMU with wrong args
You still have an inherent race here. user: plug in disk libvirt: start migration, still without disk qemu: libvirt, a disk has been plugged in.
I don't see how adding those low-level monitory things to libvirt is an improvement - debugging and scripted keystrokes are not the sort of functionality libvirt is for - or is it?
I think it could probably be argued that sending fake keystrokes could be within scope. Random ad-hoc debugging probably out of scope.
That means that to debug a problem in the field you have to locate a guest's host, and follow it around as it migrates (or disable migration). -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

On Tue, Apr 14, 2009 at 12:15:23PM +0300, Avi Kivity wrote:
Daniel P. Berrange wrote:
Yes indeed its a little crazy :-) As anthony mentioned if libvirt were able to be notified of changes a user makes in the monitor, there's no reason we could not allow end users to access the monitor of a VM libvirt is managing. We just need to make sure libvirt doesn't miss changes like attaching or detaching block devices, etc, because that'll cause crash/data loss later when libvirt migrates or does save/restore, etc because it'll launch QEMU with wrong args
You still have an inherent race here.
user: plug in disk libvirt: start migration, still without disk qemu: libvirt, a disk has been plugged in.
That is true, but we'd still be considering direct monitor access to be a 'expert' user mode of use. If they wish to shoot themselves in the foot by triggering a migration at same time they are hotplugging I'm fine if their whole leg gets blown away. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Daniel P. Berrange wrote:
On Tue, Apr 14, 2009 at 12:15:23PM +0300, Avi Kivity wrote:
Daniel P. Berrange wrote:
Yes indeed its a little crazy :-) As anthony mentioned if libvirt were able to be notified of changes a user makes in the monitor, there's no reason we could not allow end users to access the monitor of a VM libvirt is managing. We just need to make sure libvirt doesn't miss changes like attaching or detaching block devices, etc, because that'll cause crash/data loss later when libvirt migrates or does save/restore, etc because it'll launch QEMU with wrong args
You still have an inherent race here.
user: plug in disk libvirt: start migration, still without disk qemu: libvirt, a disk has been plugged in.
That is true, but we'd still be considering direct monitor access to be a 'expert' user mode of use. If they wish to shoot themselves in the foot by triggering a migration at same time they are hotplugging I'm fine if their whole leg gets blown away.
...while there is also nothing that speaks against blocking any device hot-plugging while migration is ongoing. Independent of if there is some management app involved or the user himself plays with multiple monitors. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux

Jan Kiszka wrote:
That is true, but we'd still be considering direct monitor access to be a 'expert' user mode of use. If they wish to shoot themselves in the foot by triggering a migration at same time they are hotplugging I'm fine if their whole leg gets blown away.
...while there is also nothing that speaks against blocking any device hot-plugging while migration is ongoing. Independent of if there is some management app involved or the user himself plays with multiple monitors.
If the management is doing the hotplugging, it should just to it on both sides. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

Daniel P. Berrange wrote:
On Tue, Apr 14, 2009 at 12:15:23PM +0300, Avi Kivity wrote:
Daniel P. Berrange wrote:
Yes indeed its a little crazy :-) As anthony mentioned if libvirt were able to be notified of changes a user makes in the monitor, there's no reason we could not allow end users to access the monitor of a VM libvirt is managing. We just need to make sure libvirt doesn't miss changes like attaching or detaching block devices, etc, because that'll cause crash/data loss later when libvirt migrates or does save/restore, etc because it'll launch QEMU with wrong args
You still have an inherent race here.
user: plug in disk libvirt: start migration, still without disk qemu: libvirt, a disk has been plugged in.
That is true, but we'd still be considering direct monitor access to be a 'expert' user mode of use. If they wish to shoot themselves in the foot by triggering a migration at same time they are hotplugging I'm fine if their whole leg gets blown away.
What if the system triggers migration automatically (as you'd expect). And that's just one example. I'm sure there are more. libvirt issues commands expecting some state in qemu. It can't learn of that state from listening on another monitor, because there are delays between the state changing and the notification. If you want things to work reliably, you have to follow the chain of command. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

Avi Kivity wrote:
Daniel P. Berrange wrote:
Yes indeed its a little crazy :-) As anthony mentioned if libvirt were able to be notified of changes a user makes in the monitor, there's no reason we could not allow end users to access the monitor of a VM libvirt is managing. We just need to make sure libvirt doesn't miss changes like attaching or detaching block devices, etc, because that'll cause crash/data loss later when libvirt migrates or does save/restore, etc because it'll launch QEMU with wrong args
You still have an inherent race here.
user: plug in disk libvirt: start migration, still without disk qemu: libvirt, a disk has been plugged in.
Then fix it. The race is not necessary. user: plug in a disk libvirt: lock VM against user changes incompatible with migration qemu: libvirt, lock granted libvirt: query for current disk state libvirt: start migration, knows about the disk The "libvirt, a disk has been plugged in" will be delivered but it's not important. libvirt queries the state of things after it acquires the lock and before it starts migration.
That means that to debug a problem in the field you have to locate a guest's host, and follow it around as it migrates (or disable migration).
That's right you do. Is there any way to debug a guest without disabling migration? I don't think there is at present, so of course you have to disable migration when you debug. Another reason for that "lock against migration" mentioned above. -- Jamie

Jamie Lokier wrote:
Avi Kivity wrote:
Daniel P. Berrange wrote:
Yes indeed its a little crazy :-) As anthony mentioned if libvirt were able to be notified of changes a user makes in the monitor, there's no reason we could not allow end users to access the monitor of a VM libvirt is managing. We just need to make sure libvirt doesn't miss changes like attaching or detaching block devices, etc, because that'll cause crash/data loss later when libvirt migrates or does save/restore, etc because it'll launch QEMU with wrong args
You still have an inherent race here.
user: plug in disk libvirt: start migration, still without disk qemu: libvirt, a disk has been plugged in.
Then fix it. The race is not necessary.
user: plug in a disk libvirt: lock VM against user changes incompatible with migration qemu: libvirt, lock granted libvirt: query for current disk state libvirt: start migration, knows about the disk
The "libvirt, a disk has been plugged in" will be delivered but it's not important. libvirt queries the state of things after it acquires the lock and before it starts migration.
Migration is supposed to be transparent. You're reducing quality of service if you're disabling features while migrating.
That means that to debug a problem in the field you have to locate a guest's host, and follow it around as it migrates (or disable migration).
That's right you do. Is there any way to debug a guest without disabling migration? I don't think there is at present, so of course you have to disable migration when you debug. Another reason for that "lock against migration" mentioned above.
Nothing prevents you from debugging a guest during migration. You'll have to reconnect to the monitor, but that's it. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
I disagree, I think requiring multiple sessions for controlling a single application is clumsy. I can't think of one protocol which uses it. I don't think IMAP requires multiple sessions (and I don't think commands from one session can affect the other, except through the mail store).
I agree, multiple sessions is silly. IMAP uses multiple sessions because the session is stateful, and only one mailbox can be selected, or to do multiple search operations. IMAP servers hate this btw, the number of users per server for IMAP is something like 10x less dense than POP (or at least was years ago when I actually worked on POP and IMAP).
What's the established practice? Do you know of any protocol that is line based that does notifications like this?
I worked on an appliance type server product for a while that had something similar to a monitor control port for issuing all general commands to the machine. The protocol was tagged, line oriented commands, and untagged (*) responses for asynchronous notifications. Bulk data transfers were a bit more complex than they needed to be, as the protocol had a base64 type encoding which required the size to be specified in the transfer command (thus not permitting streaming), but other than that it worked great. ANSI color is also an option to encode and highlight async notices.
I guess most MUDs?
I've never used a MUD before, I think that qualifies as before my time :-)
Well I haven't either. Maybe time to start.
What? Why have you guys not been using MUDs? I did some development work (obviously not for pay) on a Circle based MUD implementation for a while. Here's an example of mud protocol, showing the prompt... it works over just telnet and is line based with async notifications.. any time a notification comes, it reprints a blank prompt; the client line buffers the input and a better client than telnet would show your entire line. It is highly desirable if you can control and monitor everything from a single telnet session. To share this experience, telnet hrmud.com 4000 ... | 24H 100M 82V > kill fido You tickle the beastly fido as you pierce him. | 24H 100M 82V > The beastly fido tries to bite you but bites his tongue instead! You tickle the beastly fido as you pierce him. The beastly fido is incapacitated and will slowly die, if not aided. | 24H 100M 82V > You tickle the beastly fido as you pierce him. The beastly fido is mortally wounded, and will die soon, if not aided. | 24H 100M 82V > You pierce the beastly fido's heart, you heartbreaker you... The beastly fido is dead! You receive 8 experience points. Your blood freezes as you hear the beastly fido's death cry. | 24H 100M 82V > The green gelatinous blob has arrived. The janitor picks up some trash. | 24H 100M 82V > sacrifice corpse You sacrifice the corpse of the beastly fido.

Zachary Amsden wrote:
| 24H 100M 82V > You pierce the beastly fido's heart, you heartbreaker you... The beastly fido is dead! You receive 8 experience points. Your blood freezes as you hear the beastly fido's death cry.
| 24H 100M 82V > The green gelatinous blob has arrived. The janitor picks up some trash.
You know, as silly as this little exercise is, it actually has some interesting points. MUDs present a prompt showing the current status of the player; hit points, mana, and V (movement points) are the default display here, but typically you can customize the prompt to include many variables and syntax to make it easy for you to parse... and simply sending a NULL cmd (CR/LF) gets you a status update whenever you want. Now, that's sort of the same thing a virtualization management tool might want to monitor in the event of out-of-band asynchronous events - number of online CPUs, RAM, and VM power status. Add command tagging so you can associate responses to specific commands, and you've got a very powerful monitor in a single I/O channel. Personally, I think relying on asynchronous events to provide reliable status is a bad idea. Management connections can and will get disconnected, buffers will get filled and event notifications will be dropped. Having a status line that is pollable by the client allows easy access to all this information whenever it is needed.. such as an error condition. Trying to stop an already stopped VM might confuse a management program when it fails, and leaving the online status indeterminate isn't a good thing to present to the user, but if a status line is available, it is very convenient, easy to use, and never loses information. And it's really useful when your cleric's job is to hit enter to get leader status as quickly as possible so they can recall the berzerker who is buffing the mob before HP run out and they end up dead. Losing XP from dying sucks, it takes days to get that back. Zach

Hi,
Personally, I think relying on asynchronous events to provide reliable status is a bad idea. Management connections can and will get disconnected, buffers will get filled and event notifications will be dropped.
I somehow like the idea from someone (Jamie?) somewhere in this thread to model the notification like unix signals. Just flag that something changed. Don't carry the actual info, just ping the management app, which in turn must use "info <whatever>" to get the details. That solves the queuing issue. I also closes the command vs. notification races. Management apps don't need new parsers, the existing "info <foo>" parser(s) will do just fine. cheers, Gerd

Anthony Liguori wrote:
What's the established practice? Do you know of any protocol that is line based that does notifications like this?
Actually there is one line oriented protocol that does asynchronous notifications. http://faqs.org/rfcs/rfc1459.html -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
Oh. If the command generates no output (like most), you can't tell when it completes. I suppose we could have qemu print OK after completing a command.
FWIW, OpenVPN's monitor interface resolves this by prefixing all notification lines with '>'.

Avi Kivity wrote:
I think the thing that is missing is that the 'wait' command does not have to be part of the non-human mode. In non-human mode, you are always doing an implicit wait.
I think 'wait' is unusable for humans. If I want qemu to tell me something happened, it's enough to enable notifications. There's no need to tell it to wait every time something happens. That's poll(2), there's no poll(1).
For some purposes I'd prefer 'wait' on a separate monitor. For the same reason that we sometimes redirect the output of background shell commands to a file :-) Solution: 'wait -background' for those async notifications. Or simply 'notify -background event-type', no need for a wait command if you give that option. Or conversely 'notify -channel 1 event-type' to direct the events to channel 1, and 'wait -channel 1' to show events on that channel. -- Jamie

On 04/09/09 16:03, Avi Kivity wrote:
I don't want multiplexed monitor sessions, at all.
I'm very happy to finally see them. Finally one can run vms with libvirt and *still* access the monitor for debugging and development purposes.
I want async notifications added to a single monitor session. That too could be implemented today (as simple as a term_printf("notification: ...\n");
No. It is simple on the qemu side only. Parsing notifications poping up at random places in the stream is more complex. cheers, Gerd

Gerd Hoffmann wrote:
On 04/09/09 16:03, Avi Kivity wrote:
I don't want multiplexed monitor sessions, at all.
I'm very happy to finally see them. Finally one can run vms with libvirt and *still* access the monitor for debugging and development purposes.
Right, I like them for that purpose as well. But not for ordinary control.
I want async notifications added to a single monitor session. That too could be implemented today (as simple as a term_printf("notification: ...\n");
No. It is simple on the qemu side only. Parsing notifications poping up at random places in the stream is more complex.
I don't understand why, if there's a common prefix. Everywhere you expect (qemu), also expect notification:. -- I have a truly marvellous patch that fixes the bug which this signature is too narrow to contain.

Avi Kivity wrote:
Gerd Hoffmann wrote:
On 04/09/09 16:03, Avi Kivity wrote:
I don't want multiplexed monitor sessions, at all.
I'm very happy to finally see them. Finally one can run vms with libvirt and *still* access the monitor for debugging and development purposes.
Right, I like them for that purpose as well. But not for ordinary control.
How do you want to differentiate? What further complications would this bring us?
I want async notifications added to a single monitor session. That too could be implemented today (as simple as a term_printf("notification: ...\n");
No. It is simple on the qemu side only. Parsing notifications poping up at random places in the stream is more complex.
I don't understand why, if there's a common prefix.
Everywhere you expect (qemu), also expect notification:.
Please no more async notifications to the monitors. They are just ugly to parse, at least for us humans. I don't want to see any notification in the middle of my half-typed command e.g. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux

Jan Kiszka wrote:
Avi Kivity wrote:
Gerd Hoffmann wrote:
On 04/09/09 16:03, Avi Kivity wrote:
I don't want multiplexed monitor sessions, at all.
I'm very happy to finally see them. Finally one can run vms with libvirt and *still* access the monitor for debugging and development purposes.
Right, I like them for that purpose as well. But not for ordinary control.
How do you want to differentiate? What further complications would this bring us?
I'm not sure I understand your questions. Multiple monitor sessions are like multiple shell sessions. I don't think a control program should use more than one session, but it should allow a developer to connect to issue 'info registers' and 'x/20i' commands. Of course if a developer issues 'quit' or a hotunplug command, things will break.
Please no more async notifications to the monitors. They are just ugly to parse, at least for us humans. I don't want to see any notification in the middle of my half-typed command e.g.
If we can identify an interactive session, we might redraw the partial command after the prompt. btw, why would a human enable notifications? Note notifications enabled on the management session will only be displayed there. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
Jan Kiszka wrote:
Avi Kivity wrote:
Gerd Hoffmann wrote:
On 04/09/09 16:03, Avi Kivity wrote:
I don't want multiplexed monitor sessions, at all.
I'm very happy to finally see them. Finally one can run vms with libvirt and *still* access the monitor for debugging and development purposes.
Right, I like them for that purpose as well. But not for ordinary control.
How do you want to differentiate? What further complications would this bring us?
I'm not sure I understand your questions. Multiple monitor sessions are like multiple shell sessions. I don't think a control program should use more than one session, but it should allow a developer to connect to issue 'info registers' and 'x/20i' commands. Of course if a developer issues 'quit' or a hotunplug command, things will break.
We agree if we want decoupled states of the monitor sessions (one session should definitely not be used to configure the output of another one). But I see no issues with collecting the events in one session that happen to be caused by activity in some other session. But maybe I'm missing your point.
Please no more async notifications to the monitors. They are just ugly to parse, at least for us humans. I don't want to see any notification in the middle of my half-typed command e.g.
If we can identify an interactive session, we might redraw the partial command after the prompt.
Uhh, please not this kind of terminal reprinting. The terminal user keep full control over when things can be printed.
btw, why would a human enable notifications? Note notifications enabled on the management session will only be displayed there.
It's true that the common use case for events will be monitor applications. Still, enabling them for testing or simple scripting should not switch on ugly output mode or complicate the parsing. Jan -- Siemens AG, Corporate Technology, CT SE 2 Corporate Competence Center Embedded Linux

Jan Kiszka wrote:
I'm not sure I understand your questions. Multiple monitor sessions are like multiple shell sessions. I don't think a control program should use more than one session, but it should allow a developer to connect to issue 'info registers' and 'x/20i' commands. Of course if a developer issues 'quit' or a hotunplug command, things will break.
We agree if we want decoupled states of the monitor sessions (one session should definitely not be used to configure the output of another one). But I see no issues with collecting the events in one session that happen to be caused by activity in some other session. But maybe I'm missing your point.
The management application will still think the device is plugged in, and things will break if it isn't. Of course if you asked for notification X on session Y, then event X should be delivered to session Y no matter how it originated (but not to session Z).
Please no more async notifications to the monitors. They are just ugly to parse, at least for us humans. I don't want to see any notification in the middle of my half-typed command e.g.
If we can identify an interactive session, we might redraw the partial command after the prompt.
Uhh, please not this kind of terminal reprinting. The terminal user keep full control over when things can be printed.
Very well. I guess a human user can open another session for notifications, if they are so inclined.
btw, why would a human enable notifications? Note notifications enabled on the management session will only be displayed there.
It's true that the common use case for events will be monitor applications. Still, enabling them for testing or simple scripting should not switch on ugly output mode or complicate the parsing.
Fair enough. -- Do not meddle in the internals of kernels, for they are subtle and quick to panic.

Avi Kivity wrote:
I'm not sure I understand your questions. Multiple monitor sessions are like multiple shell sessions. I don't think a control program should use more than one session, but it should allow a developer to connect to issue 'info registers' and 'x/20i' commands. Of course if a developer issues 'quit' or a hotunplug command, things will break.
In most cases I think commands like 'stop' or a hotunplug command shouldn't break libvirt or other control program. As long as there's a way for libvirt to query the current hotplug state and be notified of changes, why shouldn't it behave much the same as if you asked libvirt to do do the same action itself? -- Jamie

On Wed, 2009-04-08 at 13:34 -0500, Anthony Liguori wrote:
Right now only one monitor device can be enabled at a time. In order to support asynchronous notification of events, I would like to introduce a 'wait' command that waits for an event to occur. This implies that we need an additional monitor session to allow commands to still be executed while waiting for an asynchronous notification.
Was there any consensus reached in this thread? I'm once again looking for ways to communicate qemu watchdog events to libvirt. With these patches, libvirt could open a second monitor connection to qemu, and in the second one execute "wait_event". When the watchdog triggers, the wait command would print the event, libvirt would get the fd "data available" notification, and create a domain event. Right? -- Hollis Blanchard IBM Linux Technology Center

Hollis Blanchard wrote:
On Wed, 2009-04-08 at 13:34 -0500, Anthony Liguori wrote:
Right now only one monitor device can be enabled at a time. In order to support asynchronous notification of events, I would like to introduce a 'wait' command that waits for an event to occur. This implies that we need an additional monitor session to allow commands to still be executed while waiting for an asynchronous notification.
Was there any consensus reached in this thread? I'm once again looking for ways to communicate qemu watchdog events to libvirt.
We can do multiple monitors as a debugging tool, but to support events, a proper machine monitor mode is a prerequisite. The real requirement is that events are obtainable via a single communication channel instead of requiring two separate communication channels. Internal implementation will look at lot like these patches. The reasoning for requiring a single channel is that coordinating between the two channels is expected to be prohibitively difficult. To have a single channel, we need a machine mode. It cannot be done in a human readable fashion. I think this summarizes the consensus we reached. I don't agree fully with the above but I'm okay with it. Would you agree Avi? -- Regards, Anthony Liguori

Anthony Liguori wrote:
Hollis Blanchard wrote:
On Wed, 2009-04-08 at 13:34 -0500, Anthony Liguori wrote:
Right now only one monitor device can be enabled at a time. In order to support asynchronous notification of events, I would like to introduce a 'wait' command that waits for an event to occur. This implies that we need an additional monitor session to allow commands to still be executed while waiting for an asynchronous notification.
Was there any consensus reached in this thread? I'm once again looking for ways to communicate qemu watchdog events to libvirt.
We can do multiple monitors as a debugging tool, but to support events, a proper machine monitor mode is a prerequisite.
The real requirement is that events are obtainable via a single communication channel instead of requiring two separate communication channels. Internal implementation will look at lot like these patches.
The reasoning for requiring a single channel is that coordinating between the two channels is expected to be prohibitively difficult. To have a single channel, we need a machine mode. It cannot be done in a human readable fashion.
I think this summarizes the consensus we reached. I don't agree fully with the above but I'm okay with it.
If you don't agree with it, it isn't a consensus.
Would you agree Avi?
It represents my views fairly accurately. I'm not convinced that you can't to event notifications without machine mode, but on the other hand I do think introducing machine mode and layering notifications on top of that is the best way to proceed, so I can't complain. -- error compiling committee.c: too many arguments to function
participants (11)
-
Anthony Liguori
-
Anthony Liguori
-
Avi Kivity
-
Blue Swirl
-
Charles Duffy
-
Daniel P. Berrange
-
Gerd Hoffmann
-
Hollis Blanchard
-
Jamie Lokier
-
Jan Kiszka
-
Zachary Amsden