On 5/4/20 10:07 AM, Peter Krempa wrote:
On Fri, May 01, 2020 at 16:09:04 +0900, MIKI Nobuhiro wrote:
> The waiting time to acquire the lock times out, which leads to a segment fault.
Could you please elaborate here? Adding this band-aid is pointless if it
can timeout later. We do want to fix any locking issue but without
information we can't really.
> In essence we should make improvements around locks, but as a workaround we
> will change the timeout to allow the user to increase it.
> This value was defined as 30 seconds, so use it as the default value.
> The logs are as follows:
>
> ```
> Timed out during operation: cannot acquire state change lock \
> (held by monitor=remoteDispatchDomainCreateWithFlags)
> libvirtd.service: main process exited, code=killed,status=11/SEGV
> ```
Unfortunately I don't consider this a proper justification for the
change below. Either re-state why you want this, e.g. saying that
shortening time may give users greater feedback, but mentioning that it
works around a crash is not acceptable as a justification for something
which doesn't fix the crash.
Agreed. Allowing users to configure the timeout makes sense - we already
do that for other timeouts, but if it is masking a real bug we need to
fix it first. Do you have any steps to reproduce the bug? Are you able
to get the stack trace from the coredump?
> Signed-off-by: MIKI Nobuhiro <nmiki(a)yahoo-corp.jp>
> ---
> docs/news.xml | 9 +++++++++
> src/qemu/libvirtd_qemu.aug | 1 +
> src/qemu/qemu.conf | 3 +++
> src/qemu/qemu_conf.c | 3 +++
> src/qemu/qemu_conf.h | 2 ++
> src/qemu/qemu_domain.c | 7 ++-----
> src/qemu/test_libvirtd_qemu.aug.in | 1 +
> 7 files changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/docs/news.xml b/docs/news.xml
> index 80819ae..3755c49 100644
> --- a/docs/news.xml
> +++ b/docs/news.xml
> @@ -119,6 +119,15 @@
> Bounds Checking) and IBS (Indirect Branch Speculation).
> </description>
> </change>
> + <change>
> + <summary>
> + conf: Add job wait time setting to qemu.conf
> + </summary>
> + <description>
> + How many seconds the new job waits for the existing job to finish.
> + This only affects if you are using qemu as driver.
> + </description>
> + </change>
> </section>
> <section title="Improvements">
> </section>
Changes to news.xml always must be in a separate commit.
Just a short explanation - this is to ease possible backports. For
instance, if there is a bug fix in version X, but a distro wants to
backport it to version X-1 then the news.xml looks completely different
there and the cherry-pick won't apply cleanly.
> diff --git a/src/qemu/libvirtd_qemu.aug b/src/qemu/libvirtd_qemu.aug
> index 404498b..76c896e 100644
> --- a/src/qemu/libvirtd_qemu.aug
> +++ b/src/qemu/libvirtd_qemu.aug
> @@ -97,6 +97,7 @@ module Libvirtd_qemu =
> | bool_entry "dump_guest_core"
> | str_entry "stdio_handler"
> | int_entry "max_threads_per_process"
> + | int_entry "qemu_job_wait_time"
>
> let device_entry = bool_entry "mac_filter"
> | bool_entry "relaxed_acs_check"
> diff --git a/src/qemu/qemu.conf b/src/qemu/qemu.conf
> index abdbf07..a05aaa5 100644
> --- a/src/qemu/qemu.conf
> +++ b/src/qemu/qemu.conf
> @@ -612,6 +612,9 @@
> #
> #max_threads_per_process = 0
>
> +# How many seconds the new job waits for the existing job to finish.
> +#qemu_job_wait_time = 30
> +
> # If max_core is set to a non-zero integer, then QEMU will be
> # permitted to create core dumps when it crashes, provided its
> # RAM size is smaller than the limit set.
Rest of the patch looks good code-wise.
Yes.
Michal