On 4/13/23 18:24, Peter Maydell wrote:
On Mon, 3 Apr 2023 at 19:33, Richard Henderson
<richard.henderson(a)linaro.org> wrote:
>
> On 4/3/23 07:46, Peter Maydell wrote:
>> uint32_t curr_cflags(CPUState *cpu)
>> {
>> uint32_t cflags = cpu->tcg_cflags;
>> + TCGState *tcgstate = TCG_STATE(current_accel());
>
> As mentioned against the cover, this is a very hot path.
>
> We should try for something less expensive. Perhaps as simple as
>
> return cpu->tcg_cflags | tcg_cflags_global;
>
> where cpu->tcg_cflags is updated with cpu->singlestep_enabled.
I feel like that introduces atomicity issues. If I'm reading
the code right, curr_cflags() is called without any kind
of lock held. At the moment we get away with this because
'singlestep' is an int and is always going to be atomically
updated. If we make tcg_cflags_global a value which might have
multiple bits set or not set I'm not entirely sure what the
right way is to handle the reads and writes of it.
qatomic_read() here, will dtrt for no tearing on the read.
(Not that we should have expected one anyway, for uint32_t.)
I think we can assume we have the iothread lock at any
point where we want to change either 'singlestep' or
the 'nochain' option, at least.
Indeed, it can only be changed by the monitor, under user control, so even without a lock
there's no real race there.
Using qatomic_set(&global, new_value) is sufficient to match the qatomic_read() for no
tearing. Concurrent threads will see the old value or the new value, but not garbage,
which is just fine.
We probably need to kick all cpus, so that they come out of long-running TB chains to see
the new value and re-translate.
r~