On 2015/5/15 19:09, Michal Privoznik wrote:
On 15.05.2015 08:26, zhang bo wrote:
> When we change system clock to years ago, a certain CPU may use up 100% cputime.
> The reason is that in function virEventPollCalculateTimeout(), we assign the
> unsigned long long result to an INT variable,
> *timeout = then - now; // timeout is INT, and then/now are long long
> if (*timeout < 0)
> *timeout = 0;
> there's a chance that variable @then minus variable @now may be a very large
number
> that overflows INT value expression, then *timeout will be negative and be assigned
to 0.
> Next the 'poll' in function virEventPollRunOnce() will get into an
'endless' while loop there.
> thus, the cpu that virEventPollRunOnce() thread runs on will go up to 100%.
>
> Although as we discussed before in
https://www.redhat.com/archives/libvir-list/2015-May/msg00400.html
> it should be prohibited to set-time while other applications are running, but it does
> seems to have no harm to make the codes more robust.
>
> Signed-off-by: Wang Yufei <james.wangyufei(a)huawei.com>
> Signed-off-by: Zhang Bo <oscar.zhangbo(a)huawei.com>
> ---
> src/util/vireventpoll.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/src/util/vireventpoll.c b/src/util/vireventpoll.c
> index ffda206..5f5a149 100644
> --- a/src/util/vireventpoll.c
> +++ b/src/util/vireventpoll.c
> @@ -357,9 +357,10 @@ static int virEventPollCalculateTimeout(int *timeout)
> return -1;
>
> EVENT_DEBUG("Schedule timeout then=%llu now=%llu", then, now);
> - *timeout = then - now;
> - if (*timeout < 0)
> + if (then < now)
> *timeout = 0;
> + else
> + *timeout = (then - now) & 0x7FFFFFFF;
You're trying to make this an unsigned value. What's wrong with plain
typecast?
No, I'm trying to give an positive value that an int variable can express,
> } else {
> *timeout = -1;
> }
>
I must say this is ugly. If the system clock is changed, all the
timeouts should fire, shouldn't they? Otherwise important events can be
missed.
In fact, the events will not be handled.
In virEventPollDispatchTimeouts
if (eventLoop.timeouts[i].expiresAt <= (now+20)) {
...
(cb)(timer, opaque);
'expiresAt' is much bigger than 'now' because the system clock has been
changed to long long ago.
Assign an unsigned long long value to an int value is out of control, we don't know
whether it be negative
or positive because of overflow of int. I know this patch can not fix system clock changed
problem, but at
least, the 'timeout' will be under control, and the cpu up to 100% situation can
be fixed, that's my point.
Michal
.