On a Monday in 2021, BiaoxiangYe wrote:
From: BiaoXiang Ye <yebiaoxiang(a)huawei.com>
Setting the system time backward would lead to a
multiplication overflow in function virKeepAliveStart.
FunctionvirKeepAliveTimerInternal got the same bug too.
Backtrace below:
#0 0x0000ffffae898470 in raise () from /usr/lib64/libc.so.6
#1 0x0000ffffae89981c in abort () from /usr/lib64/libc.so.6
#2 0x0000ffffaf9a36a8 in __mulvsi3 () from /usr/lib64/libvirt.so.0
#3 0x0000ffffaf8fd9e8 in virKeepAliveStart (ka=0xaaaaf954ce10,
interval=interval@entry=0,
count=count@entry=0) at ../../src/rpc/virkeepalive.c:283
#4 0x0000ffffaf908560 in virNetServerClientStartKeepAlive (client=0xaaaaf954cbe0)
at ../../src/rpc/virnetserverclient.c:1628
#5 0x0000aaaac57eb6dc in remoteDispatchConnectSupportsFeature (server=0xaaaaf95309d0,
msg=0xaaaaf9549d90, ret=0xffff8c007fc0, args=0xffff8c002e70, rerr=0xffff9ea054a0,
client=0xaaaaf954cbe0) at ../../src/remote/remote_daemon_dispatch.c:5063
#6 remoteDispatchConnectSupportsFeatureHelper (server=0xaaaaf95309d0,
client=0xaaaaf954cbe0,
msg=0xaaaaf9549d90, rerr=0xffff9ea054a0, args=0xffff8c002e70, ret=0xffff8c007fc0)
at ./remote/remote_daemon_dispatch_stubs.h:3503
#7 0x0000ffffaf9053a4 in virNetServerProgramDispatchCall(msg=0xaaaaf9549d90,
client=0xaaaaf954cbe0,
server=0x0, prog=0xaaaaf953a170) at ../../src/rpc/virnetserverprogram.c:451
#8 virNetServerProgramDispatch (prog=0xaaaaf953a170, server=0x0,
server@entry=0xaaaaf95309d0,
client=0xaaaaf954cbe0, msg=0xaaaaf9549d90) at
../../src/rpc/virnetserverprogram.c:306
#9 0x0000ffffaf90a6bc in virNetServerProcessMsg (msg=<optimized out>,
prog=<optimized out>,
client=<optimized out>, srv=0xaaaaf95309d0) at
../../src/rpc/virnetserver.c:137
#10 virNetServerHandleJob (jobOpaque=0xaaaaf950df80, opaque=0xaaaaf95309d0)
at ../../src/rpc/virnetserver.c:154
#11 0x0000ffffaf812e14 in virThreadPoolWorker (opaque=<optimized out>)
at ../../src/util/virthreadpool.c:163
#12 0x0000ffffaf81237c in virThreadHelper (data=<optimized out>) at
../../src/util/virthread.c:246
#13 0x0000ffffaea327ac in ?? () from /usr/lib64/libpthread.so.0
#14 0x0000ffffae93747c in ?? () from /usr/lib64/libc.so.6
(gdb) frame 3
#3 0x0000ffffaf8fd9e8 in virKeepAliveStart (ka=0xaaaaf954ce10,
interval=interval@entry=0,
count=count@entry=0) at ../../src/rpc/virkeepalive.c:283
283 timeout = ka->interval - delay;
(gdb) list
278 now = time(NULL);
279 delay = now - ka->lastPacketReceived; <='delay' got a negative
value
280 if (delay > ka->interval)
281 timeout = 0;
282 else
283 timeout = ka->interval - delay;
284 ka->intervalStart = now - (ka->interval - timeout);
285 ka->timer = virEventAddTimeout(timeout * 1000, virKeepAliveTimer, <=
multiplication overflow
286 ka, virObjectFreeCallback);
287 if (ka->timer < 0)
(gdb) p now
$2 = 18288001
(gdb) p ka->lastPacketReceived
$3 = 1609430405
Signed-off-by: BiaoXiang Ye <yebiaoxiang(a)huawei.com>
---
src/rpc/virkeepalive.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)
diff --git a/src/rpc/virkeepalive.c b/src/rpc/virkeepalive.c
index 860b91b6b1..e662484ea6 100644
--- a/src/rpc/virkeepalive.c
+++ b/src/rpc/virkeepalive.c
@@ -119,12 +119,25 @@ virKeepAliveTimerInternal(virKeepAlivePtr ka,
if (ka->interval <= 0 || ka->intervalStart == 0)
return false;
+ if (now < ka->intervalStart) {
+ VIR_WARN("system time jump back detected. now=%ld,"
+ " intervalStart=%ld",
+ now, ka->intervalStart);
+ ka->intervalStart = now;
+ }
+
Instead of detecting a jump backwards, we could use a monotonic timer
that does not go backwards.
g_get_monotonic_time () from GLib should do that
https://developer.gnome.org/glib/stable/glib-Date-and-Time-Functions.html...
(Although they say the timer might not tick while the machine is
suspended - not sure whether we need that here or not)
Jano
if (now - ka->intervalStart < ka->interval) {
timeval = ka->interval - (now - ka->intervalStart);
virEventUpdateTimeout(ka->timer, timeval * 1000);
return false;
}
+ if (now < ka->lastPacketReceived) {
+ VIR_WARN("system time jump back detected. now=%ld,"
+ " lastPacketReceived=%ld",
+ now, ka->lastPacketReceived);
+ ka->lastPacketReceived = now;
+ }
timeval = now - ka->lastPacketReceived;
PROBE(RPC_KEEPALIVE_TIMEOUT,
"ka=%p client=%p countToDeath=%d idle=%d",