Hi all,
[Issue observed]
If we issue 'nova reboot <server>', we get to have the console
output of the latest bootup of the server only. The console output
of the previous boot for the same server vanishes due to
truncation[1]. If we do reboot from within the VM instance [ #sudo
reboot ], or reboot the instance with 'virsh reboot
<instance>' the behavior is not the same, where the
console.log keeps increasing, with the new output being appended.
This loss of history makes some debugging scenario difficult due to
lack of information being available.
Please point me to any solution/blueprint for this issue, if already
planned. Otherwise, please comment on my analysis and proposals as
solution, below -
[Analysis]
Nova's libvirt driver on compute node tries to do a graceful restart
of the server instance, by attempting a soft_reboot first. If
soft_reboot fails, it attempts a hard_reboot. As part of
soft_reboot, it brings down the instance by calling shutdown(), and
then calls createWithFlags() to bring this up. Because of this,
qemu-kvm process for the instance gets terminated and new process is
launched. In QEMU, the chardev file is opened with O_TRUNC, and thus
we lose the previous content of the console.log file.
On the other-hand, during 'virsh reboot <instance>', the same
qemu-kvm process continues, and libvirt actually does a
qemuDomainSetFakeReboot(). Thus the same file continues capturing
the new console output as a continuation into the same file.
[Proposals for solution]
1. NOVA, driven by certain configuration, will backup the console
file, before creating the domain, during reboot scenario. viz. doing
a backup of console.log as console.log.0. How many such backups of
log-file to keep, what can be the maximum size of the file,
'logrotate` to be used or not - all these can come to NOVA as
configuration parameter.
Pros - As NOVA libvirt driver is not using libvirt's reboot()
functionality knowingly, this problem can be better addressed from
the same layer.
Cons - NOVA's libvirt layer building awareness of the console
files is not clean from modularity.
2. virDomainCreateWithFlags() will have a new flag value to indicate
logs to be appended instead of truncated, if FILE option is used.
This config will be passed to QEMU, while spawning the process.
- Changes will be not in OpenStack Code, but in libvirt and
QEMU.
Cons - We may have to do the similar implementation for all the
drivers of libvirt.
Pros - This feature's use-case is there in case of 'virsh
shutdown <instance>', followed by a 'virsh start
<instance>' too.
Regards,
Suro
Surojit Pathak
Refs -
[1] Snippet
# tail -f
/opt/stack/data/nova/instances/cea9a3d9-f833-4ded-90b8-c85b7da3f758/console.log
...
[ OK ] Started udev Coldplug all Devices.
[ OK ] Started Create static device nodes in /dev.
Starting udev Kernel Device Manager...
[ 36.938075] EXT4-fs (vda1): re-mounted. Opts: (null)
[ OK ] Started Remount Root and Kernel File Systems.
Starting Load/Save Random Seed...
[ OK ] Reached target Local File Systems (Pre).
Starting Configure read-only root support...
[ OK ] Started Load/Save Random Seed.
tail:
/opt/stack/data/nova/instances/cea9a3d9-f833-4ded-90b8-c85b7da3f758/console.log:
file truncated
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Initializing cgroup subsys cpuacct
[ 0.000000] Linux version 3.11.10-301.fc20.x86_64
(mockbuild@bkernel01.phx2.fedoraproject.org) (gcc version 4.8.2
20131017 (Red Hat 4.8.2-1) (GCC) ) #1 SMP Thu Dec 5 14:01:17 UTC
2013
[ 0.000000] Command line: ro
root=UUID=314b4a27-3885-49e8-9415-af098db4fd2a no_timer_check
console=tty1 console=ttyS0,115200n8
initrd=/boot/initramfs-3.11.10-301.fc20.x86_64.img
BOOT_IMAGE=/boot/vmlinuz-3.11.10-301.fc20.x86_64
....