On Thu, Oct 06, 2016 at 10:23:05 +0300, Nikolay Shirokovskiy wrote:
On 05.10.2016 18:13, Peter Krempa wrote:
> On Mon, Sep 12, 2016 at 17:34:39 +0300, Nikolay Shirokovskiy wrote:
>> Hi, all.
>>
>> In case migration fails due to destination qemu exits unexpectedly user
>> recevies the qemu log in the error message. Unfortunately log is truncated and
>> the most interesting part is missed (below is the example of such a log [1]).
>>
>> Actually for the most cases the first patch will be enough to fix the issue.
>> Originally I thought the problem is qemu logging and reading the log are not in
>> sync (which is true) so I tried to fix it as well in the next patches.
>>
>> * diff from v1:
>>
>> 1. split changes to libvirtd and virtlogd to different patches
>> 2. split virtlogd patch further
>> 3. simplify handling eofs and hangups in draining function
>>
>> [1] log example:
>>
>> CPU Reset (CPU 0)
>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=00000000
>> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>> EIP=00000000 EFL=00000000 [-------] CPL=0 II=0 A20=0 SMM=0 HLT=0
>> ES =0000 00000000 00000000 00000000
>> CS =0000 00000000 00000000 00000000
>> SS =0000 00000000 00000000 00000000
>> DS =0000 00000000 00000000 00000000
>> FS =0000 00000000 00000000 00000000
>> GS =0000 00000000 00000000 00000000
>> LDT=0000 00000000 00000000 00000000
>> TR =0000 00000000 00000000 00000000
>> GDT= 00000000 00000000
>> IDT= 00000000 00000000
>> CR0=00000000 CR2=00000000 CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
>> DR6=0000000000000000 DR7=0000000000000000
>> CCS=00000000 CCD=00000000 CCO=DYNAMIC
>> EFER=0000000000000000
>> FCW=0000 FSW=0000 [ST=0] FTW=ff MXCSR=00000000
>> FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
>> FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
>> FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
>> FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
>> XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
>> XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
>> XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
>> XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000
>> CPU Reset (CPU 1)
>> EAX=00000000 EBX=00000000 ECX=00000000 EDX=000206a1
>> ESI=00000000 EDI=00000000 EBP=00000000 ESP=00000000
>> EIP=0000fff0 EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0
>> ES =0000 00000000 0000ffff 00009300
>> CS =f000 ffff0000 0000ffff 00009b00
>> SS =0000 00000000 0000ffff 00009300
>> DS =0000 00000000 0000ffff 00009300
>> FS =0000 00000000 0000ffff 00009300
>> GS =0000 00000000 0000ffff 00009300
>> LDT=0000 00000000 0000ffff 00008200
>> TR =0000 00000000 0000ffff 00008b00
>> GDT= 00000000 0000ffff
>> IDT= 00000000 0000ffff
>> CR0=60000010 CR2=00000000 CR3=00000000 CR4=00000000
>> DR0=0000000000000000 DR1=0000000000000000 DR2=0000000000000000
DR3=0000000000000000
>> DR6=00000000ffff0ff0 DR7=0000000000000400
>> CCS=00000000 CCD=00000000 CCO=DYNAMIC
>> EFER=0000000000000000
>> FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80
>> FPR0=0000000000000000 0000 FPR1=0000000000000000 0000
>> FPR2=0000000000000000 0000 FPR3=0000000000000000 0000
>> FPR4=0000000000000000 0000 FPR5=0000000000000000 0000
>> FPR6=0000000000000000 0000 FPR7=0000000000000000 0000
>> XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000
>> XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000
>> XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000
>> XMM06=00000000000000000000000000000000 XMM07=000
>> qemu: terminating on signal 15 from pid 168133
>
> I don't think that reporting all of the above is a good idea. We should
> perhaps report at most two last lines.
>
We already report about half of this, this patch just removes random truncation.
As to most two lines, AFAIU one can not say what part of this log will be
useful for crash investigation.
This is not about the log but about the error message. The error message
containing ALL of the above stuff is useless for any user. For crash
investigation you can always get the full log from the actual log file.
When I've implemented this I did not see such error message. I'd
otherwise truncate it to the end since all the above in a error message
is clearly ridiculous.
Peter