On 08/03/2011 06:44 AM, Nicolas Sebrecht wrote:
The 02/08/11, Nicolas Sebrecht wrote:
> I'm stuck!
>
> As told before, I have one working (in production) system and others
> failing Gentoo systems (including the testing machine).
>
> I've check the working system against the testing machine and looked for
> differences. I did remove differences one by one (luckily systems are
> very near from each other) and couldn't have the testing machine to
> work.
>
> I've check Linux kernel configuration (first) and the whole system for
> installed packages and compilation options. On each difference found
> I've done:
> - compilation and reinstallation of ALL the packages;
> - reboot;
> - tests.
>
> Now, I have almost two exact same systems behaving differently. Some
> minor differences remain about installed packages (missing on testing):
> - lshw
> - pv
> - colorgcc
> - autofs
> - iperf
>
> Hardware isn't the same, though. Main differences are:
> - Intel(R) Xeon(R) CPU E5420 @ 2.50GHz (cpu family : 6)
> hardware RAID
> - Intel(R) Core(TM) i3-2100 CPU @ 3.10GHz (cpu family : 6)
> software RAID
>
> Ouch!
So, I've tried yet another thing. I made a tarball of the whole working
system and installed it on the testing bare metal (Core i3-2100).
The 'start' command after 'managedsave' still fails. Then, I tried to
change kvm related kernel compilation option and compiled kvm-intel as
module: fails again.
Did the hardware requirements change for libvirt/qemu-kvm? I can't
understand why the _exact same_ system works on a hardware and not on
another (where previously it worked perfectly well).
One likely possibility is some sort of race condition where CPU speed
(and other hardware related) differences) cause one thread/process to
win the race on one of the machines, and another process/thread to win
on the other (the test I suggested earlier was inspired by just such a
race that I previously encountered, but apparently the qemu you're
running already has the fix for that one :-( )