atop Is pretty great.

Guest machines are mostly 1 processor, 2GB RAM running some old 16-bit apps.  Guests were only using 15% of their RAM as shown by Task Manager in Windows, and very little swap.

The host had 48GB of RAM free, and 48k of swap in use, so it certainly isn't swapping.  atop Only showed disk going into red when starting all domains at the same time.

I did have one guest configured to use the wrong CPU...  It slipped past me when moving it from the old host to the new one.  I will post a reply if that seems to have corrected all the problems.

Thanks for the help!


On Mon, Aug 25, 2014 at 10:59 PM, Martin Kletzander <mkletzan@redhat.com> wrote:
On Mon, Aug 25, 2014 at 02:56:57PM -0600, Michael Warnecke wrote:
Hello all, I'm new to the list, and hoping someone can point me in the
right direction.

I've got an Ubuntu 14 dom0 with, what I think, are excellent specs.  12
cores, hyperthreaded, VT, 64GB RAM, gobs of disk, etc... all to run 8
virtual machines (at the moment).

My problem is each of the guests - Windows, Ubuntu, FreeBSD, starts out
working fine.  After several minutes, the guest becomes unresponsive for a
moment, and comes back.  As time goes on, this happens more, and more
frequently, and longer and longer until after about a day the guest is
completely useless, and needs a shutdown, and start.

The dom0 is NEVER unresponsive.

I've tried to track this down through every means I have available, and
have come up empty handed.  Here is some of what I've done so far, please
let me know what other information would help to debug this:

1. I've set guest OS processor to be default, and copied from host.
Neither made any noticable difference.
2. I've switched all disks and network to be VirtIO.  Big improvement over
IDE, but the unresponsive problem persists.
3. Linux guests have the following in dmesg:
hrtimer: interrupt took 10109276 ns
[sched_delayed] sched: RT throttling activated
4. I found nothing suspicious in the dom0's dmesg.
5. Unresponsiveness does not correlate with disk usage.
6. Host uses 4 disks in software RAID-5 (yes I know I'm bad, but there are
reasons) using BTRFS.
7. Guest disks are all raw.

Please let me know if there is some other useful information, or if you
have an idea where I should look next.


I would suggest looking for a bottleneck on the host and then on
guests as well.  I like using atop for this for example.  virt-top can
show you how much each machine eats up.  Check the memory and
processor usage.  What are the settings (CPU, MEM, disks) for the
machines?

It still might be just a minor issue like for example that everything
starts swapping and you're gone then of course.

Martin

Thanks!

_______________________________________________
libvirt-users mailing list
libvirt-users@redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users