Re: [libvirt-users] domains paused without any obvious reason

14 May 2019

      On Mon, May 13, 2019 at 06:19:05PM +0200, Lentes, Bernd wrote:
...
----- On May 13, 2019, at 3:34 PM, Bernd Lentes bernd.lentes@helmholtz-muenchen.de wrote:
...
Hi,
i have a two node HA-Cluster with several domains as resources.
Currently it's running in test mode.
Some domains (all on the same host) stopped running, virsh list shows them as
"paused".
All stopped at the same time (11th of may, 7:00 am), my monitoring system began
to yell.
I don't have any clue why this happened.
virsh domblkerror says for all the domains (5) "no space". The days before the
domains were running fine and i know that all disks inside the domain should
have enough space.
Also the host is not running out of space.
The logs don't say anything sensefully, unfortunately i didn't have a log for
the libvirtd daemon, i just configured that now.
The domains are stopped each day by cron at 10:30 pm for a short moment, a
snapshot is taken, domains are started again, the backing file is copied to a
CIFS server and if that is finished the snapshot is blockcommited into the
backing file.
That's working fine already for several days. This cronjob creates a log and
it's looking fine.
The domains reside in naked Logical Volumes, the respective Volume Group has
enough space.
I resumed one of the guests and it continued without any problem.
The log doesn't indicate any problem, and df -h shows enough space on
all partitions.
'virsh domstate --reason $GUEST'

will tell you what event caused the guest to pause in the first place.

If you can resume successfully, this indicates the event was a transient
problem.   Given the domblkerror message 'no space' I'm it looks that
you had a problem running out of disk space temporarily which then
resolved itself.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [libvirt-users] domains paused without any obvious reason

Daniel P. Berrangé