On Mon, May 13, 2019 at 06:19:05PM +0200, Lentes, Bernd wrote:
----- On May 13, 2019, at 3:34 PM, Bernd Lentes bernd.lentes(a)helmholtz-muenchen.de
wrote:
> Hi,
>
> i have a two node HA-Cluster with several domains as resources.
> Currently it's running in test mode.
> Some domains (all on the same host) stopped running, virsh list shows them as
> "paused".
> All stopped at the same time (11th of may, 7:00 am), my monitoring system began
> to yell.
> I don't have any clue why this happened.
> virsh domblkerror says for all the domains (5) "no space". The days before
the
> domains were running fine and i know that all disks inside the domain should
> have enough space.
> Also the host is not running out of space.
> The logs don't say anything sensefully, unfortunately i didn't have a log
for
> the libvirtd daemon, i just configured that now.
> The domains are stopped each day by cron at 10:30 pm for a short moment, a
> snapshot is taken, domains are started again, the backing file is copied to a
> CIFS server and if that is finished the snapshot is blockcommited into the
> backing file.
> That's working fine already for several days. This cronjob creates a log and
> it's looking fine.
> The domains reside in naked Logical Volumes, the respective Volume Group has
> enough space.
>
>
I resumed one of the guests and it continued without any problem.
The log doesn't indicate any problem, and df -h shows enough space on
all partitions.
'virsh domstate --reason $GUEST'
will tell you what event caused the guest to pause in the first place.
If you can resume successfully, this indicates the event was a transient
problem. Given the domblkerror message 'no space' I'm it looks that
you had a problem running out of disk space temporarily which then
resolved itself.
Regards,
Daniel
--
|:
https://berrange.com -o-
https://www.flickr.com/photos/dberrange :|
|:
https://libvirt.org -o-
https://fstop138.berrange.com :|
|:
https://entangle-photo.org -o-
https://www.instagram.com/dberrange :|