On Thu, Apr 30, 2020 at 7:23 PM Daniel P. Berrangé <berrange(a)redhat.com> wrote:
On Thu, Apr 30, 2020 at 06:28:08PM +0200, Christian Ehrhardt wrote:
> On Thu, Apr 30, 2020 at 5:10 PM Daniel P. Berrangé <berrange(a)redhat.com>
wrote:
> >
> > On Thu, Apr 30, 2020 at 04:58:25PM +0200, Christian Ehrhardt wrote:
> > > On Thu, Apr 30, 2020 at 2:51 PM Daniel P. Berrangé
<berrange(a)redhat.com> wrote:
> > > Well it seems I have a reproducible symptom and a fix, but not the
> > > explanation why the latter fixes the former.
> > > I'll need to dive into some debug & analysis myself to explain it
better.
> > > I'll be back here once I got time to do that in depth check.
> > >
> > > Until then whoever is affected (should be everyone) can give it a
> > > thought as well.
> > > Repro is as easy as
> > > One console:
> > > $ journalctl -f -u libvirt-guests
> > > Other console:
> > > $ systemctl stop libvirt-guests
> > > $ systemctl start libvirt-guests
> > >
> > > I see it with 245.4-4ubuntu3 (18.04) I'm not seeing it on
> > > 237-3ubuntu10.39 (20.04).
> > > Maybe it is a systemd bug after all?
> > > I'd be interested to hear if that is/isn't clobbering output for
> > > anyone else and what your systemd versions are?
> >
> > FWIW, it works correctly on Fedora 31 with systemd 243.
>
> Eoan with 242-7ubuntu3.7 is good as well.
> I might need to try to get some interim versions from somewhere.
I've reproduced on Fedora 33 rawhide with systemd 245 - the first
place where it lists running guests is screwed up slightly:
Apr 30 17:16:13 libvirt-fedora-rawhide systemd[1]: Stopping Suspend/Resume Running
libvirt Guests...
Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69903]: Running guests on
default URI:
Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Runningcore2
Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69934]: Suspending guests on
default URI...
Apr 30 17:16:13 libvirt-fedora-rawhide libvirt-guests.sh[69892]: SSuspending core1: ...
Apr 30 17:16:14 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Suspending core1: done
Apr 30 17:16:14 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Suspending core2: ...
Apr 30 17:16:15 libvirt-fedora-rawhide libvirt-guests.sh[69892]: Suspending core2: done
Apr 30 17:16:15 libvirt-fedora-rawhide systemd[1]: libvirt-guests.service: Succeeded.
Apr 30 17:16:15 libvirt-fedora-rawhide systemd[1]: Stopped Suspend/Resume Running libvirt
Guests.
On resume it is even worse
Apr 30 17:19:40 libvirt-fedora-rawhide systemd[1]: Starting Suspend/Resume Running
libvirt Guests...
Apr 30 17:19:40 libvirt-fedora-rawhide libvirt-guests.sh[70041]: Resuming guests on
default URI...
Apr 30 17:19:40 libvirt-fedora-rawhide libvirt-guests.sh[70030]: R
Apr 30 17:19:40 libvirt-fedora-rawhide libvirt-guests.sh[70048]: R
Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70048]: esuming guest core1:
Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70079]: esum
Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70030]: e
Apr 30 17:19:41 libvirt-fedora-rawhide libvirt-guests.sh[70086]: e
Apr 30 17:19:42 libvirt-fedora-rawhide libvirt-guests.sh[70086]: esuming guest core2:
Apr 30 17:19:42 libvirt-fedora-rawhide libvirt-guests.sh[70119]: esum
Apr 30 17:19:42 libvirt-fedora-rawhide libvirt-guests.sh[70030]: e
Apr 30 17:19:42 libvirt-fedora-rawhide systemd[1]: Finished Suspend/Resume Running
libvirt Guests.
It is possible it isn't systemd related - could be other packages that
are co-incidentally affecting it.
Since launchpad holds all former builds still ready for download I
could easily test a few versions.
I up/downgraded just the following packages on an otherwise unmodified system:
- libnss-systemd
- libpam-systemd
- libsystemd0
- systemd
- systemd-container
- systemd-sysv
Those versions I did check:
245.4-4ubuntu3 bad
245.2-1ubuntu1 bad
244.3-1ubuntu1 bad
244.1-0ubuntu3 bad (bad on retry)
243-3ubuntu1 good (good (on retry)
242-7ubuntu3 good
Retry means that I went and installed from good -> bad -> good and the
behavior was the same.
So none of the installs fixed it to then be good forever. It was
consistently good <244 and bad >=244.
Since I only installed the mentioned systemd packages and no others
I'd say it is systemd.
I can't break this up into which package, since they inter-depend each other.
I guess it might be time to file a systemd bug for this - if not to
fix then to understand what is going on so that we can make a better
decision.
--
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd