On a Tuesday in 2020, Douglas, William wrote:
On Tue, Sep 22, 2020 at 1:11 AM Daniel P. Berrangé
<berrange(a)redhat.com> wrote:
>
> On Mon, Sep 21, 2020 at 12:05:48PM -0700, Douglas, William wrote:
> > On Tue, Sep 15, 2020 at 5:53 AM Daniel P. Berrangé <berrange(a)redhat.com>
wrote:
> > >
> > > On Thu, Aug 27, 2020 at 11:24:32AM -0700, William Douglas wrote:
> >
> > <snip>
> >
> > > > +virCHMonitorPtr
> > > > +virCHMonitorNew(virDomainObjPtr vm, const char *socketdir)
> > > > +{
> > > > + virCHMonitorPtr ret = NULL;
> > > > + virCHMonitorPtr mon = NULL;
> > > > + virCommandPtr cmd = NULL;
> > > > + int pings = 0;
> > > > +
> > > > + if (virCHMonitorInitialize() < 0)
> > > > + return NULL;
> > > > +
> > > > + if (!(mon = virObjectLockableNew(virCHMonitorClass)))
> > > > + return NULL;
> > > > +
> > > > + mon->socketpath = g_strdup_printf("%s/%s-socket",
socketdir, vm->def->name);
> > > > +
> > > > + /* prepare to launch Cloud-Hypervisor socket */
> > > > + if (!(cmd = chMonitorBuildSocketCmd(vm, mon->socketpath)))
> > > > + goto cleanup;
> > > > +
> > > > + if (virFileMakePath(socketdir) < 0) {
> > > > + virReportSystemError(errno,
> > > > + _("Cannot create socket directory
'%s'"),
> > > > + socketdir);
> > > > + goto cleanup;
> > > > + }
> > > > +
> > > > + /* launch Cloud-Hypervisor socket */
> > > > + if (virCommandRunAsync(cmd, &mon->pid) < 0)
> > > > + goto cleanup;
> > > > +
> > > > + /* get a curl handle */
> > > > + mon->handle = curl_easy_init();
> > > > +
> > > > + /* try to ping VMM socket 5 times to make sure it is ready */
> > > > + while (pings < 5) {
> > > > + if (virCHMonitorPingVMM(mon) == 0)
> > > > + break;
> > > > + if (pings == 5)
> > > > + goto cleanup;
> > > > +
> > > > + g_usleep(100 * 1000);
> > > > + }
> > >
> > > This is highly undesirable. Is there no way to launch the CH process
> > > such that the socket is guaranteed to be accepting requests by the
> > > time it has forked into the background ? Or is there a way to pass
> > > in a pre-opened FD for the monitor socket ?
> > >
> > > This kind of wait with timeout will cause startup failures due to
> > > timeout under load.
> >
> > Currently the cloud-hypervisor process doesn't fork into the
> > background so that was initially what I was thinking to add to
> > cloud-hypervsior. Would tracking the pid of the cloud-hypervsior
> > running in the background be required at that point (assuming the
> > socket path to control the daemon would be known and working given
> > virCommandRun returns successfully)?
>
> It doesn't especially matter whether cloud-hypervsior forks into
> the background itself, or whether libvirt forks it into the
> background when launching it.
>
> The important thing is simply to ensure that whichever startup
> approach is used can be implemented in a race-free manner without
> needing busy-waits or timeouts.
>
> If cloud-hypervisor forks into the background, then it would
> need to write a pidfile so libvirt can determine the pid that
> ended up running. The act of libvirt waiting for the pidfile
> to be written is potentially racy though, so that might not be
> be best.
>
> Based on what we learnt from QEMU, I think being able to pass
> in a pre-created UNIX listener socket is the best. That lets
> libvirt immedately issue a connect() start after forking the
> cloud-hypervisor process. If cloud-hypervisor succesfully
> starts up, then the client socket becomes live and can be used.
> if it fails to startup, then the client socket libvirt has
> will get EOF and thus we'll see the startup failure. This
> avoids any looping/sleeping/timeout.
I think I'm misreading/missing something from the qemu setup but
looking at qemu_monitor.c's qemuMonitorOpenUnix function I see a retry
connect loop based on a timeout when trying to connect to a socket
That loop is guarded by: if (retry)
The important code path:
qemuProcessLaunch
`_ qemuProcessWaitForMonitor
`_ qemuConnectMonitor
qemuProcessWaitForMonitor sets retry=false if we're dealing with
a recent enough QEMU that supports FD passing for its chardevs
The logic that creates the socket on libvirt's side lives (sadly)
in qemuBuildChrChardevStr.
Jano
(with a comment that leads me to believe it is possible to have the
monitor socket file not yet exist). I was expecting to see something
like forking qemu and passing the fd for the socket that was opened
that qemu should use (and that libvirt connects to) but I think I only
see that for the network socket fds that qemu is supposed to use. If
you could point me in the right direction here I'd appreciate it.