-----Original Message-----
From: Thanos Makatos <thanos.makatos(a)nutanix.com>
Sent: Monday, March 4, 2024 9:45 PM
To: Thanos Makatos <thanos.makatos(a)nutanix.com>; devel(a)lists.libvirt.org
Subject: RE: join running core dump job
> -----Original Message-----
> From: Thanos Makatos <thanos.makatos(a)nutanix.com>
> Sent: Monday, March 4, 2024 5:24 PM
> To: devel(a)lists.libvirt.org
> Subject: join running core dump job
>
> Is there a way to programmatically wait for a previously initiated
> virDomainCoreDumpWithFormat() where the process that started it died?
I'm
> looking at the API and don't seem to find anything relevant. I suppose I
could
> poll via virDomainGetJobStats(), but, ideally, I'd like a function that would
join
> the dump job and return when the dump job finishes.
> _______________________________________________
> Devel mailing list -- devel(a)lists.libvirt.org
> To unsubscribe send an email to devel-leave(a)lists.libvirt.org
I see there's qemuDumpWaitForCompletion(), looks promising.
I've made some progress (added a virHypervisorDriver.domainCoreDumpWait and the
relevant scaffolding to make 'virsh dump --wait' work), calling
qemuDumpWaitForCompletion() is all that's needed.
However, it doesn't seem trivial to implement this in the test_driver.
First, IIUC testDomainCoreDumpWithFormat() gets an exclusive lock on the domain
(haven't tested anything yet), so calling domainCoreDumpWait() would block for the
wrong reason. Is making testDomainCoreDumpWithFormat() using asynchronous jobs internally
unavoidable?
Second, I want to test behaviour in my application where (1) it calls domainCoreDump(),
(2) crashes before domainCoreDump() finishes, (3) then my application starts again and
looks for a pending dump job and (4) joins it using domainCoreDumpWait(). I can't see
an easy wait of faking a dump job in the test_driver when it starts. How about adding
persistent tasks, which I can pre-populate before starting my application, or fake jobs
via an environment variable, so that when the test_driver starts it can internally
continue them? E.g. we can specify how long to run the job for and domainCoreDumpWait()
add a sleep for that long.
I'm open to suggestions.