On Wed, Oct 08, 2014 at 23:45:17 +0900, Cristian Klein wrote:
On 07 Oct 2014, at 22:08 , Jiri Denemark <jdenemar(a)redhat.com> wrote:
> On Tue, Sep 30, 2014 at 16:39:26 +0200, Cristian Klein wrote:
>> Some jobs feature several phases. For example, post-copy migration is
>> composed of a first phase, from migration start to switching to
>> post-copy, and a second phase, to migration completion. This
>> job type allows to flag that the job has completed the first phase,
>> but is not yet fully completed.
>>
>> Signed-off-by: Cristian Klein <cristian.klein(a)cs.umu.se>
>> ---
>> include/libvirt/libvirt.h.in | 1 +
>> tools/virsh-domain.c | 3 ++-
>> 2 files changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in
>> index 84cd5a4..81044f0 100644
>> --- a/include/libvirt/libvirt.h.in
>> +++ b/include/libvirt/libvirt.h.in
>> @@ -4307,6 +4307,7 @@ typedef enum {
>> VIR_DOMAIN_JOB_COMPLETED = 3, /* Job has finished, but isn't cleaned up
*/
>> VIR_DOMAIN_JOB_FAILED = 4, /* Job hit error, but isn't cleaned up */
>> VIR_DOMAIN_JOB_CANCELLED = 5, /* Job was aborted, but isn't cleaned up
*/
>> + VIR_DOMAIN_JOB_PHASE1_COMPLETED = 6, /* Job completed first phase, e.g.,
post-copy activation */
>
> This is not a job type. If we need to advertise this to libvirt clients,
> we may introduce a new job statistics typed parameter but we should not
> misuse job type. And I'm not entirely convinced we need to advertise
> this. To me it seems the only interested thing is whether a domain is
> still running on the source host or it was already resumed on the
> destination host. And it's easy to get this kind of information via
> existing ways, e.g., listening to domain life cycle events or by
> checking domain's status.
My rationale for introducing this flag was to prevent a future
breakage of an internal API. I felt like none of the previous flags
where accurately representing the status of this job after post-copy
has started.
On the other hand, you changed something that is externally visible.
Apps would be pretty confused by job type changing from UNBOUNDED to
PHASE1_COMPLETED during migration.
Sure, I could use `job->status.status` in
`qemuMigrationWaitForCompletion`, but I was concerned that a future
`libvirt` addition might not take this subtly into account and might
break post-copy.
Well, we have to be careful :-) Also I'm not sure how the new
VIR_DOMAIN_JOB_PHASE1_COMPLETED would help...
If you think I should go ahead, suppress this job type and use
`job->status.status` instead, what should I use instead,
`VIR_DOMAIN_JOB_UNBOUNDED` or `VIR_DOMAIN_JOB_BOUNDED`? To be honest,
I’m not sure to have completely understood the difference between
them, except that `VIR_DOMAIN_JOB_BOUNDED` seems to be used to signal
that a job is almost complete.
virDomainJobType is a bit overloaded since it is mostly used to describe
what kind of job is running and also about its state.
- VIR_DOMAIN_JOB_NONE - no job is currently running
- VIR_DOMAIN_JOB_BOUNDED - there is a job running and we know how much
data we will need to process during the job. That is, apps may easily
indicate its progress because we report the total amount of data and
how much we already processed. Currently we don't have such jobs in
libvirt.
- VIR_DOMAIN_JOB_UNBOUNDED - there is a job running and we have no idea
how much data we will need to process. That is, while we are
processing data, something we already processed gets changed and we
need to process it again. This is a live migration, once all memory is
processed, we need to process the memory that changed in the
meantime...
- VIR_DOMAIN_JOB_COMPLETED - there was a job running and it has just
finished (very hard to be seen externally until recently when we added
support for explicitly querying statistics about a completed job)
- VIR_DOMAIN_JOB_FAILED - there was a job running and it failed
- VIR_DOMAIN_JOB_CANCELLED - there was a job running and it was
cancelled either by libvirt itself or by a user/app
Apps know VIR_DOMAIN_{UN,}BOUNDED job mean there is a job running. We
can't change that without breaking them.
Jirka