On Thu, Nov 21, 2019 at 18:54:12 +0000, Daniel Berrange wrote:
On Fri, Oct 18, 2019 at 06:11:15PM +0200, Peter Krempa wrote:
> From: Eric Blake <eblake(a)redhat.com>
>
> Introduce a few new public APIs related to incremental backups. This
> builds on the previous notion of a checkpoint (without an existing
> checkpoint, the new API is a full backup, differing from
> virDomainBlockCopy in the point of time chosen and in operation on
> multiple disks at once); and also allows creation of a new checkpoint
> at the same time as starting the backup (after all, an incremental
> backup is only useful if it covers the state since the previous
> backup).
>
> A backup job also affects filtering a listing of domains, as well as
> adding event reporting for signaling when a push model backup
> completes (where the hypervisor creates the backup); note that the
> pull model does not have an event (starting the backup lets a third
> party access the data, and only the third party knows when it is
> finished).
>
> Since multiple backup jobs can be run in parallel in the future (well,
> qemu doesn't support it yet, but we don't want to preclude the idea),
> virDomainBackupBegin() returns a positive job id, and the id is also
> visible in the backup XML. But until a future libvirt release adds a
> bunch of APIs related to parallel job management where job ids will
> actually matter, the documentation is also clear that job id 0 means
> the 'currently running backup job' (provided one exists), for use in
> virDomainBackupGetXMLDesc() and virDomainBackupEnd().
>
> The full list of new APIs:
> virDomainBackupBegin;
> virDomainBackupEnd;
> virDomainBackupGetXMLDesc;
>
> Signed-off-by: Eric Blake <eblake(a)redhat.com>
> Reviewed-by: Daniel P. Berrangé <berrange(a)redhat.com>
> ---
> include/libvirt/libvirt-domain.h | 26 ++++-
> src/driver-hypervisor.h | 20 ++++
> src/libvirt-domain-checkpoint.c | 7 +-
> src/libvirt-domain.c | 191 +++++++++++++++++++++++++++++++
> src/libvirt_public.syms | 8 ++
> tools/virsh-domain.c | 4 +-
> 6 files changed, 252 insertions(+), 4 deletions(-)
>
> diff --git a/include/libvirt/libvirt-domain.h b/include/libvirt/libvirt-domain.h
> index 22277b0a84..2d9f69f7d4 100644
> --- a/include/libvirt/libvirt-domain.h
> +++ b/include/libvirt/libvirt-domain.h
> @@ -3267,6 +3267,7 @@ typedef enum {
>
> +/**
> + * VIR_DOMAIN_JOB_ID:
> + *
> + * virDomainGetJobStats field: the id of the job (so far, only for jobs
> + * started by virDomainBackupBegin()), as VIR_TYPED_PARAM_INT.
> + */
> +# define VIR_DOMAIN_JOB_ID "id"
> +
> /**
> * VIR_DOMAIN_JOB_TIME_ELAPSED:
> *
> @@ -4106,7 +4115,8 @@ typedef void
(*virConnectDomainEventMigrationIterationCallback)(virConnectPtr co
> * @nparams: size of the params array
> * @opaque: application specific data
> *
> - * This callback occurs when a job (such as migration) running on the domain
> + * This callback occurs when a job (such as migration or push-model
> + * virDomainBackupBegin()) running on the domain
> * is completed. The params array will contain statistics of the just completed
> * job as virDomainGetJobStats would return. The callback must not free @params
> * (the array will be freed once the callback finishes).
> @@ -4916,4 +4926,18 @@ int virDomainGetGuestInfo(virDomainPtr domain,
> int *nparams,
> unsigned int flags);
>
> +
> +int virDomainBackupBegin(virDomainPtr domain,
> + const char *backupXML,
> + const char *checkpointXML,
> + unsigned int flags);
> +
> +char *virDomainBackupGetXMLDesc(virDomainPtr domain,
> + int id,
> + unsigned int flags);
> +
> +int virDomainBackupEnd(virDomainPtr domain,
> + int id,
> + unsigned int flags);
So this is still using a plain integer job ID, which is a concern
wrt future extensibility.
My current plan is to go ahead with API based on this old one but with
no support for any parallel jobs. Basically the same thing but 'id'
argument removed.
This actually fits in with the original documentation which was already
ACKed for the virDomainBackupBegin API which said that the backup job
uses the domain async job infrastructure. This means that the
virDomainAbortJob and virDomainGetJobStats can be used to monitor the
blockjob. It also gives us the possibility to query the state of a
finised job by passing VIR_DOMAIN_JOB_STATS_COMPLETED to
virDomainGetJobStats. We also have an async event for reporting the job
state. I'll also consider removing virDomainBackupEnd for this
implementation as virDomainAbortJob should be enough.
I spoke with oVirt devs who are very keen on getting this API finished
and their requirements currently don't require any parallel jobs. In
fact they were abusing Eric's design which only ever returned the same
job ID so that they would not have to persist it.
Given that we'll need to deal with the domain job anyways for the better
job infra outlined below I don't think adding these APIs will be too
much of a burden in the interim so that we can appease oVirt's desire
for this feature and we'll have more time to design the new job
interface properly.
Earlier in the year I queried whether we should turn the
"job" into a
fully fledged object, using either a string or a UUID to identify it
uniquely.
https://www.redhat.com/archives/libvir-list/2019-March/msg01695.html
IOW having something like this:
Going forward I want this not only for the backup job but basically for
any long running operation. We needed this for a long time but of the
two long running job impls we have both are not flexible enough.
There is only one 'domain job' (migration/save/etc) and blockjobs are
bound to disks.
typedef struct _virDomainJob virDomainJob;
typedef virDomainJob *virDomainJobPtr;
I actually started some work on this but didn't get far yet. I used the
same name in my case, but I'm partially afraid that virDomainAbortJob
which would not be related to these objects will be mistaken for
actually working in this case.
Unfortunately I don't have any better idea.
void virDomainJobFree(virDomainJobPtr job);
virDomainJobLookupByUUID(virDomainPtr job,
unsigned char *uuid);
int virDomainJobGetType(virDomainJobPtr job);
int virDomainJobGetUUID(virDomainJobPtr job,
unsigned char *uuid);
int virDomainJobGetUUIDString(virDomainJobPtr job,
char *uuidstr);
virDomainJobPtr virDomainBackupBegin(virDomainPtr domain,
const char *backupXML,
const char *checkpointXML,
unsigned int flags);
I was thinking about a super-universal API so that we don't have to redo
all APIs for blockjobs. Something along
virDomainJobPtr virDomainJobBegin(virDomainPtr domain,
const char *jobxml,
unsigned int flags);
It would give us the flexibility to add new jobs and arguments for them
via XML which is more flexible (e.g. we could easily add a
virDomainBlockPull with the 'top' argument which is currently missing
but qemu started to support it some time ago). On the other hand that
seems a too prone to abuse.
As of the above I'm unsure about the tradeofs between flexibility and
too much flexibility in this case. But that's probably for a future
discussion.
char *virDomainBackupGetXMLDesc(virDomainPtr domain,
virDomainJobPtr job,
unsigned int flags);
int virDomainBackupEnd(virDomainPtr domain,
virDomainJobPtr job,
unsigned int flags);
Privately we'd define
struct _virDomainJob {
unsigned char uuid[VIR_UUID_BUFLEN];
int type;
};
So we don't do a RPC call for virDomainJobGet{Type,UUID,UUIDString},
and we'd just serialize the uuid over the wire for the virDomainBackup
APIs I presume.
Currently we have a single domain block job that can be active at any time
and it would be desirable to fold that into the new API in some way.
Yes exactly.
We can either create new v2 APIs for existing methods making them return
a virDomainJobPtr, or we could reserve a well-known UUID exclusively
to refer to the single default domain block job, which the user can grab
via virDomainJobLookupByUUID().
Yes I was thinking along the same lines. Given that we'll need to deal
with this anyways I don't feel as bad adding the backup job as a domain
job for now with all the quirks and also all the infrastructure a domain
job already provides and then dealing wit the domain jobs properly
later.