
On 05/17/2018 05:43 PM, Eric Blake wrote:
Here's my updated counterproposal for a backup API.
/** * virDomainBackupBegin:
* * There are two fundamental backup approaches. The first, called a * push model, instructs the hypervisor to copy the state of the guest * disk to the designated storage destination (which may be on the * local file system or a network device); in this mode, the * hypervisor writes the content of the guest disk to the destination, * then emits VIR_DOMAIN_EVENT_ID_BLOCK_JOB_2 when the backup is * either complete or failed (the backup image is invalid if the job * is ended prior to the event being emitted).
Better is VIR_DOMAIN_EVENT_ID_JOB_COMPLETED (BLOCK_JOB can only inform status about one disk, while this is intended to inform about multiple disks done in a single transaction). I'm a bit depressed at our technical debt in this area: virDomainGetJobStats() and virDomainAbortJob() don't take a job id, but only operate on the most recently started job, but I did mention elsewhere in my plans:
I think that it should be possible to run multiple backup operations in parallel in the long run. But in the interest of getting a proof of concept implementation out quickly, it's easier to state that for the initial implementation, libvirt supports at most one backup operation at a time (to do another backup, you have to wait for the current one to complete, or else abort and abandon the current one). As there is only one backup job running at a time, the existing virDomainGetJobInfo()/virDomainGetJobStats() will be able to report statistics about the job (insofar as such statistics are available). But in preparation for the future, when libvirt does add parallel job support, starting a backup job will return a job id; and presumably we'd add a new virDomainGetJobStatsByID() for grabbing statistics of an arbitrary (rather than the most-recently-started) job.
Since live migration also acts as a job visible through virDomainGetJobStats(), I'm going to treat an active backup job and live migration as mutually exclusive. This is particularly true when we have a pull model backup ongoing: if qemu on the source is acting as an NBD server, you can't migrate away from that qemu and tell the NBD client to reconnect to the NBD server on the migration destination. So, to perform a migration, you have to cancel any pending backup operations. Conversely, if a migration job is underway, it will not be possible to start a new backup job until migration completes. However, we DO need to modify migration to ensure that any persistent bitmaps are migrated.
Yes, this means that virDomainBackupEnd() (which takes a job id) and virDomainJobAbort() (which does not, but until we support parallel backup jobs or a mix of backup and migration at once, it does not matter) can initially both do the work of aborting a backup job. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org