[libvirt] [PATCH v5 0/5] Expose FSFreeze/FSThaw within the guest as API

Hello, This is patchset v5 to add FSFreeze/FSThaw API for custom disk snapshotting. Changes to v4: * add disks and ndisks parameter to specify disks to be frozen/thawed * make fsfreeze requests nestable * change api version to 1.2.4 (v4: https://www.redhat.com/archives/libvir-list/2014-March/msg01674.html ) === Description === Currently FSFreeze and FSThaw are supported by qemu guest agent and they are used internally in snapshot-create command with --quiesce option. However, when users want to utilize the native snapshot feature of storage devices (such as LVM over iSCSI, enterprise storage appliances, etc.), they need to issue fsfreeze command separately from libvirt-driven snapshots. (OpenStack cinder provides these storages' snapshot feature, but it cannot quiesce the guest filesystems automatically for now.) Although virDomainQemuGuestAgent() API could be used for this purpose, it is only for debugging and is not supported officially. This patchset adds virDomainFSFreeze()/virDomainFSThaw() APIs and virsh domfsfreeze/domfsthaw commands to enable the users to freeze and thaw domain's filesystems cleanly. <updated> The APIs take disks and ndisks parameters, which is a list of disk names to be frozen/thawed. If the option is not provided, every mounted filesystem is frozen/thawed. The fsfreeze can be nestable. When fsfreeze requests to a disk are issued multiple times, it is not thawed until the fsthaw requests are issued as many times as the freeze requests. Currently, qemu driver doesn't support disks parameter because the guest agent doesn't have means to specify disks to be frozen/thawed. Hence, it just counts depth of fsfreeze per domain, not per disk, so far. </updated> The APIs have flags option currently unsupported for future extension. --- Tomoki Sekiyama (5): Introduce virDomainFSFreeze() and virDomainFSThaw() public API remote: Implement virDomainFSFreeze and virDomainFSThaw qemu: Track domain quiesced status and make fsfreeze/thaw nestable qemu: Implement virDomainFSFreeze and virDomainFSThaw virsh: Expose new virDomainFSFreeze and virDomainFSThaw API include/libvirt/libvirt.h.in | 10 +++ src/access/viraccessperm.c | 2 - src/access/viraccessperm.h | 6 ++ src/driver.h | 14 ++++ src/libvirt.c | 92 ++++++++++++++++++++++++ src/libvirt_public.syms | 6 ++ src/qemu/qemu_domain.c | 6 ++ src/qemu/qemu_domain.h | 2 + src/qemu/qemu_driver.c | 159 ++++++++++++++++++++++++++++++++++++++---- src/remote/remote_driver.c | 2 + src/remote/remote_protocol.x | 30 +++++++- src/remote_protocol-structs | 18 +++++ src/rpc/gendispatch.pl | 2 + tools/virsh-domain.c | 128 ++++++++++++++++++++++++++++++++++ tools/virsh.pod | 23 ++++++ 15 files changed, 483 insertions(+), 17 deletions(-)

These will freeze and thaw filesystems within guest. The APIs take @disks and @ndisks parameters to specify disks to be frozen or thawed. The parameters can be NULL and 0, then the all mounted filesystes are frozen or thawed. If some disks are frozen multiple times, they are not thawed until requested to be thawed as many times as freeze request. @flags parameter, which are currently not used, is for future extensions. Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> --- include/libvirt/libvirt.h.in | 10 +++++ src/driver.h | 14 ++++++ src/libvirt.c | 92 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 6 +++ 4 files changed, 122 insertions(+) diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 930b7e8..d408f19 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -5277,6 +5277,16 @@ int virDomainFSTrim(virDomainPtr dom, unsigned long long minimum, unsigned int flags); +int virDomainFSFreeze(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags); + +int virDomainFSThaw(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags); + /** * virSchedParameterType: * diff --git a/src/driver.h b/src/driver.h index e66fc7a..5c81d96 100644 --- a/src/driver.h +++ b/src/driver.h @@ -1149,6 +1149,18 @@ typedef int unsigned int flags, int cancelled); +typedef int +(*virDrvDomainFSFreeze)(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags); + +typedef int +(*virDrvDomainFSThaw)(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags); + typedef struct _virDriver virDriver; typedef virDriver *virDriverPtr; @@ -1363,6 +1375,8 @@ struct _virDriver { virDrvDomainMigrateFinish3Params domainMigrateFinish3Params; virDrvDomainMigrateConfirm3Params domainMigrateConfirm3Params; virDrvConnectGetCPUModelNames connectGetCPUModelNames; + virDrvDomainFSFreeze domainFSFreeze; + virDrvDomainFSThaw domainFSThaw; }; diff --git a/src/libvirt.c b/src/libvirt.c index 4454829..43614b5 100644 --- a/src/libvirt.c +++ b/src/libvirt.c @@ -20658,3 +20658,95 @@ virDomainFSTrim(virDomainPtr dom, virDispatchError(dom->conn); return -1; } + +/** + * virDomainFSFreeze: + * @dom: a domain object + * @disks: list of disk names to be frozen + * @ndisks: the number of disks specified in @disks + * @flags: extra flags, not used yet, so callers should always pass 0 + * + * Freeze filesystems on the specified disks within the guest (hence guest + * agent may be required depending on hypervisor used). If @ndisks is 0, + * every mounted filesystem on the guest is frozen. + * Freeze can be nested. When it is called on the same disk multiple times, + * the disk will not be thawed until virDomainFSThaw is called the same times. + * + * Returns 0 on success, -1 otherwise. + */ +int +virDomainFSFreeze(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags) +{ + VIR_DOMAIN_DEBUG(dom, "disks=%p, ndisks=%d, flags=%x", + disks, ndisks, flags); + + virResetLastError(); + + virCheckDomainReturn(dom, -1); + virCheckReadOnlyGoto(dom->conn->flags, error); + if (ndisks) + virCheckNonNullArgGoto(disks, error); + else + virCheckNullArgGoto(disks, error); + + if (dom->conn->driver->domainFSFreeze) { + int ret = dom->conn->driver->domainFSFreeze(dom, disks, ndisks, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + + error: + virDispatchError(dom->conn); + return -1; +} + +/** + * virDomainFSThaw: + * @dom: a domain object + * @disks: list of disk names to be thawed + * @ndisks: the number of disks specified in @disks + * @flags: extra flags, not used yet, so callers should always pass 0 + * + * Thaw filesystems on the specified disks within the guest (hence guest + * agent may be required depending on hypervisor used). If @ndisks is 0, + * every mounted filesystem on the guest is thawed. + * + * Returns 0 on success, -1 otherwise. + */ +int +virDomainFSThaw(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags) +{ + VIR_DOMAIN_DEBUG(dom, "disks=%p, ndisks=%d, flags=%x", + disks, ndisks, flags); + + virResetLastError(); + + virCheckDomainReturn(dom, -1); + virCheckReadOnlyGoto(dom->conn->flags, error); + if (ndisks) + virCheckNonNullArgGoto(disks, error); + else + virCheckNullArgGoto(disks, error); + + if (dom->conn->driver->domainFSThaw) { + int ret = dom->conn->driver->domainFSThaw(dom, disks, ndisks, flags); + if (ret < 0) + goto error; + return ret; + } + + virReportUnsupportedError(); + + error: + virDispatchError(dom->conn); + return -1; +} diff --git a/src/libvirt_public.syms b/src/libvirt_public.syms index 9ab0c92..79d3edc 100644 --- a/src/libvirt_public.syms +++ b/src/libvirt_public.syms @@ -650,5 +650,11 @@ LIBVIRT_1.2.3 { virDomainCoreDumpWithFormat; } LIBVIRT_1.2.1; +LIBVIRT_1.2.4 { + global: + virDomainFSFreeze; + virDomainFSThaw; +} LIBVIRT_1.2.3; + # .... define new API here using predicted next version number ....

On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
These will freeze and thaw filesystems within guest. The APIs take @disks and @ndisks parameters to specify disks to be frozen or thawed. The parameters can be NULL and 0, then the all mounted filesystes are frozen or thawed. If some disks are frozen multiple times, they are not thawed until requested to be thawed as many times as freeze request. @flags parameter, which are currently not used, is for future extensions.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> --- include/libvirt/libvirt.h.in | 10 +++++ src/driver.h | 14 ++++++ src/libvirt.c | 92 ++++++++++++++++++++++++++++++++++++++++++ src/libvirt_public.syms | 6 +++ 4 files changed, 122 insertions(+)
diff --git a/include/libvirt/libvirt.h.in b/include/libvirt/libvirt.h.in index 930b7e8..d408f19 100644 --- a/include/libvirt/libvirt.h.in +++ b/include/libvirt/libvirt.h.in @@ -5277,6 +5277,16 @@ int virDomainFSTrim(virDomainPtr dom, unsigned long long minimum, unsigned int flags);
+int virDomainFSFreeze(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags); + +int virDomainFSThaw(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags);
Are all guests OS required to support unfreeze on a per disk basis. I vaguely recall someone mentioning that some OS can do all-or-none only. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 04/04/2014 02:51 AM, Daniel P. Berrange wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
These will freeze and thaw filesystems within guest. The APIs take @disks and @ndisks parameters to specify disks to be frozen or thawed. The parameters can be NULL and 0, then the all mounted filesystes are frozen or thawed. If some disks are frozen multiple times, they are not thawed until requested to be thawed as many times as freeze request. @flags parameter, which are currently not used, is for future extensions.
+int virDomainFSThaw(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags);
Are all guests OS required to support unfreeze on a per disk basis. I vaguely recall someone mentioning that some OS can do all-or-none only.
If that's the case, the command can issue an error if disks is non-NULL but the guest agent required an all-or-none operation. -- Eric Blake eblake redhat com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 4/4/14 8:15 , "Eric Blake" <eblake@redhat.com> wrote:
On 04/04/2014 02:51 AM, Daniel P. Berrange wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
These will freeze and thaw filesystems within guest. The APIs take @disks and @ndisks parameters to specify disks to be frozen or thawed. The parameters can be NULL and 0, then the all mounted filesystes are frozen or thawed. If some disks are frozen multiple times, they are not thawed until requested to be thawed as many times as freeze request. @flags parameter, which are currently not used, is for future extensions.
+int virDomainFSThaw(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags);
Are all guests OS required to support unfreeze on a per disk basis. I vaguely recall someone mentioning that some OS can do all-or-none only.
If that's the case, the command can issue an error if disks is non-NULL but the guest agent required an all-or-none operation.
Right, Disks should be non-NULL or the same as those passed on FSFreeze on the OS, otherwise it will cause an error. Thanks, -- Tomoki Sekiyama

On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
+ +/** + * virDomainFSFreeze: + * @dom: a domain object + * @disks: list of disk names to be frozen + * @ndisks: the number of disks specified in @disks
I realize this current patch series doesn't use the @disks parameter at all, but what exactly are we expecting to be passed here ? Is this a list of filesystem paths from the guest OS pov, or is it a list of disks targets from libvirt's POV ? I'm guessing the former since this is expected to be passed to the guest agent.
+ * @flags: extra flags, not used yet, so callers should always pass 0 + * + * Freeze filesystems on the specified disks within the guest (hence guest + * agent may be required depending on hypervisor used). If @ndisks is 0, + * every mounted filesystem on the guest is frozen. + * Freeze can be nested. When it is called on the same disk multiple times, + * the disk will not be thawed until virDomainFSThaw is called the same times.
I'm not entirely convinced we should allow nesting. I think it would be better to report an error if the user tries to freeze a disk that is already frozen, or tries to thaw a disks that is not frozen. Nesting makes it harder for applications to know just what state the guest is in. eg, consider that they need to run a command in the guest agent that requires the guest FS to be un-thawed. If we allow nesting, then after the app runs virDomainFSThaw, the app can't tell whether or not all thes path are un-thawed or not. IMHO we should forbid nesting. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Hi Daniel, thanks for your comment. On 4/22/14 11:39 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
+ +/** + * virDomainFSFreeze: + * @dom: a domain object + * @disks: list of disk names to be frozen + * @ndisks: the number of disks specified in @disks
I realize this current patch series doesn't use the @disks parameter at all, but what exactly are we expecting to be passed here ? Is this a list of filesystem paths from the guest OS pov, or is it a list of disks targets from libvirt's POV ? I'm guessing the former since this is expected to be passed to the guest agent.
My intention for 'disks' is latter, a list of disks targets from libvirt's POV. Currently it is just a placeholder of API. It would be passed to the agent after it is converted into a list of device addresses (e.g. a pair of drive address and controller's PCI address) so that the agent can look up filesystems on the specified disks.
+ * @flags: extra flags, not used yet, so callers should always pass 0 + * + * Freeze filesystems on the specified disks within the guest (hence guest + * agent may be required depending on hypervisor used). If @ndisks is 0, + * every mounted filesystem on the guest is frozen. + * Freeze can be nested. When it is called on the same disk multiple times, + * the disk will not be thawed until virDomainFSThaw is called the same times.
I'm not entirely convinced we should allow nesting. I think it would be better to report an error if the user tries to freeze a disk that is already frozen, or tries to thaw a disks that is not frozen. Nesting makes it harder for applications to know just what state the guest is in.
eg, consider that they need to run a command in the guest agent that requires the guest FS to be un-thawed. If we allow nesting, then after the app runs virDomainFSThaw, the app can't tell whether or not all thes path are un-thawed or not.
IMHO we should forbid nesting.
Nesting was originally advised by Eric for the reason of scalability. But if it is not actually wanted by apps, I think we can remove nesting from API design. (I personally don't know apps that run parallel FSFreeze operations for multiple disks.)
From the implementation view point, nesting may be difficult to supported in qemu guest agent, because some guest operating systems such as Windows cannot start new FSFreeze while some of filesystems are frozen.
Regards, Tomoki Sekiyama

On Tue, Apr 22, 2014 at 06:22:18PM +0000, Tomoki Sekiyama wrote:
Hi Daniel, thanks for your comment.
On 4/22/14 11:39 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
+ +/** + * virDomainFSFreeze: + * @dom: a domain object + * @disks: list of disk names to be frozen + * @ndisks: the number of disks specified in @disks
I realize this current patch series doesn't use the @disks parameter at all, but what exactly are we expecting to be passed here ? Is this a list of filesystem paths from the guest OS pov, or is it a list of disks targets from libvirt's POV ? I'm guessing the former since this is expected to be passed to the guest agent.
My intention for 'disks' is latter, a list of disks targets from libvirt's POV. Currently it is just a placeholder of API. It would be passed to the agent after it is converted into a list of device addresses (e.g. a pair of drive address and controller's PCI address) so that the agent can look up filesystems on the specified disks.
Hmm, I wonder how practical such a conversion will be. eg libvirt has a block device "sda", but the guest OS may have added a partition table (sda1, sda2, sda3, etc) and then put some of those partitions into LVM (volgroup00) and then created logical volume (vol1, vol2, etc). The guest agent can freeze individual filesystems on each logical volume, so if the API is just taking libvirt block device names, we can't express any way to freeze the filesystems the guest has. So I think we have no choice but to actually have the API take a list of guest "volumes" (eg mount points in Linux, or drive letters in Windows). Ideally the guest agent would also provide a way to list all currently known guest "volumes" so we could expose that in the API too later.
+ * @flags: extra flags, not used yet, so callers should always pass 0 + * + * Freeze filesystems on the specified disks within the guest (hence guest + * agent may be required depending on hypervisor used). If @ndisks is 0, + * every mounted filesystem on the guest is frozen. + * Freeze can be nested. When it is called on the same disk multiple times, + * the disk will not be thawed until virDomainFSThaw is called the same times.
I'm not entirely convinced we should allow nesting. I think it would be better to report an error if the user tries to freeze a disk that is already frozen, or tries to thaw a disks that is not frozen. Nesting makes it harder for applications to know just what state the guest is in.
eg, consider that they need to run a command in the guest agent that requires the guest FS to be un-thawed. If we allow nesting, then after the app runs virDomainFSThaw, the app can't tell whether or not all thes path are un-thawed or not.
IMHO we should forbid nesting.
Nesting was originally advised by Eric for the reason of scalability. But if it is not actually wanted by apps, I think we can remove nesting from API design. (I personally don't know apps that run parallel FSFreeze operations for multiple disks.)
From the implementation view point, nesting may be difficult to supported in qemu guest agent, because some guest operating systems such as Windows cannot start new FSFreeze while some of filesystems are frozen.
Yep, I hadn't even considered that impl complexity. I agree we should just drop the notion of nesting and say that we'll return VIR_ERR_OPERATION_INVALID if someone attempts to freeze an already frozen FS.
Regards, Tomoki Sekiyama
Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

Hi Daniel, On 4/23/14 5:55 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Tue, Apr 22, 2014 at 06:22:18PM +0000, Tomoki Sekiyama wrote:
Hi Daniel, thanks for your comment.
On 4/22/14 11:39 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
+ +/** + * virDomainFSFreeze: + * @dom: a domain object + * @disks: list of disk names to be frozen + * @ndisks: the number of disks specified in @disks
I realize this current patch series doesn't use the @disks parameter at all, but what exactly are we expecting to be passed here ? Is this a list of filesystem paths from the guest OS pov, or is it a list of disks targets from libvirt's POV ? I'm guessing the former since this is expected to be passed to the guest agent.
My intention for 'disks' is latter, a list of disks targets from libvirt's POV. Currently it is just a placeholder of API. It would be passed to the agent after it is converted into a list of device addresses (e.g. a pair of drive address and controller's PCI address) so that the agent can look up filesystems on the specified disks.
Hmm, I wonder how practical such a conversion will be.
eg libvirt has a block device "sda", but the guest OS may have added a partition table (sda1, sda2, sda3, etc) and then put some of those partitions into LVM (volgroup00) and then created logical volume (vol1, vol2, etc). The guest agent can freeze individual filesystems on each logical volume, so if the API is just taking libvirt block device names, we can't express any way to freeze the filesystems the guest has.
Specifying libvirt disk alias name is coming from applications' requirement. For example, OpenStack cinder driver only knows provide libvirt device names. It is also nice if virDomainSnapshotCreateXML can specify 'disks' to be frozen when only a subset of the disks is specified in snapshot XML. I'm now prototyping qemu-guest-agent code to resolve filesystems from disk addresses, and it is working with virtio/SATA/IDE/SCSI drives on x86 Linux guests. It can also handle LVM logical volumes that lies on multiple partitions on multiple disks. https://github.com/tsekiyama/qemu/commit/6d26115e769a7fe6aba7be52d2180453ac a5fee5 This gathers disk device information from sysfs in the guest. On windows, I hope Virtual Disk Service can provide this kind of informations too.
So I think we have no choice but to actually have the API take a list of guest "volumes" (eg mount points in Linux, or drive letters in Windows).
Ideally the guest agent would also provide a way to list all currently known guest "volumes" so we could expose that in the API too later.
Possibly. If the volumes information from the API contains the dependent hardware addresses, libvirt clients might be able to map the volumes and libvirt disks from domain XML. In this case, specifying subset volumes in virDomainSnapshotCreateXML would be difficult. Maybe libvirt should provide the mapping function. Which way do you prefer?
+ * @flags: extra flags, not used yet, so callers should always pass 0 + * + * Freeze filesystems on the specified disks within the guest (hence guest + * agent may be required depending on hypervisor used). If @ndisks is 0, + * every mounted filesystem on the guest is frozen. + * Freeze can be nested. When it is called on the same disk multiple times, + * the disk will not be thawed until virDomainFSThaw is called the same times.
I'm not entirely convinced we should allow nesting. I think it would be better to report an error if the user tries to freeze a disk that is already frozen, or tries to thaw a disks that is not frozen. Nesting makes it harder for applications to know just what state the guest is in.
eg, consider that they need to run a command in the guest agent that requires the guest FS to be un-thawed. If we allow nesting, then after the app runs virDomainFSThaw, the app can't tell whether or not all thes path are un-thawed or not.
IMHO we should forbid nesting.
Nesting was originally advised by Eric for the reason of scalability. But if it is not actually wanted by apps, I think we can remove nesting from API design. (I personally don't know apps that run parallel FSFreeze operations for multiple disks.)
From the implementation view point, nesting may be difficult to supported in qemu guest agent, because some guest operating systems such as Windows cannot start new FSFreeze while some of filesystems are frozen.
Yep, I hadn't even considered that impl complexity. I agree we should just drop the notion of nesting and say that we'll return VIR_ERR_OPERATION_INVALID if someone attempts to freeze an already frozen FS.
OK, I will drop nesting.
Regards, Daniel
Regards, Tomoki Sekiyama

On Thu, Apr 24, 2014 at 12:16:00AM +0000, Tomoki Sekiyama wrote:
Hi Daniel,
On 4/23/14 5:55 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Tue, Apr 22, 2014 at 06:22:18PM +0000, Tomoki Sekiyama wrote:
Hi Daniel, thanks for your comment.
On 4/22/14 11:39 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
+ +/** + * virDomainFSFreeze: + * @dom: a domain object + * @disks: list of disk names to be frozen + * @ndisks: the number of disks specified in @disks
I realize this current patch series doesn't use the @disks parameter at all, but what exactly are we expecting to be passed here ? Is this a list of filesystem paths from the guest OS pov, or is it a list of disks targets from libvirt's POV ? I'm guessing the former since this is expected to be passed to the guest agent.
My intention for 'disks' is latter, a list of disks targets from libvirt's POV. Currently it is just a placeholder of API. It would be passed to the agent after it is converted into a list of device addresses (e.g. a pair of drive address and controller's PCI address) so that the agent can look up filesystems on the specified disks.
Hmm, I wonder how practical such a conversion will be.
eg libvirt has a block device "sda", but the guest OS may have added a partition table (sda1, sda2, sda3, etc) and then put some of those partitions into LVM (volgroup00) and then created logical volume (vol1, vol2, etc). The guest agent can freeze individual filesystems on each logical volume, so if the API is just taking libvirt block device names, we can't express any way to freeze the filesystems the guest has.
Specifying libvirt disk alias name is coming from applications' requirement. For example, OpenStack cinder driver only knows provide libvirt device names. It is also nice if virDomainSnapshotCreateXML can specify 'disks' to be frozen when only a subset of the disks is specified in snapshot XML.
I'm now prototyping qemu-guest-agent code to resolve filesystems from disk addresses, and it is working with virtio/SATA/IDE/SCSI drives on x86 Linux guests. It can also handle LVM logical volumes that lies on multiple partitions on multiple disks.
https://github.com/tsekiyama/qemu/commit/6d26115e769a7fe6aba7be52d2180453ac a5fee5
This gathers disk device information from sysfs in the guest. On windows, I hope Virtual Disk Service can provide this kind of informations too.
All of this assumes that you're only interested in freezing filesystems that are backed by virtual devices exposes from the hypervisor. A guest OS could be directly accessing iSCSI/RBD/Gluster volumes. IMHO, it is within scope of this API to be able to freeze/thaw such volumes, but if you make the API take libvirt disk names, this is impossible, since these volumes are invisible to libvirt. What you propose is also fairly coarse because you are forcing all filesystems on a specified block device to be suspended at the same time. That is indeed probably the common case need, but it is desirable not to lock ourselves into that as the only option. Finally, if the guest is doing filesystem passthrough (eg virtio-9p) then we ought to allow for freeze/thaw of such filesystems, which again don't have any block device associated with libvirt XML.
So I think we have no choice but to actually have the API take a list of guest "volumes" (eg mount points in Linux, or drive letters in Windows).
Ideally the guest agent would also provide a way to list all currently known guest "volumes" so we could expose that in the API too later.
Possibly. If the volumes information from the API contains the dependent hardware addresses, libvirt clients might be able to map the volumes and libvirt disks from domain XML. In this case, specifying subset volumes in virDomainSnapshotCreateXML would be difficult. Maybe libvirt should provide the mapping function.
Which way do you prefer?
IMHO the more I look at this, the more I think we need to provide a way for apps to enumerate mount points in the guest, so that we can cover freeze/thaw of individual filesystems, not merely block device. When listing filesystems, we'd need a way to express what storage a filesystem is hosted on. So we'd likely be talking about an API which returned an array of (mount point, storage XML), where storage XML is some concise XML description of the storage beneath the FS. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 4/24/14 4:58 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 24, 2014 at 12:16:00AM +0000, Tomoki Sekiyama wrote:
Hi Daniel,
On Tue, Apr 22, 2014 at 06:22:18PM +0000, Tomoki Sekiyama wrote:
Hi Daniel, thanks for your comment.
On 4/22/14 11:39 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote:
+ +/** + * virDomainFSFreeze: + * @dom: a domain object + * @disks: list of disk names to be frozen + * @ndisks: the number of disks specified in @disks
I realize this current patch series doesn't use the @disks parameter at all, but what exactly are we expecting to be passed here ? Is this a list of filesystem paths from the guest OS pov, or is it a list of disks targets from libvirt's POV ? I'm guessing the former since this is expected to be passed to the guest agent.
My intention for 'disks' is latter, a list of disks targets from libvirt's POV. Currently it is just a placeholder of API. It would be passed to the agent after it is converted into a list of device addresses (e.g. a
On 4/23/14 5:55 , "Daniel P. Berrange" <berrange@redhat.com> wrote: pair
of drive address and controller's PCI address) so that the agent can look up filesystems on the specified disks.
Hmm, I wonder how practical such a conversion will be.
eg libvirt has a block device "sda", but the guest OS may have added a partition table (sda1, sda2, sda3, etc) and then put some of those partitions into LVM (volgroup00) and then created logical volume (vol1, vol2, etc). The guest agent can freeze individual filesystems on each logical volume, so if the API is just taking libvirt block device names, we can't express any way to freeze the filesystems the guest has.
Specifying libvirt disk alias name is coming from applications' requirement. For example, OpenStack cinder driver only knows provide libvirt device names. It is also nice if virDomainSnapshotCreateXML can specify 'disks' to be frozen when only a subset of the disks is specified in snapshot XML.
I'm now prototyping qemu-guest-agent code to resolve filesystems from disk addresses, and it is working with virtio/SATA/IDE/SCSI drives on x86 Linux guests. It can also handle LVM logical volumes that lies on multiple partitions on multiple disks.
https://github.com/tsekiyama/qemu/commit/6d26115e769a7fe6aba7be52d2180453 ac a5fee5
This gathers disk device information from sysfs in the guest. On windows, I hope Virtual Disk Service can provide this kind of informations too.
All of this assumes that you're only interested in freezing filesystems that are backed by virtual devices exposes from the hypervisor. A guest OS could be directly accessing iSCSI/RBD/Gluster volumes. IMHO, it is within scope of this API to be able to freeze/thaw such volumes, but if you make the API take libvirt disk names, this is impossible, since these volumes are invisible to libvirt.
What you propose is also fairly coarse because you are forcing all filesystems on a specified block device to be suspended at the same time. That is indeed probably the common case need, but it is desirable not to lock ourselves into that as the only option.
Finally, if the guest is doing filesystem passthrough (eg virtio-9p) then we ought to allow for freeze/thaw of such filesystems, which again don't have any block device associated with libvirt XML.
So I think we have no choice but to actually have the API take a list of guest "volumes" (eg mount points in Linux, or drive letters in Windows).
Ideally the guest agent would also provide a way to list all currently known guest "volumes" so we could expose that in the API too later.
Possibly. If the volumes information from the API contains the dependent hardware addresses, libvirt clients might be able to map the volumes and libvirt disks from domain XML. In this case, specifying subset volumes in virDomainSnapshotCreateXML would be difficult. Maybe libvirt should provide the mapping function.
Which way do you prefer?
IMHO the more I look at this, the more I think we need to provide a way for apps to enumerate mount points in the guest, so that we can cover freeze/thaw of individual filesystems, not merely block device.
OK, then, APIs will look like int virDomainFSFreeze(virDomainPtr dom, const char *mountPoints, unsigned int nMountPoints, unsigned int flags); int virDomainFSThaw(virDomainPtr dom, unsigned int flags); As we drop nesting, FSThaw will not take mountPoints argument.
When listing filesystems, we'd need a way to express what storage a filesystem is hosted on. So we'd likely be talking about an API which returned an array of (mount point, storage XML), where storage XML is some concise XML description of the storage beneath the FS.
This sounds reasonable to me. Thanks for your suggestions.
Regards, Daniel
Regards, Tomoki Sekiyama

On Thu, Apr 24, 2014 at 09:29:04PM +0000, Tomoki Sekiyama wrote:
On 4/24/14 4:58 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 24, 2014 at 12:16:00AM +0000, Tomoki Sekiyama wrote:
Hi Daniel,
On Tue, Apr 22, 2014 at 06:22:18PM +0000, Tomoki Sekiyama wrote:
Hi Daniel, thanks for your comment.
On 4/22/14 11:39 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote: > + > +/** > + * virDomainFSFreeze: > + * @dom: a domain object > + * @disks: list of disk names to be frozen > + * @ndisks: the number of disks specified in @disks
I realize this current patch series doesn't use the @disks parameter at all, but what exactly are we expecting to be passed here ? Is this a list of filesystem paths from the guest OS pov, or is it a list of disks targets from libvirt's POV ? I'm guessing the former since this is expected to be passed to the guest agent.
My intention for 'disks' is latter, a list of disks targets from libvirt's POV. Currently it is just a placeholder of API. It would be passed to the agent after it is converted into a list of device addresses (e.g. a
On 4/23/14 5:55 , "Daniel P. Berrange" <berrange@redhat.com> wrote: pair
of drive address and controller's PCI address) so that the agent can look up filesystems on the specified disks.
Hmm, I wonder how practical such a conversion will be.
eg libvirt has a block device "sda", but the guest OS may have added a partition table (sda1, sda2, sda3, etc) and then put some of those partitions into LVM (volgroup00) and then created logical volume (vol1, vol2, etc). The guest agent can freeze individual filesystems on each logical volume, so if the API is just taking libvirt block device names, we can't express any way to freeze the filesystems the guest has.
Specifying libvirt disk alias name is coming from applications' requirement. For example, OpenStack cinder driver only knows provide libvirt device names. It is also nice if virDomainSnapshotCreateXML can specify 'disks' to be frozen when only a subset of the disks is specified in snapshot XML.
I'm now prototyping qemu-guest-agent code to resolve filesystems from disk addresses, and it is working with virtio/SATA/IDE/SCSI drives on x86 Linux guests. It can also handle LVM logical volumes that lies on multiple partitions on multiple disks.
https://github.com/tsekiyama/qemu/commit/6d26115e769a7fe6aba7be52d2180453 ac a5fee5
This gathers disk device information from sysfs in the guest. On windows, I hope Virtual Disk Service can provide this kind of informations too.
All of this assumes that you're only interested in freezing filesystems that are backed by virtual devices exposes from the hypervisor. A guest OS could be directly accessing iSCSI/RBD/Gluster volumes. IMHO, it is within scope of this API to be able to freeze/thaw such volumes, but if you make the API take libvirt disk names, this is impossible, since these volumes are invisible to libvirt.
What you propose is also fairly coarse because you are forcing all filesystems on a specified block device to be suspended at the same time. That is indeed probably the common case need, but it is desirable not to lock ourselves into that as the only option.
Finally, if the guest is doing filesystem passthrough (eg virtio-9p) then we ought to allow for freeze/thaw of such filesystems, which again don't have any block device associated with libvirt XML.
So I think we have no choice but to actually have the API take a list of guest "volumes" (eg mount points in Linux, or drive letters in Windows).
Ideally the guest agent would also provide a way to list all currently known guest "volumes" so we could expose that in the API too later.
Possibly. If the volumes information from the API contains the dependent hardware addresses, libvirt clients might be able to map the volumes and libvirt disks from domain XML. In this case, specifying subset volumes in virDomainSnapshotCreateXML would be difficult. Maybe libvirt should provide the mapping function.
Which way do you prefer?
IMHO the more I look at this, the more I think we need to provide a way for apps to enumerate mount points in the guest, so that we can cover freeze/thaw of individual filesystems, not merely block device.
OK, then, APIs will look like
int virDomainFSFreeze(virDomainPtr dom, const char *mountPoints, unsigned int nMountPoints, unsigned int flags);
int virDomainFSThaw(virDomainPtr dom, unsigned int flags);
As we drop nesting, FSThaw will not take mountPoints argument.
Actually, even without nesting, I think it is still potentially valuable to allow us to thaw individual filesystems. I'd include mountPoints in the args, even if we don't intend it implement it in QEMU for the forseeable future, just to future proof it. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 4/25/14 5:44 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Thu, Apr 24, 2014 at 09:29:04PM +0000, Tomoki Sekiyama wrote:
On 4/24/14 4:58 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
Hi Daniel,
On 4/23/14 5:55 , "Daniel P. Berrange" <berrange@redhat.com> wrote:
On Tue, Apr 22, 2014 at 06:22:18PM +0000, Tomoki Sekiyama wrote:
Hi Daniel, thanks for your comment.
On 4/22/14 11:39 , "Daniel P. Berrange" <berrange@redhat.com> wrote: >On Thu, Apr 03, 2014 at 11:39:29AM -0400, Tomoki Sekiyama wrote: >> + >> +/** >> + * virDomainFSFreeze: >> + * @dom: a domain object >> + * @disks: list of disk names to be frozen >> + * @ndisks: the number of disks specified in @disks > >I realize this current patch series doesn't use the @disks
>at all, but what exactly are we expecting to be passed here ? Is >this a list of filesystem paths from the guest OS pov, or is it a >list of disks targets from libvirt's POV ? I'm guessing the
>since this is expected to be passed to the guest agent.
My intention for 'disks' is latter, a list of disks targets from libvirt's POV. Currently it is just a placeholder of API. It would be passed to
On Thu, Apr 24, 2014 at 12:16:00AM +0000, Tomoki Sekiyama wrote: parameter former the
agent after it is converted into a list of device addresses (e.g. a pair of drive address and controller's PCI address) so that the agent can look up filesystems on the specified disks.
Hmm, I wonder how practical such a conversion will be.
eg libvirt has a block device "sda", but the guest OS may have added a partition table (sda1, sda2, sda3, etc) and then put some of those partitions into LVM (volgroup00) and then created logical volume (vol1, vol2, etc). The guest agent can freeze individual filesystems on each logical volume, so if the API is just taking libvirt block device names, we can't express any way to freeze the filesystems the guest has.
Specifying libvirt disk alias name is coming from applications' requirement. For example, OpenStack cinder driver only knows provide libvirt device names. It is also nice if virDomainSnapshotCreateXML can specify 'disks' to be frozen when only a subset of the disks is specified in snapshot XML.
I'm now prototyping qemu-guest-agent code to resolve filesystems from disk addresses, and it is working with virtio/SATA/IDE/SCSI drives on x86 Linux guests. It can also handle LVM logical volumes that lies on multiple partitions on multiple disks.
https://github.com/tsekiyama/qemu/commit/6d26115e769a7fe6aba7be52d21804 53 ac a5fee5
This gathers disk device information from sysfs in the guest. On windows, I hope Virtual Disk Service can provide this kind of informations too.
All of this assumes that you're only interested in freezing filesystems that are backed by virtual devices exposes from the hypervisor. A guest OS could be directly accessing iSCSI/RBD/Gluster volumes. IMHO, it is within scope of this API to be able to freeze/thaw such volumes, but if you make the API take libvirt disk names, this is impossible, since these volumes are invisible to libvirt.
What you propose is also fairly coarse because you are forcing all filesystems on a specified block device to be suspended at the same time. That is indeed probably the common case need, but it is desirable not to lock ourselves into that as the only option.
Finally, if the guest is doing filesystem passthrough (eg virtio-9p) then we ought to allow for freeze/thaw of such filesystems, which again don't have any block device associated with libvirt XML.
So I think we have no choice but to actually have the API take a list of guest "volumes" (eg mount points in Linux, or drive letters in Windows).
Ideally the guest agent would also provide a way to list all currently known guest "volumes" so we could expose that in the API too later.
Possibly. If the volumes information from the API contains the dependent hardware addresses, libvirt clients might be able to map the volumes and libvirt disks from domain XML. In this case, specifying subset volumes in virDomainSnapshotCreateXML would be difficult. Maybe libvirt should provide the mapping function.
Which way do you prefer?
IMHO the more I look at this, the more I think we need to provide a way for apps to enumerate mount points in the guest, so that we can cover freeze/thaw of individual filesystems, not merely block device.
OK, then, APIs will look like
int virDomainFSFreeze(virDomainPtr dom, const char *mountPoints, unsigned int nMountPoints, unsigned int flags);
int virDomainFSThaw(virDomainPtr dom, unsigned int flags);
As we drop nesting, FSThaw will not take mountPoints argument.
Actually, even without nesting, I think it is still potentially valuable to allow us to thaw individual filesystems. I'd include mountPoints in the args, even if we don't intend it implement it in QEMU for the forseeable future, just to future proof it.
I see. I'll make them : int virDomainFSFreeze(virDomainPtr dom, const char *mountPoints, unsigned int nMountPoints, unsigned int flags); int virDomainFSThaw(virDomainPtr dom, const char *mountPoints, unsigned int nMountPoints, unsigned int flags); however QEMU driver might ignore mountPoints in virDomainFSThaw.
Regards, Daniel
Regards, Tomoki Sekiyama

New rules are added in fixup_name in gendispatch.pl to keep the name FSFreeze and FSThaw. This adds a new ACL permission 'fs_freeze', which is also applied to VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE flag. Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> --- src/access/viraccessperm.c | 2 +- src/access/viraccessperm.h | 6 ++++++ src/remote/remote_driver.c | 2 ++ src/remote/remote_protocol.x | 30 ++++++++++++++++++++++++++++-- src/remote_protocol-structs | 18 ++++++++++++++++++ src/rpc/gendispatch.pl | 2 ++ 6 files changed, 57 insertions(+), 3 deletions(-) diff --git a/src/access/viraccessperm.c b/src/access/viraccessperm.c index d517c66..462f46c 100644 --- a/src/access/viraccessperm.c +++ b/src/access/viraccessperm.c @@ -39,7 +39,7 @@ VIR_ENUM_IMPL(virAccessPermDomain, "start", "stop", "reset", "save", "delete", "migrate", "snapshot", "suspend", "hibernate", "core_dump", "pm_control", - "init_control", "inject_nmi", "send_input", "send_signal", "fs_trim", + "init_control", "inject_nmi", "send_input", "send_signal", "fs_trim", "fs_freeze", "block_read", "block_write", "mem_read", "open_graphics", "open_device", "screenshot", "open_namespace"); diff --git a/src/access/viraccessperm.h b/src/access/viraccessperm.h index 6d14f05..ac48d70 100644 --- a/src/access/viraccessperm.h +++ b/src/access/viraccessperm.h @@ -242,6 +242,12 @@ typedef enum { */ VIR_ACCESS_PERM_DOMAIN_FS_TRIM, /* Issue TRIM to guest filesystems */ + /** + * @desc: Freeze and thaw domain filesystems + * @message: Freezing and thawing domain filesystems require authorization + */ + VIR_ACCESS_PERM_DOMAIN_FS_FREEZE, /* Freeze and thaw guest filesystems */ + /* Peeking at guest */ /** diff --git a/src/remote/remote_driver.c b/src/remote/remote_driver.c index ed7dde6..8b39a90 100644 --- a/src/remote/remote_driver.c +++ b/src/remote/remote_driver.c @@ -7800,6 +7800,8 @@ static virDriver remote_driver = { .domainMigrateFinish3Params = remoteDomainMigrateFinish3Params, /* 1.1.0 */ .domainMigrateConfirm3Params = remoteDomainMigrateConfirm3Params, /* 1.1.0 */ .connectGetCPUModelNames = remoteConnectGetCPUModelNames, /* 1.1.3 */ + .domainFSFreeze = remoteDomainFSFreeze, /* 1.2.4 */ + .domainFSThaw = remoteDomainFSThaw, /* 1.2.4 */ }; static virNetworkDriver network_driver = { diff --git a/src/remote/remote_protocol.x b/src/remote/remote_protocol.x index 6c445cc..13fa69b 100644 --- a/src/remote/remote_protocol.x +++ b/src/remote/remote_protocol.x @@ -235,6 +235,9 @@ const REMOTE_DOMAIN_JOB_STATS_MAX = 64; /* Upper limit on number of CPU models */ const REMOTE_CONNECT_CPU_MODELS_MAX = 8192; +/* Upper limit on number of disks to frozen */ +const REMOTE_DOMAIN_FSFREEZE_DISKS_MAX = 256; + /* UUID. VIR_UUID_BUFLEN definition comes from libvirt.h */ typedef opaque remote_uuid[VIR_UUID_BUFLEN]; @@ -2959,6 +2962,17 @@ struct remote_network_event_lifecycle_msg { int detail; }; +struct remote_domain_fsfreeze_args { + remote_nonnull_domain dom; + remote_nonnull_string disks<REMOTE_DOMAIN_FSFREEZE_DISKS_MAX>; /* (const char **) */ + unsigned int flags; +}; + +struct remote_domain_fsthaw_args { + remote_nonnull_domain dom; + remote_nonnull_string disks<REMOTE_DOMAIN_FSFREEZE_DISKS_MAX>; /* (const char **) */ + unsigned int flags; +}; /*----- Protocol. -----*/ @@ -4289,7 +4303,7 @@ enum remote_procedure { /** * @generate: both * @acl: domain:snapshot - * @acl: domain:write:VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE + * @acl: domain:fs_freeze:VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE */ REMOTE_PROC_DOMAIN_SNAPSHOT_CREATE_XML = 185, @@ -5275,5 +5289,17 @@ enum remote_procedure { * @generate: both * @acl: domain:core_dump */ - REMOTE_PROC_DOMAIN_CORE_DUMP_WITH_FORMAT = 334 + REMOTE_PROC_DOMAIN_CORE_DUMP_WITH_FORMAT = 334, + + /** + * @generate: both + * @acl: domain:fs_freeze + */ + REMOTE_PROC_DOMAIN_FSFREEZE = 335, + + /** + * @generate: both + * @acl: domain:fs_freeze + */ + REMOTE_PROC_DOMAIN_FSTHAW = 336 }; diff --git a/src/remote_protocol-structs b/src/remote_protocol-structs index 456d0da..439ff87 100644 --- a/src/remote_protocol-structs +++ b/src/remote_protocol-structs @@ -2427,6 +2427,22 @@ struct remote_network_event_lifecycle_msg { int event; int detail; }; +struct remote_domain_fsfreeze_args { + remote_nonnull_domain dom; + struct { + u_int disks_len; + remote_nonnull_string * disks_val; + } disks; + u_int flags; +}; +struct remote_domain_fsthaw_args { + remote_nonnull_domain dom; + struct { + u_int disks_len; + remote_nonnull_string * disks_val; + } disks; + u_int flags; +}; enum remote_procedure { REMOTE_PROC_CONNECT_OPEN = 1, REMOTE_PROC_CONNECT_CLOSE = 2, @@ -2762,4 +2778,6 @@ enum remote_procedure { REMOTE_PROC_DOMAIN_EVENT_CALLBACK_PMSUSPEND_DISK = 332, REMOTE_PROC_DOMAIN_EVENT_CALLBACK_DEVICE_REMOVED = 333, REMOTE_PROC_DOMAIN_CORE_DUMP_WITH_FORMAT = 334, + REMOTE_PROC_DOMAIN_FSFREEZE = 335, + REMOTE_PROC_DOMAIN_FSTHAW = 336, }; diff --git a/src/rpc/gendispatch.pl b/src/rpc/gendispatch.pl index b76bbac..9cd620e 100755 --- a/src/rpc/gendispatch.pl +++ b/src/rpc/gendispatch.pl @@ -64,6 +64,8 @@ sub fixup_name { $name =~ s/Nmi$/NMI/; $name =~ s/Pm/PM/; $name =~ s/Fstrim$/FSTrim/; + $name =~ s/Fsfreeze$/FSFreeze/; + $name =~ s/Fsthaw$/FSThaw/; $name =~ s/Scsi/SCSI/; $name =~ s/Wwn$/WWN/;

Adds 'quiesced' counter into qemuDomainObjPrivate that tracks how many times fsfreeze is requested in the domain. When qemuDomainSnapshotFSFreeze is called, the counter is incremented. The fsfreeze request is sent to the guest agent when the counter changes from 0 to 1. qemuDomainSnapshotFSThaw decrements the counter and requests fsthaw when it becomes 0. It also modify error code from qemuDomainSnapshotFSFreeze and qemuDomainSnapshotFSThaw, so that a caller can know whether the command is actually sent to the guest agent. If the error is caused before sending a freeze command, a counterpart thaw command shouldn't be sent either. Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> --- src/qemu/qemu_domain.c | 6 ++++ src/qemu/qemu_domain.h | 2 + src/qemu/qemu_driver.c | 65 ++++++++++++++++++++++++++++++++++++++---------- 3 files changed, 59 insertions(+), 14 deletions(-) diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c index cdd4601..6ed3d67 100644 --- a/src/qemu/qemu_domain.c +++ b/src/qemu/qemu_domain.c @@ -357,6 +357,9 @@ qemuDomainObjPrivateXMLFormat(virBufferPtr buf, void *data) virBufferAddLit(buf, "</devices>\n"); } + if (priv->quiesced) + virBufferAsprintf(buf, "<quiesced depth='%d'/>\n", priv->quiesced); + return 0; } @@ -518,6 +521,9 @@ qemuDomainObjPrivateXMLParse(xmlXPathContextPtr ctxt, void *data) } VIR_FREE(nodes); + if (virXPathInt("string(./quiesced/@depth)", ctxt, &priv->quiesced) < 0) + priv->quiesced = 0; + return 0; error: diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h index b2830c4..620a258 100644 --- a/src/qemu/qemu_domain.h +++ b/src/qemu/qemu_domain.h @@ -176,6 +176,8 @@ struct _qemuDomainObjPrivate { char **qemuDevices; /* NULL-terminated list of devices aliases known to QEMU */ bool hookRun; /* true if there was a hook run over this domain */ + + int quiesced; /* count how many times filesystems are quiesced */ }; typedef enum { diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 9a2de12..2bb5fa9 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -12100,32 +12100,68 @@ qemuDomainPrepareDiskChainElement(virQEMUDriverPtr driver, } +/* Return -1 if request is not sent to agent due to misconfig, -2 if request + * is sent but failed, and number of frozen filesystems on success. */ static int -qemuDomainSnapshotFSFreeze(virDomainObjPtr vm) +qemuDomainSnapshotFSFreeze(virQEMUDriverPtr driver, virDomainObjPtr vm) { qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverConfigPtr cfg; int freezed; if (!qemuDomainAgentAvailable(priv, true)) return -1; + priv->quiesced++; + cfg = virQEMUDriverGetConfig(driver); + if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) { + virObjectUnref(cfg); + priv->quiesced--; + return -1; + } + virObjectUnref(cfg); + + if (priv->quiesced > 1) + return 0; + qemuDomainObjEnterAgent(vm); freezed = qemuAgentFSFreeze(priv->agent); qemuDomainObjExitAgent(vm); - - return freezed; + return freezed < 0 ? -2 : freezed; } +/* Return -1 if request is not sent to agent due to misconfig, -2 if request + * is sent but failed, and number of thawed filesystems on success. */ static int -qemuDomainSnapshotFSThaw(virDomainObjPtr vm, bool report) +qemuDomainSnapshotFSThaw(virQEMUDriverPtr driver, + virDomainObjPtr vm, bool report) { qemuDomainObjPrivatePtr priv = vm->privateData; + virQEMUDriverConfigPtr cfg; int thawed; virErrorPtr err = NULL; if (!qemuDomainAgentAvailable(priv, report)) return -1; + if (!priv->quiesced && report) { + virReportError(VIR_ERR_OPERATION_INVALID, "%s", + _("domain is not quiesced")); + return -1; + } + + priv->quiesced--; + cfg = virQEMUDriverGetConfig(driver); + if (virDomainSaveStatus(driver->xmlopt, cfg->stateDir, vm) < 0) { + virObjectUnref(cfg); + priv->quiesced++; + return -1; + } + virObjectUnref(cfg); + + if (priv->quiesced) + return 0; + qemuDomainObjEnterAgent(vm); if (!report) err = virSaveLastError(); @@ -12135,7 +12171,7 @@ qemuDomainSnapshotFSThaw(virDomainObjPtr vm, bool report) qemuDomainObjExitAgent(vm); virFreeError(err); - return thawed; + return thawed < 0 ? -2 : thawed; } /* The domain is expected to be locked and inactive. */ @@ -13116,17 +13152,18 @@ qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, goto cleanup; /* If quiesce was requested, then issue a freeze command, and a - * counterpart thaw command, no matter what. The command will - * fail if the guest is paused or the guest agent is not - * running. */ + * counterpart thaw command when it is actually sent to agent. + * The command will fail if the guest is paused or the guest agent + * is not running, or is already quiesced. */ if (flags & VIR_DOMAIN_SNAPSHOT_CREATE_QUIESCE) { - if (qemuDomainSnapshotFSFreeze(vm) < 0) { - /* helper reported the error */ - thaw = -1; + int freeze = qemuDomainSnapshotFSFreeze(driver, vm); + if (freeze < 0) { + /* the helper reported the error */ + if (freeze == -2) + thaw = -1; /* the command is sent but agent failed */ goto endjob; - } else { - thaw = 1; } + thaw = 1; } /* We need to track what state the guest is in, since taking the @@ -13267,7 +13304,7 @@ qemuDomainSnapshotCreateActiveExternal(virConnectPtr conn, goto cleanup; } if (vm && thaw != 0 && - qemuDomainSnapshotFSThaw(vm, thaw > 0) < 0) { + qemuDomainSnapshotFSThaw(driver, vm, thaw > 0) < 0) { /* helper reported the error, if it was needed */ if (thaw > 0) ret = -1;

Use qemuDomainSnapshotFSFreeze() and qemuDomainSnapshotFSFThaw() which are already implemented for snapshot quiescing. So far, disks and ndisks parameters are unsupported, because the guest agent doesn't have means to specify disks to be frozen or thawed. Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> --- src/qemu/qemu_driver.c | 94 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c index 2bb5fa9..81ff473 100644 --- a/src/qemu/qemu_driver.c +++ b/src/qemu/qemu_driver.c @@ -16700,6 +16700,98 @@ qemuConnectGetCPUModelNames(virConnectPtr conn, } +static int +qemuDomainFSFreeze(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags) +{ + virQEMUDriverPtr driver = dom->conn->privateData; + virDomainObjPtr vm; + int ret = -1; + + virCheckFlags(0, -1); + + if (disks || ndisks) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Specifying disks is not supported for now")); + return -1; + } + + if (!(vm = qemuDomObjFromDomain(dom))) + goto cleanup; + + if (virDomainFSFreezeEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is not running")); + goto endjob; + } + + ret = qemuDomainSnapshotFSFreeze(driver, vm); + + endjob: + if (!qemuDomainObjEndJob(driver, vm)) + vm = NULL; + + cleanup: + if (vm) + virObjectUnlock(vm); + return ret; +} + + +static int +qemuDomainFSThaw(virDomainPtr dom, + const char **disks, + unsigned int ndisks, + unsigned int flags) +{ + virQEMUDriverPtr driver = dom->conn->privateData; + virDomainObjPtr vm; + int ret = -1; + + virCheckFlags(0, -1); + + if (disks || ndisks) { + virReportError(VIR_ERR_ARGUMENT_UNSUPPORTED, "%s", + _("Specifying disks is not supported for now")); + return -1; + } + + if (!(vm = qemuDomObjFromDomain(dom))) + goto cleanup; + + if (virDomainFSThawEnsureACL(dom->conn, vm->def) < 0) + goto cleanup; + + if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) + goto cleanup; + + if (!virDomainObjIsActive(vm)) { + virReportError(VIR_ERR_OPERATION_INVALID, + "%s", _("domain is not running")); + goto endjob; + } + + ret = qemuDomainSnapshotFSThaw(driver, vm, true); + + endjob: + if (!qemuDomainObjEndJob(driver, vm)) + vm = NULL; + + cleanup: + if (vm) + virObjectUnlock(vm); + return ret; +} + + static virDriver qemuDriver = { .no = VIR_DRV_QEMU, .name = QEMU_DRIVER_NAME, @@ -16890,6 +16982,8 @@ static virDriver qemuDriver = { .domainMigrateFinish3Params = qemuDomainMigrateFinish3Params, /* 1.1.0 */ .domainMigrateConfirm3Params = qemuDomainMigrateConfirm3Params, /* 1.1.0 */ .connectGetCPUModelNames = qemuConnectGetCPUModelNames, /* 1.1.3 */ + .domainFSFreeze = qemuDomainFSFreeze, /* 1.2.4 */ + .domainFSThaw = qemuDomainFSThaw, /* 1.2.4 */ };

These are exposed under domfsfreeze command and domfsthaw command. Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com> --- tools/virsh-domain.c | 128 ++++++++++++++++++++++++++++++++++++++++++++++++++ tools/virsh.pod | 23 +++++++++ 2 files changed, 151 insertions(+) diff --git a/tools/virsh-domain.c b/tools/virsh-domain.c index 73414f8..331ba36 100644 --- a/tools/virsh-domain.c +++ b/tools/virsh-domain.c @@ -11391,6 +11391,122 @@ cmdDomFSTrim(vshControl *ctl, const vshCmd *cmd) return ret; } +static const vshCmdInfo info_domfsfreeze[] = { + {.name = "help", + .data = N_("Freeze domain's mounted filesystems.") + }, + {.name = "desc", + .data = N_("Freeze domain's mounted filesystems.") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_domfsfreeze[] = { + {.name = "domain", + .type = VSH_OT_DATA, + .flags = VSH_OFLAG_REQ, + .help = N_("domain name, id or uuid") + }, + {.name = "disks", + .type = VSH_OT_DATA, + .help = N_("comma separated list of disks to be frozen") + }, + {.name = NULL} +}; +static bool +cmdDomFSFreeze(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom = NULL; + bool ret = false; + unsigned int flags = 0; + const char *disk_string = NULL; + char **disk_list = NULL; /* tokenized disk_string */ + int ndisks = 0; + size_t i; + + ignore_value(vshCommandOptString(cmd, "disks", &disk_string)); + + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + return ret; + + if (disk_string) { + if ((ndisks = vshStringToArray(disk_string, &disk_list)) < 0) + goto cleanup; + } + + if (virDomainFSFreeze(dom, (const char **)disk_list, ndisks, flags) < 0) { + vshError(ctl, _("Unable to freeze filesystems")); + goto cleanup; + } + + ret = true; + + cleanup: + for (i = 0; i < ndisks; i++) + VIR_FREE(disk_list[i]); + VIR_FREE(disk_list); + virDomainFree(dom); + return ret; +} + +static const vshCmdInfo info_domfsthaw[] = { + {.name = "help", + .data = N_("Thaw domain's mounted filesystems.") + }, + {.name = "desc", + .data = N_("Thaw domain's mounted filesystems.") + }, + {.name = NULL} +}; + +static const vshCmdOptDef opts_domfsthaw[] = { + {.name = "domain", + .type = VSH_OT_DATA, + .flags = VSH_OFLAG_REQ, + .help = N_("domain name, id or uuid") + }, + {.name = "disks", + .type = VSH_OT_DATA, + .help = N_("comma separated list of disks to be thawed") + }, + {.name = NULL} +}; +static bool +cmdDomFSThaw(vshControl *ctl, const vshCmd *cmd) +{ + virDomainPtr dom = NULL; + bool ret = false; + unsigned int flags = 0; + const char *disk_string = NULL; + char **disk_list = NULL; /* tokenized disk_string */ + int ndisks = 0; + size_t i; + + ignore_value(vshCommandOptString(cmd, "disks", &disk_string)); + + if (!(dom = vshCommandOptDomain(ctl, cmd, NULL))) + return ret; + + if (disk_string) { + if ((ndisks = vshStringToArray(disk_string, &disk_list)) < 0) + goto cleanup; + } + + if (virDomainFSThaw(dom, (const char **)disk_list, ndisks, flags) < 0) { + vshError(ctl, _("Unable to thaw filesystems")); + goto cleanup; + } + + ret = true; + + cleanup: + for (i = 0; i < ndisks; i++) + VIR_FREE(disk_list[i]); + VIR_FREE(disk_list); + virDomainFree(dom); + return ret; +} + const vshCmdDef domManagementCmds[] = { {.name = "attach-device", .handler = cmdAttachDevice, @@ -11538,6 +11654,18 @@ const vshCmdDef domManagementCmds[] = { .info = info_domdisplay, .flags = 0 }, + {.name = "domfsfreeze", + .handler = cmdDomFSFreeze, + .opts = opts_domfsfreeze, + .info = info_domfsfreeze, + .flags = 0 + }, + {.name = "domfsthaw", + .handler = cmdDomFSThaw, + .opts = opts_domfsthaw, + .info = info_domfsthaw, + .flags = 0 + }, {.name = "domfstrim", .handler = cmdDomFSTrim, .opts = opts_domfstrim, diff --git a/tools/virsh.pod b/tools/virsh.pod index 98d891a..652aabc 100644 --- a/tools/virsh.pod +++ b/tools/virsh.pod @@ -941,6 +941,29 @@ Output a URI which can be used to connect to the graphical display of the domain via VNC, SPICE or RDP. If I<--include-password> is specified, the SPICE channel password will be included in the URI. +=item B<domfsfreeze> I<domain> [I<--disks> B<disks>] + +Freeze all mounted filesystems within a running domain to prepare for +consistent snapshots. + +The I<--disks> flag takes a parameter B<disks>, which is a comma separated +list of names of disks to be frozen. If this is not specified, every mounted +filesystem on the guest is frozen. + +Note that B<snapshot-create> command has a I<--quiesce> option to freeze +and thaw the filesystems automatically to keep snapshots consistent. +B<domfsfreeze> command is only needed when a user wants to utilize the +native snapshot features of storage devices not supported by libvirt yet. + +=item B<domfsthaw> I<domain> [I<--disks> B<disks>] + +Thaw all mounted filesystems within a running domain, which are frozen +by domfsfreeze command. + +The I<--disks> flag takes a parameter B<disks>, which is a comma separated +list of names of disks to be thawed. If this is not specified, every mounted +filesystem on the guest is thawed. + =item B<domfstrim> I<domain> [I<--minimum> B<bytes>] [I<--mountpoint mountPoint>]

Any comments for this patchset? On 4/3/14 11:39 , "Tomoki Sekiyama" <tomoki.sekiyama@hds.com> wrote:
Hello,
This is patchset v5 to add FSFreeze/FSThaw API for custom disk snapshotting.
Changes to v4: * add disks and ndisks parameter to specify disks to be frozen/thawed * make fsfreeze requests nestable * change api version to 1.2.4 (v4: https://www.redhat.com/archives/libvir-list/2014-March/msg01674.html )
=== Description ===
Currently FSFreeze and FSThaw are supported by qemu guest agent and they are used internally in snapshot-create command with --quiesce option. However, when users want to utilize the native snapshot feature of storage devices (such as LVM over iSCSI, enterprise storage appliances, etc.), they need to issue fsfreeze command separately from libvirt-driven snapshots. (OpenStack cinder provides these storages' snapshot feature, but it cannot quiesce the guest filesystems automatically for now.)
Although virDomainQemuGuestAgent() API could be used for this purpose, it is only for debugging and is not supported officially.
This patchset adds virDomainFSFreeze()/virDomainFSThaw() APIs and virsh domfsfreeze/domfsthaw commands to enable the users to freeze and thaw domain's filesystems cleanly.
<updated> The APIs take disks and ndisks parameters, which is a list of disk names to be frozen/thawed. If the option is not provided, every mounted filesystem is frozen/thawed.
The fsfreeze can be nestable. When fsfreeze requests to a disk are issued multiple times, it is not thawed until the fsthaw requests are issued as many times as the freeze requests.
Currently, qemu driver doesn't support disks parameter because the guest agent doesn't have means to specify disks to be frozen/thawed. Hence, it just counts depth of fsfreeze per domain, not per disk, so far. </updated>
The APIs have flags option currently unsupported for future extension. ---
Tomoki Sekiyama (5): Introduce virDomainFSFreeze() and virDomainFSThaw() public API remote: Implement virDomainFSFreeze and virDomainFSThaw qemu: Track domain quiesced status and make fsfreeze/thaw nestable qemu: Implement virDomainFSFreeze and virDomainFSThaw virsh: Expose new virDomainFSFreeze and virDomainFSThaw API
include/libvirt/libvirt.h.in | 10 +++ src/access/viraccessperm.c | 2 - src/access/viraccessperm.h | 6 ++ src/driver.h | 14 ++++ src/libvirt.c | 92 ++++++++++++++++++++++++ src/libvirt_public.syms | 6 ++ src/qemu/qemu_domain.c | 6 ++ src/qemu/qemu_domain.h | 2 + src/qemu/qemu_driver.c | 159 ++++++++++++++++++++++++++++++++++++++---- src/remote/remote_driver.c | 2 + src/remote/remote_protocol.x | 30 +++++++- src/remote_protocol-structs | 18 +++++ src/rpc/gendispatch.pl | 2 + tools/virsh-domain.c | 128 ++++++++++++++++++++++++++++++++++ tools/virsh.pod | 23 ++++++ 15 files changed, 483 insertions(+), 17 deletions(-)
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
Tomoki Sekiyama