[RFC] override CreateSnapshot extrinsic

Since the CreateSnapshot() extrinsic has 2 vendor defined values added for SnapshotType parameter, I think the method should be overridden to define these new values. The attached patch does so by creating a new class Virt_VirtualSystemSnapshotService where CreateSnapshot is overridden. The various virtualizer flavors then inherit from Virt_VirtualSystemSnapshotService instead of CIM_VirtualSystemSnapshotService directly. Some tools will parse the class info and format gui based on qualifiers it finds. In the case of CreateSnapshot, such tools will not display the vendor additions unless we override the method and describe the additions. While on the subject, I would like to understand the usefulness of SnapshotType 32768. This type will save the vm's memory state and subsequently restore the vm. IMO applying this memory snapshot later would be quite dangerous. The vm has since been running and the disk state will be quite different from when the memory snapshot was taken. Does this make sense or am I not thinking clearly :-)? Finally, invoking CreateSnapshot with SnapshotType 32769 will save the vm and leave it powered off. Querying EnabledState shows the vm 'Enabled but Offline' (suspended). According to System Virtualization Profile, one should be able to move a vm in this state to Enabled by invoking RSC(Enabled) but doing so results in "snapshot exists, apply snapshot" error. So the behavior diverges from the spec IMO. It seems the current behavior of SnapshotService should just be implemented via RSC. CreateSnapshot -> RSC(Enabled but Offline), ApplySnapshot -> RSC(Enabled) Regards, Jim

JF> Some tools will parse the class info and format gui based on JF> qualifiers it finds. In the case of CreateSnapshot, such tools will JF> not display the vendor additions unless we override the method and JF> describe the additions. Agreed. JF> While on the subject, I would like to understand the usefulness of JF> SnapshotType 32768. This type will save the vm's memory state and JF> subsequently restore the vm. IMO applying this memory snapshot JF> later would be quite dangerous. The vm has since been running and JF> the disk state will be quite different from when the memory snapshot JF> was taken. Does this make sense or am I not thinking clearly :-)? That's true, it's not useful (or safe) to do a restore from it again, once the guest has been restored once. However, if you're looking to get the memory snapshot for forensic purposes, you would not care to be able to restore from it again. Perhaps we should use a different filename in the case of a save-and-restore snapshot so that we don't confuse our own logic into thinking that the domain has a valid save image. JF> Finally, invoking CreateSnapshot with SnapshotType 32769 will save JF> the vm and leave it powered off. Querying EnabledState shows the vm JF> Enabled but Offline' (suspended). According to System JF> Virtualization Profile, one should be able to move a vm in this JF> state to Enabled by invoking RSC(Enabled) but doing so results in JF> "snapshot exists, apply snapshot" error. So the behavior diverges JF> from the spec IMO. It seems the current behavior of SnapshotService JF> should just be implemented via RSC. CreateSnapshot -> RSC(Enabled JF> but Offline), ApplySnapshot -> RSC(Enabled) I thought we had discussed this before on IRC, but perhaps it got lost in some of the other noise. It seems a little broken to me to have the services cross each other with this bit of functionality. While it may seem trivial right now, since we only ever have one snapshot to restore from, I wonder what behavior it should have later if we support multiple ones? Should it restore from the most recent? The oldest? The snapshot service handles this by exposing the snapshots as instances that the caller can reference when asking to restore. I would expect this to be the desired and sane behavior if it was all contained in ComputerSystem, but I don't think it makes sense to intermingle the behavior of the snapshot service in this case. Perhaps it would have been better to support this by doing a save on RSC(suspend) and a restore on RSC(enabled, from suspend) in the first place, but we figured that the snapshot service would be more useful in the long run. If we specify in the capabilities object that we don't support the transition from suspended to enabled, then we're not really breaking the spec here, right? -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com

Dan Smith wrote:
JF> While on the subject, I would like to understand the usefulness of JF> SnapshotType 32768. This type will save the vm's memory state and JF> subsequently restore the vm. IMO applying this memory snapshot JF> later would be quite dangerous. The vm has since been running and JF> the disk state will be quite different from when the memory snapshot JF> was taken. Does this make sense or am I not thinking clearly :-)?
That's true, it's not useful (or safe) to do a restore from it again, once the guest has been restored once. However, if you're looking to get the memory snapshot for forensic purposes, you would not care to be able to restore from it again.
Right, that's a valid use case.
Perhaps we should use a different filename in the case of a save-and-restore snapshot so that we don't confuse our own logic into thinking that the domain has a valid save image.
Agreed. Sounds like a good idea.
JF> Finally, invoking CreateSnapshot with SnapshotType 32769 will save JF> the vm and leave it powered off. Querying EnabledState shows the vm JF> Enabled but Offline' (suspended). According to System JF> Virtualization Profile, one should be able to move a vm in this JF> state to Enabled by invoking RSC(Enabled) but doing so results in JF> "snapshot exists, apply snapshot" error. So the behavior diverges JF> from the spec IMO. It seems the current behavior of SnapshotService JF> should just be implemented via RSC. CreateSnapshot -> RSC(Enabled JF> but Offline), ApplySnapshot -> RSC(Enabled)
I thought we had discussed this before on IRC, but perhaps it got lost in some of the other noise.
We did. I just felt the need to revisit it after playing with the code :-).
It seems a little broken to me to have the services cross each other with this bit of functionality. While it may seem trivial right now, since we only ever have one snapshot to restore from, I wonder what behavior it should have later if we support multiple ones? Should it restore from the most recent? The oldest? The snapshot service handles this by exposing the snapshots as instances that the caller can reference when asking to restore.
Yep, understood. I think my biggest problem with the SnapshotService is that I have a hard time thinking about snapshots that don't include a disk component - even more so in the context of multiple snapshots. This to me is the usefulness of SnapshotService. I can happily take snapshots (either just disk or disk + memory) and then sometime later select one to apply and end up with a functional system. With multiple, memory-only snapshots I think the most recent is all that could be safely applied.
I would expect this to be the desired and sane behavior if it was all contained in ComputerSystem, but I don't think it makes sense to intermingle the behavior of the snapshot service in this case.
Agreed.
Perhaps it would have been better to support this by doing a save on RSC(suspend) and a restore on RSC(enabled, from suspend) in the first place, but we figured that the snapshot service would be more useful in the long run.
From a client's perspective I just think it is cumbersome to use the SnapShotService to implement the notion of suspend. IMO, either the providers should support suspend or not. Currently they don't (as indicated in capabilities) but after invoking CreateSnapshot the vm is in suspended state - so they kind of do. Folks here writing client code are a little confused by this :-).
If we specify in the capabilities object that we don't support the transition from suspended to enabled, then we're not really breaking the spec here, right?
Correct. And it seems the Virtual System Profile allows for states to occur even if the client cannot explicitly move the vm to that state. From note on page 23: NOTE State transitions may be observed even if client state management as described in section 7.6 is not supported. For example, a state transition might be initiated by means inherent to the virtualization platform, or it might be triggered during activation of the virtualization platform itself. So I think it is safe to say that from client's perspective the providers don't support the suspended state. That said, they can achieve the same effect by means of {Create,Apply}Snapshot with the vendor defined values. Thanks, Jim

JF> Yep, understood. I think my biggest problem with the JF> SnapshotService is that I have a hard time thinking about snapshots JF> that don't include a disk component - even more so in the context of JF> multiple snapshots. This to me is the usefulness of JF> SnapshotService. I can happily take snapshots (either just disk or JF> disk + memory) and then sometime later select one to apply and end JF> up with a functional system. Right, and that's where the snapshot service will start to become much more useful, when we can support that sort of behavior on one or more platforms. JF> From a client's perspective I just think it is cumbersome to use the JF> SnapShotService to implement the notion of suspend. IMO, either the JF> providers should support suspend or not. Currently they don't (as JF> indicated in capabilities) but after invoking CreateSnapshot the vm JF> is in suspended state - so they kind of do. Folks here writing JF> client code are a little confused by this :-). But in the future, when we do have coordinated snapshots, would you ever really want to do a suspend instead of a snapshot? The danger of someone corrupting their disk is definitely higher, and I don't think there are any performance benefits for most storage types. So my point is: when we get to that point, even if we support suspend as a state transition, it would still be a "you should really be using the snapshot service instead" kinda thing, I think. JF> So I think it is safe to say that from client's perspective the JF> providers don't support the suspended state. That said, they can JF> achieve the same effect by means of {Create,Apply}Snapshot with the JF> vendor defined values. Okay, I'm happy with that for now. If it becomes an issue, perhaps we can target a behavioral change to coincide with a significant version bump. Thanks Jim! -- Dan Smith IBM Linux Technology Center Open Hypervisor Team email: danms@us.ibm.com
participants (2)
-
Dan Smith
-
Jim Fehlig