[libvirt] Disk snapshot mode proposal: patch for storing the snapshot mode from .vmx to .xml

Hello, Few days ago I have proposed to implement the snapshot mode functionality for esx/vpx: |Mail subject: "[libvirt] VMWare "independent disk" processing needed" Sent: "04.07.2011" Sender: "computernews@rambler.ru" | What I meant is to allow "libvirt" to handle (see full .vmx attached) : |vmx: scsi0:1.mode = "independent-persistent" vmx: scsi0:2.mode = "independent-nonpersistent"| As none has responded I decided to start implementing that myself. By now I have the managed to implement the .vmx -> .xml part. According to "General tips for contributing patches" (http://libvirt.org/hacking.html#patches) I am sending the patch back to the community as early as it has sense as I still hope to contribute this feature to the community. It builds and has been functionally tested. I also tried my best to stick to coding conventions applied on the project. So I hope my patch will not take long time to analyze for someone experienced enough. I saw someone already has started (put few comments about snapshot modes in src/vmx/vmx.c). So tried to undestand an original idea and keep up with it. I hope I was not wrong there. I am willing to keep working in this direction. So if someone would be so kind to take a look at my efforts and provide a feedback - it would be very nice and, I hope, useful for the rest of the project. Attached are: 1. patch itself 2. ".vmx" file I am testing on 3. ".xml" file produced by my changes Looking forward to hear any feedback/criticism/advices. Thanks in advance. Best regards Oleh Paliy

2011/7/7 computernews@rambler.ru <computernews@rambler.ru>:
Hello,
Few days ago I have proposed to implement the snapshot mode functionality for esx/vpx: Mail subject: "[libvirt] VMWare "independent disk" processing needed" Sent: "04.07.2011" Sender: "computernews@rambler.ru"
What I meant is to allow "libvirt" to handle (see full .vmx attached) : vmx: scsi0:1.mode = "independent-persistent" vmx: scsi0:2.mode = "independent-nonpersistent"
As none has responded I decided to start implementing that myself.
Well, I saw your mail, but missed to respond in a reasonable timeframe.
By now I have the managed to implement the .vmx -> .xml part. According to "General tips for contributing patches" (http://libvirt.org/hacking.html#patches) I am sending the patch back to the community as early as it has sense as I still hope to contribute this feature to the community. It builds and has been functionally tested. I also tried my best to stick to coding conventions applied on the project. So I hope my patch will not take long time to analyze for someone experienced enough.
I saw someone already has started (put few comments about snapshot modes in src/vmx/vmx.c). So tried to undestand an original idea and keep up with it. I hope I was not wrong there.
I am willing to keep working in this direction. So if someone would be so kind to take a look at my efforts and provide a feedback - it would be very nice and, I hope, useful for the rest of the project.
Attached are: 1. patch itself 2. ".vmx" file I am testing on 3. ".xml" file produced by my changes
Looking forward to hear any feedback/criticism/advices. Thanks in advance.
Best regards Oleh Paliy
First of all your patch contains changes to files in the po directory. That has to be removed from the patch. Anyway, you decided to add an snapshot_mode attribute to the disk element and exposed the VMX values there. I'm not sure that this is a good idea as scsi0:0.mode affects two aspects. scsi0:0.mode can basically have three different modes - persistent, the default, a disk with this mode will take part in snapshots and changes to the disk's content persist domain power cycles and snapshot restoring. - independent-persistent, a disk with this mode will not take part in snapshots, but changes to the disk's content persist domain power cycles and snapshot restoring. - independent-nonpersistent, a disk with this mode will be not take part in snapshots and changes to the disk's content don't persist domain power cycles and snapshot restoring. This is realized by writing all changes into an additional .vmdk instead of the original .vmdk. This additional .vmdk is automatically deleted on domain power cycles and snapshot restoring. There are two additional but outdated modes undoable and nonpersistent that aren't supported anymore. So the two aspects scsi0:0.mode affects is snapshot and the persistence of changes. I think it makes more sense to use two attributes for the disk element to expose this. <disk ... snapshot='yes|no' persistent='yes|no'> - snapshot=yes persistent=yes maps to scsi0:0.mode=persistent - snapshot=yes persistent=no is unsupported for ESX - snapshot=no persistent=yes maps to scsi0:0.mode=independent-persistent - snapshot=no persistent=no maps to scsi0:0.mode=independent-nonpersistent Instead of attributes this could also be represented by subelements to the disk element, like <shareable/> or <readonly/> for filesystem elements, but that's a detail. The more important question is: Is this an general concept or is it too ESX specific? Eric is currently discussing/designing an extension to libvirt's snapshot/checkpoint capabilities. At the moment libvirt supports checkpointing a complete domain including RAM image and all storage volumes, but it doesn't support snapshotting of single storage volumes or a subset of the storage volumes of a domain. https://www.redhat.com/archives/libvir-list/2011-June/msg00761.html Eric suggest to extend the <domainsnapshot> and virStorageVol* APIs to allow to include only a subset of the domain's storage volumes in a checkpoint. This approach allows to specifiy for each checkpoint which storage volumes to include. ESX allows something similar with the independent modes, but you cannot define this on a per snapshot basis, but have to decided this before. Eric's approach is more flexible but doesn't work for ESX. I wonder if we could add the snapshot attribute/subelement to the disk element. This allows to set the independent mode for ESX and allows to define a preset for other hypervisors like QEMU that will support Eric's more flexible approach. So when you don't explicitly define which disk to include in a checkpoint in the <domainsnapshot> XML then the snapshot setting from the the <domain> XML apply. If there are no presets for snapshot it defaults to yes. For QEMU you could override the snapshot setting from the <domain> XML in the <domainsnapshot> XML, for ESX you either don't specify this in the <domainsnapshot> XML or have to match the settings from the <domain> XML due to the way ESX works. The persistence setting might be more ESX specific, but I think libvirt could realize this for QEMU too, when the domain is using qcow2 images with a base image. In that case libvirt could clear the qcow2 image when the domain is restarted to realize persistent=no. I might be incorrect here as I'm not the QEMU expert here. In general, I think that this works out quite well. Eric what do you think about adding snapshot preset per disk element to your snapshot extension proposal, so we can reuse it for ESX. On a second thought we might want to use negative word so we don't add subelements for the defaults, for example <disk type='file' device='disk'> <source file='[datastore] test1/test1.vmdk'/> <target dev='sda' bus='scsi'/> <independent/> <nonpersistent/> <address type='drive' controller='0' bus='0' unit='0'/> </disk> would disable snapshots and makes the disk non-persistent. -- Matthias Bolte http://photron.blogspot.com

On 07/09/2011 09:52 AM, Matthias Bolte wrote:
Anyway, you decided to add an snapshot_mode attribute to the disk element and exposed the VMX values there. I'm not sure that this is a good idea as scsi0:0.mode affects two aspects.
scsi0:0.mode can basically have three different modes
- persistent, the default, a disk with this mode will take part in snapshots and changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-persistent, a disk with this mode will not take part in snapshots, but changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-nonpersistent, a disk with this mode will be not take part in snapshots and changes to the disk's content don't persist domain power cycles and snapshot restoring. This is realized by writing all changes into an additional .vmdk instead of the original .vmdk. This additional .vmdk is automatically deleted on domain power cycles and snapshot restoring.
There are two additional but outdated modes undoable and nonpersistent that aren't supported anymore.
So the two aspects scsi0:0.mode affects is snapshot and the persistence of changes. I think it makes more sense to use two attributes for the disk element to expose this.
<disk ... snapshot='yes|no' persistent='yes|no'>
- snapshot=yes persistent=yes maps to scsi0:0.mode=persistent
- snapshot=yes persistent=no is unsupported for ESX
- snapshot=no persistent=yes maps to scsi0:0.mode=independent-persistent
- snapshot=no persistent=no maps to scsi0:0.mode=independent-nonpersistent
Hmm, this indeed seems like it might be reasonable to represent both aspects in XML. See also https://www.redhat.com/archives/libvir-list/2011-May/msg00315.html. At stake is whether a disk has a snapshot taken by default, and whether a disk is treated as temporary for the life of the domain (qemu has a -snapshot command line option that treats all disks as temporary, but with better per-volume snapshot abilities, libvirt could certainly offer the same fine-tuning of per-disk as esx appears to offer). So yes, I'll need to fold something like this into my v2 proposal for snapshot handling.
The more important question is: Is this an general concept or is it too ESX specific?
It's sounding generic enough that it will be worth getting it right in the XML.
Eric is currently discussing/designing an extension to libvirt's snapshot/checkpoint capabilities. At the moment libvirt supports checkpointing a complete domain including RAM image and all storage volumes, but it doesn't support snapshotting of single storage volumes or a subset of the storage volumes of a domain.
https://www.redhat.com/archives/libvir-list/2011-June/msg00761.html
Eric suggest to extend the <domainsnapshot> and virStorageVol* APIs to allow to include only a subset of the domain's storage volumes in a checkpoint. This approach allows to specifiy for each checkpoint which storage volumes to include. ESX allows something similar with the independent modes, but you cannot define this on a per snapshot basis, but have to decided this before. Eric's approach is more flexible but doesn't work for ESX. I wonder if we could add the snapshot attribute/subelement to the disk element. This allows to set the independent mode for ESX and allows to define a preset for other hypervisors like QEMU that will support Eric's more flexible approach.
So when you don't explicitly define which disk to include in a checkpoint in the <domainsnapshot> XML then the snapshot setting from the the <domain> XML apply. If there are no presets for snapshot it defaults to yes. For QEMU you could override the snapshot setting from the <domain> XML in the <domainsnapshot> XML, for ESX you either don't specify this in the <domainsnapshot> XML or have to match the settings from the <domain> XML due to the way ESX works.
Yes, having a per-disk default in the <domain> XML (applicable to both qemu and ESX), as well as a per-disk override in the <domainsnapshot> (here, qemu can take advantage, but ESX would have to fail if the override is not identical to the domain defaults). does make sense.
The persistence setting might be more ESX specific, but I think libvirt could realize this for QEMU too, when the domain is using qcow2 images with a base image. In that case libvirt could clear the qcow2 image when the domain is restarted to realize persistent=no. I might be incorrect here as I'm not the QEMU expert here.
You're exactly right - qemu can implement per-disk persistent=no by doing a qcow2 wrapper around just the disks that should be reverted when the VM stops running. There might be some interactions with migration to worry about, though.
On a second thought we might want to use negative word so we don't add subelements for the defaults, for example
<disk type='file' device='disk'> <source file='[datastore] test1/test1.vmdk'/> <target dev='sda' bus='scsi'/> <independent/> <nonpersistent/> <address type='drive' controller='0' bus='0' unit='0'/> </disk>
Now we're doing a bit of bike-shedding - I think there's definitely consensus that this has to be in the XML somewhere, but whether as an attribute or as a sub-element still remains to be decided. -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org

Hello, So what I see from the discussion is that I have to reimplement my patch one way or another (attribute / sub-element). So I am eager to do that as soon as the choice is made (unless you decide otherwise and want someone else to implement that). Now that you have explained the bigger picture with more details (other hypervisors but the ESX) let me summarize an upcoming changes into a specification: Value in .vmx file representation in <domain> XML" as attributes representation in <domain> XML" as sub-items Description scsi0:0.mode = "persistent" <disk ... snapshot='yes' persistent='yes'> Default. Snapshots are in use, changes survive power cycles scsi0:0.mode = "independent-persistent" <disk ... snapshot='no' persistent='yes'> <independent/> No snapshot logic is applicable, changes remain scsi0:0.mode = "independent-nonpersistent" <disk ... snapshot='no' persistent='no'> <independent/> <nonpersistent/> No snapshots, changes lost after domain power cycles scsi0:0.mode = "undoable" <disk ... snapshot='no' persistent='yes'> <independent/> for backwards compatibility scsi0:0.mode = "nonpersistent" <disk ... snapshot='no' persistent='no'> <independent/> <nonpersistent/> for backwards compatibility Also the reverse conversion has to be implemented (<domain> XML -> vmx file). Obviously the appropriate <domain> XML format description changes has to be made here: http://libvirt.org/formatdomain.html#elementsDisks. I can do that as well. Best regards Oleh Paliy
On 07/09/2011 09:52 AM, Matthias Bolte wrote:
Anyway, you decided to add an snapshot_mode attribute to the disk element and exposed the VMX values there. I'm not sure that this is a good idea as scsi0:0.mode affects two aspects.
scsi0:0.mode can basically have three different modes
- persistent, the default, a disk with this mode will take part in snapshots and changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-persistent, a disk with this mode will not take part in snapshots, but changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-nonpersistent, a disk with this mode will be not take part in snapshots and changes to the disk's content don't persist domain power cycles and snapshot restoring. This is realized by writing all changes into an additional .vmdk instead of the original .vmdk. This additional .vmdk is automatically deleted on domain power cycles and snapshot restoring.
There are two additional but outdated modes undoable and nonpersistent that aren't supported anymore.
So the two aspects scsi0:0.mode affects is snapshot and the persistence of changes. I think it makes more sense to use two attributes for the disk element to expose this.
<disk ... snapshot='yes|no' persistent='yes|no'>
- snapshot=yes persistent=yes maps to scsi0:0.mode=persistent
- snapshot=yes persistent=no is unsupported for ESX
- snapshot=no persistent=yes maps to scsi0:0.mode=independent-persistent
- snapshot=no persistent=no maps to scsi0:0.mode=independent-nonpersistent Hmm, this indeed seems like it might be reasonable to represent both aspects in XML. See also https://www.redhat.com/archives/libvir-list/2011-May/msg00315.html.
At stake is whether a disk has a snapshot taken by default, and whether a disk is treated as temporary for the life of the domain (qemu has a -snapshot command line option that treats all disks as temporary, but with better per-volume snapshot abilities, libvirt could certainly offer the same fine-tuning of per-disk as esx appears to offer).
So yes, I'll need to fold something like this into my v2 proposal for snapshot handling.
The more important question is: Is this an general concept or is it too ESX specific? It's sounding generic enough that it will be worth getting it right in the XML.
Eric is currently discussing/designing an extension to libvirt's snapshot/checkpoint capabilities. At the moment libvirt supports checkpointing a complete domain including RAM image and all storage volumes, but it doesn't support snapshotting of single storage volumes or a subset of the storage volumes of a domain.
https://www.redhat.com/archives/libvir-list/2011-June/msg00761.html
Eric suggest to extend the<domainsnapshot> and virStorageVol* APIs to allow to include only a subset of the domain's storage volumes in a checkpoint. This approach allows to specifiy for each checkpoint which storage volumes to include. ESX allows something similar with the independent modes, but you cannot define this on a per snapshot basis, but have to decided this before. Eric's approach is more flexible but doesn't work for ESX. I wonder if we could add the snapshot attribute/subelement to the disk element. This allows to set the independent mode for ESX and allows to define a preset for other hypervisors like QEMU that will support Eric's more flexible approach.
So when you don't explicitly define which disk to include in a checkpoint in the<domainsnapshot> XML then the snapshot setting from the the<domain> XML apply. If there are no presets for snapshot it defaults to yes. For QEMU you could override the snapshot setting from the<domain> XML in the<domainsnapshot> XML, for ESX you either don't specify this in the<domainsnapshot> XML or have to match the settings from the<domain> XML due to the way ESX works. Yes, having a per-disk default in the<domain> XML (applicable to both qemu and ESX), as well as a per-disk override in the<domainsnapshot> (here, qemu can take advantage, but ESX would have to fail if the override is not identical to the domain defaults). does make sense.
The persistence setting might be more ESX specific, but I think libvirt could realize this for QEMU too, when the domain is using qcow2 images with a base image. In that case libvirt could clear the qcow2 image when the domain is restarted to realize persistent=no. I might be incorrect here as I'm not the QEMU expert here. You're exactly right - qemu can implement per-disk persistent=no by doing a qcow2 wrapper around just the disks that should be reverted when the VM stops running. There might be some interactions with migration to worry about, though.
On a second thought we might want to use negative word so we don't add subelements for the defaults, for example
<disk type='file' device='disk'> <source file='[datastore] test1/test1.vmdk'/> <target dev='sda' bus='scsi'/> <independent/> <nonpersistent/> <address type='drive' controller='0' bus='0' unit='0'/> </disk> Now we're doing a bit of bike-shedding - I think there's definitely consensus that this has to be in the XML somewhere, but whether as an attribute or as a sub-element still remains to be decided.

2011/7/12 computernews@rambler.ru <computernews@rambler.ru>
** Hello,
So what I see from the discussion is that I have to reimplement my patch one way or another (attribute / sub-element). So I am eager to do that as soon as the choice is made (unless you decide otherwise and want someone else to implement that).
Oh, I already went ahead and implemented a first version of this (3 days ago) while thinking about how this could integrate with Eric's checkpointing, but didn't post it yet. It's a bit incomplete and lacks documentation, support for the legacy modes is incomplete and test coverage is incomplete. Here's the preliminary patch.
Now that you have explained the bigger picture with more details (other hypervisors but the ESX) let me summarize an upcoming changes into a specification: Value in .vmx file representation in <domain> XML" as attributes representation in <domain> XML" as sub-items Description scsi0:0.mode = "persistent" <disk ... snapshot='yes' persistent='yes'>
Default. Snapshots are in use, changes survive power cycles scsi0:0.mode = "independent-persistent" <disk ... snapshot='no' persistent='yes'> <independent/> No snapshot logic is applicable, changes remain scsi0:0.mode = "independent-nonpersistent" <disk ... snapshot='no' persistent='no'> <independent/> <nonpersistent/> No snapshots, changes lost after domain power cycles scsi0:0.mode = "undoable" <disk ... snapshot='no' persistent='yes'> <independent/> for backwards compatibility scsi0:0.mode = "nonpersistent" <disk ... snapshot='no' persistent='no'> <independent/> <nonpersistent/> for backwards compatibility Also the reverse conversion has to be implemented (<domain> XML -> vmx file).
Obviously the appropriate <domain> XML format description changes has to be made here: http://libvirt.org/formatdomain.html#elementsDisks. I can do that as well.
Best regards Oleh Paliy
On 07/09/2011 09:52 AM, Matthias Bolte wrote:
Anyway, you decided to add an snapshot_mode attribute to the disk element and exposed the VMX values there. I'm not sure that this is a good idea as scsi0:0.mode affects two aspects.
scsi0:0.mode can basically have three different modes
- persistent, the default, a disk with this mode will take part in snapshots and changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-persistent, a disk with this mode will not take part in snapshots, but changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-nonpersistent, a disk with this mode will be not take part in snapshots and changes to the disk's content don't persist domain power cycles and snapshot restoring. This is realized by writing all changes into an additional .vmdk instead of the original .vmdk. This additional .vmdk is automatically deleted on domain power cycles and snapshot restoring.
There are two additional but outdated modes undoable and nonpersistent that aren't supported anymore.
So the two aspects scsi0:0.mode affects is snapshot and the persistence of changes. I think it makes more sense to use two attributes for the disk element to expose this.
<disk ... snapshot='yes|no' persistent='yes|no'>
- snapshot=yes persistent=yes maps to scsi0:0.mode=persistent
- snapshot=yes persistent=no is unsupported for ESX
- snapshot=no persistent=yes maps to scsi0:0.mode=independent-persistent
- snapshot=no persistent=no maps to scsi0:0.mode=independent-nonpersistent
Hmm, this indeed seems like it might be reasonable to represent both aspects in XML. See alsohttps://www.redhat.com/archives/libvir-list/2011-May/msg00315.html.
At stake is whether a disk has a snapshot taken by default, and whether a disk is treated as temporary for the life of the domain (qemu has a -snapshot command line option that treats all disks as temporary, but with better per-volume snapshot abilities, libvirt could certainly offer the same fine-tuning of per-disk as esx appears to offer).
So yes, I'll need to fold something like this into my v2 proposal for snapshot handling.
The more important question is: Is this an general concept or is it too ESX specific?
It's sounding generic enough that it will be worth getting it right in the XML.
Eric is currently discussing/designing an extension to libvirt's snapshot/checkpoint capabilities. At the moment libvirt supports checkpointing a complete domain including RAM image and all storage volumes, but it doesn't support snapshotting of single storage volumes or a subset of the storage volumes of a domain. https://www.redhat.com/archives/libvir-list/2011-June/msg00761.html
Eric suggest to extend the <domainsnapshot> and virStorageVol* APIs to allow to include only a subset of the domain's storage volumes in a checkpoint. This approach allows to specifiy for each checkpoint which storage volumes to include. ESX allows something similar with the independent modes, but you cannot define this on a per snapshot basis, but have to decided this before. Eric's approach is more flexible but doesn't work for ESX. I wonder if we could add the snapshot attribute/subelement to the disk element. This allows to set the independent mode for ESX and allows to define a preset for other hypervisors like QEMU that will support Eric's more flexible approach.
So when you don't explicitly define which disk to include in a checkpoint in the <domainsnapshot> XML then the snapshot setting from the the <domain> XML apply. If there are no presets for snapshot it defaults to yes. For QEMU you could override the snapshot setting from the <domain> XML in the <domainsnapshot> XML, for ESX you either don't specify this in the <domainsnapshot> XML or have to match the settings from the <domain> XML due to the way ESX works.
Yes, having a per-disk default in the <domain> XML (applicable to both qemu and ESX), as well as a per-disk override in the <domainsnapshot> (here, qemu can take advantage, but ESX would have to fail if the override is not identical to the domain defaults). does make sense.
The persistence setting might be more ESX specific, but I think libvirt could realize this for QEMU too, when the domain is using qcow2 images with a base image. In that case libvirt could clear the qcow2 image when the domain is restarted to realize persistent=no. I might be incorrect here as I'm not the QEMU expert here.
You're exactly right - qemu can implement per-disk persistent=no by doing a qcow2 wrapper around just the disks that should be reverted when the VM stops running. There might be some interactions with migration to worry about, though.
On a second thought we might want to use negative word so we don't add subelements for the defaults, for example
<disk type='file' device='disk'> <source file='[datastore] test1/test1.vmdk'/> <target dev='sda' bus='scsi'/> <independent/> <nonpersistent/> <address type='drive' controller='0' bus='0' unit='0'/> </disk>
Now we're doing a bit of bike-shedding - I think there's definitely consensus that this has to be in the XML somewhere, but whether as an attribute or as a sub-element still remains to be decided.
-- libvir-list mailing list libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Matthias Bolte http://photron.blogspot.com

Hello, Thanks for your time. I needed this really feature badly. Looking forward to test the final version. Best regards Oleh Paliy
2011/7/12 computernews@rambler.ru <mailto:computernews@rambler.ru> <computernews@rambler.ru <mailto:computernews@rambler.ru>>
Hello,
So what I see from the discussion is that I have to reimplement my patch one way or another (attribute / sub-element). So I am eager to do that as soon as the choice is made (unless you decide otherwise and want someone else to implement that).
Oh, I already went ahead and implemented a first version of this (3 days ago) while thinking about how this could integrate with Eric's checkpointing, but didn't post it yet. It's a bit incomplete and lacks documentation, support for the legacy modes is incomplete and test coverage is incomplete.
Here's the preliminary patch.
Now that you have explained the bigger picture with more details (other hypervisors but the ESX) let me summarize an upcoming changes into a specification: Value in .vmx file representation in <domain> XML" as attributes representation in <domain> XML" as sub-items Description scsi0:0.mode = "persistent" <disk ... snapshot='yes' persistent='yes'>
Default. Snapshots are in use, changes survive power cycles scsi0:0.mode = "independent-persistent" <disk ... snapshot='no' persistent='yes'> <independent/> No snapshot logic is applicable, changes remain scsi0:0.mode = "independent-nonpersistent" <disk ... snapshot='no' persistent='no'> <independent/> <nonpersistent/> No snapshots, changes lost after domain power cycles scsi0:0.mode = "undoable" <disk ... snapshot='no' persistent='yes'> <independent/> for backwards compatibility scsi0:0.mode = "nonpersistent" <disk ... snapshot='no' persistent='no'> <independent/> <nonpersistent/> for backwards compatibility
Also the reverse conversion has to be implemented (<domain> XML -> vmx file).
Obviously the appropriate <domain> XML format description changes has to be made here: http://libvirt.org/formatdomain.html#elementsDisks. I can do that as well.
Best regards Oleh Paliy
On 07/09/2011 09:52 AM, Matthias Bolte wrote:
Anyway, you decided to add an snapshot_mode attribute to the disk element and exposed the VMX values there. I'm not sure that this is a good idea as scsi0:0.mode affects two aspects.
scsi0:0.mode can basically have three different modes
- persistent, the default, a disk with this mode will take part in snapshots and changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-persistent, a disk with this mode will not take part in snapshots, but changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-nonpersistent, a disk with this mode will be not take part in snapshots and changes to the disk's content don't persist domain power cycles and snapshot restoring. This is realized by writing all changes into an additional .vmdk instead of the original .vmdk. This additional .vmdk is automatically deleted on domain power cycles and snapshot restoring.
There are two additional but outdated modes undoable and nonpersistent that aren't supported anymore.
So the two aspects scsi0:0.mode affects is snapshot and the persistence of changes. I think it makes more sense to use two attributes for the disk element to expose this.
<disk ... snapshot='yes|no' persistent='yes|no'>
- snapshot=yes persistent=yes maps to scsi0:0.mode=persistent
- snapshot=yes persistent=no is unsupported for ESX
- snapshot=no persistent=yes maps to scsi0:0.mode=independent-persistent
- snapshot=no persistent=no maps to scsi0:0.mode=independent-nonpersistent
Hmm, this indeed seems like it might be reasonable to represent both aspects in XML. See also https://www.redhat.com/archives/libvir-list/2011-May/msg00315.html.
At stake is whether a disk has a snapshot taken by default, and whether a disk is treated as temporary for the life of the domain (qemu has a -snapshot command line option that treats all disks as temporary, but with better per-volume snapshot abilities, libvirt could certainly offer the same fine-tuning of per-disk as esx appears to offer).
So yes, I'll need to fold something like this into my v2 proposal for snapshot handling.
The more important question is: Is this an general concept or is it too ESX specific?
It's sounding generic enough that it will be worth getting it right in the XML.
Eric is currently discussing/designing an extension to libvirt's snapshot/checkpoint capabilities. At the moment libvirt supports checkpointing a complete domain including RAM image and all storage volumes, but it doesn't support snapshotting of single storage volumes or a subset of the storage volumes of a domain.
https://www.redhat.com/archives/libvir-list/2011-June/msg00761.html
Eric suggest to extend the<domainsnapshot> and virStorageVol* APIs to allow to include only a subset of the domain's storage volumes in a checkpoint. This approach allows to specifiy for each checkpoint which storage volumes to include. ESX allows something similar with the independent modes, but you cannot define this on a per snapshot basis, but have to decided this before. Eric's approach is more flexible but doesn't work for ESX. I wonder if we could add the snapshot attribute/subelement to the disk element. This allows to set the independent mode for ESX and allows to define a preset for other hypervisors like QEMU that will support Eric's more flexible approach.
So when you don't explicitly define which disk to include in a checkpoint in the<domainsnapshot> XML then the snapshot setting from the the<domain> XML apply. If there are no presets for snapshot it defaults to yes. For QEMU you could override the snapshot setting from the<domain> XML in the<domainsnapshot> XML, for ESX you either don't specify this in the<domainsnapshot> XML or have to match the settings from the<domain> XML due to the way ESX works.
Yes, having a per-disk default in the<domain> XML (applicable to both qemu and ESX), as well as a per-disk override in the<domainsnapshot> (here, qemu can take advantage, but ESX would have to fail if the override is not identical to the domain defaults). does make sense.
The persistence setting might be more ESX specific, but I think libvirt could realize this for QEMU too, when the domain is using qcow2 images with a base image. In that case libvirt could clear the qcow2 image when the domain is restarted to realize persistent=no. I might be incorrect here as I'm not the QEMU expert here.
You're exactly right - qemu can implement per-disk persistent=no by doing a qcow2 wrapper around just the disks that should be reverted when the VM stops running. There might be some interactions with migration to worry about, though.
On a second thought we might want to use negative word so we don't add subelements for the defaults, for example
<disk type='file' device='disk'> <source file='[datastore] test1/test1.vmdk'/> <target dev='sda' bus='scsi'/> <independent/> <nonpersistent/> <address type='drive' controller='0' bus='0' unit='0'/> </disk>
Now we're doing a bit of bike-shedding - I think there's definitely consensus that this has to be in the XML somewhere, but whether as an attribute or as a sub-element still remains to be decided.
-- libvir-list mailing list libvir-list@redhat.com <mailto:libvir-list@redhat.com> https://www.redhat.com/mailman/listinfo/libvir-list
-- Matthias Bolte http://photron.blogspot.com

On 07/12/2011 09:17 AM, Matthias Bolte wrote:
Here's the preliminary patch.
Anyway, you decided to add an snapshot_mode attribute to the disk element and exposed the VMX values there. I'm not sure that this is a good idea as scsi0:0.mode affects two aspects.
scsi0:0.mode can basically have three different modes
- persistent, the default, a disk with this mode will take part in snapshots and changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-persistent, a disk with this mode will not take part in snapshots, but changes to the disk's content persist domain power cycles and snapshot restoring.
- independent-nonpersistent, a disk with this mode will be not take part in snapshots and changes to the disk's content don't persist domain power cycles and snapshot restoring. This is realized by writing all changes into an additional .vmdk instead of the original .vmdk. This additional .vmdk is automatically deleted on domain power cycles and snapshot restoring.
There are two additional but outdated modes undoable and nonpersistent that aren't supported anymore.
So the two aspects scsi0:0.mode affects is snapshot and the persistence of changes. I think it makes more sense to use two attributes for the disk element to expose this.
<disk ... snapshot='yes|no' persistent='yes|no'>
- snapshot=yes persistent=yes maps to scsi0:0.mode=persistent
- snapshot=yes persistent=no is unsupported for ESX
- snapshot=no persistent=yes maps to scsi0:0.mode=independent-persistent
- snapshot=no persistent=no maps to scsi0:0.mode=independent-nonpersistent
snapshot=yes persistent=no probably does not make sense for any hypervisor (if you are going to throw away the disk changes at next boot, then creating a snapshot would make the data persistent after all). More specifically, both the ESX and QEMU implementations of a non-persistent disk (create a wrapper .vdsk or qcow2 file with the original file as backing, and throw away the wrapper when the domain quits running) would imply turning the temporary file into the backing file of yet another image, but as soon as the temporary file is deleted, that other image is rendered broken.
The more important question is: Is this an general concept or is it too ESX specific?
It's sounding generic enough that it will be worth getting it right in the XML.
On a second thought we might want to use negative word so we don't add subelements for the defaults, for example
<disk type='file' device='disk'> <source file='[datastore] test1/test1.vmdk'/> <target dev='sda' bus='scsi'/> <independent/> <nonpersistent/> <address type='drive' controller='0' bus='0' unit='0'/> </disk>
Now we're doing a bit of bike-shedding - I think there's definitely consensus that this has to be in the XML somewhere, but whether as an attribute or as a sub-element still remains to be decided.
If we argue that only three modes make sense (1. ESX "persistent" - participates in snapshots and changes survive boots; 2. ESX "independent-persistent" - changes survive boots but does not participate in snapshots; 3. ESX "independent-nonpersistent" - changes are thrown away at boot), then maybe this is better rendered as a single attribute: <disk persistence='snapshot|independent|none'> Or can someone make the case that independence and persistence are orthogonal, and that there ever would be a use case for a disk that participates in snapshots but gets reverted at next boot? -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org
participants (3)
-
computernews@rambler.ru
-
Eric Blake
-
Matthias Bolte