Hi Eric and list,

I had another production VM start pausing itself.  This one had been running for more than 4 years on a 60G LVM volume.  It has had the occasional snapshot during that time though all have been "removed" using the virt-manager gui so I used qemu-img as you suggested.

# qemu-img convert /dev/trk-kvm-02-vg/rt44 -O qcow2 /mnt/scratch/rt44.qcow2

dd'd the qcow2 image back on to the LV after testing it boots OK directly from the image and it is in production again.

The VM itself reports ample space available:

$ df -h
Filesystem                       Size  Used Avail Use% Mounted on
udev                             3.9G     0  3.9G   0% /dev
tmpfs                            789M  8.8M  780M   2% /run
/dev/mapper/RT--vg-root           51G   21G   28G  42% /
tmpfs                            3.9G     0  3.9G   0% /dev/shm
tmpfs                            5.0M     0  5.0M   0% /run/lock
tmpfs                            3.9G     0  3.9G   0% /sys/fs/cgroup
/dev/vda1                        472M  155M  293M  35% /boot
192.168.0.16:/volume1/fileLevel  8.1T  2.5T  5.6T  31% /mnt/nfs/fileLevel
tmpfs                            789M     0  789M   0% /run/user/1000

I would prefer to not get caught out again with this machine pausing, how can I determine how much space is being used up by 'deleted' internal snapshots?  Do you have any suggested reading on this?

If I extend the LVM volume but not the guest file system will snapshots be "at the end" of the LV and "outside" the guest file system?

If I were to expand the guest's ext4 file system I would want to do it unmounted and from a live CD but I'm having a heck of a time getting my live distro to use the virtio disk drivers.  Any advice there?

sincerely

Paul O'Rorke
Tracker Software Products (Canada) Limited
www.tracker-software.com
Tel: +1 (250) 324 1621
Fax: +1 (250) 324 1623



Support:
http://www.tracker-software.com/support
Download latest Releases
http://www.tracker-software.com/downloads/




On 2018-05-01 02:31 PM, Eric Blake wrote:
On 05/01/2018 04:17 PM, Paul O'Rorke wrote:
I have been using internal snapshots on production qcow2 images for a couple of years, admittedly as infrequently as possible with one exception and that exception has had multiple snapshots taken and removed using virt-manager's GUI.

I was unaware of this:
There are some technical downsides to
internal snapshots IIUC, such as inability to free the space used by the
internal snapshot when it is deleted,

This is not an insurmountable difficulty, just one that no one has spent time coding up.


This might explain why this VM recently kept going into a paused state and I had to extend the volume to get it to stay up.  This VM is used for testing our software in SharePoint and we make heavy use of snapshots.  Is there nothing I can to do recover that space?

If you have no internal snapshots, you can do a 'qemu-img convert' to copy just the portion of the image that is actively in use; the copy will use less disk space than the original because it got rid of the now-unused space.  'virt-sparsify' from libguestfs takes this one step further, by also removing unused space within the guest filesystem itself.

In fact, even if you do have internal snapshots, there is probably a sequence of 'qemu-img convert' invocations that can ultimately convert all of your internal snapshots into an external chain of snapshots; but I don't have a ready formula off-hand to point to (experiment on an image you don't care about, before doing it on your production image).

What would be the best practice then for a VM that needs to be able to create and remove snapshots on a regular basis?

In a managed environment, external snapshots probably have the most support for creating and later merging portions of a chain of snapshots, although we could still improve libvirt to make this feel like more of a first class citizen.