[libvirt] flush guest page cache and suspend?

Hi, I'm new to this list. I've searched the archives (and google) but I haven't found an answer. Recently I moved to CentOS 5.4, with libvirt-0.6.3 and kvm-83. They are rather new compared to 5.3 ones, so I'm still investigating on the new capabilities. My guests disk images are LV single devices, /dev/vg_virt/<guestname>, usually with two partitions, the root fs and swap space. I'd like to run backups on the node only, taking snapshots of the LVs. I'm already doing that, actually. Everything works fine, using bacula (with one minor glitch, but that's for another list). Now I wonder if it makes sense to suspend the guest before creating the snapshot: virsh suspend <guestname> lvcreate --snapshot ... virsh resume <guestname> ... perform backup ... lvremove -f <the snapshot> but thinking about it a second time it seems it doesn't add anything to the equation, the filesystem is still potentially unclean, with data to be written still in the guest page cache. The filesystem is ext3, so it's not that bad (recovers a consistent state from the journal, with standard options even data-consistent, not just metadata-consistent). Yet it's like doing a backup of a (potentially badly) crashed filesystem. While I trust the recovery step of ext3, my feeling is that it can't be 100% reliable. I have (rarely) seen ext3 crash and ask for fsck at boot, journal or not (for truth's sake, I can't rule out an hard disk problem in those cases). So, first question: does the suspend command cause a flush in the guest OS (details: both guest and node are CentOS 5.4, hypervisor is qemu-kvm)? I guess not (otherwise I won't be here). So if not, what are the options to force the sync? Ideally, there should be a single atomic 'sync & suspend'. In practice, I can think of some workarounds: ssh <guest> -c sync, or the very old-fashioned way of enabling the 'sync' user and logging in from the serial console, or issuing a sysrq-s, again on the console. I'm interested in the latter, but I wasn't able to trigger the sysrq from either 'virsh console <guestname>' or 'screen `virsh ttyconsole <guestname>` (tried minicom, too). The serial console is on pty, I'm not even sure you can generate a break on a pty. (and yes, I remembered to sysctl -w kernel.sysrq=1 on the guest). I know XEN has 'xm sysrq', but this is kvm. Is there anything similar? I think it can be done if you invoke qemu-kvm from the command line with -nographics (it multiplexes the console and the monitor lines on stdin/out, and you can send a 'break' with C-a b I think, I've never tried). So question 2): is there a way to send a sysrq-s to the guest? My fallback plan is to try and map the serial line over telnet instead of pty, and the figure out a way to send the break (which I think is part of the telnet protocol) from the command line. I'd rather not mess with telnet and local port assignment to the guests, if not necessary. My really fallback plan is to revert to shutdown/snapshot/create, which is overkill for a daily backup. Question 3): does it _really_ make sense to try and sync the guest OS page cache before taking the snapshot? Or is it just me being paranoid? For reference, here's the qemu cmdline of one of the guests: /usr/libexec/qemu-kvm -S -M pc -m 512 -smp 1 -name wtitv -uuid 445d0fe5-c1b6-4baf-a186-da1fe021158c -monitor pty -pidfile /var/run/libvirt/qemu//wtitv.pid -boot c -drive file=/dev/vg_virt/wtitv,if=ide,index=0,boot=on -net nic,macaddr=00:16:3e:57:45:4f,vlan=0,model=e1000 -net tap,fd=14,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb -usbdevice tablet -vnc 127.0.0.1:0 and here's the xml config file for libvirt, same guest: <domain type='kvm'> <name>wtitv</name> <uuid>445d0fe5-c1b6-4baf-a186-da1fe021158c</uuid> <memory>524288</memory> <currentMemory>524288</currentMemory> <vcpu>1</vcpu> <os> <type arch='x86_64' machine='pc'>hvm</type> <boot dev='hd'/> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <features> <acpi/> </features> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='block' device='disk'> <source dev='/dev/vg_virt/wtitv'/> <target dev='hda' bus='ide'/> </disk> <interface type='bridge'> <mac address='00:16:3e:57:45:4f'/> <source bridge='br1'/> <model type='e1000'/> </interface> <console type='pty'> <target port='0'/> </console> <input type='tablet' bus='usb'/> <graphics type='vnc' autoport='yes'/> </devices> </domain> TIA, .TM.

Hi. I cannot answer for the sync when the host is suspended, but I think the best way is to save your guest (virsh save guest /past/to/backup), take your snapshot, restore the guest (virsh restore /path/to/backup), then run the backup (of the snapshot and the saved state file). I've writtent a script which does exactly that: http://repo.firewall-services.com/misc/virt/virt-backup.pl It takes care of: - saving the guest (or just suspend) - take the snapshot (if possible) - restore the guest (or just resume) - dump the snapshot, and optionnaly compress it on the fly - cleanup everything when you're done Regards, Daniel Le mercredi 04 novembre 2009 à 18:16 +0100, Marco Colombo a écrit :
Hi, I'm new to this list. I've searched the archives (and google) but I haven't found an answer.
Recently I moved to CentOS 5.4, with libvirt-0.6.3 and kvm-83. They are rather new compared to 5.3 ones, so I'm still investigating on the new capabilities.
My guests disk images are LV single devices, /dev/vg_virt/<guestname>, usually with two partitions, the root fs and swap space.
I'd like to run backups on the node only, taking snapshots of the LVs. I'm already doing that, actually. Everything works fine, using bacula (with one minor glitch, but that's for another list).
Now I wonder if it makes sense to suspend the guest before creating the snapshot:
virsh suspend <guestname> lvcreate --snapshot ... virsh resume <guestname> ... perform backup ... lvremove -f <the snapshot>
but thinking about it a second time it seems it doesn't add anything to the equation, the filesystem is still potentially unclean, with data to be written still in the guest page cache. The filesystem is ext3, so it's not that bad (recovers a consistent state from the journal, with standard options even data-consistent, not just metadata-consistent). Yet it's like doing a backup of a (potentially badly) crashed filesystem. While I trust the recovery step of ext3, my feeling is that it can't be 100% reliable. I have (rarely) seen ext3 crash and ask for fsck at boot, journal or not (for truth's sake, I can't rule out an hard disk problem in those cases).
So, first question: does the suspend command cause a flush in the guest OS (details: both guest and node are CentOS 5.4, hypervisor is qemu-kvm)?
I guess not (otherwise I won't be here). So if not, what are the options to force the sync?
Ideally, there should be a single atomic 'sync & suspend'. In practice, I can think of some workarounds: ssh <guest> -c sync, or the very old-fashioned way of enabling the 'sync' user and logging in from the serial console, or issuing a sysrq-s, again on the console.
I'm interested in the latter, but I wasn't able to trigger the sysrq from either 'virsh console <guestname>' or 'screen `virsh ttyconsole <guestname>` (tried minicom, too). The serial console is on pty, I'm not even sure you can generate a break on a pty. (and yes, I remembered to sysctl -w kernel.sysrq=1 on the guest).
I know XEN has 'xm sysrq', but this is kvm. Is there anything similar? I think it can be done if you invoke qemu-kvm from the command line with -nographics (it multiplexes the console and the monitor lines on stdin/out, and you can send a 'break' with C-a b I think, I've never tried).
So question 2): is there a way to send a sysrq-s to the guest?
My fallback plan is to try and map the serial line over telnet instead of pty, and the figure out a way to send the break (which I think is part of the telnet protocol) from the command line. I'd rather not mess with telnet and local port assignment to the guests, if not necessary.
My really fallback plan is to revert to shutdown/snapshot/create, which is overkill for a daily backup.
Question 3): does it _really_ make sense to try and sync the guest OS page cache before taking the snapshot? Or is it just me being paranoid?
For reference, here's the qemu cmdline of one of the guests:
/usr/libexec/qemu-kvm -S -M pc -m 512 -smp 1 -name wtitv -uuid 445d0fe5-c1b6-4baf-a186-da1fe021158c -monitor pty -pidfile /var/run/libvirt/qemu//wtitv.pid -boot c -drive file=/dev/vg_virt/wtitv,if=ide,index=0,boot=on -net nic,macaddr=00:16:3e:57:45:4f,vlan=0,model=e1000 -net tap,fd=14,script=,vlan=0,ifname=vnet0 -serial pty -parallel none -usb -usbdevice tablet -vnc 127.0.0.1:0
and here's the xml config file for libvirt, same guest:
<domain type='kvm'> <name>wtitv</name> <uuid>445d0fe5-c1b6-4baf-a186-da1fe021158c</uuid> <memory>524288</memory> <currentMemory>524288</currentMemory> <vcpu>1</vcpu> <os> <type arch='x86_64' machine='pc'>hvm</type> <boot dev='hd'/> </os> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <features> <acpi/> </features> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='block' device='disk'> <source dev='/dev/vg_virt/wtitv'/> <target dev='hda' bus='ide'/> </disk> <interface type='bridge'> <mac address='00:16:3e:57:45:4f'/> <source bridge='br1'/> <model type='e1000'/> </interface> <console type='pty'> <target port='0'/> </console> <input type='tablet' bus='usb'/> <graphics type='vnc' autoport='yes'/> </devices> </domain>
TIA, .TM.
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list -- Daniel Berteaud FIREWALL-SERVICES SARL. Société de Services en Logiciels Libres Technopôle Montesquieu 33650 MARTILLAC Tel : 05 56 64 15 32 Fax : 05 56 64 15 32 Mail: daniel@firewall-services.com Web : http://www.firewall-services.com

On 11/04/2009 06:29 PM, Daniel Berteaud wrote:
Hi.
I cannot answer for the sync when the host is suspended, but I think the best way is to save your guest (virsh save guest /past/to/backup), take your snapshot, restore the guest (virsh restore /path/to/backup), then run the backup (of the snapshot and the saved state file).
I've writtent a script which does exactly that:
http://repo.firewall-services.com/misc/virt/virt-backup.pl
It takes care of: - saving the guest (or just suspend) - take the snapshot (if possible) - restore the guest (or just resume) - dump the snapshot, and optionnaly compress it on the fly - cleanup everything when you're done
Regards, Daniel
Wow, what a nice script, thank you. If I understand it right, you also save the VM state, and any dirty pages are in there. The combo (dd of snapshots) + (vm state) is a consistent backup of the guest. The only drawback I can see is that the backup can only be restored by restoring the whole guest, tho. The filesystem dump(s) alone potentially are not in a sane state. In order to access the data safely you need to restore a copy of the guest VM in an isolated enviroment, log in and shutdown. Then you can access the disk images. Accessing the files is a kind of requirement if you run bacula. Of course it can be instructed to backup a single big file, but that defeats completely the idea of fine grained restore (even a single file). Bacula maintains a catalog of all files in all clients even multiple versions at different times, that's part of its strengths. Anyway, it's still possible to think something like this: save/snapshot/restore/dump on the node, then copy everything on the backup server (or a dedicated host close to it), restore the copy of the guest again in an isolated environment, clean shutdown, backup. That depends on the configuration of the guest, of course... in my case, the network is attached to a bridge that is configured at boot time by the node, so all I need is an isolated bridge device (with no real interfaces attached) on the restore host... although, the disks being LVs, each one with a different name and device path, a bit of adjustment on the XML configuration is required. Nothing that a perl (or python) script can't handle. This works for full backups, for differential/incremental ones is kind of overkill, as you copy the LVs entirely everytime. Still it may be sensible to instruct bacula to make incrementals: they are supposed to 1) reduce bandwidth, 2) reduce storage requirements. 1) is defeated, but 2) still holds. But I think there must be a better way. :) All I need is to sync the page cache on the guest just before the snapshot. sysrq-s would do. A possibile alternative is the suspend-to-mem and suspend-to-disk stuff. If they do a sync, it could be feasible to trigger a suspend-to-mem on the guest, suspend the VM, snapshot, restore VM, restart the guest (I have not idea how to do that, tho). .TM.
participants (2)
-
Daniel Berteaud
-
Marco Colombo