----- Original Message -----
From: "Kashyap Chamarthy" <kchamart(a)redhat.com>
Looking at `qemu-img` source[1], 'cache=writeback' seems to be the
default. That's also corroborated by this[2] (Rich's blog, and
libguestfs/virt-tools lead developer).
[1]
http://git.qemu.org/?p=qemu.git;a=blob;f=qemu-img.c;h=d4518e724f848a6ff8f...
[2]
http://rwmj.wordpress.com/2013/09/02/new-in-libguestfs-allow-cache-mode-t...
> Which is correct?
"cache=writeback"
> How is the cache mode set by default (if cache= is not
> specified)?
It's compiled into the binary.
> My second question is can cache=none be used safely on a local ext4
> filesystem with no BBU? Since ext4 uses barriers, would writing to
> these qcow2 image files be safe? The kernel documentation about
> barriers states that "Write barriers enforce proper on-disk ordering
> of journal commits, making volatile disk write caches safe to use, at
> some performance penalty". Does this apply to qcow2 VM images?
FWIW, in my test environments (which I should admit - there's not a
whole lot of I/O activity), I use:
$ qemu-img create -f qcow2 -o preallocation=metadata test1.qcow2 8G
Followed by an `fallocate`:
$ fallocate -l 8589934592 test1.qcow2
Then, I used to invoke QEMU "cache=none" (setting it in libvirt's guest
XML), but lately started using the default "cache=writeback" after the
I learnt about the bug from Rich's blog above.
--
/kashyap
Kashyap,
Thanks for the clarification. Rich's article seems to indicate that cache=writeback
is safe:
writeback is the new, safe default. Flush commands are obeyed so as
long as you’re
using a journalled filesystem or issue guestfs_sync calls your data will be safe.
However, I have several VMs running on a server with qemu-kvm 1.4.0 and libguestfs
1.14.8
(older because this is Ubuntu 12.04) using the default cache mode, cache=writeback,
and recently this server's UPS experienced a fault so all of the VMs and host lost
power.
After booting back up, I discovered that the filesystems on 3 of the guests were
corrupted, requiring an fsck with a lot of fixes. After fsck finished, data that had
been written within the last 24-48 hours on the disks appears to have been corrupted.
This makes me think that the data was never synced back to the disk, and would
indicate that I can't trust the guest's journalled filesystem. This data was
written
several hours before the crash, so I would think that should be enough time for an
fsync to be called.
How can I guarantee the safety of written data on guests whose images are stored on
the following types of filesystems:
* local ext4 filesystem on a md RAID (no BBU)
* NFS mountpoint with the "sync" option
Thanks,
Andrew