
On Wed, Oct 08, 2008 at 11:06:27AM -0500, Anthony Liguori wrote:
Daniel P. Berrange wrote:
On Wed, Oct 08, 2008 at 01:15:46PM +0200, Chris Lalancette wrote:
Daniel P. Berrange wrote:
QEMU defaults to allowing the host OS to cache all disk I/O. THis has a couple of problems
- It is a waste of memory because the guest already caches I/O ops - It is unsafe on host OS crash - all unflushed guest I/O will be lost, and there's no ordering guarentees, so metadata updates could be flushe to disk, while the journal updates were not. Say goodbye to your filesystem. - It makes benchmarking more or less impossible / worthless because what the benchmark things are disk writes just sit around in memory so guest disk performance appears to exceed host diskperformance.
This patch disables caching on all QEMU guests. NB, Xen has long done this for both PV & HVM guests - QEMU only gained this ability when -drive was introduced, and sadly kept the default to unsafe cache=on settings. I'm for this in general, but I'm a little worried about the "performance regression" aspect of this. People are going to upgrade to 0.4.7 (or whatever), and suddenly find that their KVM guests perform much more slowly. This is better in the end for their data, but we might hear large complaints about it.
Yes & no. They will find their guests perform more consistently. With the current system their guests will perform very erratically depending on memory & I/O pressure on the host. If the host I/O cache is empty & has no I/O load, current guests will be "fast",
They will perform marginally better than if cache=off. This is the Linux host knows more about the underlying hardware than the guest and is able to do smarter read-ahead. When using cache=off, the host cannot perform any sort of read-ahead.
but if host I/O cache is full and they do something which requires more host memory (eg start up another guest), then all existing guests get their I/O performance trashed as the I/O cache has to be flushed out, and future I/O is unable to be cached.
This is not accurate. Dirty pages in the host page cache are not reclaimable until they're written to disk. If you're in a seriously low memory situation, they the thing allocating memory is going to sleep until the data is written to disk. If an existing guest is trying to do I/O, then what things will degenerate to is basically cache=off since the guest must wait for other pending IO to complete
Xen went through this same change and there were not any serious complaints, particularly when explained that previous system had zero data integrity guarentees. The current system merely provides an illusion of performance - any attempt to show that performance has decreased is impossible because any attempt to run benchmarks with existing caching just results in meaningless garbage.
I can't see this bug, but a quick grep of ioemu in xen-unstable for O_DIRECT reveals that they are not in fact using O_DIRECT.
Sorry, it was mistakenly private - fixed now. Xen does use O_DIRECT for paravirt driver case - blktap is using the combo of AIO+O_DIRECT. QEMU code is only used for the IDE emulation case which isn't interesting from a performance POV. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|