On Wed, Oct 08, 2008 at 11:06:27AM -0500, Anthony Liguori wrote:
Daniel P. Berrange wrote:
>On Wed, Oct 08, 2008 at 01:15:46PM +0200, Chris Lalancette wrote:
>>Daniel P. Berrange wrote:
>>>QEMU defaults to allowing the host OS to cache all disk I/O. THis has a
>>>couple of problems
>>>
>>> - It is a waste of memory because the guest already caches I/O ops
>>> - It is unsafe on host OS crash - all unflushed guest I/O will be
>>> lost, and there's no ordering guarentees, so metadata updates could
>>> be flushe to disk, while the journal updates were not. Say goodbye
>>> to your filesystem.
>>> - It makes benchmarking more or less impossible / worthless because
>>> what the benchmark things are disk writes just sit around in memory
>>> so guest disk performance appears to exceed host diskperformance.
>>>
>>>This patch disables caching on all QEMU guests. NB, Xen has long done
>>>this
>>>for both PV & HVM guests - QEMU only gained this ability when -drive was
>>>introduced, and sadly kept the default to unsafe cache=on settings.
>>I'm for this in general, but I'm a little worried about the
"performance
>>regression" aspect of this. People are going to upgrade to 0.4.7 (or
>>whatever),
>>and suddenly find that their KVM guests perform much more slowly. This is
>>better in the end for their data, but we might hear large complaints
>>about it.
>
>Yes & no. They will find their guests perform more consistently. With the
>current system their guests will perform very erratically depending on
>memory & I/O pressure on the host. If the host I/O cache is empty & has
>no I/O load, current guests will be "fast",
They will perform marginally better than if cache=off. This is the
Linux host knows more about the underlying hardware than the guest and
is able to do smarter read-ahead. When using cache=off, the host cannot
perform any sort of read-ahead.
>but if host I/O cache is full
>and they do something which requires more host memory (eg start up another
>guest), then all existing guests get their I/O performance trashed as the
>I/O cache has to be flushed out, and future I/O is unable to be cached.
This is not accurate. Dirty pages in the host page cache are not
reclaimable until they're written to disk. If you're in a seriously low
memory situation, they the thing allocating memory is going to sleep
until the data is written to disk. If an existing guest is trying to do
I/O, then what things will degenerate to is basically cache=off since
the guest must wait for other pending IO to complete
>Xen went through this same change and there were not any serious
>complaints, particularly when explained that previous system had
>zero data integrity guarentees. The current system merely provides an
>illusion of performance - any attempt to show that performance has
>decreased is impossible because any attempt to run benchmarks with
>existing caching just results in meaningless garbage.
>
>https://bugzilla.redhat.com/show_bug.cgi?id=444047
I can't see this bug, but a quick grep of ioemu in xen-unstable for
O_DIRECT reveals that they are not in fact using O_DIRECT.
Sorry, it was mistakenly private - fixed now.
Xen does use O_DIRECT for paravirt driver case - blktap is using the combo
of AIO+O_DIRECT. QEMU code is only used for the IDE emulation case which
isn't interesting from a performance POV.
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://ovirt.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|