Hi there,
We have a production OpenStack cloud, currently on Qemu 1.0 & 1.5 with
Libivrt 1.1.1. We're using local storage with storage-migration when
we need to move machines around.
Since we built this thing we've noticed that with network storage
attached (have seen this with iSCSI and Ceph RBD targets, mainly
interested in the latter) the migration moves all of the network disk
contents as well, which for any non-toy disk sizes pretty much renders
it useless as the migration is then bounded not by the guest memory
size and activity but also by the block storage size. E.g., an idle
guest with 8GB RAM and a 300GB Ceph RBD attached takes ~3 hours to
migrate (over a 20GE network) and we observe balanced TX/RX on both
the source and destination, as the source reads blocks from the Ceph
cluster, streams them to the destination, and the destination in turn
writes them back to the Ceph cluster.
Looking in Libvirt's qemu_migration.c code I see
qemuMigrationDriveMirror skips shared, RO and source-less disks. But
I'm not sure why network disks in general, and in particular RBDs,
wouldn't be considered shared for these purposes. From my naive
reading the checks in qemuMigrationIsSafe (which explicitly pass
NETWORK+RBD) seem to confirm this, at least for RBDs. Is this a bug?
--
Cheers,
~Blairo
Show replies by date