[libvirt-users] Live migration with non-shared storage leads to corrupted file system

Hi, We have the following environment for live-migration with non-shared stroage between two nodes, Host OS: RHEL 6.3 Kernel: 2.6.32-279.el6.x86_64 Qemu-kvm: 1.2.0 libvirt: 0.10.1 and use "virsh" to do the job as virsh -c 'qemu:///system' migrate --live --persistent --copy-storage-all <guest-name> qemu+ssh://<target-node>/system The above command itself returns no error, and the migrated domain in the destination node starts fine. But when I log into the migrated domain, some commands failed immediately. And if I shutdown the domain, it won't boot up any more, complaining about the corrupted file system. Furthermore, I can confirm that the domain before migration works flawlessly after thorough test. The log file in /var/log/libvirt/qemu looks fine without any warnings or errors. And the only error message I can observe is found at /var/log/libvirt/libvirtd.log 2012-11-25 10:00:55.001+0000: 15398: warning : qemuDomainObjBeginJobInternal:838 : Cannot start job (query, none) for domain testVM; current job is (async nested, migration out) owned by (15397, 15397) 2012-11-25 10:00:55.001+0000: 15398: error : qemuDomainObjBeginJobInternal:842 : Timed out during operation: cannot acquire state change lock 2012-11-25 10:00:57.009+0000: 15393: error : virNetSocketReadWire:1184 : End of file while reading data: Input/output error I also noticed that the raw image file used by the migrated domain has the different sizes (reported by "du") before and after the migration. Is there anybody having the similiar experience with live migration on non-shared storage? It apparently leads to failed migrations in libvirt but no cirtical errors ever reported. Brett

On Sun, Nov 25, 2012 at 06:57:19PM +0800, Xinglong Wu wrote:
Is there anybody having the similiar experience with live migration on non-shared storage? It apparently leads to failed migrations in libvirt but no cirtical errors ever reported.
Make sure you have your driver cache set to "none". I've seen similar things happen with other cache settings, but with "none" it works just fine (but of course that might impose an i/o performance penalty in some cases?). I think the default is "unsafe" in 0.9.7 and later? It would be nice if the cache settings were documented more clearly.

Thanks for the comments. Forgot to mention it but yes, I do have "cache=none" configured for the migrated domain, as <driver name='qemu' type='raw' cache='none'/> in my XML file, and doubled-checked before and after the migration. In fact, if cache is not set to "none", virsh will issue a warning and refuse to do the live migration. Brett On Sun, Nov 25, 2012 at 7:12 PM, Henrik Ahlgren <pablo@seestieto.com> wrote:
On Sun, Nov 25, 2012 at 06:57:19PM +0800, Xinglong Wu wrote:
Is there anybody having the similiar experience with live migration on non-shared storage? It apparently leads to failed migrations in libvirt but no cirtical errors ever reported.
Make sure you have your driver cache set to "none". I've seen similar things happen with other cache settings, but with "none" it works just fine (but of course that might impose an i/o performance penalty in some cases?). I think the default is "unsafe" in 0.9.7 and later? It would be nice if the cache settings were documented more clearly.

Dear all: Libvirt create a default network called virbr0 using for NAT. I have two interface in my computer: eth0 eth1. My question is what interface does virbr0 forward message to? many thanks! regards zhangzhang

brctl show 在 2012-11-26 下午5:10,"张章" <zhang_zhang@live.com>写道:
Dear all: Libvirt create a default network called virbr0 using for NAT. I have two interface in my computer: eth0 eth1. My question is
what interface does virbr0 forward message to?
many thanks! regards zhangzhang
_______________________________________________ libvirt-users mailing list libvirt-users@redhat.com https://www.redhat.com/mailman/listinfo/libvirt-users

On 11/26/2012 04:48 AM, Timon Wang wrote:
brctl show
That command won't show anything relevant to the question. The virbr0 bridge created by libvirt for the default network is not attached directly to any physical interfaces, so brctl will not show anything other than guest tap devices connected.
在 2012-11-26 下午5:10,"张章" <zhang_zhang@live.com <mailto:zhang_zhang@live.com>> 写道:
Dear all: Libvirt create a default network called virbr0 using for NAT. I have two interface in my computer: eth0 eth1. My question is what
interface does virbr0 forward message to?
All traffic from guests connected to a libvirt NATed network like the "default" network must go through the host's IP routing stack to get to the outside, and that is where the decision is made (on a per-packet basis) about which interface to use for egress. So the answer is "each packet will be sent out the appropriate interface for that packet's destination address, according to the host's IP routing table." Note that you can limit the outgoing traffic from a particular network to only be allowed on a particular interface (by adding a "dev='ethX'" attribute to the <forward> element of the network), but that will only serve to block traffic that would have been forwarded via other interfaces, it won't re-route it to the allowed interface. (BTW, please don't ask a new question as a reply to an unrelated earlier message to the list - even if you change the subject, any proper email client will bury it in the replies to the original message. Instead, create a new message.)
participants (5)
-
Henrik Ahlgren
-
Laine Stump
-
Timon Wang
-
Xinglong Wu
-
张章