The patch works well. Thanks for your reply.
Sorry to reply too late. I'm kinda of busy with other jobs recently.
-----Original Message-----
From: Daniel P. Berrange [mailto:berrange@redhat.com]
Sent: Monday, October 28, 2013 7:55 PM
To: Wangyufei (A)
Cc: libvir-list(a)redhat.com; Xuchao (H); Wangrui (K)
Subject: Re: [libvirt] When vm's status file being left over, some persistent
but inactive vms will be lost by libvirtd after libvirtd rebooting.
On Fri, Oct 18, 2013 at 03:00:22AM +0000, Wangyufei (A) wrote:
> Hello,
> I found a problem that:
> vm's status file may be left over in the path /var/run/libvirt/qemu under
some situation, such as host reboot. When vm's status file is left over, some
> persistent but inactive vms will be lost by libvirtd after it is rebooted. And
you can do as follows to reproduce the problem:
> 1、Create a vm and start it by the commands: virsh define vm-xml and
virsh start vm-name.
> 2、Stop the libvirtd by the command: service libvirtd stop.
> 3、Kill the qemu process related to the vm, and make the vm's status file
left over.
> 4、Start libvirtd.
> After starting the libvirtd service, we find that the vm has been lost by
libvirtd with command"virsh list --all".
> What we expect is that the vm is shown with shutoff status, should we?
>
> The reason for the problem is that:
> During libvirtd startup, it first loads status files of vms under the path
/var/run/libvirt/qemu, creates virDomainObj for each vm and adds it to
> driver->domains list.
> Then it creates a thread to connect related qemu process for each
virDomainObj in the domains list.Because the qemu process has been killed,
so connecting to
> qemu will be failed. When connecting to qemu failed, connection-thread
will do the follows:
> 1、Check if vm->persistent is 1.
> 2、If vm->persistent is not 1, then qemuDomainRemoveInactive() is
called to remove the virDomainObj.
> 3、Then the following calling sequence will
occur:qemuDomainRemoveInactive()
-->virDomainObjListRemove()-->virHashRemoveEntry(). Around
virHashRemoveEntry(),
> domlist and dom will be locked and unlocked sequencely.
> The problem of the above steps is that vm->persistent maybe has been
set to 1 by libvirtd main-thread when connection-thread calling
virHashRemoveEntry() to
> remove the dom. That is a persistent virDomainObj is removed during
libvirtd startup.
>
> Two ways can resolve the above problem:
> 1、expending the range of locking virDomainObj and virDomainObjList,
lock the object of virDomainObj and virDomainObjList in connection-thread
before checking vm->persistent.
> 2、checking vm->persistent again before calling virHashRemoveEntry().
>
> Do you think it is a problem described above and which way listed
above is more suitable to resolve the problem, or is there any other better
idea? Any suggestions?
The problem here is really that we should have loaded the persistent
configs before we started the thread to reconnect. That ensures that
the VM is marked persistent before the thread runs.
Can you test the patch I've just sent for this.
BTW, also please configure your email client to add line breaks at 80
characters or less.
Daniel
--
|:
http://berrange.com -o-
http://www.flickr.com/photos/dberrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|:
http://entangle-photo.org -o-
http://live.gnome.org/gtk-vnc :|