On Thu, Apr 22, 2010 at 08:54:30AM -0400, Chris Lalancette wrote:
On 04/22/2010 08:34 AM, Daniel P. Berrange wrote:
> On Wed, Apr 21, 2010 at 05:16:21PM -0400, Chris Lalancette wrote:
>> On 04/21/2010 04:34 PM, Stephen Shaw wrote:
>>> I'm getting a seg fault when running virsh snapshot-create 1, but only
>>> when virt-manager is open and connected.
>>>
>>> Here is some of the debug info I was able to come up with -
>>>
http://fpaste.org/9GO6/ (bt)
>>>
http://fpaste.org/7gkH/ ('thread apply all bt)
>>>
>>> * After the crash
>>> (gdb) p mon->msg
>>> $1 = (qemuMonitorMessagePtr) 0x0
>>>
>>>
>>> nibbler:~ # libvirtd --version
>>> libvirtd (libvirt) 0.8.0
>>>
>>>
>>> Please let me know if there is any other information you need.
>>> Stephen
>>
>> Thanks for the report. To be perfectly honest, I can't see how what
>> happened could happen :). But I'll take a closer look at it and see
>> if I can reproduce and see what is going on with it.
>
> I see thread locking problems in the code
>
> - qemuDomainSnapshotCreateXML() is calling monitor commands, but has
> not run qemuDomainObjBeginJobWithDriver() to ensure exclusive
> access to the monitor
>
> - qemuDomainSnapshotDiscard has same problem
Yep, just fixing those now. I didn't quite understand the ObjBeginJob
before, but I think I'm understanding it now. This is probably the source of
the problems.
There's some notes about the rules in src/qemu/THREADS.txt.
You must acquire locks on objects in the following order, not missing
any steps.
qemud_driver (qemudDriverLock)
virDomainObjPtr (implicit via virDomainObjFindByXXX)
qemuMonitorPrivatePtr (qemuDomainObjBeginJob)
qemuMonitorPtr (implicit via qemuDomainObjEnterMonitor)
Note that qemuDomainObjEnterMonitor() will release the locks on the
qemud_driver & virDomainObjPtr objects, once it has acquired the lock
on qemuMonitorPtr. The qemuMonitorPrivatePtr object has a condition
variable that ensures continued mutual exclusion, even though the
qemud_driver, virDomainObjPtr object are now unlocked.
So by missing qemuDomainObjBeginJob(), the condition variable was not
acquired, and mutual exclusion was not assured
Regards,
Daniel
--
|: Red Hat, Engineering, London -o-
http://people.redhat.com/berrange/ :|
|:
http://libvirt.org -o-
http://virt-manager.org -o-
http://deltacloud.org :|
|:
http://autobuild.org -o-
http://search.cpan.org/~danberr/ :|
|: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|