
On 07/07/2011 05:34 PM, Jiri Denemark wrote:
This series is also available at https://gitorious.org/~jirka/libvirt/jirka-staging/commits/migration-recover...
The series does several things: - persists current job and its phase in status xml - allows safe monitor commands to be run during migration/save/dump jobs - implements recovery when libvirtd is restarted while a job is active - consolidates some code and fixes bugs I found when working in the area
git bisect is pointing to this series as the cause of a regression in 'virsh managedsave dom' triggering libvirtd core dumps if some other process is actively making queries on domain at the same time (virt-manager is a great process for fitting that bill). I'm trying to further narrow down which patch introduced the regression, and see if I can plug the race (probably a case of not checking whether the monitor still exists when getting the condition for an asynchronous job, since the whole point of virsh [managed]save is that the domain will go away when the save completes, but that it is time-consuming enough that we want to query domain state in the meantime). Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0x7ffff06d0700 (LWP 11419)] 0x00000000004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x7ffff06cf380) at qemu/qemu_monitor.c:801 801 while (!mon->msg->finished) { (gdb) bt #0 0x00000000004b9ad8 in qemuMonitorSend (mon=0x7fffe815c060, msg=0x7ffff06cf380) at qemu/qemu_monitor.c:801 #1 0x00000000004c77ae in qemuMonitorJSONCommandWithFd (mon=0x7fffe815c060, cmd=0x7fffd8000940, scm_fd=-1, reply=0x7ffff06cf480) at qemu/qemu_monitor_json.c:225 #2 0x00000000004c78e5 in qemuMonitorJSONCommand (mon=0x7fffe815c060, cmd=0x7fffd8000940, reply=0x7ffff06cf480) at qemu/qemu_monitor_json.c:254 #3 0x00000000004cc19c in qemuMonitorJSONGetMigrationStatus ( mon=0x7fffe815c060, status=0x7ffff06cf580, transferred=0x7ffff06cf570, remaining=0x7ffff06cf568, total=0x7ffff06cf560) at qemu/qemu_monitor_json.c:1920 #4 0x00000000004bc1b3 in qemuMonitorGetMigrationStatus (mon=0x7fffe815c060, status=0x7ffff06cf580, transferred=0x7ffff06cf570, remaining=0x7ffff06cf568, total=0x7ffff06cf560) at qemu/qemu_monitor.c:1532 #5 0x00000000004b201b in qemuMigrationUpdateJobStatus (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, job=0x5427b6 "domain save job") at qemu/qemu_migration.c:765 #6 0x00000000004b2383 in qemuMigrationWaitForCompletion ( driver=0x7fffe80089f0, vm=0x7fffe8015cd0) at qemu/qemu_migration.c:846 #7 0x00000000004b7806 in qemuMigrationToFile (driver=0x7fffe80089f0, vm=0x7fffe8015cd0, fd=27, offset=4096, path=0x7fffd8000990 "/var/lib/libvirt/qemu/save/fedora_12.save", compressor=0x0, is_reg=true, bypassSecurityDriver=true) at qemu/qemu_migration.c:2766 #8 0x000000000046a90d in qemuDomainSaveInternal (driver=0x7fffe80089f0, dom=0x7fffd8000ad0, vm=0x7fffe8015cd0, path=0x7fffd8000990 "/var/lib/libvirt/qemu/save/fedora_12.save", compressed=0, bypass_cache=false) at qemu/qemu_driver.c:2386 -- Eric Blake eblake@redhat.com +1-801-349-2682 Libvirt virtualization library http://libvirt.org