Hi,
this is kind of a follow-up to an older question/discussion:
https://www.redhat.com/archives/libvir-list/2010-July/msg00267.html
As a result of that, I use a second thread for monitoring the live
migration, taking actions (setting maxdowntime to a value that fits
the situation) if necessary.
Although I call getJobInfo() with a quite low frequency (once a
second), problems are occuring frequently, like every 10th or 15th
live migration. Problems range from exceptions that the domain is
not running anymore to complete JVM crashes ->
http://pastebin.com/jT6sXubu
Recovery from exceptions doesn't seem to work perfectly, as they
seem to trigger that connections to a host can't be shut down
properly because there are still open references.
Of course, in my monitoring thread I'm checking in every monitoring
iteration if the domain object is not null, is still active, if the
jobInfo is available yet etc. But, as I can not synchronize with
vm.migrate(), there still a reasonable chance that migrate() just
invalidates the current domain while I'm accessing it, no matter
what I do.
At the C level every API in libvirt is threadsafe. The only key point
is that if you use objects (eg virDomainPtr) from multiple threads
you ought to hold an extra reference on them (virDomainRef) per
thread to ensure that one thread does not delete an object that is
in use by the other thread.
At the Java level, this reference handling ought to be working
automatically so you wouldn't need todo anything special to safely
do migration with 2 threads as you describe. So I don't really have
any explanation for what you see.
Daniel
--
|: