
On Thu, Oct 22, 2015 at 13:52:29 +0100, Daniel P. Berrange wrote: ...
On a related topic, we don't have great error reporting in the (usually unlikely) scenario that we get a stuck job / timeout. I've long thought it could be desirable to record some metadata when we start jobs, such as the __FUNC__ of the method which started the job, so when we report an error we can include that info as a diagnostic aid.
Do you mean something like virsh # resume cd error: Failed to resume domain cd error: Timed out during operation: cannot acquire state change lock (held by remoteDispatchDomainSuspend) This was implemented by v1.2.13-295-gb79f25e
This would have to be against the qemuDomainObjPrivPtr struct. THis makes me think that using the separate bool inJob/inMonitor stack variables is not required.
We could just add
int threadid; bool inJob; bool inMonitor; const char *jobfunc;
to qemuDomainObjPrivPtr. That way you don't need to modify the Enter/Exit functions to add extra arguments - we just track everything internally. When exiting, we'd compare against the threadid, to make sure we don't accidentally relaase a different thread's job.
Yeah, as long as we can make sure threadid is unique and stable: /* These next two functions are for debugging only, since they are not * guaranteed to give unique values for distinct threads on all * architectures, nor are the two functions guaranteed to give the same * value for the same thread. */ unsigned long long virThreadSelfID(void); unsigned long long virThreadID(virThreadPtr thread); so far we avoided using thread IDs for anything critical. Jirka