
On Thu, Oct 22, 2015 at 01:47:28PM +0200, Jiri Denemark wrote:
Hi all.
Looking at the qemu driver code, we have a lot of code with the following pattern:
if (doSomething() < 0) goto cleanup;
if (qemuDomainObjBeginJob() < 0) goto cleanup;
if (doSomethingElse() < 0) goto endjob;
qemuDomainObjEnterMonitor();
if (qemuMonitorSomething() < 0) { qemuDomainObjExitMonitor(); goto endjob; }
if (qemuMonitorSomethingElse() < 0) { qemuDomainObjExitMonitor(); goto endjob; }
if (qemuDomainObjExitMonitor() < 0) goto endjob;
...
ret = 0;
endjob: qemuDomainObjEndJob();
cleanup: ... return ret;
Sometimes qemuDomainObjExitMonitor is in its own label instead of being explicitly called on every single error path. Sometimes qemuDomainObjEndJob is called explicitly followed by goto cleanup. In either case, it's pretty easy to get this wrong. It's too easy to jump to a wrong label (especially when moving code around) or forget to call the appropriate exit function before jumping to a label.
So what if we make the code more robust by changing who needs to track whether we are in a monitor or have a job. Now it's the caller that tracks it while I think we could teach the job/monitor APIs themselves to track the state:
bool inJob = false; bool inMonitor = false;
if (doSomething() < 0) goto cleanup;
if (qemuDomainObjBeginJob(..., &inJob) < 0) goto cleanup;
if (doSomethingElse() < 0) goto cleanup;
qemuDomainObjEnterMonitor(..., &inMonitor);
if (qemuMonitorSomething() < 0) goto cleanup;
if (qemuMonitorSomethingElse() < 0) goto cleanup;
if (qemuDomainObjExitMonitor(..., &inMonitor) < 0) goto cleanup;
...
ret = 0;
cleanup: if (qemuDomainObjExitMonitor(..., &inMonitor) < 0) ret = -1; qemuDomainObjEndJob(..., &inJob); ... return ret;
It's not a lot shorter or longer but it saves us from jumping to different labels and it makes sure we always exit the monitor and end the job.
So is it worth the effort or do you thing it's even worse than before or do you have any other ideas?
On a related topic, we don't have great error reporting in the (usually unlikely) scenario that we get a stuck job / timeout. I've long thought it could be desirable to record some metadata when we start jobs, such as the __FUNC__ of the method which started the job, so when we report an error we can include that info as a diagnostic aid. This would have to be against the qemuDomainObjPrivPtr struct. THis makes me think that using the separate bool inJob/inMonitor stack variables is not required. We could just add int threadid; bool inJob; bool inMonitor; const char *jobfunc; to qemuDomainObjPrivPtr. That way you don't need to modify the Enter/Exit functions to add extra arguments - we just track everything internally. When exiting, we'd compare against the threadid, to make sure we don't accidentally relaase a different thread's job. Regards, Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|