On 05/05/2013 05:01 AM, Michal Privoznik wrote:
>> @@ -228,7 +228,7 @@ virCapabilitiesAddHostFeature(virCapsPtr
caps,
>> caps->host.nfeatures, 1) < 0)
>> return -1;
>>
>> - if ((caps->host.features[caps->host.nfeatures] = strdup(name)) ==
NULL)
>> + if (VIR_STRDUP(caps->host.features[caps->host.nfeatures], name) <
0)
>> return -1;
>
> This function was previously silent on error; which means its callers
> are now reporting an error where they used to be silent (probably GOOD
> in most cases) or reporting doubled errors (annoying, and worth fixing
> now while we are looking at it). I'd feel more comfortable if we:
>
I don't quite agree. In the next serie I will be:
1) turning VIR_ALLOC* and virAsprintf into OOM reporting
2) removing virReportOOMError from everywhere
Removing OOMError from everywhere will indeed clean up double OOM
reporting. But what I'm worried about is cases where we had no error
reporting, but now we do - and are there any cases where the new error
reporting could lead to spurious problems? Or conversely, are there any
callers that should have been reporting errors, but weren't, and where
we won't notice that those callers are still buggy when we make our pass
to clean up double OOM reporting?
These two steps will effectively cut off double OOM error reporting and
move the place where error is reported closer to it's actual origin.
So in this concrete example, callers of virCapabilitiesAddHostFeature are:
libxl_conf.c:80: libxlBuildCapabilities: does call virReportOOMError (in
fact, its parent does)
src/test/test_driver.c:175: testBuildCapabilities: does call
virReportOOMError
xen_hypervisor.c:2293: xenHypervisorBuildCapabilities does call
virReportOOMError (in fact, its parent does)
After my patches, these wouldn't call virReportOOMError at all. I agree,
that the future patches I am talking about will require more inspection
if callee reports OOM error on every path, but I would not bother with
it for now. I mean, this chunk of code we have here will report OOM
error at no cost - compared to your suggestion where I'd have to
manually go through every possible callers. Twice.
Okay, on that front, you've convinced me that we want to minimize the
effort to get your series in - both the effort for you to get to the
final end goal, and the time I spend on reviewing it :)
Moreover, I don't think moving from silent -> reporting is no bad at
all. The worst case scenario is, users will see one more line in the
libvirtd log. The best case scenario is - all forgotten places where we
currently don't catch OOM errors, or do it in bad way (remember
make_nonnull_domain?) will now report the correct error. Again, at no price.
For the case of src/conf/capabilities.c, I think we are fine (those
functions were always used synchronously, so the user will eventually
see the error); for the case of src/conf/domain_event.c, I'm less
convinced (the OOM happens while trying to send an async message, so
where we are previously silent, we are now polluting the thread-local
error object, and the next synchronous call that happens to use the same
worker thread will now have an active but unrelated OOM error object in
the way - if we are going to silently drop failed attempts to send an
async event, then we should not be leaving an error behind in
thread-local storage).
What I really fear about here is, that I don't change semantics. But
that's what review is for, isn't it :)
Yep - which is why I've been flagging any call that changes from
silent->reporting as a potential semantic change. The problem is that I
don't know the best way to divide the task so that we aren't reviewing
the same code through multiple revisions of churn, while still avoiding
the problem of a semantic change that we will regret. If anyone else
has ideas on what we should do here, speak up!
I guess at this point, I'll continue reviewing the next couple of
patches in the series, and see how many (or how few) instances of
silent->reporting changes we really deal with. It may not be as bad as
I'm worried about, in which case it may be easier to just note the
problems, commit your series as is, and deal with the fallout manually
at the end of your series, instead of making you rebase yet another
round (after all, on patch 3/34, I only found 2 of the 10 files where I
identified a silent->reporting change).
--
Eric Blake eblake redhat com +1-919-301-3266
Libvirt virtualization library
http://libvirt.org