On 17.05.2018 15:25, Erik Skultety wrote:
> On Thu, May 17, 2018 at 02:49:12PM +0300, Nikolay Shirokovskiy wrote:
>>
>>
>> On 17.05.2018 14:01, Erik Skultety wrote:
>>> On Thu, May 17, 2018 at 01:42:36PM +0300, Nikolay Shirokovskiy wrote:
>>>>
>>>>
>>>> On 17.05.2018 13:11, Nikolay Shirokovskiy wrote:
>>>>> virCopyLastError is intended to be used after last error is set.
>>>>> However due to virLastErrorObject failures (very unlikely through
>>>>> as thread local error is allocated on first use) we can have zero
>>>>> fields in a copy as a result. In particular code field can be set
>>>>> to VIR_ERR_OK.
>>>>>
>>>>> In some places (qemu monitor, qemu agent and qemu migaration code
>>>>> for example) we use copy result as a flag and this leads to bugs.
>>>>>
>>>>> Let's set copy result to generic error ("cause
unknown") in this case.
>>>>> Approach is based on John Ferlan comments in [1] and [2] to initial
>>>>> attempt which fixes issue in much less generic way.
>>>>>
>>>>> [1]
https://www.redhat.com/archives/libvir-list/2018-April/msg02700.html
>>>>> [2]
https://www.redhat.com/archives/libvir-list/2018-May/msg00124.html
>>>>> ---
>>>>> src/util/virerror.c | 2 +-
>>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/src/util/virerror.c b/src/util/virerror.c
>>>>> index c000b00..9f158af 100644
>>>>> --- a/src/util/virerror.c
>>>>> +++ b/src/util/virerror.c
>>>>> @@ -343,7 +343,7 @@ virCopyLastError(virErrorPtr to)
>>>>> if (err)
>>>>> virCopyError(err, to);
>>>>> else
>>>>> - virResetError(to);
>>>>> + virErrorGenericFailure(err);
>>>>
>>>> virErrorGenericFailure(to) of course
>>>
>>> Yeah, I just prepared a mail to respond to the original patch, nevermind
:)...
>>> Although I still don't see how this solves anything. First of all, the
whole
>>> else branch is useless, since we "memset'd" the caller-passed
object right
>>> before we could potentially hit the 'else' branch. As you've
noted in the
>>> commit message, virLastErrorObject() can only fail with NULL on OOM, yet one
of
>>> the things virErrorGenericFailure will try to do is strdup(), which I
highly
>>> doubt would succeed after a previous OOM, IOW if libvirt encounters OOM
it's
>>> pretty much stuck in a loop of failed attempts to report an OOM error
>>> encountering some more. If you really want to return something, I'd say
we
>>> should return VIR_ERR_NO_MEMORY instead, probably checking the validity of
the
>>
>> This can be misleading. The original error is lost due OOM but it can be
>
> Although true, how does VIR_ERR_UNKNOWN help to tell you anything about the
> original root cause, if you at least report OOM error, of course the root cause
> might have been different, but at least it conveys some information that is
> true, especially with OOM which tells you to "buckle up" because things
are
> going haywire - on the contrary, if we report UNKNOWN, you convey no
> information plus people will treat is as an unhandled exception, shrug their
> shoulders and file a bugzilla and paste the error there, which without any
> debug logs is good as nothing AND, since we're talking OOM here, things still
Well, makes sense.
> will go haywire. I don't think this patch offers any improvement over the
> existing problem.
Wait, but we need to set .code to some error otherwise the logic of callers
will be broken (see qemu monitor for example).
Please check another approach suggested in my another reply.
Sure, I'll have a look at it tomorrow and respond in order to move this forward.
Thanks,
Erik