On 09/25/2014 10:39 PM, Kevin Wolf wrote:
Am 25.09.2014 um 14:29 hat Alexey Kardashevskiy geschrieben:
> On 09/25/2014 08:20 PM, Kevin Wolf wrote:
>> Am 25.09.2014 um 11:55 hat Alexey Kardashevskiy geschrieben:
>>> Right. Cool. So is below what was suggested? I am doublechecking as it does
>>> not solve the original issue - the bottomhalf is called first and then
>>> nbd_trip() crashes in qcow2_co_flush_to_os().
>>>
>>> diff --git a/block.c b/block.c
>>> index d06dd51..1e6dfd1 100644
>>> --- a/block.c
>>> +++ b/block.c
>>> @@ -5037,20 +5037,22 @@ void bdrv_invalidate_cache(BlockDriverState *bs,
>>> Error **errp)
>>> if (local_err) {
>>> error_propagate(errp, local_err);
>>> return;
>>> }
>>>
>>> ret = refresh_total_sectors(bs, bs->total_sectors);
>>> if (ret < 0) {
>>> error_setg_errno(errp, -ret, "Could not refresh total sector
count");
>>> return;
>>> }
>>> +
>>> + bdrv_drain_all();
>>> }
>>
>> Try moving the bdrv_drain_all() call to the top of the function (at
>> least it must be called before bs->drv->bdrv_invalidate_cache).
>
>
> Ok, I did. Did not help.
>
>
>>
>>> +static QEMUBH *migration_complete_bh;
>>> +static void process_incoming_migration_complete(void *opaque);
>>> +
>>> static void process_incoming_migration_co(void *opaque)
>>> {
>>> QEMUFile *f = opaque;
>>> - Error *local_err = NULL;
>>> int ret;
>>>
>>> ret = qemu_loadvm_state(f);
>>> qemu_fclose(f);
>>
>> Paolo suggested to move eveything starting from here, but as far as I
>> can tell, leaving the next few lines here shouldn't hurt.
>
>
> Ouch. I was looking at wrong qcow2_fclose() all this time :)
> Aaaany what you suggested did not help -
> bdrv_co_flush() calls qemu_coroutine_yield() while this BH is being
> executed and the situation is still the same.
Hm, do you have a backtrace? The idea with the BH was that it would be
executed _outside_ coroutine context and therefore wouldn't be able to
yield. If it's still executed in coroutine context, it would be
interesting to see who that caller is.
Like this?
process_incoming_migration_complete
bdrv_invalidate_cache_all
bdrv_drain_all
aio_dispatch
node->io_read (which is nbd_read)
nbd_trip
bdrv_co_flush
[...]
--
Alexey