13.04.2018 00:35, John Snow wrote:
On 04/12/2018 08:26 AM, Vladimir Sementsov-Ogievskiy wrote:
> 1. It looks unsafe to use nbd server + backup(sync=none) on same node,
> synchronization is needed, like in block/replication, which uses
> backup_wait_for_overlapping_requests, backup_cow_request_begin,
> backup_cow_request_end. We have a filter driver for this thing, not yet
> in upstream.
Is it the case that blockdev-backup sync=none can race with read
requests on the NBD server?
i.e. we can get temporarily inconsistent data before the COW completes?
Can you elaborate?
I'm not sure but looks possible:
1. start NBD read, find that there is a hole in temporary image, decide
to read from active image (or even start read) and yield
2. guest writes to the same are (COW happens, but it doesn't help)
3. reduce point (1.), read invalid (already updated by 2.) data
And similar place in block/replication, which uses backup(sync=none) too
is protected from such situation.
> 2. If we use filter driver anyway, it may be better to not use backup at
> all, and do all needed things in a filter driver.
if blockdev-backup sync=none isn't sufficient to get the semantics we
want, it may indeed be more appropriate to just leave the entire task to
a new filter node.
> 3. It may be interesting to implement something like READ_ONCE for NBD,
> which means, that we will never read these clusters again. And after
> such command, we don't need to copy corresponding clusters to temporary
> image, if guests decides to write them (as we know, that client already
> read them and don't going to read again).
That would be a very interesting optimization indeed; but I don't think
we have any kind of infrastructure for such things currently. It's
almost like a TRIM on which regions need to perform COW for the
BlockSnapshot.
Hmm, READ+TRIM may be used too. And trim may be naturally implemented in
special filter driver.
--
Best regards,
Vladimir