[PATCH v2 00/17] Allow sparse streams for block devices

v2 of: https://www.redhat.com/archives/libvir-list/2020-July/msg00145.html diff to v1: - Switch virfdstream to glib (patches 1-6) - Document the feature in NEWS.rst - Introduced test cases for virStringIsNull() - Included what was WIP patch in v1 => patch 16 which ensures block devices aren't read twice Michal Prívozník (17): virfdstream: Use g_autofree in virFDStreamThreadDoRead() virFDStreamMsgQueuePush: Clear pointer to passed message virfdstream: Use autoptr for virFDStreamMsg virfdstream: Use g_new0() instead of VIR_ALLOC() virfdstream: Use VIR_AUTOCLOSE() virfdstream: Drop some needless labels libvirt-storage: Document volume upload/download stream format virstring: Introduce virStringIsNull() virfile: Introduce virFileInDataDetectZeroes() virsh: Pass virshStreamCallbackDataPtr to virshStreamSink() and virshStreamSkip() virsh: Track if vol-upload or vol-download work over a block device virshStreamSkip: Emulate skip for block devices virfdstream: Allow sparse stream vol-download virshStreamInData: Handle block devices virfdstream: Emulate skip for block devices stream: Don't read block device twice news: Document sparse streams for block devcies NEWS.rst | 7 ++ src/libvirt-storage.c | 8 +- src/libvirt_private.syms | 2 + src/util/virfdstream.c | 179 ++++++++++++++++++++++----------------- src/util/virfile.c | 50 +++++++++++ src/util/virfile.h | 5 ++ src/util/virstring.c | 38 +++++++++ src/util/virstring.h | 2 + tests/virstringtest.c | 47 ++++++++++ tools/virsh-util.c | 90 +++++++++++++++++--- tools/virsh-util.h | 4 + tools/virsh-volume.c | 23 ++++- 12 files changed, 360 insertions(+), 95 deletions(-) -- 2.26.2

The buffer that allocated in the virFDStreamThreadDoRead() can be automatically freed, or if saved into the message structure it can be stolen. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index 1c32be47a9..e29c95690b 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -431,7 +431,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, virFDStreamMsgPtr msg = NULL; int inData = 0; long long sectionLen = 0; - char *buf = NULL; + g_autofree char *buf = NULL; ssize_t got; if (sparse && *dataLen == 0) { @@ -483,9 +483,8 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, } msg->type = VIR_FDSTREAM_MSG_TYPE_DATA; - msg->stream.data.buf = buf; + msg->stream.data.buf = g_steal_pointer(&buf); msg->stream.data.len = got; - buf = NULL; if (sparse) *dataLen -= got; } @@ -496,7 +495,6 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, return got; error: - VIR_FREE(buf); virFDStreamMsgFree(msg); return -1; } -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:19 +0200, Michal Privoznik wrote:
The buffer that allocated in the virFDStreamThreadDoRead() can be automatically freed, or if saved into the message structure it can be stolen.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

All callers of virFDStreamMsgQueuePush() have the same pattern: they explicitly set @msg passed to NULL to avoid freeing it later on. Well, the function can take address of the pointer and clear it for them. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index e29c95690b..6efe6c17ad 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -140,7 +140,7 @@ VIR_ONCE_GLOBAL_INIT(virFDStreamData); static int virFDStreamMsgQueuePush(virFDStreamDataPtr fdst, - virFDStreamMsgPtr msg, + virFDStreamMsgPtr *msg, int fd, const char *fdname) { @@ -150,7 +150,7 @@ virFDStreamMsgQueuePush(virFDStreamDataPtr fdst, while (*tmp) tmp = &(*tmp)->next; - *tmp = msg; + *tmp = g_steal_pointer(msg); virCondSignal(&fdst->threadCond); if (safewrite(fd, &c, sizeof(c)) != sizeof(c)) { @@ -489,8 +489,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, *dataLen -= got; } - virFDStreamMsgQueuePush(fdst, msg, fdout, fdoutname); - msg = NULL; + virFDStreamMsgQueuePush(fdst, &msg, fdout, fdoutname); return got; @@ -814,8 +813,7 @@ static int virFDStreamWrite(virStreamPtr st, const char *bytes, size_t nbytes) msg->stream.data.buf = buf; msg->stream.data.len = nbytes; - virFDStreamMsgQueuePush(fdst, msg, fdst->fd, "pipe"); - msg = NULL; + virFDStreamMsgQueuePush(fdst, &msg, fdst->fd, "pipe"); ret = nbytes; } else { retry: @@ -1010,8 +1008,7 @@ virFDStreamSendHole(virStreamPtr st, msg->type = VIR_FDSTREAM_MSG_TYPE_HOLE; msg->stream.hole.len = length; - virFDStreamMsgQueuePush(fdst, msg, fdst->fd, "pipe"); - msg = NULL; + virFDStreamMsgQueuePush(fdst, &msg, fdst->fd, "pipe"); } } else { off = lseek(fdst->fd, length, SEEK_CUR); -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:20 +0200, Michal Privoznik wrote:
All callers of virFDStreamMsgQueuePush() have the same pattern: they explicitly set @msg passed to NULL to avoid freeing it later on. Well, the function can take address of the pointer and clear it for them.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 13 +++++-------- 1 file changed, 5 insertions(+), 8 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

A cleanup function can be declared for virFDStreamMsg type so that the structure doesn't have to be freed explicitly. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index 6efe6c17ad..25661736ca 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -208,6 +208,8 @@ virFDStreamMsgFree(virFDStreamMsgPtr msg) VIR_FREE(msg); } +G_DEFINE_AUTOPTR_CLEANUP_FUNC(virFDStreamMsg, virFDStreamMsgFree); + static void virFDStreamMsgQueueFree(virFDStreamMsgPtr *queue) @@ -428,7 +430,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, size_t *dataLen, size_t buflen) { - virFDStreamMsgPtr msg = NULL; + g_autoptr(virFDStreamMsg) msg = NULL; int inData = 0; long long sectionLen = 0; g_autofree char *buf = NULL; @@ -494,7 +496,6 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, return got; error: - virFDStreamMsgFree(msg); return -1; } @@ -761,7 +762,7 @@ virFDStreamAbort(virStreamPtr st) static int virFDStreamWrite(virStreamPtr st, const char *bytes, size_t nbytes) { virFDStreamDataPtr fdst = st->privateData; - virFDStreamMsgPtr msg = NULL; + g_autoptr(virFDStreamMsg) msg = NULL; int ret = -1; if (nbytes > INT_MAX) { @@ -838,7 +839,6 @@ static int virFDStreamWrite(virStreamPtr st, const char *bytes, size_t nbytes) cleanup: virObjectUnlock(fdst); - virFDStreamMsgFree(msg); return ret; } @@ -960,7 +960,7 @@ virFDStreamSendHole(virStreamPtr st, unsigned int flags) { virFDStreamDataPtr fdst = st->privateData; - virFDStreamMsgPtr msg = NULL; + g_autoptr(virFDStreamMsg) msg = NULL; off_t off; int ret = -1; @@ -1028,7 +1028,6 @@ virFDStreamSendHole(virStreamPtr st, ret = 0; cleanup: virObjectUnlock(fdst); - virFDStreamMsgFree(msg); return ret; } -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:21 +0200, Michal Privoznik wrote:
A cleanup function can be declared for virFDStreamMsg type so that the structure doesn't have to be freed explicitly.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

This switch allow us to save a few lines of code. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index 25661736ca..c85dee05c3 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -452,8 +452,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, buflen > length - total) buflen = length - total; - if (VIR_ALLOC(msg) < 0) - goto error; + msg = g_new0(virFDStreamMsg, 1); if (sparse && *dataLen == 0) { msg->type = VIR_FDSTREAM_MSG_TYPE_HOLE; @@ -474,8 +473,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, buflen > *dataLen) buflen = *dataLen; - if (VIR_ALLOC_N(buf, buflen) < 0) - goto error; + buf = g_new0(char, buflen); if ((got = saferead(fdin, buf, buflen)) < 0) { virReportSystemError(errno, @@ -805,9 +803,8 @@ static int virFDStreamWrite(virStreamPtr st, const char *bytes, size_t nbytes) goto cleanup; } - if (VIR_ALLOC(msg) < 0 || - VIR_ALLOC_N(buf, nbytes) < 0) - goto cleanup; + msg = g_new0(virFDStreamMsg, 1); + buf = g_new0(char, nbytes); memcpy(buf, bytes, nbytes); msg->type = VIR_FDSTREAM_MSG_TYPE_DATA; @@ -1003,8 +1000,7 @@ virFDStreamSendHole(virStreamPtr st, virFDStreamMsgQueuePop(fdst, fdst->fd, "pipe"); } else { - if (VIR_ALLOC(msg) < 0) - goto cleanup; + msg = g_new0(virFDStreamMsg, 1); msg->type = VIR_FDSTREAM_MSG_TYPE_HOLE; msg->stream.hole.len = length; @@ -1123,8 +1119,7 @@ static int virFDStreamOpenInternal(virStreamPtr st, /* Create the thread after fdst and st were initialized. * The thread worker expects them to be that way. */ - if (VIR_ALLOC(fdst->thread) < 0) - goto error; + fdst->thread = g_new0(virThread, 1); if (virCondInit(&fdst->threadCond) < 0) { virReportSystemError(errno, "%s", @@ -1277,8 +1272,7 @@ virFDStreamOpenFileInternal(virStreamPtr st, if (virPipe(pipefds) < 0) goto error; - if (VIR_ALLOC(threadData) < 0) - goto error; + threadData = g_new0(virFDStreamThreadData, 1); threadData->st = virObjectRef(st); threadData->length = length; -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:22 +0200, Michal Privoznik wrote:
This switch allow us to save a few lines of code.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

Again, instead of closing FDs explicitly, we can automatically close them when they go out of their respective scopes. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index c85dee05c3..bac1c95c0a 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -571,9 +571,9 @@ virFDStreamThread(void *opaque) virStreamPtr st = data->st; size_t length = data->length; bool sparse = data->sparse; - int fdin = data->fdin; + VIR_AUTOCLOSE fdin = data->fdin; char *fdinname = data->fdinname; - int fdout = data->fdout; + VIR_AUTOCLOSE fdout = data->fdout; char *fdoutname = data->fdoutname; virFDStreamDataPtr fdst = st->privateData; bool doRead = fdst->threadDoRead; @@ -633,8 +633,6 @@ virFDStreamThread(void *opaque) virObjectUnref(fdst); if (virFDStreamDataDisposed) st->privateData = NULL; - VIR_FORCE_CLOSE(fdin); - VIR_FORCE_CLOSE(fdout); virFDStreamThreadDataFree(data); return; @@ -1160,9 +1158,10 @@ int virFDStreamConnectUNIX(virStreamPtr st, { struct sockaddr_un sa; virTimeBackOffVar timeout; + VIR_AUTOCLOSE fd = -1; int ret; - int fd = socket(AF_UNIX, SOCK_STREAM, 0); + fd = socket(AF_UNIX, SOCK_STREAM, 0); if (fd < 0) { virReportSystemError(errno, "%s", _("Unable to open UNIX socket")); goto error; @@ -1197,10 +1196,11 @@ int virFDStreamConnectUNIX(virStreamPtr st, if (virFDStreamOpenInternal(st, fd, NULL, 0) < 0) goto error; + + fd = -1; return 0; error: - VIR_FORCE_CLOSE(fd); return -1; } -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:23 +0200, Michal Privoznik wrote:
Again, instead of closing FDs explicitly, we can automatically close them when they go out of their respective scopes.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
[...]
@@ -1160,9 +1158,10 @@ int virFDStreamConnectUNIX(virStreamPtr st, { struct sockaddr_un sa; virTimeBackOffVar timeout; + VIR_AUTOCLOSE fd = -1; int ret;
- int fd = socket(AF_UNIX, SOCK_STREAM, 0); + fd = socket(AF_UNIX, SOCK_STREAM, 0); if (fd < 0) { virReportSystemError(errno, "%s", _("Unable to open UNIX socket")); goto error; @@ -1197,10 +1196,11 @@ int virFDStreamConnectUNIX(virStreamPtr st,
if (virFDStreamOpenInternal(st, fd, NULL, 0) < 0) goto error; + + fd = -1; return 0;
Please move the clearing of 'fd' closer towards the call to 'virFDStreamOpenInternal' which consumes it than to the return statement so that it's visualy clearer that it's consumed by the call. Reviewed-by: Peter Krempa <pkrempa@redhat.com>

After previous cleanups, some labels in some functions have nothing but 'return' statement in them. Drop the labels and replace 'goto'-s with respective return statements. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index bac1c95c0a..50209a8bd4 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -438,7 +438,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, if (sparse && *dataLen == 0) { if (virFileInData(fdin, &inData, §ionLen) < 0) - goto error; + return -1; if (length && sectionLen > length - total) @@ -466,7 +466,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, virReportSystemError(errno, _("unable to seek in %s"), fdinname); - goto error; + return -1; } } else { if (sparse && @@ -479,7 +479,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, virReportSystemError(errno, _("Unable to read %s"), fdinname); - goto error; + return -1; } msg->type = VIR_FDSTREAM_MSG_TYPE_DATA; @@ -492,9 +492,6 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, virFDStreamMsgQueuePush(fdst, &msg, fdout, fdoutname); return got; - - error: - return -1; } @@ -1164,22 +1161,22 @@ int virFDStreamConnectUNIX(virStreamPtr st, fd = socket(AF_UNIX, SOCK_STREAM, 0); if (fd < 0) { virReportSystemError(errno, "%s", _("Unable to open UNIX socket")); - goto error; + return -1; } memset(&sa, 0, sizeof(sa)); sa.sun_family = AF_UNIX; if (abstract) { if (virStrcpy(sa.sun_path+1, path, sizeof(sa.sun_path)-1) < 0) - goto error; + return -1; sa.sun_path[0] = '\0'; } else { if (virStrcpyStatic(sa.sun_path, path) < 0) - goto error; + return -1; } if (virTimeBackOffStart(&timeout, 1, 3*1000 /* ms */) < 0) - goto error; + return -1; while (virTimeBackOffWait(&timeout)) { ret = connect(fd, (struct sockaddr *)&sa, sizeof(sa)); if (ret == 0) @@ -1191,17 +1188,14 @@ int virFDStreamConnectUNIX(virStreamPtr st, continue; } - goto error; + return -1; } if (virFDStreamOpenInternal(st, fd, NULL, 0) < 0) - goto error; + return -1; fd = -1; return 0; - - error: - return -1; } -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:24 +0200, Michal Privoznik wrote:
After previous cleanups, some labels in some functions have nothing but 'return' statement in them. Drop the labels and replace 'goto'-s with respective return statements.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 24 +++++++++--------------- 1 file changed, 9 insertions(+), 15 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

For libvirt, the volume is just a binary blob and it doesn't interpret data on volume upload/download. But as it turns out, this unspoken assumption is not clear to our users. Document it explicitly. Suggested in: https://bugzilla.redhat.com/show_bug.cgi?id=1851023#c17 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt-storage.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c index a45c8b98c1..8738f6dd14 100644 --- a/src/libvirt-storage.c +++ b/src/libvirt-storage.c @@ -1590,7 +1590,9 @@ virStorageVolCreateXMLFrom(virStoragePoolPtr pool, * * Download the content of the volume as a stream. If @length * is zero, then the remaining contents of the volume after - * @offset will be downloaded. + * @offset will be downloaded. Please note, that the data is + * not interpreted and therefore data received by stream + * client are in the very same format the volume is in. * * If VIR_STORAGE_VOL_DOWNLOAD_SPARSE_STREAM is set in @flags * effective transmission of holes is enabled. This assumes using @@ -1663,7 +1665,9 @@ virStorageVolDownload(virStorageVolPtr vol, * will fail if @offset + @length exceeds the size of the * volume. Otherwise, if @length is non-zero, an error * will be raised if an attempt is made to upload greater - * than @length bytes of data. + * than @length bytes of data. Please note, that the data is + * not interpreted and therefore data sent by stream client + * must be in the very same format the volume is in. * * If VIR_STORAGE_VOL_UPLOAD_SPARSE_STREAM is set in @flags * effective transmission of holes is enabled. This assumes using -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:25 +0200, Michal Privoznik wrote:
For libvirt, the volume is just a binary blob and it doesn't interpret data on volume upload/download. But as it turns out, this unspoken assumption is not clear to our users. Document it explicitly.
Suggested in: https://bugzilla.redhat.com/show_bug.cgi?id=1851023#c17
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt-storage.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c index a45c8b98c1..8738f6dd14 100644 --- a/src/libvirt-storage.c +++ b/src/libvirt-storage.c @@ -1590,7 +1590,9 @@ virStorageVolCreateXMLFrom(virStoragePoolPtr pool, * * Download the content of the volume as a stream. If @length * is zero, then the remaining contents of the volume after - * @offset will be downloaded. + * @offset will be downloaded. Please note, that the data is + * not interpreted and therefore data received by stream + * client are in the very same format the volume is in.
I don't think this wording clarifies it that much as it's not obvious what's meant by 'interpreted'. How about: Please note that the stream transports the volume itself as is, so the downloaded data may not correspond to guest OS visible state in cases when a complex storage format such as qcow2 or vmdk is used.

On 8/20/20 10:35 AM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:25 +0200, Michal Privoznik wrote:
For libvirt, the volume is just a binary blob and it doesn't interpret data on volume upload/download. But as it turns out, this unspoken assumption is not clear to our users. Document it explicitly.
Suggested in: https://bugzilla.redhat.com/show_bug.cgi?id=1851023#c17
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt-storage.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c index a45c8b98c1..8738f6dd14 100644 --- a/src/libvirt-storage.c +++ b/src/libvirt-storage.c @@ -1590,7 +1590,9 @@ virStorageVolCreateXMLFrom(virStoragePoolPtr pool, * * Download the content of the volume as a stream. If @length * is zero, then the remaining contents of the volume after - * @offset will be downloaded. + * @offset will be downloaded. Please note, that the data is + * not interpreted and therefore data received by stream + * client are in the very same format the volume is in.
I don't think this wording clarifies it that much as it's not obvious what's meant by 'interpreted'.
How about:
Please note that the stream transports the volume itself as is, so the downloaded data may not correspond to guest OS visible state in cases when a complex storage format such as qcow2 or vmdk is used.
Fine by me. Michal

On Thu, Aug 20, 2020 at 14:29:21 +0200, Michal Privoznik wrote:
On 8/20/20 10:35 AM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:25 +0200, Michal Privoznik wrote:
For libvirt, the volume is just a binary blob and it doesn't interpret data on volume upload/download. But as it turns out, this unspoken assumption is not clear to our users. Document it explicitly.
Suggested in: https://bugzilla.redhat.com/show_bug.cgi?id=1851023#c17
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt-storage.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/src/libvirt-storage.c b/src/libvirt-storage.c index a45c8b98c1..8738f6dd14 100644 --- a/src/libvirt-storage.c +++ b/src/libvirt-storage.c @@ -1590,7 +1590,9 @@ virStorageVolCreateXMLFrom(virStoragePoolPtr pool, * * Download the content of the volume as a stream. If @length * is zero, then the remaining contents of the volume after - * @offset will be downloaded. + * @offset will be downloaded. Please note, that the data is + * not interpreted and therefore data received by stream + * client are in the very same format the volume is in.
I don't think this wording clarifies it that much as it's not obvious what's meant by 'interpreted'.
How about:
Please note that the stream transports the volume itself as is, so the downloaded data may not correspond to guest OS visible state in cases when a complex storage format such as qcow2 or vmdk is used.
Fine by me.
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Michal

This function will be used to detect zero buffers (which are going to be interpreted as hole in virStream later). I shamelessly took inspiration from coreutils. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virstring.c | 38 ++++++++++++++++++++++++++++++++ src/util/virstring.h | 2 ++ tests/virstringtest.c | 47 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index ae0e253ab9..1d80aeb833 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -3197,6 +3197,7 @@ virStringHasChars; virStringHasControlChars; virStringHasSuffix; virStringIsEmpty; +virStringIsNull; virStringIsPrintable; virStringListAdd; virStringListAutoFree; diff --git a/src/util/virstring.c b/src/util/virstring.c index e9e792f3bf..c26bc770d4 100644 --- a/src/util/virstring.c +++ b/src/util/virstring.c @@ -1404,3 +1404,41 @@ int virStringParseYesNo(const char *str, bool *result) return 0; } + + +/** + * virStringIsNull: + * @buf: buffer to check + * @len: the length of the buffer + * + * For given buffer @buf and its size @len determine whether + * it contains only zero bytes (NUL) or not. + * + * Returns: true if buffer is full of zero bytes, + * false otherwise. + */ +bool virStringIsNull(const char *buf, size_t len) +{ + const char *p = buf; + + if (!len) + return true; + + /* Check up to 16 first bytes. */ + for (;;) { + if (*p) + return false; + + p++; + len--; + + if (!len) + return true; + + if ((len & 0xf) == 0) + break; + } + + /* Now we know first 16 bytes are NUL, memcmp with self. */ + return memcmp(buf, p, len) == 0; +} diff --git a/src/util/virstring.h b/src/util/virstring.h index 360c68395c..d0e54358ac 100644 --- a/src/util/virstring.h +++ b/src/util/virstring.h @@ -185,6 +185,8 @@ int virStringParsePort(const char *str, int virStringParseYesNo(const char *str, bool *result) G_GNUC_WARN_UNUSED_RESULT; +bool virStringIsNull(const char *buf, size_t len); + /** * VIR_AUTOSTRINGLIST: * diff --git a/tests/virstringtest.c b/tests/virstringtest.c index bee49e6cb6..5838a57574 100644 --- a/tests/virstringtest.c +++ b/tests/virstringtest.c @@ -649,6 +649,27 @@ static int testFilterChars(const void *args) return ret; } +struct testIsNulData { + const char *buf; + size_t len; + bool nul; +}; + +static int +testIsNul(const void *args) +{ + const struct testIsNulData *data = args; + int rc; + + rc = virStringIsNull(data->buf, data->len); + if (rc != data->nul) { + fprintf(stderr, "Returned %d, expected %d\n", rc, data->nul); + return -1; + } + + return 0; +} + static int mymain(void) { @@ -977,6 +998,32 @@ mymain(void) TEST_FILTER_CHARS(NULL, NULL, NULL); TEST_FILTER_CHARS("hello 123 hello", "helo", "hellohello"); +#define TEST_IS_NUL(expect, ...) \ + do { \ + const char buf[] = {__VA_ARGS__ }; \ + struct testIsNulData isNulData = { \ + .buf = buf, \ + .len = G_N_ELEMENTS(buf), \ + .nul = expect \ + }; \ + if (virTestRun("isNul check", \ + testIsNul, &isNulData) < 0) \ + ret = -1; \ + } while (0) + + TEST_IS_NUL(true); + TEST_IS_NUL(true, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00); + TEST_IS_NUL(false, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01); + TEST_IS_NUL(false, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, + 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x01); + TEST_IS_NUL(false, + 0x00, 0x01, 0x00, 0x01, 0x00, 0x01, 0x00, 0x01, + 0x00, 0x01, 0x00, 0x01, 0x00, 0x01, 0x00, 0x01, + 0x00); + return ret == 0 ? EXIT_SUCCESS : EXIT_FAILURE; } -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:26 +0200, Michal Privoznik wrote:
This function will be used to detect zero buffers (which are going to be interpreted as hole in virStream later).
I shamelessly took inspiration from coreutils.
Coreutils is proudly GPLv3 ...
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virstring.c | 38 ++++++++++++++++++++++++++++++++ src/util/virstring.h | 2 ++ tests/virstringtest.c | 47 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+)
[...]
diff --git a/src/util/virstring.c b/src/util/virstring.c index e9e792f3bf..c26bc770d4 100644 --- a/src/util/virstring.c +++ b/src/util/virstring.c @@ -1404,3 +1404,41 @@ int virStringParseYesNo(const char *str, bool *result)
return 0; } + + +/** + * virStringIsNull:
IsNull might indicate that this does a check if the pointer is NULL. You are checking for NUL bytes.
+ * @buf: buffer to check + * @len: the length of the buffer + * + * For given buffer @buf and its size @len determine whether + * it contains only zero bytes (NUL) or not.
Given the semantics of C strings being terminated by the NUL byte I don't think this function qualifies as a string helper and thus should probably reside somewhere outside of virstring.h
+ * + * Returns: true if buffer is full of zero bytes, + * false otherwise. + */ +bool virStringIsNull(const char *buf, size_t len) +{ + const char *p = buf; + + if (!len) + return true; + + /* Check up to 16 first bytes. */ + for (;;) { + if (*p) + return false; + + p++; + len--; + + if (!len) + return true; + + if ((len & 0xf) == 0) + break; + }
Do we really need to do this optimization? We could arguably simplify this to: if (*buf != '\0') return false; return memcmp(buf, buf + 1, len - 1); You can then use the saved lines to explain that comparing a piece of memory with itself shifted by any position just ensures that there are repeating sequences of itself in the remainder and by shifting it by 1 it means that it checks that the strings are just the same byte. The check above then ensuers that the one byte is NUL.

On Thu, Aug 20, 2020 at 01:02:55PM +0200, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:26 +0200, Michal Privoznik wrote:
This function will be used to detect zero buffers (which are going to be interpreted as hole in virStream later).
I shamelessly took inspiration from coreutils.
Coreutils is proudly GPLv3 ...
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virstring.c | 38 ++++++++++++++++++++++++++++++++ src/util/virstring.h | 2 ++ tests/virstringtest.c | 47 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+)
[...]
diff --git a/src/util/virstring.c b/src/util/virstring.c index e9e792f3bf..c26bc770d4 100644 --- a/src/util/virstring.c +++ b/src/util/virstring.c @@ -1404,3 +1404,41 @@ int virStringParseYesNo(const char *str, bool *result)
return 0; } + + +/** + * virStringIsNull:
IsNull might indicate that this does a check if the pointer is NULL. You are checking for NUL bytes.
+ * @buf: buffer to check + * @len: the length of the buffer + * + * For given buffer @buf and its size @len determine whether + * it contains only zero bytes (NUL) or not.
Given the semantics of C strings being terminated by the NUL byte I don't think this function qualifies as a string helper and thus should probably reside somewhere outside of virstring.h
+ * + * Returns: true if buffer is full of zero bytes, + * false otherwise. + */ +bool virStringIsNull(const char *buf, size_t len) +{ + const char *p = buf; + + if (!len) + return true; + + /* Check up to 16 first bytes. */ + for (;;) { + if (*p) + return false; + + p++; + len--; + + if (!len) + return true; + + if ((len & 0xf) == 0) + break; + }
Do we really need to do this optimization? We could arguably simplify this to:
if (*buf != '\0') return false;
return memcmp(buf, buf + 1, len - 1);
Depends whether we care about this sparsification having goood performance or not. As a point of reference, QEMU has invested tonnes of effort in its impl, using highly tuned impls for SSE, AVX, etc switched at runtime. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Thu, Aug 20, 2020 at 13:17:42 +0100, Daniel Berrange wrote:
On Thu, Aug 20, 2020 at 01:02:55PM +0200, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:26 +0200, Michal Privoznik wrote:
This function will be used to detect zero buffers (which are going to be interpreted as hole in virStream later).
I shamelessly took inspiration from coreutils.
Coreutils is proudly GPLv3 ...
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> ---
[...]
Do we really need to do this optimization? We could arguably simplify this to:
if (*buf != '\0') return false;
return memcmp(buf, buf + 1, len - 1);
Depends whether we care about this sparsification having goood performance or not. As a point of reference, QEMU has invested tonnes of effort in its impl, using highly tuned impls for SSE, AVX, etc switched at runtime.
Well, comments to other patches in this series question whether we actually want to do explicit sparsification at all. Users always can explicitly sparsify their storage using the qemu tools and the implementation this series adds (the sparsification on read) is IMO actually stretching the semantics of the _SPARSE_STREAM flag too much and creating different behaviour depending on whether a block device is used (full sparsification) or a file is used (sparseness of the file is preserved, explicitly allocated but zeroed blocks are transported in full).

On 8/20/20 1:02 PM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:26 +0200, Michal Privoznik wrote:
This function will be used to detect zero buffers (which are going to be interpreted as hole in virStream later).
I shamelessly took inspiration from coreutils.
Coreutils is proudly GPLv3 ...
Sure. But it was discussed in v1 and I think we agreed that the algorithm is generic enough that it can be used in Libvirt too: https://www.redhat.com/archives/libvir-list/2020-July/msg00170.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virstring.c | 38 ++++++++++++++++++++++++++++++++ src/util/virstring.h | 2 ++ tests/virstringtest.c | 47 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+)
[...]
diff --git a/src/util/virstring.c b/src/util/virstring.c index e9e792f3bf..c26bc770d4 100644 --- a/src/util/virstring.c +++ b/src/util/virstring.c @@ -1404,3 +1404,41 @@ int virStringParseYesNo(const char *str, bool *result)
return 0; } + + +/** + * virStringIsNull:
IsNull might indicate that this does a check if the pointer is NULL. You are checking for NUL bytes.
Fair enough. I'm out of ideas though. Do you have a suggestion?
+ * @buf: buffer to check + * @len: the length of the buffer + * + * For given buffer @buf and its size @len determine whether + * it contains only zero bytes (NUL) or not.
Given the semantics of C strings being terminated by the NUL byte I don't think this function qualifies as a string helper and thus should probably reside somewhere outside of virstring.h
Alright. I will try to find better location. But since I want to use this function from virsh too I'd like to have it in utils.
+ * + * Returns: true if buffer is full of zero bytes, + * false otherwise. + */ +bool virStringIsNull(const char *buf, size_t len) +{ + const char *p = buf; + + if (!len) + return true; + + /* Check up to 16 first bytes. */ + for (;;) { + if (*p) + return false; + + p++; + len--; + + if (!len) + return true; + + if ((len & 0xf) == 0) + break; + }
Do we really need to do this optimization? We could arguably simplify this to:
if (*buf != '\0') return false;
return memcmp(buf, buf + 1, len - 1);
You can then use the saved lines to explain that comparing a piece of memory with itself shifted by any position just ensures that there are repeating sequences of itself in the remainder and by shifting it by 1 it means that it checks that the strings are just the same byte. The check above then ensuers that the one byte is NUL.
The idea is to pass aligned address to memcmp(). But I guess we can let memcmp() deal with that. Michal

On Thu, Aug 20, 2020 at 14:36:15 +0200, Michal Privoznik wrote:
On 8/20/20 1:02 PM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:26 +0200, Michal Privoznik wrote:
This function will be used to detect zero buffers (which are going to be interpreted as hole in virStream later).
I shamelessly took inspiration from coreutils.
Coreutils is proudly GPLv3 ...
Sure. But it was discussed in v1 and I think we agreed that the algorithm is generic enough that it can be used in Libvirt too:
https://www.redhat.com/archives/libvir-list/2020-July/msg00170.html
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virstring.c | 38 ++++++++++++++++++++++++++++++++ src/util/virstring.h | 2 ++ tests/virstringtest.c | 47 ++++++++++++++++++++++++++++++++++++++++ 4 files changed, 88 insertions(+)
[...]
diff --git a/src/util/virstring.c b/src/util/virstring.c index e9e792f3bf..c26bc770d4 100644 --- a/src/util/virstring.c +++ b/src/util/virstring.c @@ -1404,3 +1404,41 @@ int virStringParseYesNo(const char *str, bool *result) return 0; } + + +/** + * virStringIsNull:
IsNull might indicate that this does a check if the pointer is NULL. You are checking for NUL bytes.
Fair enough. I'm out of ideas though. Do you have a suggestion?
+ * @buf: buffer to check + * @len: the length of the buffer + * + * For given buffer @buf and its size @len determine whether + * it contains only zero bytes (NUL) or not.
Given the semantics of C strings being terminated by the NUL byte I don't think this function qualifies as a string helper and thus should probably reside somewhere outside of virstring.h
Alright. I will try to find better location. But since I want to use this function from virsh too I'd like to have it in utils.
+ * + * Returns: true if buffer is full of zero bytes, + * false otherwise. + */ +bool virStringIsNull(const char *buf, size_t len) +{ + const char *p = buf; + + if (!len) + return true; + + /* Check up to 16 first bytes. */ + for (;;) { + if (*p) + return false; + + p++; + len--; + + if (!len) + return true; + + if ((len & 0xf) == 0) + break; + }
Do we really need to do this optimization? We could arguably simplify this to:
if (*buf != '\0') return false;
return memcmp(buf, buf + 1, len - 1);
You can then use the saved lines to explain that comparing a piece of memory with itself shifted by any position just ensures that there are repeating sequences of itself in the remainder and by shifting it by 1 it means that it checks that the strings are just the same byte. The check above then ensuers that the one byte is NUL.
The idea is to pass aligned address to memcmp(). But I guess we can let memcmp() deal with that.
Well, this explanation might justify the algorithm above, but it's certainly not obvious, so please add a comment.

For given file descriptor determine if the current position it is in plus 1MiB (arbitrary chosen value) consists solely from zero bytes or not. This is a block device friendly version of virFileInData(). Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/libvirt_private.syms | 1 + src/util/virfile.c | 67 ++++++++++++++++++++++++++++++++++++++++ src/util/virfile.h | 4 +++ 3 files changed, 72 insertions(+) diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms index 1d80aeb833..6b5a751788 100644 --- a/src/libvirt_private.syms +++ b/src/libvirt_private.syms @@ -2070,6 +2070,7 @@ virFileGetMountSubtree; virFileGetXAttr; virFileGetXAttrQuiet; virFileInData; +virFileInDataDetectZeroes; virFileIsCDROM; virFileIsDir; virFileIsExecutable; diff --git a/src/util/virfile.c b/src/util/virfile.c index c034df5931..a35d9ccb7a 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4064,6 +4064,73 @@ virFileInData(int fd G_GNUC_UNUSED, #endif /* !HAVE_DECL_SEEK_HOLE */ +#define DETECT_ZEORES_BLOCK_SIZE (1 * 1024 * 1024) + +/* + * virFileInDataDetectZeroes: + * @fd: file to check + * @inData: true if current position in the @fd is in data section + * @length: amount of bytes until the end of the current section + * + * This behaves exactly like virFileInData() except it doesn't use SEEK_DATA + * and SEEK_HOLE rather than a buffer into which it reads data from @fd and + * detects if the buffer is full of zeroes. Therefore, it is safe to use on + * special files (e.g. block devices). On the other hand it sees only + * DETECT_ZEORES_BLOCK_SIZE ahead. + * + * Returns: 0 on success, + * -1 otherwise. + */ +int +virFileInDataDetectZeroes(int fd, + int *inData, + long long *length) +{ + const size_t buflen = DETECT_ZEORES_BLOCK_SIZE; + g_autofree char *bytes = NULL; + off_t cur; + ssize_t r; + int ret = -1; + + /* Get current position */ + cur = lseek(fd, 0, SEEK_CUR); + if (cur == (off_t) -1) { + virReportSystemError(errno, "%s", + _("Unable to get current position in file")); + goto cleanup; + } + + bytes = g_new0(char, buflen); + + if ((r = saferead(fd, bytes, buflen)) < 0) { + virReportSystemError(errno, "%s", + _("Unable to read from file")); + goto cleanup; + } + + *inData = !virStringIsNull(bytes, r); + *length = r; + ret = 0; + + cleanup: + /* At any rate, reposition back to where we started. */ + if (cur != (off_t) -1) { + int theerrno = errno; + + if (lseek(fd, cur, SEEK_SET) == (off_t) -1) { + virReportSystemError(errno, "%s", + _("unable to restore position in file")); + ret = -1; + if (theerrno == 0) + theerrno = errno; + } + + errno = theerrno; + } + return ret; +} + + /** * virFileReadValueInt: * @value: pointer to int to be filled in with the value diff --git a/src/util/virfile.h b/src/util/virfile.h index 7a92364a5c..9a5eade609 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -352,6 +352,10 @@ int virFileInData(int fd, int *inData, long long *length); +int virFileInDataDetectZeroes(int fd, + int *inData, + long long *length); + G_DEFINE_AUTOPTR_CLEANUP_FUNC(virFileWrapperFd, virFileWrapperFdFree); int virFileGetXAttr(const char *path, -- 2.26.2

These callback will need to know more that the FD they are working on. Pass the structure that is passed to other stream callbacks (e.g. virshStreamSource() or virshStreamSourceSkip()) instead of inventing a new one. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tools/virsh-util.c | 10 +++++----- tools/virsh-volume.c | 6 +++++- 2 files changed, 10 insertions(+), 6 deletions(-) diff --git a/tools/virsh-util.c b/tools/virsh-util.c index 932d6d0849..89f15efd08 100644 --- a/tools/virsh-util.c +++ b/tools/virsh-util.c @@ -146,9 +146,9 @@ virshStreamSink(virStreamPtr st G_GNUC_UNUSED, size_t nbytes, void *opaque) { - int *fd = opaque; + virshStreamCallbackDataPtr cbData = opaque; - return safewrite(*fd, bytes, nbytes); + return safewrite(cbData->fd, bytes, nbytes); } @@ -186,13 +186,13 @@ virshStreamSkip(virStreamPtr st G_GNUC_UNUSED, long long offset, void *opaque) { - int *fd = opaque; + virshStreamCallbackDataPtr cbData = opaque; off_t cur; - if ((cur = lseek(*fd, offset, SEEK_CUR)) == (off_t) -1) + if ((cur = lseek(cbData->fd, offset, SEEK_CUR)) == (off_t) -1) return -1; - if (ftruncate(*fd, cur) < 0) + if (ftruncate(cbData->fd, cur) < 0) return -1; return 0; diff --git a/tools/virsh-volume.c b/tools/virsh-volume.c index 72394915d8..5cbc2efb7a 100644 --- a/tools/virsh-volume.c +++ b/tools/virsh-volume.c @@ -793,6 +793,7 @@ cmdVolDownload(vshControl *ctl, const vshCmd *cmd) unsigned long long offset = 0, length = 0; bool created = false; virshControlPtr priv = ctl->privData; + virshStreamCallbackData cbData; unsigned int flags = 0; if (vshCommandOptULongLong(ctl, cmd, "offset", &offset) < 0) @@ -820,6 +821,9 @@ cmdVolDownload(vshControl *ctl, const vshCmd *cmd) created = true; } + cbData.ctl = ctl; + cbData.fd = fd; + if (!(st = virStreamNew(priv->conn, 0))) { vshError(ctl, _("cannot create a new stream")); goto cleanup; @@ -830,7 +834,7 @@ cmdVolDownload(vshControl *ctl, const vshCmd *cmd) goto cleanup; } - if (virStreamSparseRecvAll(st, virshStreamSink, virshStreamSkip, &fd) < 0) { + if (virStreamSparseRecvAll(st, virshStreamSink, virshStreamSkip, &cbData) < 0) { vshError(ctl, _("cannot receive data from volume %s"), name); goto cleanup; } -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:28 +0200, Michal Privoznik wrote:
These callback will need to know more that the FD they are working on. Pass the structure that is passed to other stream callbacks (e.g. virshStreamSource() or virshStreamSourceSkip()) instead of inventing a new one.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tools/virsh-util.c | 10 +++++----- tools/virsh-volume.c | 6 +++++- 2 files changed, 10 insertions(+), 6 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

We can't use virFileInData() with block devices, but we could use new virFileInDataDetectZeroes(). But to decide we need to know if the FD we are reading data from / writing data to is a block device. Store this information in _virshStreamCallbackData. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tools/virsh-util.h | 1 + tools/virsh-volume.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+) diff --git a/tools/virsh-util.h b/tools/virsh-util.h index 72653d9735..9ef28cfe0a 100644 --- a/tools/virsh-util.h +++ b/tools/virsh-util.h @@ -72,6 +72,7 @@ typedef virshStreamCallbackData *virshStreamCallbackDataPtr; struct _virshStreamCallbackData { vshControl *ctl; int fd; + bool isBlock; }; int diff --git a/tools/virsh-volume.c b/tools/virsh-volume.c index 5cbc2efb7a..20823e2d10 100644 --- a/tools/virsh-volume.c +++ b/tools/virsh-volume.c @@ -679,6 +679,7 @@ cmdVolUpload(vshControl *ctl, const vshCmd *cmd) virshControlPtr priv = ctl->privData; unsigned int flags = 0; virshStreamCallbackData cbData; + struct stat sb; if (vshCommandOptULongLong(ctl, cmd, "offset", &offset) < 0) return false; @@ -697,8 +698,14 @@ cmdVolUpload(vshControl *ctl, const vshCmd *cmd) goto cleanup; } + if (fstat(fd, &sb) < 0) { + vshError(ctl, _("unable to stat %s"), file); + goto cleanup; + } + cbData.ctl = ctl; cbData.fd = fd; + cbData.isBlock = !!S_ISBLK(sb.st_mode); if (vshCommandOptBool(cmd, "sparse")) flags |= VIR_STORAGE_VOL_UPLOAD_SPARSE_STREAM; @@ -795,6 +802,7 @@ cmdVolDownload(vshControl *ctl, const vshCmd *cmd) virshControlPtr priv = ctl->privData; virshStreamCallbackData cbData; unsigned int flags = 0; + struct stat sb; if (vshCommandOptULongLong(ctl, cmd, "offset", &offset) < 0) return false; @@ -821,8 +829,14 @@ cmdVolDownload(vshControl *ctl, const vshCmd *cmd) created = true; } + if (fstat(fd, &sb) < 0) { + vshError(ctl, _("unable to stat %s"), file); + goto cleanup; + } + cbData.ctl = ctl; cbData.fd = fd; + cbData.isBlock = !!S_ISBLK(sb.st_mode); if (!(st = virStreamNew(priv->conn, 0))) { vshError(ctl, _("cannot create a new stream")); -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:29 +0200, Michal Privoznik wrote:
We can't use virFileInData() with block devices, but we could use new virFileInDataDetectZeroes(). But to decide we need to know if the FD we are reading data from / writing data to is a block device. Store this information in _virshStreamCallbackData.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tools/virsh-util.h | 1 + tools/virsh-volume.c | 14 ++++++++++++++ 2 files changed, 15 insertions(+)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

This callback is called when the server sends us STREAM_HOLE meaning there is no real data, only zeroes. For regular files we would just seek() beyond EOF and ftruncate() to create the hole. But for block devices this won't work. Not only we can't seek() beyond EOF, and ftruncate() will fail, this approach won't fill the device with zeroes. We have to do it manually. Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tools/virsh-util.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/tools/virsh-util.c b/tools/virsh-util.c index 89f15efd08..884261eb49 100644 --- a/tools/virsh-util.c +++ b/tools/virsh-util.c @@ -189,11 +189,33 @@ virshStreamSkip(virStreamPtr st G_GNUC_UNUSED, virshStreamCallbackDataPtr cbData = opaque; off_t cur; - if ((cur = lseek(cbData->fd, offset, SEEK_CUR)) == (off_t) -1) - return -1; + if (cbData->isBlock) { + g_autofree char * buf = NULL; + const size_t buflen = 1 * 1024 * 1024; /* 1MiB */ - if (ftruncate(cbData->fd, cur) < 0) - return -1; + /* While for files it's enough to lseek() and ftruncate() to create + * a hole which would emulate zeroes on read(), for block devices + * we have to write zeroes to read() zeroes. And we have to write + * @got bytes of zeroes. Do that in smaller chunks though.*/ + + buf = g_new0(char, buflen); + + while (offset) { + size_t count = MIN(offset, buflen); + ssize_t r; + + if ((r = safewrite(cbData->fd, buf, count)) < 0) + return -1; + + offset -= r; + } + } else { + if ((cur = lseek(cbData->fd, offset, SEEK_CUR)) == (off_t) -1) + return -1; + + if (ftruncate(cbData->fd, cur) < 0) + return -1; + } return 0; } -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:30 +0200, Michal Privoznik wrote:
This callback is called when the server sends us STREAM_HOLE meaning there is no real data, only zeroes. For regular files we would just seek() beyond EOF and ftruncate() to create the hole. But for block devices this won't work. Not only we can't seek() beyond EOF, and ftruncate() will fail, this approach won't fill the device with zeroes. We have to do it manually.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tools/virsh-util.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

When handling sparse stream, a thread is executed. This thread runs a read() or write() loop (depending what API is called; in this case it's virStorageVolDownload() and this the thread run read() loop). The read() is handled in virFDStreamThreadDoRead() which is then data/hole section aware, meaning it uses virFileInData() to detect data and hole sections and sends TYPE_DATA or TYPE_HOLE virStream messages accordingly. However, virFileInData() does not work with block devices. Simply because block devices don't have data and hole sections. But we can use new virFileInDataDetectZeroes() which is block device friendly for that. Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index 50209a8bd4..44b4855549 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -398,6 +398,7 @@ struct _virFDStreamThreadData { size_t length; bool doRead; bool sparse; + bool isBlock; int fdin; char *fdinname; int fdout; @@ -421,6 +422,7 @@ virFDStreamThreadDataFree(virFDStreamThreadDataPtr data) static ssize_t virFDStreamThreadDoRead(virFDStreamDataPtr fdst, bool sparse, + bool isBlock, const int fdin, const int fdout, const char *fdinname, @@ -437,8 +439,13 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, ssize_t got; if (sparse && *dataLen == 0) { - if (virFileInData(fdin, &inData, §ionLen) < 0) - return -1; + if (isBlock) { + if (virFileInDataDetectZeroes(fdin, &inData, §ionLen) < 0) + return -1; + } else { + if (virFileInData(fdin, &inData, §ionLen) < 0) + return -1; + } if (length && sectionLen > length - total) @@ -568,6 +575,7 @@ virFDStreamThread(void *opaque) virStreamPtr st = data->st; size_t length = data->length; bool sparse = data->sparse; + bool isBlock = data->isBlock; VIR_AUTOCLOSE fdin = data->fdin; char *fdinname = data->fdinname; VIR_AUTOCLOSE fdout = data->fdout; @@ -604,7 +612,7 @@ virFDStreamThread(void *opaque) } if (doRead) - got = virFDStreamThreadDoRead(fdst, sparse, + got = virFDStreamThreadDoRead(fdst, sparse, isBlock, fdin, fdout, fdinname, fdoutname, length, total, @@ -1271,6 +1279,7 @@ virFDStreamOpenFileInternal(virStreamPtr st, threadData->st = virObjectRef(st); threadData->length = length; threadData->sparse = sparse; + threadData->isBlock = !!S_ISBLK(sb.st_mode); if ((oflags & O_ACCMODE) == O_RDONLY) { threadData->fdin = fd; -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:31 +0200, Michal Privoznik wrote:
When handling sparse stream, a thread is executed. This thread runs a read() or write() loop (depending what API is called; in this case it's virStorageVolDownload() and this the thread run read() loop). The read() is handled in virFDStreamThreadDoRead() which is then data/hole section aware, meaning it uses virFileInData() to detect data and hole sections and sends TYPE_DATA or TYPE_HOLE virStream messages accordingly.
However, virFileInData() does not work with block devices. Simply because block devices don't have data and hole sections. But we can use new virFileInDataDetectZeroes() which is block device friendly for that.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
IMO this goes against the semantics of the _SPARSE_STREAM flag. A block device by definition is not sparse, so there are no holes to send. What you've implemented is a way to sparsify a block device, but that IMO should not be considered by default when a block device is used. If a file is not sparse, the previous code doesn't actually transmit holes either. If you want to achieve sparsification on the source side of the transmission, this IMO needs an explicit flag to opt-in and then we should sparsify also regular files using the same algorithm.

On 8/20/20 1:57 PM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:31 +0200, Michal Privoznik wrote:
When handling sparse stream, a thread is executed. This thread runs a read() or write() loop (depending what API is called; in this case it's virStorageVolDownload() and this the thread run read() loop). The read() is handled in virFDStreamThreadDoRead() which is then data/hole section aware, meaning it uses virFileInData() to detect data and hole sections and sends TYPE_DATA or TYPE_HOLE virStream messages accordingly.
However, virFileInData() does not work with block devices. Simply because block devices don't have data and hole sections. But we can use new virFileInDataDetectZeroes() which is block device friendly for that.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
IMO this goes against the semantics of the _SPARSE_STREAM flag. A block device by definition is not sparse, so there are no holes to send.
What you've implemented is a way to sparsify a block device, but that IMO should not be considered by default when a block device is used. If a file is not sparse, the previous code doesn't actually transmit holes either.
If you want to achieve sparsification on the source side of the transmission, this IMO needs an explicit flag to opt-in and then we should sparsify also regular files using the same algorithm.
Fair enough. So how about I'll send v3 where: a) in the first patches I make our stream read/write functions handle block devices for _SPARSE_STREAM without any zero block detection. Only thing that will happen is that if the source is a sparse regular file and thus the stream receiver gets a HOLE packet and it is writing the data into a block device it will have to emulate the hole by writing block of zeroes. However, if the stream source is a block device then no HOLE shall ever be sent. b) in next patches I'll introduce _DETECT_ZEROES flag (and possibly make it require _SPARSE_STREAM too) which will handle the case where the stream source is a block device with zero blocks, at which point it will try to detect them and be allowed to send HOLE down the stream. Does this sound reasonable? BTW: I've pushed the first N patches which were just switching FDStream to glib. Michal

On Thu, Aug 20, 2020 at 15:31:28 +0200, Michal Privoznik wrote:
On 8/20/20 1:57 PM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:31 +0200, Michal Privoznik wrote:
When handling sparse stream, a thread is executed. This thread runs a read() or write() loop (depending what API is called; in this case it's virStorageVolDownload() and this the thread run read() loop). The read() is handled in virFDStreamThreadDoRead() which is then data/hole section aware, meaning it uses virFileInData() to detect data and hole sections and sends TYPE_DATA or TYPE_HOLE virStream messages accordingly.
However, virFileInData() does not work with block devices. Simply because block devices don't have data and hole sections. But we can use new virFileInDataDetectZeroes() which is block device friendly for that.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
IMO this goes against the semantics of the _SPARSE_STREAM flag. A block device by definition is not sparse, so there are no holes to send.
What you've implemented is a way to sparsify a block device, but that IMO should not be considered by default when a block device is used. If a file is not sparse, the previous code doesn't actually transmit holes either.
If you want to achieve sparsification on the source side of the transmission, this IMO needs an explicit flag to opt-in and then we should sparsify also regular files using the same algorithm.
Fair enough. So how about I'll send v3 where:
a) in the first patches I make our stream read/write functions handle block devices for _SPARSE_STREAM without any zero block detection. Only thing that will happen is that if the source is a sparse regular file and thus the stream receiver gets a HOLE packet and it is writing the data into a block device it will have to emulate the hole by writing block of zeroes. However, if the stream source is a block device then no HOLE shall ever be sent.
AFAIK I've R-b'd enough patches to fix this portion and provided that there aren't any merge conflicts you can already commit those. I'm completely fine with that portion as-is.
b) in next patches I'll introduce _DETECT_ZEROES flag (and possibly make it require _SPARSE_STREAM too) which will handle the case where the stream source is a block device with zero blocks, at which point it will try to detect them and be allowed to send HOLE down the stream.
On this topic, I agree that it's a sensible approach for the rest of the series and it at least unifies the behaviour. I'm unsure though whether it's worth even doing _DETECT_ZEROES feature at all though. To me it feels that the users are better off using other tools rather than re-implementing yet another thing in libvirt. If possible provide some additional justification here.

On 8/20/20 3:42 PM, Peter Krempa wrote:
On Thu, Aug 20, 2020 at 15:31:28 +0200, Michal Privoznik wrote:
On 8/20/20 1:57 PM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:31 +0200, Michal Privoznik wrote:
When handling sparse stream, a thread is executed. This thread runs a read() or write() loop (depending what API is called; in this case it's virStorageVolDownload() and this the thread run read() loop). The read() is handled in virFDStreamThreadDoRead() which is then data/hole section aware, meaning it uses virFileInData() to detect data and hole sections and sends TYPE_DATA or TYPE_HOLE virStream messages accordingly.
However, virFileInData() does not work with block devices. Simply because block devices don't have data and hole sections. But we can use new virFileInDataDetectZeroes() which is block device friendly for that.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
IMO this goes against the semantics of the _SPARSE_STREAM flag. A block device by definition is not sparse, so there are no holes to send.
What you've implemented is a way to sparsify a block device, but that IMO should not be considered by default when a block device is used. If a file is not sparse, the previous code doesn't actually transmit holes either.
If you want to achieve sparsification on the source side of the transmission, this IMO needs an explicit flag to opt-in and then we should sparsify also regular files using the same algorithm.
Fair enough. So how about I'll send v3 where:
a) in the first patches I make our stream read/write functions handle block devices for _SPARSE_STREAM without any zero block detection. Only thing that will happen is that if the source is a sparse regular file and thus the stream receiver gets a HOLE packet and it is writing the data into a block device it will have to emulate the hole by writing block of zeroes. However, if the stream source is a block device then no HOLE shall ever be sent.
AFAIK I've R-b'd enough patches to fix this portion and provided that there aren't any merge conflicts you can already commit those.
I'm completely fine with that portion as-is.
Almost :-) For instance this very patch uses virFileInDataDetectZeroes() to detect zero blocks on block devices. It needs to be changed to always assume data section and some length. The same applies to the next patch 14/17. But the diff is trivial: iff --git c/src/util/virfdstream.c w/src/util/virfdstream.c index 9968cdc623..39514ef555 100644 --- c/src/util/virfdstream.c +++ w/src/util/virfdstream.c @@ -440,8 +440,15 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, if (sparse && *dataLen == 0) { if (isBlock) { - if (virFileInDataDetectZeroes(fdin, &inData, §ionLen) < 0) - return -1; + /* Block devices are always in data section by definition. The + * @sectionLen is slightly more tricky. While we could try and get + * how much bytes is there left until EOF, we can pretend there is + * always X bytes left and let the saferead() below hit EOF (which + * is then handled gracefully anyway). Worst case scenario, this + * branch is called more than once. + * X was chosen to be 1MiB but it has ho special meaning. */ + inData = 1; + sectionLen = 1 * 1024 * 1024; And the same for virsh case. Do you want me to resend those two patches?
b) in next patches I'll introduce _DETECT_ZEROES flag (and possibly make it require _SPARSE_STREAM too) which will handle the case where the stream source is a block device with zero blocks, at which point it will try to detect them and be allowed to send HOLE down the stream.
On this topic, I agree that it's a sensible approach for the rest of the series and it at least unifies the behaviour.
I'm unsure though whether it's worth even doing _DETECT_ZEROES feature at all though. To me it feels that the users are better off using other tools rather than re-implementing yet another thing in libvirt.
Alright. Fair enough I guess.
If possible provide some additional justification here.
It was discussed in the bz https://bugzilla.redhat.com/show_bug.cgi?id=1852528 VDSM is doing a thin provisioning and as a part of that they are copying files onto block devices. But for that zero detection shouldn't be needed. Michal

On Fri, Aug 21, 2020 at 11:19:40 +0200, Michal Privoznik wrote:
On 8/20/20 3:42 PM, Peter Krempa wrote:
On Thu, Aug 20, 2020 at 15:31:28 +0200, Michal Privoznik wrote:
On 8/20/20 1:57 PM, Peter Krempa wrote:
On Tue, Jul 07, 2020 at 21:46:31 +0200, Michal Privoznik wrote:
When handling sparse stream, a thread is executed. This thread runs a read() or write() loop (depending what API is called; in this case it's virStorageVolDownload() and this the thread run read() loop). The read() is handled in virFDStreamThreadDoRead() which is then data/hole section aware, meaning it uses virFileInData() to detect data and hole sections and sends TYPE_DATA or TYPE_HOLE virStream messages accordingly.
However, virFileInData() does not work with block devices. Simply because block devices don't have data and hole sections. But we can use new virFileInDataDetectZeroes() which is block device friendly for that.
Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-)
IMO this goes against the semantics of the _SPARSE_STREAM flag. A block device by definition is not sparse, so there are no holes to send.
What you've implemented is a way to sparsify a block device, but that IMO should not be considered by default when a block device is used. If a file is not sparse, the previous code doesn't actually transmit holes either.
If you want to achieve sparsification on the source side of the transmission, this IMO needs an explicit flag to opt-in and then we should sparsify also regular files using the same algorithm.
Fair enough. So how about I'll send v3 where:
a) in the first patches I make our stream read/write functions handle block devices for _SPARSE_STREAM without any zero block detection. Only thing that will happen is that if the source is a sparse regular file and thus the stream receiver gets a HOLE packet and it is writing the data into a block device it will have to emulate the hole by writing block of zeroes. However, if the stream source is a block device then no HOLE shall ever be sent.
AFAIK I've R-b'd enough patches to fix this portion and provided that there aren't any merge conflicts you can already commit those.
I'm completely fine with that portion as-is.
Almost :-) For instance this very patch uses virFileInDataDetectZeroes() to detect zero blocks on block devices. It needs to be changed to always assume data section and some length. The same applies to the next patch 14/17. But the diff is trivial:
iff --git c/src/util/virfdstream.c w/src/util/virfdstream.c index 9968cdc623..39514ef555 100644 --- c/src/util/virfdstream.c +++ w/src/util/virfdstream.c @@ -440,8 +440,15 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst,
if (sparse && *dataLen == 0) { if (isBlock) { - if (virFileInDataDetectZeroes(fdin, &inData, §ionLen) < 0) - return -1; + /* Block devices are always in data section by definition. The + * @sectionLen is slightly more tricky. While we could try and get + * how much bytes is there left until EOF, we can pretend there is + * always X bytes left and let the saferead() below hit EOF (which + * is then handled gracefully anyway). Worst case scenario, this + * branch is called more than once. + * X was chosen to be 1MiB but it has ho special meaning. */ + inData = 1; + sectionLen = 1 * 1024 * 1024;
And the same for virsh case. Do you want me to resend those two patches?
Well, for a substantial change like this it should be the default approach. You can use: Reviewed-by: Peter Krempa <pkrempa@redhat.com> Don't forget to send the final version to the list though.

This is very similar to previous commit. The virshStreamInData() callback is used by virStreamSparseSendAll() to detect whether the file the data is read from is in data or hole section. The SendAll() will then send corresponding type of virStream message to make server create a hole or write actual data. But the callback uses virFileInData() even for block devices, which results in an error. Switch to virFileInDataDetectZeroes() for block devices. Partially resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- tools/virsh-util.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-) diff --git a/tools/virsh-util.c b/tools/virsh-util.c index 884261eb49..867f69577f 100644 --- a/tools/virsh-util.c +++ b/tools/virsh-util.c @@ -230,12 +230,20 @@ virshStreamInData(virStreamPtr st G_GNUC_UNUSED, virshStreamCallbackDataPtr cbData = opaque; vshControl *ctl = cbData->ctl; int fd = cbData->fd; - int ret; - if ((ret = virFileInData(fd, inData, offset)) < 0) - vshError(ctl, "%s", _("Unable to get current position in stream")); + if (cbData->isBlock) { + if (virFileInDataDetectZeroes(fd, inData, offset) < 0) { + vshError(ctl, "%s", _("Unable to get current position in stream")); + return -1; + } + } else { + if (virFileInData(fd, inData, offset) < 0) { + vshError(ctl, "%s", _("Unable to get current position in stream")); + return -1; + } + } - return ret; + return 0; } -- 2.26.2

This is similar to one of previous patches. When receiving stream (on virStorageVolUpload() and subsequent virStreamSparseSendAll()) we may receive a hole. If the volume we are saving the incoming data into is a regular file we just lseek() and ftruncate() to create the hole. But this won't work if the file is a block device. If that is the case we must write zeroes so that any subsequent reader reads nothing just zeroes (just like they would from a hole in a regular file). Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528 Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 59 +++++++++++++++++++++++++++++++----------- 1 file changed, 44 insertions(+), 15 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index 44b4855549..40825a8811 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -505,6 +505,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, static ssize_t virFDStreamThreadDoWrite(virFDStreamDataPtr fdst, bool sparse, + bool isBlock, const int fdin, const int fdout, const char *fdinname, @@ -512,7 +513,6 @@ virFDStreamThreadDoWrite(virFDStreamDataPtr fdst, { ssize_t got = 0; virFDStreamMsgPtr msg = fdst->msg; - off_t off; bool pop = false; switch (msg->type) { @@ -540,19 +540,48 @@ virFDStreamThreadDoWrite(virFDStreamDataPtr fdst, } got = msg->stream.hole.len; - off = lseek(fdout, got, SEEK_CUR); - if (off == (off_t) -1) { - virReportSystemError(errno, - _("unable to seek in %s"), - fdoutname); - return -1; - } - - if (ftruncate(fdout, off) < 0) { - virReportSystemError(errno, - _("unable to truncate %s"), - fdoutname); - return -1; + if (isBlock) { + g_autofree char * buf = NULL; + const size_t buflen = 1 * 1024 * 1024; /* 1MiB */ + size_t toWrite = got; + + /* While for files it's enough to lseek() and ftruncate() to create + * a hole which would emulate zeroes on read(), for block devices + * we have to write zeroes to read() zeroes. And we have to write + * @got bytes of zeroes. Do that in smaller chunks though.*/ + + buf = g_new0(char, buflen); + + while (toWrite) { + size_t count = MIN(toWrite, buflen); + ssize_t r; + + if ((r = safewrite(fdout, buf, count)) < 0) { + virReportSystemError(errno, + _("Unable to write %s"), + fdoutname); + return -1; + } + + toWrite -= r; + } + } else { + off_t off; + + off = lseek(fdout, got, SEEK_CUR); + if (off == (off_t) -1) { + virReportSystemError(errno, + _("unable to seek in %s"), + fdoutname); + return -1; + } + + if (ftruncate(fdout, off) < 0) { + virReportSystemError(errno, + _("unable to truncate %s"), + fdoutname); + return -1; + } } pop = true; @@ -618,7 +647,7 @@ virFDStreamThread(void *opaque) length, total, &dataLen, buflen); else - got = virFDStreamThreadDoWrite(fdst, sparse, + got = virFDStreamThreadDoWrite(fdst, sparse, isBlock, fdin, fdout, fdinname, fdoutname); -- 2.26.2

On Tue, Jul 07, 2020 at 21:46:33 +0200, Michal Privoznik wrote:
This is similar to one of previous patches.
When receiving stream (on virStorageVolUpload() and subsequent virStreamSparseSendAll()) we may receive a hole. If the volume we are saving the incoming data into is a regular file we just lseek() and ftruncate() to create the hole. But this won't work if the file is a block device. If that is the case we must write zeroes so that any subsequent reader reads nothing just zeroes (just like they would from a hole in a regular file).
Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=1852528
Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 59 +++++++++++++++++++++++++++++++----------- 1 file changed, 44 insertions(+), 15 deletions(-)
Reviewed-by: Peter Krempa <pkrempa@redhat.com>

The aim of virFileInDataDetectZeroes() is to mimic virFileInData() so that it can be used in sparse virStream with very little change to logic. This was implemented in previous commits. These two functions act alike - the passed FD is rewound back to the position it was in when either of the functions was called. For regular files (when virFileInData()) is used, this is not a problem - rewinding an FD is usually cheap and data is read() only from data section and it's read only once. But this is not the case for virFileInDataDetectZeroes(), which doesn't use cheap lseek() to learn data/hole sections, but uses read() and zero buffer detection. Worst case scenario, virFileInDataDetectZeroes() reads 1MiB of bytes, finds it is non-zero data, rewinds FD that 1MiB back only so that virFDStreamThreadDoRead() (in case of daemon) or virshStreamSource() (in case of client) can read the same data again. Possibly, this is not that big problem because kernel is likely to keep the data still in cache. But let's not rely on that and store the buffer for later use if we learned it contains actual data. Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virfdstream.c | 27 +++++++++++++--------- src/util/virfile.c | 51 ++++++++++++++---------------------------- src/util/virfile.h | 3 ++- tools/virsh-util.c | 40 +++++++++++++++++++++++++++++---- tools/virsh-util.h | 3 +++ tools/virsh-volume.c | 3 ++- 6 files changed, 76 insertions(+), 51 deletions(-) diff --git a/src/util/virfdstream.c b/src/util/virfdstream.c index 40825a8811..9d617c71da 100644 --- a/src/util/virfdstream.c +++ b/src/util/virfdstream.c @@ -440,7 +440,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, if (sparse && *dataLen == 0) { if (isBlock) { - if (virFileInDataDetectZeroes(fdin, &inData, §ionLen) < 0) + if (virFileInDataDetectZeroes(fdin, &inData, §ionLen, &buf) < 0) return -1; } else { if (virFileInData(fdin, &inData, §ionLen) < 0) @@ -468,7 +468,7 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, /* HACK: The message queue is one directional. So caller * cannot make us skip the hole. Do that for them instead. */ - if (sectionLen && + if (sectionLen && !isBlock && lseek(fdin, sectionLen, SEEK_CUR) == (off_t) -1) { virReportSystemError(errno, _("unable to seek in %s"), @@ -476,17 +476,22 @@ virFDStreamThreadDoRead(virFDStreamDataPtr fdst, return -1; } } else { - if (sparse && - buflen > *dataLen) - buflen = *dataLen; + if (sparse && isBlock) { + /* Data already read by virFileInDataDetectZeroes() */ + got = sectionLen; + } else { + if (sparse && + buflen > *dataLen) + buflen = *dataLen; - buf = g_new0(char, buflen); + buf = g_new0(char, buflen); - if ((got = saferead(fdin, buf, buflen)) < 0) { - virReportSystemError(errno, - _("Unable to read %s"), - fdinname); - return -1; + if ((got = saferead(fdin, buf, buflen)) < 0) { + virReportSystemError(errno, + _("Unable to read %s"), + fdinname); + return -1; + } } msg->type = VIR_FDSTREAM_MSG_TYPE_DATA; diff --git a/src/util/virfile.c b/src/util/virfile.c index a35d9ccb7a..e33a1c280c 100644 --- a/src/util/virfile.c +++ b/src/util/virfile.c @@ -4071,12 +4071,18 @@ virFileInData(int fd G_GNUC_UNUSED, * @fd: file to check * @inData: true if current position in the @fd is in data section * @length: amount of bytes until the end of the current section + * @buf: data read if in data section * - * This behaves exactly like virFileInData() except it doesn't use SEEK_DATA - * and SEEK_HOLE rather than a buffer into which it reads data from @fd and - * detects if the buffer is full of zeroes. Therefore, it is safe to use on - * special files (e.g. block devices). On the other hand it sees only - * DETECT_ZEORES_BLOCK_SIZE ahead. + * For given @fd it reads up to DETECT_ZEORES_BLOCK_SIZE bytes from it. + * If all bytes read are zero then @inData is set to 0, otherwise @inData + * is set to 1 and @buf is set to the read data. In either case, the + * number of bytes read is stored in @length. + * + * If @fd is at EOF, then this function sets @inData = 0 and @length = 0. + * + * Note, in contrast to virFileInData(), this function is safe to use on + * special files (e.g. block devices), does not rewind @fd back to its + * starting position, and sees only DETECT_ZEORES_BLOCK_SIZE bytes ahead. * * Returns: 0 on success, * -1 otherwise. @@ -4084,50 +4090,27 @@ virFileInData(int fd G_GNUC_UNUSED, int virFileInDataDetectZeroes(int fd, int *inData, - long long *length) + long long *length, + char **buf) { const size_t buflen = DETECT_ZEORES_BLOCK_SIZE; g_autofree char *bytes = NULL; - off_t cur; ssize_t r; - int ret = -1; - - /* Get current position */ - cur = lseek(fd, 0, SEEK_CUR); - if (cur == (off_t) -1) { - virReportSystemError(errno, "%s", - _("Unable to get current position in file")); - goto cleanup; - } bytes = g_new0(char, buflen); if ((r = saferead(fd, bytes, buflen)) < 0) { virReportSystemError(errno, "%s", _("Unable to read from file")); - goto cleanup; + return -1; } *inData = !virStringIsNull(bytes, r); *length = r; - ret = 0; + if (*inData) + *buf = g_steal_pointer(&bytes); - cleanup: - /* At any rate, reposition back to where we started. */ - if (cur != (off_t) -1) { - int theerrno = errno; - - if (lseek(fd, cur, SEEK_SET) == (off_t) -1) { - virReportSystemError(errno, "%s", - _("unable to restore position in file")); - ret = -1; - if (theerrno == 0) - theerrno = errno; - } - - errno = theerrno; - } - return ret; + return 0; } diff --git a/src/util/virfile.h b/src/util/virfile.h index 9a5eade609..06ee0bd801 100644 --- a/src/util/virfile.h +++ b/src/util/virfile.h @@ -354,7 +354,8 @@ int virFileInData(int fd, int virFileInDataDetectZeroes(int fd, int *inData, - long long *length); + long long *length, + char **buf); G_DEFINE_AUTOPTR_CLEANUP_FUNC(virFileWrapperFd, virFileWrapperFdFree); diff --git a/tools/virsh-util.c b/tools/virsh-util.c index 867f69577f..dd2f4fb6a3 100644 --- a/tools/virsh-util.c +++ b/tools/virsh-util.c @@ -160,8 +160,25 @@ virshStreamSource(virStreamPtr st G_GNUC_UNUSED, { virshStreamCallbackDataPtr cbData = opaque; int fd = cbData->fd; + int r; - return saferead(fd, bytes, nbytes); + if (cbData->isBlock) { + size_t bufavail = cbData->buflen - cbData->bufoff; + + r = MIN(nbytes, bufavail); + memcpy(bytes, cbData->buf + cbData->bufoff, r); + cbData->bufoff += r; + + if (cbData->bufoff == cbData->buflen) { + VIR_FREE(cbData->buf); + cbData->buflen = 0; + cbData->bufoff = 0; + } + } else { + r = saferead(fd, bytes, nbytes); + } + + return r; } @@ -172,9 +189,11 @@ virshStreamSourceSkip(virStreamPtr st G_GNUC_UNUSED, { virshStreamCallbackDataPtr cbData = opaque; int fd = cbData->fd; - off_t cur; - if ((cur = lseek(fd, offset, SEEK_CUR)) == (off_t) -1) + /* Don't seek if the @fd is a block device. It was left at + * the desired position by virshStreamInData(). */ + if (!cbData->isBlock && + (lseek(fd, offset, SEEK_CUR)) == (off_t) -1) return -1; return 0; @@ -232,10 +251,23 @@ virshStreamInData(virStreamPtr st G_GNUC_UNUSED, int fd = cbData->fd; if (cbData->isBlock) { - if (virFileInDataDetectZeroes(fd, inData, offset) < 0) { + g_autofree char *buf = NULL; + + if (virFileInDataDetectZeroes(fd, inData, offset, &buf) < 0) { vshError(ctl, "%s", _("Unable to get current position in stream")); return -1; } + + if (*inData) { + cbData->buf = g_steal_pointer(&buf); + cbData->buflen = *offset; + cbData->bufoff = 0; + } + + /* Intentionally leaving @fd in a different position than on entering + * because either we've found a hole and virshStreamSourceSkip() would + * want to seek forward, or we've read some data and would need to + * imitate seek in virshStreamSource(). Skip seeking back and forth. */ } else { if (virFileInData(fd, inData, offset) < 0) { vshError(ctl, "%s", _("Unable to get current position in stream")); diff --git a/tools/virsh-util.h b/tools/virsh-util.h index 9ef28cfe0a..b2b5d6bcd2 100644 --- a/tools/virsh-util.h +++ b/tools/virsh-util.h @@ -73,6 +73,9 @@ struct _virshStreamCallbackData { vshControl *ctl; int fd; bool isBlock; + char *buf; + size_t buflen; + long long bufoff; }; int diff --git a/tools/virsh-volume.c b/tools/virsh-volume.c index 20823e2d10..9fd1674a9f 100644 --- a/tools/virsh-volume.c +++ b/tools/virsh-volume.c @@ -678,7 +678,7 @@ cmdVolUpload(vshControl *ctl, const vshCmd *cmd) unsigned long long offset = 0, length = 0; virshControlPtr priv = ctl->privData; unsigned int flags = 0; - virshStreamCallbackData cbData; + virshStreamCallbackData cbData = { 0 }; struct stat sb; if (vshCommandOptULongLong(ctl, cmd, "offset", &offset) < 0) @@ -752,6 +752,7 @@ cmdVolUpload(vshControl *ctl, const vshCmd *cmd) virStorageVolFree(vol); if (st) virStreamFree(st); + VIR_FREE(cbData.buf); VIR_FORCE_CLOSE(fd); return ret; } -- 2.26.2

Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- NEWS.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/NEWS.rst b/NEWS.rst index 232387ebdc..e225cbbcd3 100644 --- a/NEWS.rst +++ b/NEWS.rst @@ -15,6 +15,13 @@ v6.6.0 (unreleased) * **Improvements** + * Allow sparse streams for block devices + + Sparse streams (e.g. ``virsh vol-download --sparse`` or ``virsh vol-upload + --sparse``) now handle if one of the stream ends is a block device. A zero + block detection is performed so that zero block are not transferred and + thus bandwidth is saved. + * **Bug fixes** -- 2.26.2
participants (3)
-
Daniel P. Berrangé
-
Michal Privoznik
-
Peter Krempa