
On 4/28/22 3:01 PM, Daniel P. Berrangé wrote:
On Wed, Apr 27, 2022 at 11:13:20PM +0200, Claudio Fontana wrote:
this is in preparation for a minor refactoring of the copy function itself out of runIO().
Signed-off-by: Claudio Fontana <cfontana@suse.de> --- src/util/iohelper.c | 95 +++++++++++++++++++++++++-------------------- 1 file changed, 53 insertions(+), 42 deletions(-)
diff --git a/src/util/iohelper.c b/src/util/iohelper.c index 2c91bf4f93..c13746a547 100644 --- a/src/util/iohelper.c +++ b/src/util/iohelper.c @@ -45,6 +45,16 @@ # define O_DIRECT 0 #endif
+struct runIOParams { + bool isBlockDev; + bool isDirect; + bool isWrite; + int fdin; + const char *fdinname; + int fdout; + const char *fdoutname; +}; + static int runIO(const char *path, int fd, int oflags) { @@ -53,13 +63,9 @@ runIO(const char *path, int fd, int oflags) size_t buflen = 1024*1024; intptr_t alignMask = 64*1024 - 1; int ret = -1; - int fdin, fdout; - const char *fdinname, *fdoutname; - unsigned long long total = 0; - bool direct = O_DIRECT && ((oflags & O_DIRECT) != 0); - off_t end = 0; + off_t total = 0; struct stat sb; - bool isBlockDev = false; + struct runIOParams p;
#if WITH_POSIX_MEMALIGN if (posix_memalign(&base, alignMask + 1, buflen)) @@ -77,34 +83,23 @@ runIO(const char *path, int fd, int oflags) fd, path); goto cleanup; } - isBlockDev = S_ISBLK(sb.st_mode); + p.isBlockDev = S_ISBLK(sb.st_mode); + p.isDirect = O_DIRECT && (oflags & O_DIRECT);
switch (oflags & O_ACCMODE) { case O_RDONLY: - fdin = fd; - fdinname = path; - fdout = STDOUT_FILENO; - fdoutname = "stdout"; - /* To make the implementation simpler, we give up on any - * attempt to use O_DIRECT in a non-trivial manner. */ - if (!isBlockDev && direct && ((end = lseek(fd, 0, SEEK_CUR)) != 0)) { - virReportSystemError(end < 0 ? errno : EINVAL, "%s", - _("O_DIRECT read needs entire seekable file")); - goto cleanup; - } + p.isWrite = false; + p.fdin = fd; + p.fdinname = path; + p.fdout = STDOUT_FILENO; + p.fdoutname = "stdout"; break; case O_WRONLY: - fdin = STDIN_FILENO; - fdinname = "stdin"; - fdout = fd; - fdoutname = path; - /* To make the implementation simpler, we give up on any - * attempt to use O_DIRECT in a non-trivial manner. */ - if (!isBlockDev && direct && (end = lseek(fd, 0, SEEK_END)) != 0) { - virReportSystemError(end < 0 ? errno : EINVAL, "%s", - _("O_DIRECT write needs empty seekable file")); - goto cleanup; - } + p.isWrite = true; + p.fdin = STDIN_FILENO; + p.fdinname = "stdin"; + p.fdout = fd; + p.fdoutname = path; break;
case O_RDWR: @@ -114,6 +109,22 @@ runIO(const char *path, int fd, int oflags) (oflags & O_ACCMODE)); goto cleanup; } + /* To make the implementation simpler, we give up on any + * attempt to use O_DIRECT in a non-trivial manner. */ + if (!p.isBlockDev && p.isDirect) { + off_t off; + if (p.isWrite) { + if ((off = lseek(fd, 0, SEEK_END)) != 0) { + virReportSystemError(off < 0 ? errno : EINVAL, "%s", + _("O_DIRECT write needs empty seekable file")); + goto cleanup; + } + } else if ((off = lseek(fd, 0, SEEK_CUR)) != 0) { + virReportSystemError(off < 0 ? errno : EINVAL, "%s", + _("O_DIRECT read needs entire seekable file")); + goto cleanup; + }
I'm puzzelled about the current code doing SEEK_END in one case and SEEK_CUR in the other case. I think this method should simply use SEEK_CUR only as a sanity check for being ablt to use O_DIRECT.
I think the answer is in the original commit: commit 12291656b135ed788e41dadbd2d15e613ddea9b5 Author: Eric Blake <eblake@redhat.com> Date: Tue Jul 12 08:35:05 2011 -0600 save: let iohelper work on O_DIRECT fds Required for a coming patch where iohelper will operate on O_DIRECT fds. There, the user-space memory must be aligned to file system boundaries (at least 512, but using page-aligned works better, and some file systems prefer 64k). Made tougher by the fact that VIR_ALLOC won't work on void *, but posix_memalign won't work on char * and isn't available everywhere. This patch makes some simplifying assumptions - namely, output to an O_DIRECT fd will only be attempted on an empty seekable file (hence, no need to worry about preserving existing data on a partial block, and ftruncate will work to undo the effects of having to round up the size of the last block written), and input from an O_DIRECT fd will only be attempted on a complete seekable file with the only possible short read at EOF. ---- Here the relevant part being "empty seekable file". The check not only ensures the file is seekable, but also double checks that it is empty. I can add a comment to make it explicit.
If we actually need to position the file offset, then the caller should do a SEEK_END itself.
+ }
while (1) { ssize_t got; @@ -124,16 +135,16 @@ runIO(const char *path, int fd, int oflags) * writes will be aligned. * In other cases using saferead reduces number of syscalls. */ - if (fdin == fd && direct) { - if ((got = read(fdin, buf, buflen)) < 0 && + if (!p.isWrite && p.isDirect) { + if ((got = read(p.fdin, buf, buflen)) < 0 && errno == EINTR) continue; } else { - got = saferead(fdin, buf, buflen); + got = saferead(p.fdin, buf, buflen); }
The distinction for read vs saferead is for correctness with O_DIRECT.
We only want to use read on the plain file / blockdev, and saferead on the socket/pipe.
Yes that's clear.
We already call fstat to let us figure out the fd type, so can use that decide whether to use read vs saferead.
This ties into my comment on the later patch about simplifying the parameters passed in, to remove the distinction of read vs write.
As mentioned elsewhere I don't think it can be simplified quite that way, but I think there is an alternative that could work, will propose in the next spin. Thanks Claudio