On 20.09.2017 17:48, Daniel P. Berrange wrote:
On Wed, Sep 20, 2017 at 02:58:55PM +0300, Nikolay Shirokovskiy
wrote:
> saferead is not suitable for direct reads. If file size is not multiple
> of align size then we get EINVAL on the read(2) that is supposed to
> return 0 because read buffer will not be aligned at this point.
>
> Let's not read again after partial read and check that we read
> everything by comparing the number of total bytes read against file size.
> ---
>
> Diff from v1:
> - a couple of cosmetic changes
>
> src/util/iohelper.c | 42 +++++++++++++++++++++++++++++++++++++-----
> 1 file changed, 37 insertions(+), 5 deletions(-)
>
> diff --git a/src/util/iohelper.c b/src/util/iohelper.c
> index fe15a92..9aa6ae0 100644
> --- a/src/util/iohelper.c
> +++ b/src/util/iohelper.c
> @@ -56,6 +56,7 @@ runIO(const char *path, int fd, int oflags)
> unsigned long long total = 0;
> bool direct = O_DIRECT && ((oflags & O_DIRECT) != 0);
> off_t end = 0;
> + off_t cur;
>
> #if HAVE_POSIX_MEMALIGN
> if (posix_memalign(&base, alignMask + 1, buflen)) {
> @@ -78,10 +79,22 @@ runIO(const char *path, int fd, int oflags)
> fdoutname = "stdout";
> /* To make the implementation simpler, we give up on any
> * attempt to use O_DIRECT in a non-trivial manner. */
> - if (direct && ((end = lseek(fd, 0, SEEK_CUR)) != 0)) {
> - virReportSystemError(end < 0 ? errno : EINVAL, "%s",
> - _("O_DIRECT read needs entire seekable
file"));
> - goto cleanup;
> + if (direct) {
> + if ((cur = lseek(fd, 0, SEEK_CUR)) != 0) {
> + virReportSystemError(cur < 0 ? errno : EINVAL, "%s",
> + _("O_DIRECT read needs entire seekable
file"));
> + goto cleanup;
> + }
> +
> + if ((end = lseek(fd, 0, SEEK_END)) < 0) {
> + virReportSystemError(errno, "%s", _("can not seek
file end"));
> + goto cleanup;
> + }
> +
> + if (lseek(fd, cur, SEEK_SET) < 0) {
> + virReportSystemError(errno, "%s", _("can not seek
file begin"));
> + goto cleanup;
> + }
> }
> break;
> case O_WRONLY:
> @@ -109,7 +122,26 @@ runIO(const char *path, int fd, int oflags)
> while (1) {
> ssize_t got;
>
> - if ((got = saferead(fdin, buf, buflen)) < 0) {
> + /* in case of O_DIRECT we cannot read again to check for EOF
> + * after partial buffer read as it is done in saferead */
> + if (direct && fdin == fd && end - total < buflen) {
> + if (total == end)
> + break;
> +
> + while ((got = read(fd, buf, buflen)) < 0 && errno == EINTR)
> + ;
> +
> + if (got < end - total) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Unable to read last chunk from %s"),
> + fdinname);
> + goto cleanup;
> + }
> + } else {
> + got = saferead(fdin, buf, buflen);
> + }
> +
> + if (got < 0) {
> virReportSystemError(errno, _("Unable to read %s"),
fdinname);
> goto cleanup;
> }
So the ultimate problem here is that when saferead() hits EOF, it tries
to read some more, expecting to see ret == 0, but the buffer that
saferead is passing into read() is not suitably aligned, so we get an
error report despite not having any data to read.
What's fun is that the runIO method already handles the fact that
saferead() may return less data than was asked for, so using
the saferead() function is pretty much pointless. There is thus
no need to have separate codepaths for O_DIRECT vs normal, if
we change iohelper to just use read() unconditionally and let
it handle all return values with its existing code.
It depends on whether reads on files/block devices never return partial
reads or not. At least there is not clear statement in read(2) AFAIR.
IOW can simplify your patch to just this I believe:>
diff --git a/src/util/iohelper.c b/src/util/iohelper.c
index fe15a92e6..5416d4506 100644
--- a/src/util/iohelper.c
+++ b/src/util/iohelper.c
@@ -109,7 +109,9 @@ runIO(const char *path, int fd, int oflags)
while (1) {
ssize_t got;
- if ((got = saferead(fdin, buf, buflen)) < 0) {
+ if ((got = read(fdin, buf, buflen)) < 0) {
+ if (errno == EINTR)
+ continue;
virReportSystemError(errno, _("Unable to read %s"), fdinname);
goto cleanup;
}
But we still read half the buffer at the file end and then try misaligned read
after that.
We could consider short-read as a sign of EOF but i'm not sure this is true so I
decided to add check using file size measured thru lseek(2).
Nikolay