
On 3/21/22 8:55 AM, Andrea Righi wrote:
On Fri, Mar 18, 2022 at 02:34:29PM +0100, Claudio Fontana wrote: ...
I have lots of questions here, and I tried to involve Jiri and Andrea Righi here, who a long time ago proposed a POSIX_FADV_NOREUSE implementation.
1) What is the reason iohelper was introduced?
2) Was Jiri's comment about the missing linux implementation of POSIX_FADV_NOREUSE?
3) if using O_DIRECT is the only reason for iohelper to exist (...?), would replacing it with posix_fadvise remove the need for iohelper?
4) What has stopped Andreas' or another POSIX_FADV_NOREUSE implementation in the kernel?
For what I remember (it was a long time ago sorry) I stopped to pursue the POSIX_FADV_NOREUSE idea, because we thought that moving to a memcg-based solution was a better and more flexible approach, assuming memcg would have given some form of specific page cache control. As of today I think we still don't have any specific page cache control feature in memcg, so maybe we could reconsider the FADV_NOREUSE idea (or something similar)?
Maybe even introduce a separate FADV_<something> flag if we don't want to bind a specific implementation of this feature to a standard POSIX flag (even if FADV_NOREUSE is still implemented as a no-op in the kernel).
The thing that I liked about the fadvise approach is its simplicity from an application perspective, because it's just a syscall and that's it, without having to deal with any other subsystems (cgroups, sysfs, and similar).
-Andrea
Thanks Andrea, I guess for this specific use case I am still missing some key understanding on the role of iohelper in libvirt, Jiri Denemark's comment seems to suggest that having an implementation of FADV_NOREUSE would remove the need for iohelper entirely, so I assume this would remove the extra copy of the data which seems to impose a substantial throughput penalty when migrating to a file. I guess I am hoping for Jiri to weigh in on this, or anyone with a clear understanding of this matter. Ciao, Claudio
Lots of questions..
Thanks for all your insight,
Claudio
Dave
Ciao,
C
In the above tests with libvirt, were you using the --bypass-cache flag or not ?
No, I do not. Tests with ramdisk did not show a notable difference for me,
but tests with /dev/null were not possible, since the command line is not accepted:
# virsh save centos7 /dev/null Domain 'centos7' saved to /dev/null [OK]
# virsh save centos7 /dev/null --bypass-cache error: Failed to save domain 'centos7' to /dev/null error: Failed to create file '/dev/null': Invalid argument
Hopefully use of O_DIRECT doesn't make a difference for /dev/null, since the I/O is being immediately thrown away and so ought to never go into I/O cache.
In terms of the comparison, we still have libvirt iohelper giving QEMU a pipe, while your test above gives QEMU a UNIX socket.
So I still wonder if the delta is caused by the pipe vs socket difference, as opposed to netcat vs libvirt iohelper code.
I'll look into this aspect, thanks!