On 3/21/22 8:55 AM, Andrea Righi wrote:
On Fri, Mar 18, 2022 at 02:34:29PM +0100, Claudio Fontana wrote:
...
> I have lots of questions here, and I tried to involve Jiri and Andrea Righi here, who
a long time ago proposed a POSIX_FADV_NOREUSE implementation.
>
> 1) What is the reason iohelper was introduced?
>
> 2) Was Jiri's comment about the missing linux implementation of
POSIX_FADV_NOREUSE?
>
> 3) if using O_DIRECT is the only reason for iohelper to exist (...?), would replacing
it with posix_fadvise remove the need for iohelper?
>
> 4) What has stopped Andreas' or another POSIX_FADV_NOREUSE implementation in the
kernel?
For what I remember (it was a long time ago sorry) I stopped to pursue
the POSIX_FADV_NOREUSE idea, because we thought that moving to a
memcg-based solution was a better and more flexible approach, assuming
memcg would have given some form of specific page cache control. As of
today I think we still don't have any specific page cache control
feature in memcg, so maybe we could reconsider the FADV_NOREUSE idea (or
something similar)?
Maybe even introduce a separate FADV_<something> flag if we don't want
to bind a specific implementation of this feature to a standard POSIX
flag (even if FADV_NOREUSE is still implemented as a no-op in the
kernel).
The thing that I liked about the fadvise approach is its simplicity from
an application perspective, because it's just a syscall and that's it,
without having to deal with any other subsystems (cgroups, sysfs, and
similar).
-Andrea
Thanks Andrea,
I guess for this specific use case I am still missing some key understanding on the role
of iohelper in libvirt,
Jiri Denemark's comment seems to suggest that having an implementation of FADV_NOREUSE
would remove the need for iohelper entirely,
so I assume this would remove the extra copy of the data which seems to impose a
substantial throughput penalty when migrating to a file.
I guess I am hoping for Jiri to weigh in on this, or anyone with a clear understanding of
this matter.
Ciao,
Claudio
>
> Lots of questions..
>
> Thanks for all your insight,
>
> Claudio
>
>>
>> Dave
>>
>>> Ciao,
>>>
>>> C
>>>
>>>>>
>>>>> In the above tests with libvirt, were you using the
>>>>> --bypass-cache flag or not ?
>>>>
>>>> No, I do not. Tests with ramdisk did not show a notable difference for
me,
>>>>
>>>> but tests with /dev/null were not possible, since the command line is not
accepted:
>>>>
>>>> # virsh save centos7 /dev/null
>>>> Domain 'centos7' saved to /dev/null
>>>> [OK]
>>>>
>>>> # virsh save centos7 /dev/null --bypass-cache
>>>> error: Failed to save domain 'centos7' to /dev/null
>>>> error: Failed to create file '/dev/null': Invalid argument
>>>>
>>>>
>>>>>
>>>>> Hopefully use of O_DIRECT doesn't make a difference for
>>>>> /dev/null, since the I/O is being immediately thrown
>>>>> away and so ought to never go into I/O cache.
>>>>>
>>>>> In terms of the comparison, we still have libvirt iohelper
>>>>> giving QEMU a pipe, while your test above gives QEMU a
>>>>> UNIX socket.
>>>>>
>>>>> So I still wonder if the delta is caused by the pipe vs socket
>>>>> difference, as opposed to netcat vs libvirt iohelper code.
>>>>
>>>> I'll look into this aspect, thanks!
>>>