On Fri, Feb 10, 2017 at 09:58:19AM +0800, Eli Qiao wrote:
--
Eli Qiao
Sent with Sparrow (
http://www.sparrowmailapp.com/?sig)
On Thursday, 9 February 2017 at 10:25 PM, Marcelo Tosatti wrote:
> On Thu, Feb 09, 2017 at 03:43:01PM +0800, Eli Qiao wrote:
> > Addressed comment from v2 -> v3:
> >
> > Daniel:
> > * Fixed coding style, passed `make check` and `make syntax-check`
> >
> > * Variables renaming and move from header file to c file.
> >
> > * For locking/mutex support, no progress.
> >
> > There are some discussion from mailing list, but I can not find a better
> > way to add locking support without performance impact.
> >
> > I'll explain the process and please help to advice what shoud we do.
> >
> > VM create:
> > 1) Get the cache left value on each bank of the host. This should be
> > shared amount all VMs.
> > 2) Calculate the schemata on the bank based on all created resctrl
> > domain's schemata
> > 3) Calculate the default schemata by scaning all domain's schemata.
> > 4) Flush default schemata to /sys/fs/resctrl/schemata
> >
>
>
> Yes, this 4 steps have to be serialized and cannot occur in parallel.
>
> >
> > VM destroy:
> > 1) Remove the resctrl domain of that VM
> > 2) Recalculate default schemata
> > 3) Flush default schemata to /sys/fs/resctrl/schemata
> >
> > The key point is that all VMs shares /sys/fs/resctrl/schemata, and
> > when a VM create a resctrl domain, the schemata of that VM depends on
> > the default schemata and all other exsited schematas. So a global
> > mutex is reqired.
> >
> > Before calculate a schemata or update default schemata, libvirt
> > should gain this global mutex.
> >
> > I will try to think more about how to support this gracefully in next
> > patch set.
> >
>
>
> There are two things you need to protect:
>
> 1) Access to the filesystem (and schematas). For this you should use
> the fcntl lock as mentioned previously:
>
https://lkml.org/lkml/2016/12/14/377
>
> Other applications that use resctrlfs should use the same lock.
>
> All you have to do is to grab and release the fcntl lock as follows:
>
> +#include <sys/file.h>
> +#include <stdlib.h>
> +
> +void resctrl_take_shared_lock(int fd)
> +{
> + int ret;
> +
> + /* take shared lock on resctrl filesystem */
> + ret = flock(fd, LOCK_SH);
> + if (ret) {
> + perror("flock");
> + exit(-1);
> + }
> +}
> +
> +void resctrl_take_exclusive_lock(int fd)
> +{
> + int ret;
> +
> + /* release lock on resctrl filesystem */
> + ret = flock(fd, LOCK_EX);
> + if (ret) {
> + perror("flock");
> + exit(-1);
> + }
> +}
> +
> +void resctrl_release_lock(int fd)
> +{
> + int ret;
> +
> + /* take shared lock on resctrl filesystem */
> + ret = flock(fd, LOCK_UN);
> + if (ret) {
> + perror("flock");
> + exit(-1);
> + }
> +}
> +
> +void main(void)
> +{
> + int fd, ret;
> +
> + fd = open("/sys/fs/resctrl", O_DIRECTORY);
> + if (fd == -1) {
> + perror("open");
> + exit(-1);
> + }
> + resctrl_take_shared_lock(fd);
> + /* code to read directory contents */
> + resctrl_release_lock(fd);
> +
> + resctrl_take_exclusive_lock(fd);
> + /* code to read and write directory contents */
> + resctrl_release_lock(fd);
> +}
>
>
Thank Marcelo, libvirt have a util named virFileLock
https://github.com/libvirt/libvirt/blob/master/src/util/virfile.c#L388
would that be same with your flock?
No, virFileLock uses:
Advisory record locking
Linux implements traditional ("process-associated") UNIX record
locks, as standardized by POSIX. For a Linux-specific
alternative
with better semantics, see the discussion of open file
description
locks below.
F_SETLK, F_SETLKW, and F_GETLK are used to acquire, release, and
test
for the existence of record locks (also known as byte-range,
file-
segment, or file-region locks). The third argument, lock, is a
pointer to a structure that has at least the following fields (in
unspecified order).
struct flock {
...
short l_type; /* Type of lock: F_RDLCK,
F_WRLCK, F_UNLCK */
short l_whence; /* How to interpret l_start:
SEEK_SET, SEEK_CUR, SEEK_END */
off_t l_start; /* Starting offset for lock */
off_t l_len; /* Number of bytes to lock */
pid_t l_pid; /* PID of process blocking our lock
(set by F_GETLK and F_OFD_GETLK) */
...
};
The l_whence, l_start, and l_len fields of this structure specify
the
range of bytes we wish to lock. Bytes past the end of the file
may
be locked, but not bytes before the start of the file.
Which is different than LOCK_SH/LOCK_EX. Please use LOCK_SH/LOCK_EX.
>
> 2) The internal data structures in libvirt. I suppose you can
> simply add a mutex to protect all of the "struct _virResCtrl"
> since there should be no contention on that lock.
>
>
Sure, I will try to refine my code.
> > Marcelo:
> > * Added vcpu support for cachetune, this will allow user to define which
> > vcpu using which cache allocation bank.
> >
> > <cachetune id='0' host_id=0 size='3072' unit='KiB'
vcpus='0,1'/>
> >
> > vcpus is a cpumap, the vcpu pids will be added to tasks file
> >
> > * Added cdp compatible, user can specify l3 cache even host enable cdp.
> > See patch 8.
> > On a cdp enabled host, specify l3code/l3data by
> >
> > <cachetune id='0' host_id='0' type='l3'
size='3072' unit='KiB'/>
> > rcelo@amt patchesv3]$ cd ..
> >
>
>
> > This will create a schemata like:
> > L3data:0=0xff00;...
> > L3code:0=0xff00;...
> >
> > * Would you please help to test if the functions work.
>
> Will do.
>
> > Martin:
> > * Xml test case, I have no time to work on this yet, would you please
> > show me an example, would like to amend it later.
> >
> > This series patches are for supportting CAT featues, which also
> > called cache tune in libvirt.
> >
> > First to expose cache information which could be tuned in capabilites XML.
> > Then add new domain xml element support to add cacahe bank which will apply
> > on this libvirt domain.
> >
> > This series patches add a util file `resctrl.c/h`, an interface to talk with
> > linux kernel's sys fs.
> >
> > There are still some TODOs such as expose new public interface to get free
> > cache information.
> >
>
>
> Can you please list the TODOs?
Sure, there’s one public API missing for query cache free on each bank.
And that is all?
(testing now, will let you know the results soon).