
On Thu, Jan 12, 2017 at 08:47:58AM -0200, Marcelo Tosatti wrote:
On Thu, Jan 12, 2017 at 09:44:36AM +0800, 乔立勇(Eli Qiao) wrote:
hi, It's really good to have you get involved to support CAT in libvirt/OpenStack. replied inlines.
2017-01-11 20:19 GMT+08:00 Marcelo Tosatti <mtosatti@redhat.com>:
Hi,
Comments/questions related to: https://www.redhat.com/archives/libvir-list/2017-January/msg00354.html
1) root s2600wt:~/linux# virsh cachetune kvm02 --l3.count 2
How does allocation of code/data look like?
My plan's expose new options:
virsh cachetune kvm02 --l3data.count 2 --l3code.count 2
Please notes, you can use only l3 or l3data/l3code(if enable cdp while mount resctrl fs)
Fine. However, you should be able to emulate a type=both reservation (non cdp) by writing a schemata file with the same CBM bits:
L3code:0=0x000ff;1=0x000ff L3data:0=0x000ff;1=0x000ff
(*)
I don't see how this interface enables that possibility.
I suppose it would be easier for mgmt software to have it done automatically:
virsh cachetune kvm02 --l3 size_in_kbytes.
Would create the reservations as (*) in resctrlfs, in case host is CDP enabled.
(also please use kbytes, or give a reason to not use kbytes).
Note: exposing the unit size is fine as mgmt software might decide a placement of VMs which reduces the amount of L3 cache reservation rounding (although i doubt anyone is going to care about that in practice).
2) 'nodecachestats' command:
3. Add new virsh command 'nodecachestats': This API is to expose vary cache resouce left on each hardware (cpu socket). It will be formated as: <resource_type>.<resource_id>: left size KiB
Does this take into account that only contiguous regions of cbm masks can be used for allocations?
yes, it is the contiguous regions cbm or in another word it's the default cbm represent's cache value.
resctrl doesn't allow set non-contiguous cbm (which is restricted by hardware)
OK.
Also, it should return the amount of free cache on each cacheid.
yes, it is. resource_id == cacheid
OK.
3) The interface should support different sizes for different cache-ids. See the KVM-RT use case at https://www.redhat.com/archives/libvir-list/2017-January/msg00415.html "WHAT THE USER NEEDS TO SPECIFY FOR VIRTUALIZATION (KVM-RT)".
I don't think it's good to let user specify cache-ids while doing cache allocation.
This is necessary for our usecase.
the cache ids used should rely on what cpu affinity the vm are setting.
The cache ids configuration should match the cpu affinity configuration.
eg.
1. for those host who has only one cache id(one socket host), we don't need to set cache id
Right.
2. if multiple cache ids(sockets), user should set vcpu -> pcpu mapping (define cpuset for a VM), then we (libvirt) need to compute how much cache on which cache id should set. Which is to say, user should set the cpu affinity before cache allocation.
I know that the most cases of using CAT is for NFV. As far as I know, NFV is using NUMA and cpu pining (vcpu -> pcpu mapping), so we don't need to worry about on which cache id we set the cache size.
So, just let user specify cache size(here my propose is cache unit account) and let libvirt detect on which cache id set how many cache.
Ok fine, its OK to not expose this to the user but calculate it internally in libvirt. As long as you recompute the schematas whenever cpu affinity changes. But using different cache-id's in schemata is necessary for our usecase.
Hum, thinking again about this, it needs to be per-vcpu. So for the NFV use-case you want: vcpu0: no reservation (belongs to the default group). vcpu1: reservation with particular size. Then if a vcpu is pinned, "trim" the reservation down to the particular cache-id where its pinned to. This is important because it allows vcpu0 workload to not interfere with the realtime workload running on vcpu1.