Re: [libvirt] [V3] RFC for support cache tune in libvirt

Wednesday, 11 January 2017

On Wed, Jan 11, 2017 at 10:05:26AM +0000, Daniel P. Berrange wrote:
...
 On Tue, Jan 10, 2017 at 07:42:59AM +0000, Qiao, Liyong wrote:
 > Add support for cache allocation.
 > 
 > Thanks Martin for the previous version comments, this is the v3 version for RFC ,
I’v have some PoC code [2]. The follow changes are partly finished by the PoC.
 > 
 > #Propose Changes
 > 
 > ## virsh command line
 > 
 > 1. Extend output of nodeinfo, to expose L3 cache size for Level 3 (last level cache
size).
 > 
 > This will expose how many cache on a host which can be used.
 > 
 > root@s2600wt:~/linux# virsh nodeinfo | grep L3
 > L3 cache size:       56320 KiB

 Ok, as previously discussed, we should include this in the capabilities
 XML instead and have info about all the caches. We likely also want to
 relate which CPUs are associated with which cache in some way.

 eg if we have this topology

     <topology>
       <cells num='2'>
         <cell id='0'>
           <cpus num='6'>
             <cpu id='0' socket_id='0' core_id='0'
siblings='0'/>
             <cpu id='1' socket_id='0' core_id='2'
siblings='1'/>
             <cpu id='2' socket_id='0' core_id='4'
siblings='2'/>
             <cpu id='6' socket_id='0' core_id='1'
siblings='6'/>
             <cpu id='7' socket_id='0' core_id='3'
siblings='7'/>
             <cpu id='8' socket_id='0' core_id='5'
siblings='8'/>
           </cpus>
         </cell>
         <cell id='1'>
           <cpus num='6'>
             <cpu id='3' socket_id='1' core_id='0'
siblings='3'/>
             <cpu id='4' socket_id='1' core_id='2'
siblings='4'/>
             <cpu id='5' socket_id='1' core_id='4'
siblings='5'/>
             <cpu id='9' socket_id='1' core_id='1'
siblings='9'/>
             <cpu id='10' socket_id='1' core_id='3'
siblings='10'/>
             <cpu id='11' socket_id='1' core_id='5'
siblings='11'/>
           </cpus>
         </cell>
       </cells>
     </topology>

 We might have something like this cache info

     <cache>
       <bank type="l3" size="56320" units="KiB"
cpus="0,2,3,6,7,8"/>
       <bank type="l3" size="56320" units="KiB"
cpus="3,4,5,9,10,11"/>
       <bank type="l2" size="256" units="KiB"
cpus="0"/>
       <bank type="l2" size="256" units="KiB"
cpus="1"/>
       <bank type="l2" size="256" units="KiB"
cpus="2"/>
       <bank type="l2" size="256" units="KiB"
cpus="3"/>
       <bank type="l2" size="256" units="KiB"
cpus="4"/>
       <bank type="l2" size="256" units="KiB"
cpus="5"/>
       <bank type="l2" size="256" units="KiB"
cpus="6"/>
       <bank type="l2" size="256" units="KiB"
cpus="7"/>
       <bank type="l2" size="256" units="KiB"
cpus="8"/>
       <bank type="l2" size="256" units="KiB"
cpus="9"/>
       <bank type="l2" size="256" units="KiB"
cpus="10"/>
       <bank type="l2" size="256" units="KiB"
cpus="11"/>
       <bank type="l1i" size="256" units="KiB"
cpus="0"/>
       <bank type="l1i" size="256" units="KiB"
cpus="1"/>
       <bank type="l1i" size="256" units="KiB"
cpus="2"/>
       <bank type="l1i" size="256" units="KiB"
cpus="3"/>
       <bank type="l1i" size="256" units="KiB"
cpus="4"/>
       <bank type="l1i" size="256" units="KiB"
cpus="5"/>
       <bank type="l1i" size="256" units="KiB"
cpus="6"/>
       <bank type="l1i" size="256" units="KiB"
cpus="7"/>
       <bank type="l1i" size="256" units="KiB"
cpus="8"/>
       <bank type="l1i" size="256" units="KiB"
cpus="9"/>
       <bank type="l1i" size="256" units="KiB"
cpus="10"/>
       <bank type="l1i" size="256" units="KiB"
cpus="11"/>
       <bank type="l1d" size="256" units="KiB"
cpus="0"/>
       <bank type="l1d" size="256" units="KiB"
cpus="1"/>
       <bank type="l1d" size="256" units="KiB"
cpus="2"/>
       <bank type="l1d" size="256" units="KiB"
cpus="3"/>
       <bank type="l1d" size="256" units="KiB"
cpus="4"/>
       <bank type="l1d" size="256" units="KiB"
cpus="5"/>
       <bank type="l1d" size="256" units="KiB"
cpus="6"/>
       <bank type="l1d" size="256" units="KiB"
cpus="7"/>
       <bank type="l1d" size="256" units="KiB"
cpus="8"/>
       <bank type="l1d" size="256" units="KiB"
cpus="9"/>
       <bank type="l1d" size="256" units="KiB"
cpus="10"/>
       <bank type="l1d" size="256" units="KiB"
cpus="11"/>
     </cache>

 which shows each socket has its own dedicated L3 cache, and each
 core has its own L2 & L1 cache. 
We need to also include the host cache ID value in the XML to
let us reliably distinguish / associate with differet cache
banks when placing guests, if there's multiple caches of the
same type associated with the same CPU.

     <cache>
       <bank id="0" type="l3" size="56320"
units="KiB" cpus="0,2,3,6,7,8"/>
       <bank id="1" type="l3" size="56320"
units="KiB" cpus="0,2,3,6,7,8"/>
       <bank id="2" type="l3" size="56320"
units="KiB" cpus="3,4,5,9,10,11"/>
       <bank id="3" type="l3" size="56320"
units="KiB" cpus="3,4,5,9,10,11"/>
       <bank id="4" type="l2" size="256"
units="KiB" cpus="0"/>
       ....
     </cache>

...
 > 3. Add new virsh command 'nodecachestats':
 > This API is to expose vary cache resouce left on each hardware (cpu socket).
 > 
 > It will be formated as:
 > 
 > <resource_type>.<resource_id>: left size KiB
 > 
 > for example I have a 2 socket cpus host, and I'v enabled cat_l3 feature only
 > 
 > root@s2600wt:~/linux# virsh nodecachestats
 > L3.0 : 56320 KiB
 > L3.1 : 56320 KiB
 > 
 >   P.S. resource_type can be L3, L3DATA, L3CODE, L2 for now.

 This feels like something we should have in the capabilities XML too
 rather than a new command

     <cache>
       <bank type="l3" size="56320" units="KiB"
cpus="0,2,3,6,7,8">
           <control unit="KiB" min="2816" avail="56320/>
       </bank>
       <bank type="l3" size="56320" units="KiB"
cpus="3,4,5,9,10,11">
           <control unit="KiB" min="2816"
avail="56320"/>
       </bank>
     </cache> 
Opps, ignore this. I remember the reason we always report available
resource separately from physically present resource, is that we
don't want to re-generate capabilities XML every time available
resource changes.

So, yes, we do need some API like  virNodeFreeCache()  / virs nodefreecache
We probably want to use an 2d array of typed parameters. The first level of
the array would represent the cache bank, the second level woudl represent
the parameters for that bank. eg if we had 3 cache banks, we'd report a
3x3 typed parameter array, with parameters for the cache ID, its type and
the available / free size

   id=0
   type=l3
   avail=56320

   id=1
   type=l3
   avail=56320

   id=2
   type=l3
   avail=56320

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://entangle-photo.org       -o-    http://search.cpan.org/~danberr/ :|

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [libvirt] [V3] RFC for support cache tune in libvirt