[libvirt] RFC: APIs to manage KSM

Triggered by the requirement to control the new sys knob /sys/kernel/mm/ksm/merge_nodes for a NUMA aware host by patch: http://permalink.gmane.org/gmane.linux.kernel.mm/81497 It shows there is no methods to manage the KSM in libvirt. This proposes the new APIs like following, thanks for suggestions/idea in advance. == The macros and enums for the params' fields /* virNodeKSMRunMode: * * Represents the modes of /sys/kernel/mm/ksm/run. */ typedef enum { /* Stop ksmd from running but keep merged pages */ VIR_NODE_STOP_KSM_KEEP_MERGED_PAGES = 0, /* Start ksmd */ VIR_NODE_START_KSM = 1, /* Stop ksmd and and unmerge all pages currently merged */ VIR_NODE_STOP_KSM_UNMERGE_PAGES = 2, } virNodeKSMRunMode; /* * VIR_NODE_KSM_PAGES_TO_SCAN: * * Macro for typed parameter that represents how many present pages * to scan before ksmd goes to sleep. */ # define VIR_NODE_KSM_PAGES_TO_SCAN "pages_to_scan" /* * VIR_NODE_KSM_SLEEP_MILLISECS: * * Macro for typed parameter that represents how many milliseconds * ksmd should sleep before next scan. */ # define VIR_NODE_KSM_SLEEP_MILLISECS "sleep_millisecs" /* * VIR_NODE_KSM_RUN: * * Macro for typed parameter that is used to control ksmd's state, as * an int containing a virNodeKSMRunMode value. */ # define VIR_NODE_KSM_RUN "run" /* * VIR_NODE_KSM_PAGES_SHARED: * * Macro for typed parameter that represents how many ksm shared pages * are being used. */ # define VIR_NODE_KSM_PAGES_SHARED "pages_shared" /* * VIR_NODE_KSM_PAGES_SHARING: * * Macro for typed parameter that represents how many sites are * sharing the pages i.e. how much saved. */ # define VIR_NODE_KSM_PAGES_SHARING "pages_sharing" /* VIR_NODE_KSM_PAGES_UNSHARED: * * Macro for typed parameter that represents how many pages unique * but repeatedly checked for merging. */ # define VIR_NODE_KSM_PAGES_UNSHARED "pages_unshared" /* VIR_NODE_KSM_PAGES_VOLATILE: * * Macro for typed parameter that represents how many pages changing * too fast to be placed in a tree. */ # define VIR_NODE_KSM_PAGES_VOLATILE "pages_volatile" /* VIR_NODE_KSM_FULL_SCAN: * * Macro for typed parameter that represents how many times all * mergeable areas have been scanned. */ # define VIR_NODE_KSM_FULL_SCAN "full_scan" == API to Get KSM parameters. /* * virNodeGetKSMParameters: * @conn: pointer to the hypervisor connection * @params: pointer to memory parameter object * (return value, allocated by the caller) * @nparams: pointer to number of memory parameters; input and output * @flags: extra flags; not used yet, so callers should always pass 0 * * ... Detailed comments ommitted ... * * Returns 0 in case of success, and -1 in case of failure. */ int virNodeGetKSMParameters(virConnectPtr conn, virTypedParameterPtr params, int *nparams, unsigned int flags); == API to set KSM parameters. Currently only "run", "pages_to_scan", and "sleep_millisecs" are writable. The new boolearn sys knob "merge_nodes" could be added once it's acceptted by kernel upstream. /* * virNodeSetKSMParameters: * @conn: pointer to the hypervisor connection * @params: pointer to scheduler parameter objects * @naprams: number of scheduler parameter objects * (this value can be the same or less than the returned * value nparams of virDomainGetSchedulerType) * @flags: extra flags; not used yet, so callers should always pass 0 * * ... Detailed comments ommitted ... * * Returns 0 in case of success, -1 in case of failure. */ int virNodeSetKSMParameters(virConnectPtr conn, virTypedParameterPtr params, int nparams, unsigned int flags); BTW, the other thing libvirt misses is to tell ksmtuned to retune when a domain is started or destroyed. Which is outlined in the Red Hat Virtualization Administration Guide doc, but it's not true. So either we have a hole in libvirt, or lacks a fix for the document. <...> The KSM tuning service The ksmtuned service does not have any options. The ksmtuned service loops and adjusts ksm. The ksmtuned service is notified by libvirt when a guest is created or destroyed. </...> https://access.redhat.com/knowledge/docs/en-US/Red_Hat_Enterprise_Linux/6/ht... Regards, Osier

On Fri, Aug 31, 2012 at 04:56:01PM +0800, Osier Yang wrote:
Triggered by the requirement to control the new sys knob /sys/kernel/mm/ksm/merge_nodes for a NUMA aware host by patch:
http://permalink.gmane.org/gmane.linux.kernel.mm/81497
It shows there is no methods to manage the KSM in libvirt. This proposes the new APIs like following, thanks for suggestions/idea in advance.
== The macros and enums for the params' fields
/* virNodeKSMRunMode: * * Represents the modes of /sys/kernel/mm/ksm/run. */ typedef enum { /* Stop ksmd from running but keep merged pages */ VIR_NODE_STOP_KSM_KEEP_MERGED_PAGES = 0,
/* Start ksmd */ VIR_NODE_START_KSM = 1,
/* Stop ksmd and and unmerge all pages currently merged */ VIR_NODE_STOP_KSM_UNMERGE_PAGES = 2, } virNodeKSMRunMode;
See my next comment about this...
/* * VIR_NODE_KSM_PAGES_TO_SCAN: * * Macro for typed parameter that represents how many present pages * to scan before ksmd goes to sleep. */ # define VIR_NODE_KSM_PAGES_TO_SCAN "pages_to_scan"
/* * VIR_NODE_KSM_SLEEP_MILLISECS: * * Macro for typed parameter that represents how many milliseconds * ksmd should sleep before next scan. */ # define VIR_NODE_KSM_SLEEP_MILLISECS "sleep_millisecs"
/* * VIR_NODE_KSM_RUN: * * Macro for typed parameter that is used to control ksmd's state, as * an int containing a virNodeKSMRunMode value. */ # define VIR_NODE_KSM_RUN "run"
The other constants here are all tunable parameters. This one however is an action. IMHO we should not use a {Get,Set}Parameters style APIs for actions. We should have an explicit API for invoking them.
/* * VIR_NODE_KSM_PAGES_SHARED: * * Macro for typed parameter that represents how many ksm shared pages * are being used. */ # define VIR_NODE_KSM_PAGES_SHARED "pages_shared"
/* * VIR_NODE_KSM_PAGES_SHARING: * * Macro for typed parameter that represents how many sites are * sharing the pages i.e. how much saved. */ # define VIR_NODE_KSM_PAGES_SHARING "pages_sharing"
/* VIR_NODE_KSM_PAGES_UNSHARED: * * Macro for typed parameter that represents how many pages unique * but repeatedly checked for merging. */ # define VIR_NODE_KSM_PAGES_UNSHARED "pages_unshared"
/* VIR_NODE_KSM_PAGES_VOLATILE: * * Macro for typed parameter that represents how many pages changing * too fast to be placed in a tree. */ # define VIR_NODE_KSM_PAGES_VOLATILE "pages_volatile"
/* VIR_NODE_KSM_FULL_SCAN: * * Macro for typed parameter that represents how many times all * mergeable areas have been scanned. */ # define VIR_NODE_KSM_FULL_SCAN "full_scan"
== API to Get KSM parameters.
/* * virNodeGetKSMParameters: * @conn: pointer to the hypervisor connection * @params: pointer to memory parameter object * (return value, allocated by the caller) * @nparams: pointer to number of memory parameters; input and output * @flags: extra flags; not used yet, so callers should always pass 0 * * ... Detailed comments ommitted ... * * Returns 0 in case of success, and -1 in case of failure. */ int virNodeGetKSMParameters(virConnectPtr conn, virTypedParameterPtr params, int *nparams, unsigned int flags);
== API to set KSM parameters.
Currently only "run", "pages_to_scan", and "sleep_millisecs" are writable. The new boolearn sys knob "merge_nodes" could be added once it's acceptted by kernel upstream.
/* * virNodeSetKSMParameters: * @conn: pointer to the hypervisor connection * @params: pointer to scheduler parameter objects * @naprams: number of scheduler parameter objects * (this value can be the same or less than the returned * value nparams of virDomainGetSchedulerType) * @flags: extra flags; not used yet, so callers should always pass 0 * * ... Detailed comments ommitted ... * * Returns 0 in case of success, -1 in case of failure. */ int virNodeSetKSMParameters(virConnectPtr conn, virTypedParameterPtr params, int nparams, unsigned int flags);
BTW, the other thing libvirt misses is to tell ksmtuned to retune when a domain is started or destroyed. Which is outlined in the Red Hat Virtualization Administration Guide doc, but it's not true. So either we have a hole in libvirt, or lacks a fix for the document.
As a general point, we should not use the term 'KSM' in the API names, since that is an implementation specific name. We want some more generic name. eg virNodeSetMemorySharingParameters likewise for all the enum/constant names. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 08/31/2012 07:46 AM, Daniel P. Berrange wrote:
On Fri, Aug 31, 2012 at 04:56:01PM +0800, Osier Yang wrote:
Triggered by the requirement to control the new sys knob /sys/kernel/mm/ksm/merge_nodes for a NUMA aware host by patch:
As a general point, we should not use the term 'KSM' in the API names, since that is an implementation specific name. We want some more generic name. eg
virNodeSetMemorySharingParameters
likewise for all the enum/constant names.
For that matter, why can't we reuse the existing virDomainSetMemoryParameters() for the enum/constants? That is, whether or not to use ksm is a memory parameter. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On Fri, Aug 31, 2012 at 07:54:31AM -0700, Eric Blake wrote:
On 08/31/2012 07:46 AM, Daniel P. Berrange wrote:
On Fri, Aug 31, 2012 at 04:56:01PM +0800, Osier Yang wrote:
Triggered by the requirement to control the new sys knob /sys/kernel/mm/ksm/merge_nodes for a NUMA aware host by patch:
As a general point, we should not use the term 'KSM' in the API names, since that is an implementation specific name. We want some more generic name. eg
virNodeSetMemorySharingParameters
likewise for all the enum/constant names.
For that matter, why can't we reuse the existing virDomainSetMemoryParameters() for the enum/constants? That is, whether or not to use ksm is a memory parameter.
Actually this is a host level action. There is a separate non-upstreamed patch which lets us control whether KSM is used per-guest, which is waiting for a fix to QEMU Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|

On 08/31/2012 07:57 AM, Daniel P. Berrange wrote:
some more generic name. eg
virNodeSetMemorySharingParameters
likewise for all the enum/constant names.
For that matter, why can't we reuse the existing virDomainSetMemoryParameters() for the enum/constants? That is, whether or not to use ksm is a memory parameter.
Actually this is a host level action. There is a separate non-upstreamed patch which lets us control whether KSM is used per-guest, which is waiting for a fix to QEMU
Oh, I missed virNode vs. virDomain. Still, consistent naming argues for virNodeSetMemoryParameters (not MemorySharingParameters), so that we can add future MemoryParameters unrelated to sharing. -- Eric Blake eblake@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org

On 2012年08月31日 23:00, Eric Blake wrote:
On 08/31/2012 07:57 AM, Daniel P. Berrange wrote:
some more generic name. eg
virNodeSetMemorySharingParameters
likewise for all the enum/constant names.
For that matter, why can't we reuse the existing virDomainSetMemoryParameters() for the enum/constants? That is, whether or not to use ksm is a memory parameter.
Actually this is a host level action. There is a separate non-upstreamed patch which lets us control whether KSM is used per-guest, which is waiting for a fix to QEMU
Oh, I missed virNode vs. virDomain. Still, consistent naming argues for virNodeSetMemoryParameters (not MemorySharingParameters), so that we can add future MemoryParameters unrelated to sharing.
I'm afraid there could be too much parameter fields in future if using MemoryParameters. That is, do we want to have one "big" API, or smaller ones?

On 2012年08月31日 22:46, Daniel P. Berrange wrote:
On Fri, Aug 31, 2012 at 04:56:01PM +0800, Osier Yang wrote:
Triggered by the requirement to control the new sys knob /sys/kernel/mm/ksm/merge_nodes for a NUMA aware host by patch:
http://permalink.gmane.org/gmane.linux.kernel.mm/81497
It shows there is no methods to manage the KSM in libvirt. This proposes the new APIs like following, thanks for suggestions/idea in advance.
== The macros and enums for the params' fields
/* virNodeKSMRunMode: * * Represents the modes of /sys/kernel/mm/ksm/run. */ typedef enum { /* Stop ksmd from running but keep merged pages */ VIR_NODE_STOP_KSM_KEEP_MERGED_PAGES = 0,
/* Start ksmd */ VIR_NODE_START_KSM = 1,
/* Stop ksmd and and unmerge all pages currently merged */ VIR_NODE_STOP_KSM_UNMERGE_PAGES = 2, } virNodeKSMRunMode;
See my next comment about this...
/* * VIR_NODE_KSM_PAGES_TO_SCAN: * * Macro for typed parameter that represents how many present pages * to scan before ksmd goes to sleep. */ # define VIR_NODE_KSM_PAGES_TO_SCAN "pages_to_scan"
/* * VIR_NODE_KSM_SLEEP_MILLISECS: * * Macro for typed parameter that represents how many milliseconds * ksmd should sleep before next scan. */ # define VIR_NODE_KSM_SLEEP_MILLISECS "sleep_millisecs"
/* * VIR_NODE_KSM_RUN: * * Macro for typed parameter that is used to control ksmd's state, as * an int containing a virNodeKSMRunMode value. */ # define VIR_NODE_KSM_RUN "run"
The other constants here are all tunable parameters. This one however is an action. IMHO we should not use a {Get,Set}Parameters style APIs for actions. We should have an explicit API for invoking them.
Understand, and I tend to agree, how about a separate API like "virNodeManageSharedMemory"?
/* * VIR_NODE_KSM_PAGES_SHARED: * * Macro for typed parameter that represents how many ksm shared pages * are being used. */ # define VIR_NODE_KSM_PAGES_SHARED "pages_shared"
/* * VIR_NODE_KSM_PAGES_SHARING: * * Macro for typed parameter that represents how many sites are * sharing the pages i.e. how much saved. */ # define VIR_NODE_KSM_PAGES_SHARING "pages_sharing"
/* VIR_NODE_KSM_PAGES_UNSHARED: * * Macro for typed parameter that represents how many pages unique * but repeatedly checked for merging. */ # define VIR_NODE_KSM_PAGES_UNSHARED "pages_unshared"
/* VIR_NODE_KSM_PAGES_VOLATILE: * * Macro for typed parameter that represents how many pages changing * too fast to be placed in a tree. */ # define VIR_NODE_KSM_PAGES_VOLATILE "pages_volatile"
/* VIR_NODE_KSM_FULL_SCAN: * * Macro for typed parameter that represents how many times all * mergeable areas have been scanned. */ # define VIR_NODE_KSM_FULL_SCAN "full_scan"
== API to Get KSM parameters.
/* * virNodeGetKSMParameters: * @conn: pointer to the hypervisor connection * @params: pointer to memory parameter object * (return value, allocated by the caller) * @nparams: pointer to number of memory parameters; input and output * @flags: extra flags; not used yet, so callers should always pass 0 * * ... Detailed comments ommitted ... * * Returns 0 in case of success, and -1 in case of failure. */ int virNodeGetKSMParameters(virConnectPtr conn, virTypedParameterPtr params, int *nparams, unsigned int flags);
== API to set KSM parameters.
Currently only "run", "pages_to_scan", and "sleep_millisecs" are writable. The new boolearn sys knob "merge_nodes" could be added once it's acceptted by kernel upstream.
/* * virNodeSetKSMParameters: * @conn: pointer to the hypervisor connection * @params: pointer to scheduler parameter objects * @naprams: number of scheduler parameter objects * (this value can be the same or less than the returned * value nparams of virDomainGetSchedulerType) * @flags: extra flags; not used yet, so callers should always pass 0 * * ... Detailed comments ommitted ... * * Returns 0 in case of success, -1 in case of failure. */ int virNodeSetKSMParameters(virConnectPtr conn, virTypedParameterPtr params, int nparams, unsigned int flags);
BTW, the other thing libvirt misses is to tell ksmtuned to retune when a domain is started or destroyed. Which is outlined in the Red Hat Virtualization Administration Guide doc, but it's not true. So either we have a hole in libvirt, or lacks a fix for the document.
As a general point, we should not use the term 'KSM' in the API names, since that is an implementation specific name. We want some more generic name. eg
Completely agreed. :-)
virNodeSetMemorySharingParameters
But I recall we once used term "sharedMemory" before. So to be consistent, "SharedMemory" should be better. Regards, Osier
participants (3)
-
Daniel P. Berrange
-
Eric Blake
-
Osier Yang