[Libvir] The problem of the definition of tuning informations

I promised that mail for the beginning of the week but I still have a very hard time to try to formulate a good plan of action, I'm still stuck in a dilemna, see below. What is it? ----------- I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ... The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate. However I got objections from David Lutterkort [1], Jim Fehlig [2], and John Levon [3] plus of course the initial request for it from Tatsuro Enokura (and the Fujistu people in general) [4] The problem to me comes from 2 things: 1/ storing tuning informations in domains descriptions is not sufficient 2/ if we store them there we also need to always save them when exporting the XML domain file 2/ is fairly important to avoid a lot of problems as we have experienced before for example with console informations. If the input for virsh create gets different from the output of dumpxml, a lot of rather annoying things happen in practice, it certainly generate confusion. So we really need to output those tuning data if we put them in. Also I strongly believe in 1/, i.e. tuning informations are cross domain and they are vey likely to change fast as soon as the management applications will get deployed, but even in relatively small deployment the tuning is rather a per host informations, which may depend on the current workload of the machine. I don't believe in tuning being loaded at create time and never changing later. Even in my own very basic usage that doesn't match my use which lead me to load and stop domain on demand for short period of time. My opinion: ----------- We need better tools, even for simple use case to be able to save an existing tuning for a domain or a full machine, and reload it when needed. This is IMHO better done on top of the existing API which already have the entry points to implement them. My idea is to provide tuning commands in virsh [5]. If you implement tuning both at creation time and in the tool, this mean you either make them different in which case you have no coherency between what you say when you create a domain or save its config and what you do at the virsh level. If you don't make it different (for example trying to use the same kind of XML syntax), then you need code for doing this both in the tool and in the library itself, or you export as a new API the tuning load and save. Exporting as a parallel API what we have already for scheduling and VCPU affinity makes the API more complex, and less coherent. I don't want to force the decision one way or another, it is probable I missed something, but I don't think adding tuning informations to the domain configuration file to be really that convenient, I could be done better, and with less associated problems by keeping those separate. Daniel [1] https://www.redhat.com/archives/libvir-list/2007-October/msg00250.html [2] https://www.redhat.com/archives/libvir-list/2007-November/msg00003.html [3] https://www.redhat.com/archives/libvir-list/2007-October/msg00046.html [4] https://www.redhat.com/archives/libvir-list/2007-October/msg00221.html [5] https://www.redhat.com/archives/libvir-list/2007-October/msg00245.html -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel Veillard wrote: <snip>
My opinion: -----------
We need better tools, even for simple use case to be able to save an existing tuning for a domain or a full machine, and reload it when needed. This is IMHO better done on top of the existing API which already have the entry points to implement them. My idea is to provide tuning commands in virsh [5]. If you implement tuning both at creation time and in the tool, this mean you either make them different in which case you have no coherency between what you say when you create a domain or save its config and what you do at the virsh level. If you don't make it different (for example trying to use the same kind of XML syntax), then you need code for doing this both in the tool and in the library itself, or you export as a new API the tuning load and save. Exporting as a parallel API what we have already for scheduling and VCPU affinity makes the API more complex, and less coherent.
Daniel, Just to be sure I understand, are you suggesting removing tuning information from any configuration file and making it a runtime exercise to set it up? (That is, after the domain has been started) -- Elizabeth Kon (Beth) IBM Linux Technology Center Open Hypervisor Team email: eak@us.ibm.com

On Thu, Nov 08, 2007 at 02:34:05PM -0500, beth kon wrote:
Daniel Veillard wrote:
<snip>
My opinion: -----------
We need better tools, even for simple use case to be able to save an existing tuning for a domain or a full machine, and reload it when needed. This is IMHO better done on top of the existing API which already have the entry points to implement them. My idea is to provide tuning commands in virsh [5]. If you implement tuning both at creation time and in the tool, this mean you either make them different in which case you have no coherency between what you say when you create a domain or save its config and what you do at the virsh level. If you don't make it different (for example trying to use the same kind of XML syntax), then you need code for doing this both in the tool and in the library itself, or you export as a new API the tuning load and save. Exporting as a parallel API what we have already for scheduling and VCPU affinity makes the API more complex, and less coherent.
Daniel,
Just to be sure I understand, are you suggesting removing tuning information from any configuration file and making it a runtime exercise to set it up? (That is, after the domain has been started)
It would not be removing, as I don't think we have any at this point, the only exception would be the 'currentMemory' parameter, and the cpuset informations, both are optional obviously, but may be absolutely necessary at creation time, to avoid a broken or failed setup of the domain. And yes tuning would be a runtime exercise (which application can activate immediately after the completion of the Create command), Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Thu, Nov 08, 2007 at 04:31:49PM -0500, Daniel Veillard wrote:
On Thu, Nov 08, 2007 at 02:34:05PM -0500, beth kon wrote:
Just to be sure I understand, are you suggesting removing tuning information from any configuration file and making it a runtime exercise to set it up? (That is, after the domain has been started)
It would not be removing, as I don't think we have any at this point, the only exception would be the 'currentMemory' parameter, and the cpuset informations, both are optional obviously, but may be absolutely necessary at creation time, to avoid a broken or failed setup of the domain.
Hum, I'm afraid I may have been unclear there, currentMemory and cpuset aren't really tuning parameters and need to be stay in the XML config. They are just 2 inforamtions which are on the edge. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

* Daniel Veillard <veillard@redhat.com> [2007-11-08 10:08]:
I promised that mail for the beginning of the week but I still have I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ...
The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate.
For at least (maybe only) Xen NUMA systems, the application of "tuning" information after a domain is started does not achieve the same affect as including the information during the initial construction of the domain. In particular, Xen needs to know which physical cpus are being used to determine which cpus it from which numanode it will allocate memory. Adjusting affinity after the domain has allocated memory doesn't allow libvirt or any management app to control from which node domains pull memory. I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com

On Thu, Nov 08, 2007 at 02:00:10PM -0600, Ryan Harper wrote:
* Daniel Veillard <veillard@redhat.com> [2007-11-08 10:08]:
I promised that mail for the beginning of the week but I still have I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ...
The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate.
For at least (maybe only) Xen NUMA systems, the application of "tuning" information after a domain is started does not achieve the same affect as including the information during the initial construction of the domain. In particular, Xen needs to know which physical cpus are being used to determine which cpus it from which numanode it will allocate memory. Adjusting affinity after the domain has allocated memory doesn't allow libvirt or any management app to control from which node domains pull memory.
yes, I understand and that's why I agreed to add the cpuset information at that point it's more than tunning because it may be irreversible for the lifetime of the domain, so this really should be in the XML. I'm not suggesting to go back about 'cpu affinity' i.e. to which physical CPUs a domain should be bound, but 'vcpu affinity' i.e. then how the virtual CPUs of the domain are mapped onto that cpu set, that can change dynamically without (serious) performance penalty.
I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction.
My point is that you don't need the tuning informations to create the domain, if you need them it's not tuning. When you say you want to merge them, do you want this to create the domain ? It should not be necessary (or I take a counter example that would help me), right ? Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

* Daniel Veillard <veillard@redhat.com> [2007-11-08 15:27]:
On Thu, Nov 08, 2007 at 02:00:10PM -0600, Ryan Harper wrote:
* Daniel Veillard <veillard@redhat.com> [2007-11-08 10:08]:
I promised that mail for the beginning of the week but I still have I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ...
The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate.
For at least (maybe only) Xen NUMA systems, the application of "tuning" information after a domain is started does not achieve the same affect as including the information during the initial construction of the domain. In particular, Xen needs to know which physical cpus are being used to determine which cpus it from which numanode it will allocate memory. Adjusting affinity after the domain has allocated memory doesn't allow libvirt or any management app to control from which node domains pull memory.
yes, I understand and that's why I agreed to add the cpuset information at that point it's more than tunning because it may be irreversible for the lifetime of the domain, so this really should be in the XML. I'm not suggesting to go back about 'cpu affinity' i.e. to which physical CPUs a domain should be bound, but 'vcpu affinity' i.e. then how the virtual CPUs of the domain are mapped onto that cpu set, that can change
OK, I see your distinction here.
dynamically without (serious) performance penalty.
At least for Xen, the 'cpu' affinity specified with a domain is only accessible via the xen config file and is not enforced in any way such that it prevents from someone "tuning" a domain to use physical cpus outside of the specified cpumap. Users can can certainly specify a cpu outside of the original cpuset from the config file which in a NUMA scenario has the potential for serious performance penalties.
I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction.
My point is that you don't need the tuning informations to create the domain, if you need them it's not tuning. When you say you want to merge them, do you want this to create the domain ? It should not be necessary (or I take a counter example that would help me), right ?
I agree here. I was lumping cpuset info into your tunable category but you clarified the distinction above. I just want to ensure that initial cpuset mapping is present prior to constructing a domain as that is integral for proper Xen NUMA memory allocation. -- Ryan Harper Software Engineer; Linux Technology Center IBM Corp., Austin, Tx (512) 838-9253 T/L: 678-9253 ryanh@us.ibm.com

On Thu, Nov 08, 2007 at 03:41:12PM -0600, Ryan Harper wrote:
* Daniel Veillard <veillard@redhat.com> [2007-11-08 15:27]:
yes, I understand and that's why I agreed to add the cpuset information at that point it's more than tunning because it may be irreversible for the lifetime of the domain, so this really should be in the XML. I'm not suggesting to go back about 'cpu affinity' i.e. to which physical CPUs a domain should be bound, but 'vcpu affinity' i.e. then how the virtual CPUs of the domain are mapped onto that cpu set, that can change
OK, I see your distinction here.
okay, good this is clarified, bear with me it's not always simple to try to explain this kind of things :-)
dynamically without (serious) performance penalty.
At least for Xen, the 'cpu' affinity specified with a domain is only accessible via the xen config file and is not enforced in any way such that it prevents from someone "tuning" a domain to use physical cpus outside of the specified cpumap. Users can can certainly specify a cpu outside of the original cpuset from the config file which in a NUMA scenario has the potential for serious performance penalties.
Well all tuning parameters I can think of can actually harm the system, actually if there was no drawback possible they would be integrated in the system default mechanism I guess :-)
I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction.
My point is that you don't need the tuning informations to create the domain, if you need them it's not tuning. When you say you want to merge them, do you want this to create the domain ? It should not be necessary (or I take a counter example that would help me), right ?
I agree here. I was lumping cpuset info into your tunable category but you clarified the distinction above. I just want to ensure that initial cpuset mapping is present prior to constructing a domain as that is integral for proper Xen NUMA memory allocation.
okay, sure, that's clear in my mind but wasn't clear in my wording, I hope there is no other issue. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel Veillard wrote:
On Thu, Nov 08, 2007 at 02:00:10PM -0600, Ryan Harper wrote:
* Daniel Veillard <veillard@redhat.com> [2007-11-08 10:08]:
I promised that mail for the beginning of the week but I still have I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ...
The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate.
For at least (maybe only) Xen NUMA systems, the application of "tuning" information after a domain is started does not achieve the same affect as including the information during the initial construction of the domain. In particular, Xen needs to know which physical cpus are being used to determine which cpus it from which numanode it will allocate memory. Adjusting affinity after the domain has allocated memory doesn't allow libvirt or any management app to control from which node domains pull memory.
yes, I understand and that's why I agreed to add the cpuset information at that point it's more than tunning because it may be irreversible for the lifetime of the domain, so this really should be in the XML. I'm not suggesting to go back about 'cpu affinity' i.e. to which physical CPUs a domain should be bound, but 'vcpu affinity' i.e. then how the virtual CPUs of the domain are mapped onto that cpu set, that can change dynamically without (serious) performance penalty.
I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction.
My point is that you don't need the tuning informations to create the domain, if you need them it's not tuning.
Well said Daniel - a simple point I was missing. I repeal my objection [1]. Thanks, Jim [1] https://www.redhat.com/archives/libvir-list/2007-November/msg00003.html

On Thu, Nov 08, 2007 at 06:06:24PM -0700, Jim Fehlig wrote:
Daniel Veillard wrote:
On Thu, Nov 08, 2007 at 02:00:10PM -0600, Ryan Harper wrote:
I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction.
My point is that you don't need the tuning informations to create the domain, if you need them it's not tuning.
Well said Daniel - a simple point I was missing. I repeal my objection [1].
Hey Jim, thanks for the update ! and welcome back :-) Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel Veillard wrote:
On Thu, Nov 08, 2007 at 02:00:10PM -0600, Ryan Harper wrote:
* Daniel Veillard <veillard@redhat.com> [2007-11-08 10:08]:
I promised that mail for the beginning of the week but I still have I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ...
The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate.
For at least (maybe only) Xen NUMA systems, the application of "tuning" information after a domain is started does not achieve the same affect as including the information during the initial construction of the domain. In particular, Xen needs to know which physical cpus are being used to determine which cpus it from which numanode it will allocate memory. Adjusting affinity after the domain has allocated memory doesn't allow libvirt or any management app to control from which node domains pull memory.
yes, I understand and that's why I agreed to add the cpuset information at that point it's more than tunning because it may be irreversible for the lifetime of the domain, so this really should be in the XML. I'm not suggesting to go back about 'cpu affinity' i.e. to which physical CPUs a domain should be bound, but 'vcpu affinity' i.e. then how the virtual CPUs of the domain are mapped onto that cpu set, that can change dynamically without (serious) performance penalty.
I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction.
My point is that you don't need the tuning informations to create the domain, if you need them it's not tuning. When you say you want to merge them, do you want this to create the domain ? It should not be necessary (or I take a counter example that would help me), right ?
It seems to me that the only reason cpuset information is being treated as more than tuning is due to an artifact of Xen (i.e., it must be specified at domain creation). For KVM, for example, I believe this can be specified after domain creation. From a libvirt perspective, I think the XML config/tuning split should be hypervisor-neutral, and based solely on what is required to get a domain running (ignoring performance): 1) XML contains arguments absolutely needed to start a domain in any hypervisor. This could be thought of as the minimum requirements for starting a domin. 2) Tuning information contains arguments that affect performance, and may be changed. When a domain is started, the caller can specify a minimal start (XML only) or a tuned start (XML plus tuning). Lower level libvirt code would understand the specifics of the hypervisor well enough to know whether it had to include some of the tuning information at domain creation time.
Daniel
-- Elizabeth Kon (Beth) IBM Linux Technology Center Open Hypervisor Team email: eak@us.ibm.com

beth kon wrote:
Daniel Veillard wrote:
On Thu, Nov 08, 2007 at 02:00:10PM -0600, Ryan Harper wrote:
* Daniel Veillard <veillard@redhat.com> [2007-11-08 10:08]:
I promised that mail for the beginning of the week but I still have I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ...
The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate.
For at least (maybe only) Xen NUMA systems, the application of "tuning" information after a domain is started does not achieve the same affect as including the information during the initial construction of the domain. In particular, Xen needs to know which physical cpus are being used to determine which cpus it from which numanode it will allocate memory. Adjusting affinity after the domain has allocated memory doesn't allow libvirt or any management app to control from which node domains pull memory.
yes, I understand and that's why I agreed to add the cpuset information at that point it's more than tunning because it may be irreversible for the lifetime of the domain, so this really should be in the XML. I'm not suggesting to go back about 'cpu affinity' i.e. to which physical CPUs a domain should be bound, but 'vcpu affinity' i.e. then how the virtual CPUs of the domain are mapped onto that cpu set, that can change dynamically without (serious) performance penalty.
I don't have any objection to separating "tuning" information as long as we have the ability to merge permanent domain parameters with its "tuning" information prior to domain construction.
My point is that you don't need the tuning informations to create the domain, if you need them it's not tuning. When you say you want to merge them, do you want this to create the domain ? It should not be necessary (or I take a counter example that would help me), right ?
It seems to me that the only reason cpuset information is being treated as more than tuning is due to an artifact of Xen (i.e., it must be specified at domain creation). For KVM, for example, I believe this can be specified after domain creation.
From a libvirt perspective, I think the XML config/tuning split should be hypervisor-neutral, and based solely on what is required to get a domain running (ignoring performance):
1) XML contains arguments absolutely needed to start a domain in any hypervisor. This could be thought of as the minimum requirements for starting a domin.
2) Tuning information contains arguments that affect performance, and may be changed.
When a domain is started, the caller can specify a minimal start (XML only) or a tuned start (XML plus tuning). Lower level libvirt code would understand the specifics of the hypervisor well enough to know whether it had to include some of the tuning information at domain creation time.
Daniel and I have been discussing this a bit on IRC, so I will dump that information on the list... (correct me if I misstate something here, Daniel :-) Daniel wants to have the xml contain all parameters that must be specified at domain creation in order to achieve proper function, and cannot be tuned later. I agree this is a reasonable definition. In this case, cpuset would need to be in the xml. My concern is that currently Xen will fail a domain create request if the cpu is out of range with the error "invalid argument", so the user will not have enough information to correct the problem in the xml and try again. We can pursue getting a more explicit error message from Xen. Or Xen could ignore the cpuset and start the domain, perhaps with a warning message. My thinking was that ideally it might be good to have libvirt provide 2 start methods - minimal and tuned, but Daniel thinks it is not worth the complexity. It should be up to the user to correct issues in the xml and try again.
Daniel
-- Elizabeth Kon (Beth) IBM Linux Technology Center Open Hypervisor Team email: eak@us.ibm.com

On Tue, Nov 13, 2007 at 11:51:09AM -0500, beth kon wrote:
beth kon wrote:
When a domain is started, the caller can specify a minimal start (XML only) or a tuned start (XML plus tuning). Lower level libvirt code would understand the specifics of the hypervisor well enough to know whether it had to include some of the tuning information at domain creation time.
Daniel and I have been discussing this a bit on IRC, so I will dump that information on the list... (correct me if I misstate something here, Daniel :-)
Daniel wants to have the xml contain all parameters that must be specified at domain creation in order to achieve proper function, and cannot be tuned later. I agree this is a reasonable definition. In this case, cpuset would need to be in the xml.
Yup, that's what I tried to express :-)
My concern is that currently Xen will fail a domain create request if the cpu is out of range with the error "invalid argument", so the user will not have enough information to correct the problem in the xml and try again. We can pursue getting a more explicit error message from Xen. Or Xen could ignore the cpuset and start the domain, perhaps with a warning message.
My thinking was that ideally it might be good to have libvirt provide 2 start methods - minimal and tuned, but Daniel thinks it is not worth the complexity. It should be up to the user to correct issues in the xml and try again.
We really need to get some reporting. Usually 'invalid argument' arise for example if you try to start an x86_64 guest on an i686 host, i.e. something as broken as a wrong architecture. One thing we can do in libvirt is check the input parameters, we know how many physical CPUs are on the box, and we do parse the cpuset attribute so we coudl either: - drop the informations about cpuset and give an error - abort the create operation and also give the error unfortunately we can't do anything about predefined domains, already in the xen database. But I think this should cope with most case in libvirt. Oh and of course ideally we should get a xend patch upstream to give back a correct error message, but that's not something we can control. thanks ! Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Hi, Daniel Daniel Veillard wrote:
I promised that mail for the beginning of the week but I still have a very hard time to try to formulate a good plan of action, I'm still stuck in a dilemna, see below.
What is it? ----------- I think tuning informations are that set of parameters associated to a domain or a host, which are not stricly needed to get the domain(s) working but improve their runtime behaviour. To me this includes: - scheduling parameters the scope may be host/hypervisor/domain - vcpu affinity i.e. to which set of physical CPU each of the vcpu may be bound - and possibly others ...
The problem: ------------ People would like to associate those to the XML domain informations, the goal being to be able to restore those informations when a domain (re-)starts. I have been objecting it so far because, I think those informations don't have the same lifetime and scope as the other domain informations saved in the XML. Since they are not needed to start the domain, and that once the domain is started the existing domain API can be used to change those informations, it is better to keep them separate. However I got objections from David Lutterkort [1], Jim Fehlig [2], and John Levon [3] plus of course the initial request for it from Tatsuro Enokura (and the Fujistu people in general) [4] The problem to me comes from 2 things: 1/ storing tuning informations in domains descriptions is not sufficient 2/ if we store them there we also need to always save them when exporting the XML domain file
2/ is fairly important to avoid a lot of problems as we have experienced before for example with console informations. If the input for virsh create gets different from the output of dumpxml, a lot of rather annoying things happen in practice, it certainly generate confusion. So we really need to output those tuning data if we put them in. Also I strongly believe in 1/, i.e. tuning informations are cross domain and they are vey likely to change fast as soon as the management applications will get deployed, but even in relatively small deployment the tuning is rather a per host informations, which may depend on the current workload of the machine. I don't believe in tuning being loaded at create time and never changing later. Even in my own very basic usage that doesn't match my use which lead me to load and stop domain on demand for short period of time.
My opinion: -----------
We need better tools, even for simple use case to be able to save an existing tuning for a domain or a full machine, and reload it when needed. This is IMHO better done on top of the existing API which already have the entry points to implement them. My idea is to provide tuning commands in virsh [5]. If you implement tuning both at creation time and in the tool, this mean you either make them different in which case you have no coherency between what you say when you create a domain or save its config and what you do at the virsh level. If you don't make it different (for example trying to use the same kind of XML syntax), then you need code for doing this both in the tool and in the library itself, or you export as a new API the tuning load and save. Exporting as a parallel API what we have already for scheduling and VCPU affinity makes the API more complex, and less coherent.
I don't want to force the decision one way or another, it is probable I missed something, but I don't think adding tuning informations to the domain configuration file to be really that convenient, I could be done better, and with less associated problems by keeping those separate.
I agree that it does not necessary weight/cap information on boot Also, I agree that these informations stores as tuning information not XML format. Since these informations lifetime is different. I hope two things to consider 1) tuning infoamations can set on boot 2) tuning informations can set via libvirt API on domain shutoff. Tatsuro Enokura
participants (5)
-
beth kon
-
Daniel Veillard
-
Jim Fehlig
-
Ryan Harper
-
Tatsuro Enokura