Add option to skip cpu feature and model verification in libvirt and rely on qemu 'enforce' option

Hi Everyone, We had a requirement to skip all CPU feature validations from libvirt and totally rely on Qemu for following reasons. * As libvirt always lies behind in term of cpu feature management compare to qemu, so we may not be able to use all of the latest support provided by qemu till it is updated in libvirt too. For example like if new cpu features are added in qemu but not yet updated in libvirt. * Libvirt does not support cpu model versioning as Qemu. * Libvirt checks if a features is supported or not based on cpuid, a feature supported by host may not be supported by qemu or there can be some features which are emulated by Qemu so libvirt level checking does not make sense in those cases. * Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped. Basically we want another check option like 'qemu-enforce' which will skip all feature and cpu model verfications in libvirt and pass 'enforce' option to qemu. It will work similar to check 'none' i mean no check to verify if provide features are supported by host, also on top of that skip things like virCPUValidateFeatures where ever required. I understand this is specific to just Qemu and libvirt has to take care for many other emulator, so wanted to check if something like that makes sense as it will be very useful for Qemu atleast. If so we can send a patch. Thanks Manish Mishra

Hi. On Wed, Nov 16, 2022 at 22:53:18 +0530, manish.mishra wrote:
We had a requirement to skip all CPU feature validations from libvirt and totally rely on Qemu for following reasons.
You are mixing two different things here. The "enforce" CPU checking for making sure a guest gets exactly what the user asked QEMU for and missing CPU models or features support in libvirt.
* As libvirt always lies behind in term of cpu feature management compare to qemu, so we may not be able to use all of the latest support provided by qemu till it is updated in libvirt too. For example like if new cpu features are added in qemu but not yet updated in libvirt.
So you effectively ask for a "passthrough" configuration, that is, CPU model and features would be passed directly to QEMU without libvirt even looking at them, right? Anyway, the current goal is to make libvirt ask for supported CPU models including their definition so that we don't have to keep our own definitions. That is, libvirt would support all models as they are added to QEMU without any additional work. Not sure if CPU features would follow, but adding them is fairly easy and what's more important they don't change in time (unlike CPU models).
* Libvirt checks if a features is supported or not based on cpuid, a feature supported by host may not be supported by qemu or there can be some features which are emulated by Qemu so libvirt level checking does not make sense in those cases.
This has not been the case for several years already. Libvirt asks QEMU what CPU features it can provide on the host (either natively or emulated) and checks against them. So no issue here. Well, expect that some features are strange and QEMU does not report them as enabled for -cpu host even though it would enable them for some specific CPU models. But that's more in a "bug" area which needs to be solved between QEMU and libvirt rather than being it a principal issue.
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped.
Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues. So once QEMU starts we checks all features enabled in the vCPU and note those that were (unexpectedly) added so that we can explicitly ask for them during migration and also check that no other features were added on top. And for this whole process to work, we need to understand, i.e., have explicit support, for all CPU features (QEMU report a lot of CPU properties and we need to know which one needs to be stored as enabled features) and CPU models to make sure both sides of migration understand the CPU definition in domain XML in the same way.
Basically we want another check option like 'qemu-enforce' which will skip all feature and cpu model verfications in libvirt and pass 'enforce' option to qemu. It will work similar to check 'none' i mean no check to verify if provide features are supported by host, also on top of that skip things like virCPUValidateFeatures where ever required.
In short, I don't think this is a good idea and we should or could support it and still maintain migration safety. If you don't care about migration safety, you don't need to care about "enforce" QEMU check either and can just use host-passthrough and get whatever QEMU supports on the host. Jirka

On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped.
Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues.
Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled, and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this? thanks john

On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped.
Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues.
Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled,
Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof. Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this?
This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind. Jirka

On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped. Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues. Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being? Thanks Manish Mishra
Jirka

On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped. Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues. Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being?
I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention. Jirka

On 22/11/22 6:18 pm, Jiri Denemark wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped.
Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues. Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being? I understand why you want this, but on the other hand adding just a
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote: plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention.
Jirka
Hi Jiri, Thanks, We discussed internally, yes something like that works for us. We wanted to dicusss how it will look like roughly before sending patch. Does something like below works. <qemu:cpu> will be a sub-property within <cpu> and it will contains details only about cpu-model, cpu-features and check everything else can be as it is. So for CPU feature and model either user can define those in <cpu> itself as in example 1 or in <qemu:cpu> as in example 2. Also either example 1 or example 2 are valid, if user mixes up both, i mean define cpu feature or models in both <qemu:cpu> and <cpu> it will be a parsing error to avoid conflicts. In feature policy we as of now need only require/disable for <qemu:cpu> case and no need of options like match, mode if <qemu:cpu> is defined. Please let us know if something like that works. It may vary a bit while working on patch, so probaly it will be more clear when patch is out, I just wanted to put our requirements roughly before starting working on patch :). Example 1: <cpu mode='custom' match='exact' check='partial'> <model fallback='forbid'>Icelake-Server</model> <vendor>Intel</vendor> <topology sockets='240' dies='1' cores='1' threads='1'/> <cache level='3' mode='emulate'/> <feature policy='require' name='smap'/> <feature policy='require' name='avx'/> <feature policy='require' name='xsaveopt'/> <feature policy='disable' name='tsc_adjust'/> <numa> <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu> Example 2: <cpu> <vendor>Intel</vendor> <topology sockets='240' dies='1' cores='1' threads='1'/> <cache level='3' mode='emulate'/> <qemu:cpu check='enfoce'> <model>Icelake-Server-V4</model> <feature policy='require' name='smap'/> <feature policy='require' name='avx'/> <feature policy='require' name='xsaveopt'/> <feature policy='disable' name='tsc_adjust'/> <qemu:cpu /> <numa> <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu> Thanks Manish Mishra

On Wed, Nov 23, 2022 at 13:40:21 +0530, manish.mishra wrote:
Hi Jiri, Thanks, We discussed internally, yes something like that works for us. We wanted to dicusss how it will look like roughly before sending patch. Does something like below works. <qemu:cpu> will be a sub-property within <cpu> and it will contains details only about cpu-model, cpu-features and check everything else can be as it is. So for CPU feature and model either user can define those in <cpu> itself as in example 1 or in <qemu:cpu> as in example 2. Also either example 1 or example 2 are valid, if user mixes up both, i mean define cpu feature or models in both <qemu:cpu> and <cpu> it will be a parsing error to avoid conflicts. In feature policy we as of now need only require/disable for <qemu:cpu> case and no need of options like match, mode if <qemu:cpu> is defined. Please let us know if something like that works. It may vary a bit while working on patch, so probaly it will be more clear when patch is out, I just wanted to put our requirements roughly before starting working on patch :).
<cpu> <vendor>Intel</vendor> <topology sockets='240' dies='1' cores='1' threads='1'/> <cache level='3' mode='emulate'/> <qemu:cpu check='enfoce'> <model>Icelake-Server-V4</model> <feature policy='require' name='smap'/> <feature policy='require' name='avx'/> <feature policy='require' name='xsaveopt'/> <feature policy='disable' name='tsc_adjust'/> <qemu:cpu /> <numa> <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu>
I think this would be quite hard to notice in the XML as it looks almost like a normal CPU definition and it would also complicate the <cpu> parser. We should keep this completely separated similarly to other elements in qemu namespace (as described in https://libvirt.org/drvqemu.html#pass-through-of-arbitrary-qemu-commands So for example, <domain type='qemu' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'> ... <cpu> <topology sockets='240' dies='1' cores='1' threads='1'/> <cache level='3' mode='emulate'/> <numa>...</numa> ... </cpu> <qemu:cpu> <qemu:model name='Icelake-Server-V4'/> <qemu:option name='enforce' value='on'/> <qemu:option name='smap' value='on'/> <qemu:option name='avx' value='on'/> <qemu:option name='tsc-adjust' value='off'/> <qemu:option name='strange-option' value='1234'/> </qemu:cpu> <devices> ... </devices> </domain> and this would directly translate to -cpu Icelake-Server-V4,enforce=on,smap=on,avx=on,tsc-adjust=off,strange-option=1234 In other words, make it a complete passthrough of the values to QEMU without any extra parsing or translating on libvirt side. This way you can express anything QEMU supports for the -cpu command line option and libvirt has not implemented yet. Defining a CPU model both in <cpu> and <qemu:cpu> would be an error indeed. But eware, even usage of hypervisor features or other XML configuration outside of <cpu> that are translated into -cpu options could be unsupported and overridden by the content of <qemu:cpu>. We have several options for dealing with this: 1. check all of this and report an error 2. document the new <qemu:cpu> element to completely override anything libvirt would pass to -cpu and it's up to the user to include anything they need there (such as hypervisor features) 3. use the content of <qemu:cpu> as a base for -cpu and add other staff there as usual with a proper documentation that mentioning a specific option both in <qemu:cpu> and elsewhere results in an undefined behavior Personally, I would vote for 2 or 3 and perhaps slightly more for 3 as it is a bit more user friendly in case the goal is to override the CPU model but keep using normal configuration for hypervisor features. And of course, using <qemu:cpu> element would result in the domain being tainted with a new VIR_DOMAIN_TAINT_CUSTOM_CPU. Jirka

On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote:
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped. Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues. Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being?
I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention.
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner. If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote:
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
> * Qemu already provide an option 'enforce' to validate if features > with which vm is started is exactly same as one provided and nothing > is silently dropped. Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues. Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being?
I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention.
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue. If anyone thinks it's a good idea to use such stuff for anything even remotely close to production then that's their call. We can't really prevent that. Just as with any other <qemu:...> stuff we already have. Jirka

On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue.
If anyone thinks it's a good idea to use such stuff for anything even remotely close to production then that's their call. We can't really prevent that. Just as with any other <qemu:...> stuff we already have.
Having two copies of essentially the same information that must be kept perfectly in sync has been the direct cause of multiple serious bugs for us (some self-inflicted, probably, but certainly not all). I understand that "fix libvirt every time there's a disparity" is appealing but unfortunately in practice this just hasn't been the case. We're struggling to see what value the libvirt layer is bringing us in this area, as opposed to a single-source-of-truth approach. regards john

On Wed, Nov 23, 2022 at 11:34:24AM +0000, John Levon wrote:
On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue.
If anyone thinks it's a good idea to use such stuff for anything even remotely close to production then that's their call. We can't really prevent that. Just as with any other <qemu:...> stuff we already have.
Having two copies of essentially the same information that must be kept perfectly in sync has been the direct cause of multiple serious bugs for us (some self-inflicted, probably, but certainly not all). I understand that "fix libvirt every time there's a disparity" is appealing but unfortunately in practice this just hasn't been the case.
We're struggling to see what value the libvirt layer is bringing us in this area, as opposed to a single-source-of-truth approach.
Well libvirt's job is to insulate applications from the specific hypervisor implementation details. We support multiple hypervisors, and as such it is a goal that CPU feature names at least are able to be made consistent across hypervisor drivers in libvirt, but ideally CPU model names too that is more challenging to achieve, depending on what config flexibility the hypervisor offers. Still, directly exposing the hypervisor's specific names is an explicit anti-goal. Another factor is that QEMU configuration syntax changes over time, and libvirt aims to transparently adapt to such changes, that includes potential renames of flags/models. There is always a chance of bugs when mapping from the hypervisor to libvirt, but that's not a reason to abandon the core goals of libvirt to isolate apps from the raw hypervisor impl details. Our focus needs to be on reducing the chance of such bugs in libvirt and reducing any time lag between features. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Nov 23, 2022 at 11:48:17AM +0000, Daniel P. Berrangé wrote:
We're struggling to see what value the libvirt layer is bringing us in this area, as opposed to a single-source-of-truth approach.
Well libvirt's job is to insulate applications from the specific hypervisor implementation details.
Sure, of course, this isn't relevant to other drivers or something that would be turned on from (say) a virt-manager context. We're not trying to argue this should be a default or anything, but this would hardly be the first thing that's only supported by the qemu driver. regards john

On Wed, Nov 23, 2022 at 11:57:08AM +0000, John Levon wrote:
On Wed, Nov 23, 2022 at 11:48:17AM +0000, Daniel P. Berrangé wrote:
We're struggling to see what value the libvirt layer is bringing us in this area, as opposed to a single-source-of-truth approach.
Well libvirt's job is to insulate applications from the specific hypervisor implementation details.
Sure, of course, this isn't relevant to other drivers or something that would be turned on from (say) a virt-manager context. We're not trying to argue this should be a default or anything, but this would hardly be the first thing that's only supported by the qemu driver.
There's a big difference between a feature only supported by one of the hypervisor drivers, and a feature where we just blindly do passthrough of the hypervisor terminology as-is. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Nov 23, 2022 at 11:34:24 +0000, John Levon wrote:
On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue.
If anyone thinks it's a good idea to use such stuff for anything even remotely close to production then that's their call. We can't really prevent that. Just as with any other <qemu:...> stuff we already have.
Having two copies of essentially the same information that must be kept perfectly in sync has been the direct cause of multiple serious bugs for us (some self-inflicted, probably, but certainly not all). I understand that "fix libvirt every time there's a disparity" is appealing but unfortunately in practice this just hasn't been the case.
Well, but that's not how it is all supposed to work. First, libvirt and QEMU do not strictly need to be in sync with everything. The important part is that features are in sync (pretty much trivial and easy, it's just that someone needs to be motivated to send a patch). Libvirt needs to understand the features to be able to report host capabilities. Not to mention that QEMU changed names of several features and even deprecated the old spellings which is completely transparent to libvirt users as they can still use the old names no matter what version of QEMU they use. This is one of the key benefits of libvirt. CPU models need to be added to libvirt too, but their definition does not have to be in sync with QEMU. And as far as I can tell the goal is to dynamically probe the models and their details from QEMU so that we don't need to keep our own copy of CPU models. This approach should be possible (and it's actually a requirement due to the increased number of supported CPU models) with versioned CPUs, but we need some more support for this probing in QEMU. Anyway, with the currently suggested use case we should have a single source of truth and that is QEMU. In domain capabilities XML we report what CPU models QEMU can run on a given host. You can take any of these models and use them with <cpu check='none' ...> to start a domain with that CPU models whatever that means to QEMU. Libvirt then translates the created CPU to its internal (so far) definition which is stable across all versions of libvirt to make sure any version of libvirt can understand it (during migration, save/restore, snapshots, upgrade) and can ensure the CPU does not change in any way until the domain shuts down. In other words, libvirt code should be in a state when differences between QEMU and libvirt in CPU model definitions should not matter. Modulo bugs, of course, but such bugs should be reported, analyzed and fixed similarly to any other bugs in any software.
We're struggling to see what value the libvirt layer is bringing us in this area, as opposed to a single-source-of-truth approach.
The ABI stability and compatibility with different versions of QEMU and libvirt. We definitely want to minimize duplicity and use QEMU as much as possible to implement our APIs and semantics. For example, the all of the CPU model related functionality is offloaded to QEMU on s390. But x86_64 lacks the required parts no to mention that the CPU modelling there is much more complicated. Jirka

On Wed, Nov 23, 2022 at 02:12:59PM +0100, Jiri Denemark wrote:
Not to mention that QEMU changed names of several features and even deprecated the old spellings which is completely transparent to libvirt users as they can still use the old names no matter what version of QEMU they use. This is one of the key benefits of libvirt.
It's certainly very unfortunate that qemu did that (and we are still paying the price for the arch-facilities/arch-capabilities thing). If I'm reading your email right, you're saying that libvirt is (trying to?) move in a direction where instead of having say src/cpu_map/x86_Icelake-Client.xml, instead there is a very minimal layer that knows that some qemu feature got renamed, and is effectively just a filter "papering over" such qemu issues. And in particular it's no longer the case that every single cpu model, cpu model version, and cpu feature needs commits to both qemu and libvirt, and they must be in sync. I'm presuming this will also mean plumbing through qemu model versions too, as that's an essential part I think. Do I have this right at all? regards john

On Wed, Nov 23, 2022 at 01:35:42PM +0000, John Levon wrote:
On Wed, Nov 23, 2022 at 02:12:59PM +0100, Jiri Denemark wrote:
Not to mention that QEMU changed names of several features and even deprecated the old spellings which is completely transparent to libvirt users as they can still use the old names no matter what version of QEMU they use. This is one of the key benefits of libvirt.
It's certainly very unfortunate that qemu did that (and we are still paying the price for the arch-facilities/arch-capabilities thing).
Note, it depends on your POV of 'unfortunate'. Of course any time QEMU changes something, it has an impact on things using QEMU, but libvirt is here precisely to give isolation from those changes. Indeed, the very fact that libvirt exists and is widely used, is what allows QEMU to deprecated and rename stuff on a pretty aggressive schedule. If libvirt wasn't providing this isolation, then it is unlikely QEMU would have adopted its 2 release deprecated cycle, and would be locked into their mistakes for a much longer period. IOW, libvirt is allowing QEMU to fix their own technical debt on a more aggressive timeframe, albeit at some cost to libvirt maintainers.
And in particular it's no longer the case that every single cpu model, cpu model version, and cpu feature needs commits to both qemu and libvirt, and they must be in sync.
I would *always* expect there to be libvirt work to define the CPU feature names, and possibly CPU model names. IIUC, we don't need to copy over the entire CPU model expansion though, since we can query that direct from QEMU given the model name. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Nov 23, 2022 at 01:45:37PM +0000, Daniel P. Berrangé wrote:
Indeed, the very fact that libvirt exists and is widely used, is what allows QEMU to deprecated and rename stuff on a pretty aggressive schedule. If libvirt wasn't providing this isolation, then it is unlikely QEMU would have adopted its 2 release deprecated cycle, and would be locked into their mistakes for a much longer period.
My understanding was that qemu no longer change existing CPU model definitions, but instead introduce new versions (like IcelakeServer-V6), and that's how they can make changes as needed. Or are they still arbitrarily changing/breaking *existing* definitions in incompatible ways?
IOW, libvirt is allowing QEMU to fix their own technical debt on a more aggressive timeframe, albeit at some cost to libvirt maintainers.
I'd be interested in one or two specific examples so I can understand this properly. thanks john

On Wed, Nov 23, 2022 at 01:54:50PM +0000, John Levon wrote:
On Wed, Nov 23, 2022 at 01:45:37PM +0000, Daniel P. Berrangé wrote:
Indeed, the very fact that libvirt exists and is widely used, is what allows QEMU to deprecated and rename stuff on a pretty aggressive schedule. If libvirt wasn't providing this isolation, then it is unlikely QEMU would have adopted its 2 release deprecated cycle, and would be locked into their mistakes for a much longer period.
My understanding was that qemu no longer change existing CPU model definitions, but instead introduce new versions (like IcelakeServer-V6), and that's how they can make changes as needed. Or are they still arbitrarily changing/breaking *existing* definitions in incompatible ways?
The existing CPU model versions should not be getting changed by machine types any more. It hsould be purely additive with new versions being introduced, at least for x86. Other arches handle CPUs differently in QEMU :-(
IOW, libvirt is allowing QEMU to fix their own technical debt on a more aggressive timeframe, albeit at some cost to libvirt maintainers.
I'd be interested in one or two specific examples so I can understand this properly.
The poster child example of QEMU evolving is the block layer which has many syntax versions * -hda / -sda / etc * -drive if=ide,..... * -drive if=none,... -device virtio-blk * -blockdev driver=... -device virtio-blk * -blockdev json:{...} -device virtio-blk though at the same time its a bad example, as QEMU has failed to actually get around to deleting the old syntaxes so far. There is a quite a long list of things where QEMU has actually remove the old syntax though: https://www.qemu.org/docs/master/about/removed-features.html and libvirt has adapted to a great many of those. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On Wed, Nov 23, 2022 at 01:59:54PM +0000, Daniel P. Berrangé wrote:
The poster child example of QEMU evolving is the block layer which has many syntax versions
Sure, those sorts of things have indeed changed an awful lot, and libvirt obviously brings value in that interface. But that feels IMHO quite different from the CPU model/feature stuff. For one thing, it's unavoidable if you're going to support multiple qemu versions.
https://www.qemu.org/docs/master/about/removed-features.html
The wholesale removal of Icelake-Client is the only salient example listed here, right? Given this is in libvirt too, how did libvirt help with this? regards john

On Wed, Nov 23, 2022 at 13:35:42 +0000, John Levon wrote:
On Wed, Nov 23, 2022 at 02:12:59PM +0100, Jiri Denemark wrote:
Not to mention that QEMU changed names of several features and even deprecated the old spellings which is completely transparent to libvirt users as they can still use the old names no matter what version of QEMU they use. This is one of the key benefits of libvirt.
It's certainly very unfortunate that qemu did that (and we are still paying the price for the arch-facilities/arch-capabilities thing). If I'm reading your email right, you're saying that libvirt is (trying to?) move in a direction where instead of having say src/cpu_map/x86_Icelake-Client.xml
That's mostly where I hope we can get. I'd love to see CPU models being probed dynamically from QEMU. And especially the details of each CPU models, i.e., what features it's going to enable. Well, for proper host CPU detection we will still most likely need to have a mapping between CPU signatures and models, but this is unrelated to the issue we're discussing in the thread.
instead there is a very minimal layer that knows that some qemu feature got renamed, and is effectively just a filter "papering over" such qemu issues.
This depends on what you mean by a very minimal layer :-) See below...
And in particular it's no longer the case that every single cpu model, cpu model version (needs commits to both qemu and libvirt)
Yes, I hope so.
and cpu feature needs commits to both qemu and libvirt
No, libvirt will still need each CPU feature to be added explicitly. But this is pretty much a non issue. Defining a new feature is fast and simple (well, unless Intel invents yet another way of advertising supported features) and the features never change.
and they must be in sync.
Features need to be in sync, but CPU models do not (and of course, there's even nothing to sync if we probe the definition in runtime).
I'm presuming this will also mean plumbing through qemu model versions too, as that's an essential part I think.
Right. Jirka

On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote:
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote:
On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote:
>> * Qemu already provide an option 'enforce' to validate if features >> with which vm is started is exactly same as one provided and nothing >> is silently dropped. > Right, but it's not enough. In addition to removed features libvirt also > checks for unexpectedly added features. And you really need to do both. > Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU > says everything is fine, the guest might see more than what you asked. > For example, if a feature is enabled only if a host supports it you may > or may not get it without any complains from QEMU. But if you get it you > really need to explicitly ask for it during migration, otherwise the > feature can just silently disappear. Of course, this would be a really > bad behavior from QEMU, but that does not mean it can't happen (I think > SVM is a bit problematic in this way) and the whole point of libvirt's > checks is to prevent this kind of issues. Hi Jiri, I'm not following this very well. I think you're saying that qemu has had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
and it's libvirt's job/role to paper over those issues? Do you have some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being?
I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention.
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue.
Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 23/11/22 5:11 pm, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote: > On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote: > >>> * Qemu already provide an option 'enforce' to validate if features >>> with which vm is started is exactly same as one provided and nothing >>> is silently dropped. >> Right, but it's not enough. In addition to removed features libvirt also >> checks for unexpectedly added features. And you really need to do both. >> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU >> says everything is fine, the guest might see more than what you asked. >> For example, if a feature is enabled only if a host supports it you may >> or may not get it without any complains from QEMU. But if you get it you >> really need to explicitly ask for it during migration, otherwise the >> feature can just silently disappear. Of course, this would be a really >> bad behavior from QEMU, but that does not mean it can't happen (I think >> SVM is a bit problematic in this way) and the whole point of libvirt's >> checks is to prevent this kind of issues. > Hi Jiri, I'm not following this very well. I think you're saying that qemu has > had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
> and it's libvirt's job/role to paper over those issues? Do you have > some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being? I understand why you want this, but on the other hand adding just a
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote: plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention. I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into
On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote: libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping. Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue. Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs
With regards, Daniel
Daniel, We actually want migratibility of VM too, i mean best set of migratable of features supported by both source and destination Qemu. Sometimes important cpu model/version or features take a lot of time to be updated in libvirt, also other class of issues like what John mentioned multiple source of truths is very common. Thanks Manish Mishra

On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote:
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote:
On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote: > On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote: > >>> * Qemu already provide an option 'enforce' to validate if features >>> with which vm is started is exactly same as one provided and nothing >>> is silently dropped. >> Right, but it's not enough. In addition to removed features libvirt also >> checks for unexpectedly added features. And you really need to do both. >> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU >> says everything is fine, the guest might see more than what you asked. >> For example, if a feature is enabled only if a host supports it you may >> or may not get it without any complains from QEMU. But if you get it you >> really need to explicitly ask for it during migration, otherwise the >> feature can just silently disappear. Of course, this would be a really >> bad behavior from QEMU, but that does not mean it can't happen (I think >> SVM is a bit problematic in this way) and the whole point of libvirt's >> checks is to prevent this kind of issues. > Hi Jiri, I'm not following this very well. I think you're saying that qemu has > had bugs previously where features get silently enabled, Personally, I haven't actually witnessed any bug in this area (as far as I can remember, which is not that far :-)), but got some reports of at least one, even though without any proof.
Specifically, SVM is quite strange as it is included in all AMD CPU models in QEMU and yet if you try to start it on a host without nesting enabled "enforce" does not complain. I saw the feature is enabled with older machine types, but I was told the magic behind this feature looks like not only machine type but even host configuration itself is involved. Anyway, although I saw reports of that the feature could be enabled without an explicit request even with new machine types I still haven't seen any proof of this happening. So I hope it just does not happen and users are only afraid of this possibility.
> and it's libvirt's job/role to paper over those issues? Do you have > some specific cases of this? This is not about papering. It's actually the opposite, that is about detecting if something like this happens and reporting it as a failure rather than papering it and hoping everything goes well. So I think doing this is a good idea even though I don't think we actually saw any issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being?
I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention.
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue.
Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs
That would be nice, but the QEMU passthrough syntax cannot be used for changing options that libvirt already passes to QEMU. So using it would likely result in two separate -cpu options on QEMU command line. And it would not rule out the CPU verification code in libvirt. Remember, we add a default -cpu model in case there's none configured in the XML. Jirka

On Wed, Nov 23, 2022 at 12:51:39PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote:
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote:
On 18/11/22 5:00 pm, Jiri Denemark wrote: > On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote: >> On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote: >> >>>> * Qemu already provide an option 'enforce' to validate if features >>>> with which vm is started is exactly same as one provided and nothing >>>> is silently dropped. >>> Right, but it's not enough. In addition to removed features libvirt also >>> checks for unexpectedly added features. And you really need to do both. >>> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU >>> says everything is fine, the guest might see more than what you asked. >>> For example, if a feature is enabled only if a host supports it you may >>> or may not get it without any complains from QEMU. But if you get it you >>> really need to explicitly ask for it during migration, otherwise the >>> feature can just silently disappear. Of course, this would be a really >>> bad behavior from QEMU, but that does not mean it can't happen (I think >>> SVM is a bit problematic in this way) and the whole point of libvirt's >>> checks is to prevent this kind of issues. >> Hi Jiri, I'm not following this very well. I think you're saying that qemu has >> had bugs previously where features get silently enabled, > Personally, I haven't actually witnessed any bug in this area (as far as > I can remember, which is not that far :-)), but got some reports of at > least one, even though without any proof. > > Specifically, SVM is quite strange as it is included in all AMD CPU > models in QEMU and yet if you try to start it on a host without nesting > enabled "enforce" does not complain. I saw the feature is enabled with > older machine types, but I was told the magic behind this feature looks > like not only machine type but even host configuration itself is > involved. Anyway, although I saw reports of that the feature could be > enabled without an explicit request even with new machine types I still > haven't seen any proof of this happening. So I hope it just does not > happen and users are only afraid of this possibility. > >> and it's libvirt's job/role to paper over those issues? Do you have >> some specific cases of this? > This is not about papering. It's actually the opposite, that is about > detecting if something like this happens and reporting it as a failure > rather than papering it and hoping everything goes well. So I think > doing this is a good idea even though I don't think we actually saw any > issue of this kind.
Hi Jiri, I see now with libvirt master, with check=='full' we verify both silently dropped as well as added features. But as you already stated Qemu silently adding feature is a Qemu bug and libvirt just reports that bug, so it should be very unlikely, i agree that is not a good reasoning :). Our requirement is that we want to use CPU Models and features which are defined in Qemu but not in libvirt for e.g if we want to use Icelake-Server-V4 directly or newly added vmx feature, libvirt does not allow. Currently we take help of qemu to do validations but for cpu feature verfication and model definations we still use libvirt defined definations which prevent us to use anything which is not defined in libvirt. I see there are already efforts going on to get all model and feature defination from qemu itself, but not sure how much time it will take. Till that happens we thought safest option is to have an option to remove all validations from libvirt and rely on qemu 'enforce' for migration safetly. I understand qemu-enforce does not check for silently added features, but that case we can assume is very unlikely and Qemu should fix otherwise VMs will not poweron anyway with check=='full'. Basically we want it as an modification of check='none' but also skipping things like virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if silently adding features is that big concern we can have a check in Qemu itself? I understand current qemu-enforce is not as migration safe as check=='full' but probably suitable for our use case for time being?
I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention.
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue.
Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs
That would be nice, but the QEMU passthrough syntax cannot be used for changing options that libvirt already passes to QEMU. So using it would likely result in two separate -cpu options on QEMU command line. And it would not rule out the CPU verification code in libvirt. Remember, we add a default -cpu model in case there's none configured in the XML.
Two -cpu options isn't a problem, the latter will override the former, which is fine from the level of support intended for QEMU passthrough usage. For the libvirt checking, isn't the 'check=none' attr sufficient to skip checks libvirt does. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 23/11/22 5:44 pm, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 12:51:39PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote: > On 18/11/22 5:00 pm, Jiri Denemark wrote: >> On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote: >>> On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote: >>> >>>>> * Qemu already provide an option 'enforce' to validate if features >>>>> with which vm is started is exactly same as one provided and nothing >>>>> is silently dropped. >>>> Right, but it's not enough. In addition to removed features libvirt also >>>> checks for unexpectedly added features. And you really need to do both. >>>> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU >>>> says everything is fine, the guest might see more than what you asked. >>>> For example, if a feature is enabled only if a host supports it you may >>>> or may not get it without any complains from QEMU. But if you get it you >>>> really need to explicitly ask for it during migration, otherwise the >>>> feature can just silently disappear. Of course, this would be a really >>>> bad behavior from QEMU, but that does not mean it can't happen (I think >>>> SVM is a bit problematic in this way) and the whole point of libvirt's >>>> checks is to prevent this kind of issues. >>> Hi Jiri, I'm not following this very well. I think you're saying that qemu has >>> had bugs previously where features get silently enabled, >> Personally, I haven't actually witnessed any bug in this area (as far as >> I can remember, which is not that far :-)), but got some reports of at >> least one, even though without any proof. >> >> Specifically, SVM is quite strange as it is included in all AMD CPU >> models in QEMU and yet if you try to start it on a host without nesting >> enabled "enforce" does not complain. I saw the feature is enabled with >> older machine types, but I was told the magic behind this feature looks >> like not only machine type but even host configuration itself is >> involved. Anyway, although I saw reports of that the feature could be >> enabled without an explicit request even with new machine types I still >> haven't seen any proof of this happening. So I hope it just does not >> happen and users are only afraid of this possibility. >> >>> and it's libvirt's job/role to paper over those issues? Do you have >>> some specific cases of this? >> This is not about papering. It's actually the opposite, that is about >> detecting if something like this happens and reporting it as a failure >> rather than papering it and hoping everything goes well. So I think >> doing this is a good idea even though I don't think we actually saw any >> issue of this kind. > > Hi Jiri, I see now with libvirt master, with check=='full' we verify > both silently dropped as well as added features. But as you already > stated Qemu silently adding feature is a Qemu bug and libvirt just > reports that bug, so it should be very unlikely, i agree that is not a > good reasoning :). Our requirement is that we want to use CPU Models > and features which are defined in Qemu but not in libvirt for e.g if > we want to use Icelake-Server-V4 directly or newly added vmx feature, > libvirt does not allow. Currently we take help of qemu to do > validations but for cpu feature verfication and model definations we > still use libvirt defined definations which prevent us to use anything > which is not defined in libvirt. I see there are already efforts going > on to get all model and feature defination from qemu itself, but not > sure how much time it will take. Till that happens we thought safest > option is to have an option to remove all validations from libvirt and > rely on qemu 'enforce' for migration safetly. I understand > qemu-enforce does not check for silently added features, but that case > we can assume is very unlikely and Qemu should fix otherwise VMs will > not poweron anyway with check=='full'. Basically we want it as an > modification of check='none' but also skipping things like > virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if > silently adding features is that big concern we can have a check in > Qemu itself? I understand current qemu-enforce is not as migration > safe as check=='full' but probably suitable for our use case for time > being? I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention. I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into
On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote: libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping. Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue. Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs That would be nice, but the QEMU passthrough syntax cannot be used for changing options that libvirt already passes to QEMU. So using it would
On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote: likely result in two separate -cpu options on QEMU command line. And it would not rule out the CPU verification code in libvirt. Remember, we add a default -cpu model in case there's none configured in the XML. Two -cpu options isn't a problem, the latter will override the former, which is fine from the level of support intended for QEMU passthrough usage.
For the libvirt checking, isn't the 'check=none' attr sufficient to skip checks libvirt does.
With regards, Daniel
Actually even within check='none' libvirt does few checks e.g. virCPUValidateFeatures if feature string passed is proper and defined in libvirt. We could extend check='none' to remove even these kind of things. So Jiri suggested that making such kind of changes in general path is not good idea so discussion went to using separate cpu class <qemu:cpu>. Thanks Manish Mishra

On Wed, Nov 23, 2022 at 05:58:41PM +0530, manish.mishra wrote:
On 23/11/22 5:44 pm, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 12:51:39PM +0100, Jiri Denemark wrote:
Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs That would be nice, but the QEMU passthrough syntax cannot be used for changing options that libvirt already passes to QEMU. So using it would
On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote: likely result in two separate -cpu options on QEMU command line. And it would not rule out the CPU verification code in libvirt. Remember, we add a default -cpu model in case there's none configured in the XML. Two -cpu options isn't a problem, the latter will override the former, which is fine from the level of support intended for QEMU passthrough usage.
For the libvirt checking, isn't the 'check=none' attr sufficient to skip checks libvirt does.
With regards, Daniel
Actually even within check='none' libvirt does few checks e.g. virCPUValidateFeatures if feature string passed is proper and defined in libvirt. We could extend check='none' to remove even these kind of things. So Jiri suggested that making such kind of changes in general path is not good idea so discussion went to using separate cpu class <qemu:cpu>.
There is no need to remove any such check, as you wouldn't be specifying any <cpu> element with custom features set, so the virCPUValidateFeuatres check should not be creating any problems at that point. With regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|

On 23/11/22 6:00 pm, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 05:58:41PM +0530, manish.mishra wrote:
On Wed, Nov 23, 2022 at 12:51:39PM +0100, Jiri Denemark wrote:
Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs That would be nice, but the QEMU passthrough syntax cannot be used for changing options that libvirt already passes to QEMU. So using it would
On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote: likely result in two separate -cpu options on QEMU command line. And it would not rule out the CPU verification code in libvirt. Remember, we add a default -cpu model in case there's none configured in the XML. Two -cpu options isn't a problem, the latter will override the former, which is fine from the level of support intended for QEMU passthrough usage.
For the libvirt checking, isn't the 'check=none' attr sufficient to skip checks libvirt does.
With regards, Daniel Actually even within check='none' libvirt does few checks e.g. virCPUValidateFeatures if feature string passed is proper and defined in libvirt. We could extend check='none' to remove even
On 23/11/22 5:44 pm, Daniel P. Berrangé wrote: these kind of things. So Jiri suggested that making such kind of changes in general path is not good idea so discussion went to using separate cpu class <qemu:cpu>. There is no need to remove any such check, as you wouldn't be specifying any <cpu> element with custom features set, so the virCPUValidateFeuatres check should not be creating any problems at that point.
oh i see you meant something like this, <cpu mode='custom' match='exact' check='none'> <numa> <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu> <qemu:commandline> <qemu:arg value='-cpu'/> <qemu:arg value='enforce,Icelake-v4,x2apic=on,hv-time,kvm-pv-eoi=on'/> </qemu:commandline> oh ok gives us sometime to evaluate this will revert soon, I am more worried about if we have to deal with other cpu related things too other than just cpu feature and model. Our current usecase was only around cpu features and cpu model :). Thanks Manish Mishra
With regards, Daniel

On Wed, Nov 23, 2022 at 18:34:38 +0530, manish.mishra wrote:
oh i see you meant something like this,
<cpu mode='custom' match='exact' check='none'> <numa> <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu>
<qemu:commandline> <qemu:arg value='-cpu'/> <qemu:arg value='enforce,Icelake-v4,x2apic=on,hv-time,kvm-pv-eoi=on'/> </qemu:commandline>
This won't work as libvirt should add a default CPU model there and replace check='none' with check='full' once the domain starts. But we actually have a way to do what you need even without any change in libvirt: <cpu mode='host-passthrough' check='none'> <topology .../> <numa> ... </numa> ... </cpu> <qemu:commandline> <qemu:arg value='-cpu'/> <qemu:arg value='Icelake-Server-v4,enforce,x2apic=on,hv-time,kvm-pv-eoi=on'/> </qemu:commandline> This will result in -cpu host,migratable=on ... -cpu Icelake-Server-v4,enforce,x2apic=on,hv-time,kvm-pv-eoi=on and libvirt will not touch the check='none' attribute once the domain starts and it would not complain about any unknown CPU features or models as there's nothing to check in <cpu/>. That is, there's no need to come up with anything special to support passing arbitrary options to -cpu. Jirka

On 24/11/22 5:40 pm, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 18:34:38 +0530, manish.mishra wrote:
oh i see you meant something like this,
<cpu mode='custom' match='exact' check='none'> <numa> <cell id='0' cpus='0-239' memory='8388608' unit='KiB' memAccess='shared'/> </numa> </cpu>
<qemu:commandline> <qemu:arg value='-cpu'/> <qemu:arg value='enforce,Icelake-v4,x2apic=on,hv-time,kvm-pv-eoi=on'/> </qemu:commandline> This won't work as libvirt should add a default CPU model there and replace check='none' with check='full' once the domain starts.
But we actually have a way to do what you need even without any change in libvirt:
<cpu mode='host-passthrough' check='none'> <topology .../> <numa> ... </numa> ... </cpu>
<qemu:commandline> <qemu:arg value='-cpu'/> <qemu:arg value='Icelake-Server-v4,enforce,x2apic=on,hv-time,kvm-pv-eoi=on'/> </qemu:commandline>
This will result in
-cpu host,migratable=on ... -cpu Icelake-Server-v4,enforce,x2apic=on,hv-time,kvm-pv-eoi=on
and libvirt will not touch the check='none' attribute once the domain starts and it would not complain about any unknown CPU features or models as there's nothing to check in <cpu/>.
That is, there's no need to come up with anything special to support passing arbitrary options to -cpu.
Jirka
Thank you so much Jirka, let me give this a try with few different cases. I will let you know how this goes or if we need anything extra. Thanks Manish Mishra

On Wed, Nov 23, 2022 at 12:14:45 +0000, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 12:51:39PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:41:03 +0000, Daniel P. Berrangé wrote:
On Wed, Nov 23, 2022 at 12:29:24PM +0100, Jiri Denemark wrote:
On Wed, Nov 23, 2022 at 11:18:01 +0000, Daniel P. Berrangé wrote:
On Tue, Nov 22, 2022 at 01:48:39PM +0100, Jiri Denemark wrote:
On Mon, Nov 21, 2022 at 12:47:58 +0530, manish.mishra wrote: > > On 18/11/22 5:00 pm, Jiri Denemark wrote: > > On Fri, Nov 18, 2022 at 10:18:59 +0000, John Levon wrote: > >> On Fri, Nov 18, 2022 at 10:52:32AM +0100, Jiri Denemark wrote: > >> > >>>> * Qemu already provide an option 'enforce' to validate if features > >>>> with which vm is started is exactly same as one provided and nothing > >>>> is silently dropped. > >>> Right, but it's not enough. In addition to removed features libvirt also > >>> checks for unexpectedly added features. And you really need to do both. > >>> Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU > >>> says everything is fine, the guest might see more than what you asked. > >>> For example, if a feature is enabled only if a host supports it you may > >>> or may not get it without any complains from QEMU. But if you get it you > >>> really need to explicitly ask for it during migration, otherwise the > >>> feature can just silently disappear. Of course, this would be a really > >>> bad behavior from QEMU, but that does not mean it can't happen (I think > >>> SVM is a bit problematic in this way) and the whole point of libvirt's > >>> checks is to prevent this kind of issues. > >> Hi Jiri, I'm not following this very well. I think you're saying that qemu has > >> had bugs previously where features get silently enabled, > > Personally, I haven't actually witnessed any bug in this area (as far as > > I can remember, which is not that far :-)), but got some reports of at > > least one, even though without any proof. > > > > Specifically, SVM is quite strange as it is included in all AMD CPU > > models in QEMU and yet if you try to start it on a host without nesting > > enabled "enforce" does not complain. I saw the feature is enabled with > > older machine types, but I was told the magic behind this feature looks > > like not only machine type but even host configuration itself is > > involved. Anyway, although I saw reports of that the feature could be > > enabled without an explicit request even with new machine types I still > > haven't seen any proof of this happening. So I hope it just does not > > happen and users are only afraid of this possibility. > > > >> and it's libvirt's job/role to paper over those issues? Do you have > >> some specific cases of this? > > This is not about papering. It's actually the opposite, that is about > > detecting if something like this happens and reporting it as a failure > > rather than papering it and hoping everything goes well. So I think > > doing this is a good idea even though I don't think we actually saw any > > issue of this kind. > > > Hi Jiri, I see now with libvirt master, with check=='full' we verify > both silently dropped as well as added features. But as you already > stated Qemu silently adding feature is a Qemu bug and libvirt just > reports that bug, so it should be very unlikely, i agree that is not a > good reasoning :). Our requirement is that we want to use CPU Models > and features which are defined in Qemu but not in libvirt for e.g if > we want to use Icelake-Server-V4 directly or newly added vmx feature, > libvirt does not allow. Currently we take help of qemu to do > validations but for cpu feature verfication and model definations we > still use libvirt defined definations which prevent us to use anything > which is not defined in libvirt. I see there are already efforts going > on to get all model and feature defination from qemu itself, but not > sure how much time it will take. Till that happens we thought safest > option is to have an option to remove all validations from libvirt and > rely on qemu 'enforce' for migration safetly. I understand > qemu-enforce does not check for silently added features, but that case > we can assume is very unlikely and Qemu should fix otherwise VMs will > not poweron anyway with check=='full'. Basically we want it as an > modification of check='none' but also skipping things like > virCPUValidateFeatures and passing option 'enforce' to Qemu. Or if > silently adding features is that big concern we can have a check in > Qemu itself? I understand current qemu-enforce is not as migration > safe as check=='full' but probably suitable for our use case for time > being?
I understand why you want this, but on the other hand adding just a plain pass-through for CPU model and features is quite dangerous as it can be used to set any -cpu option even if it's not a feature. And especially the parts that are configured in other areas of domain XML (such as hypervisor features). So I think instead of adding a new mode for <cpu> we should go another way. For things that are not yet supported by libvirt we support various elements in qemu namespace, e.g., setting QEMU command line options, environment variables, or even overriding options libvirt would normally set on the command line. So I guess we could have a special <qemu:cpu> element which would be used when a user wants full control over the -cpu option without any libvirt intervention.
I really don't think we should allow this, especially not for something which is clearly intended to be used for production deployment. Our hypervisor CLI passthrough is there to allow for short term workarounds for bugs, or to experiment with a feature before it is mapped into libvirt in a supported manner.
If there are aspects related to QEMU configuration thiat are not in libvirt yet, we need to address those gaps, not provide yet another way to bypass libvirt mapping.
Indeed, this was definitely meant as a short term workaround for stuff we don't have support for yet, for testing or experimental purposes. The supported approach is for implement the missing parts in libvirt (and QEMU) as soon as possible. I don't see why would a properly documented support for experiments with -cpu would be an issue.
Why can't it just use the exsting QEMU passthrough syntax we have. I don't think we should be adding specifial support just for CPUs
That would be nice, but the QEMU passthrough syntax cannot be used for changing options that libvirt already passes to QEMU. So using it would likely result in two separate -cpu options on QEMU command line. And it would not rule out the CPU verification code in libvirt. Remember, we add a default -cpu model in case there's none configured in the XML.
Two -cpu options isn't a problem, the latter will override the former, which is fine from the level of support intended for QEMU passthrough usage.
Oh, OK. In that case this would be usable. But see below.
For the libvirt checking, isn't the 'check=none' attr sufficient to skip checks libvirt does.
It does, although once a domain is started, we update the CPU definition in the XML to use check='full'. So check='none' is only usable for starting a domain. Migration will always use check='full'. We would need to come up with a new check value to completely turn off this functionality. Jirka

Hi Jiri, Thank you so much for response. On 18/11/22 3:22 pm, Jiri Denemark wrote:
Hi.
On Wed, Nov 16, 2022 at 22:53:18 +0530, manish.mishra wrote:
We had a requirement to skip all CPU feature validations from libvirt and totally rely on Qemu for following reasons. You are mixing two different things here. The "enforce" CPU checking for making sure a guest gets exactly what the user asked QEMU for and missing CPU models or features support in libvirt.
I actually meant, if we do checks in libvirt then if some cpu features or models are missing in libvirt we can not use those. 'enforce' was just fallback for checks in that case as atleast someone need to verify if what we passed is taken so that live migrations does not have issues.
* As libvirt always lies behind in term of cpu feature management compare to qemu, so we may not be able to use all of the latest support provided by qemu till it is updated in libvirt too. For example like if new cpu features are added in qemu but not yet updated in libvirt. So you effectively ask for a "passthrough" configuration, that is, CPU model and features would be passed directly to QEMU without libvirt even looking at them, right?
Yes but we want live migrations too.
Anyway, the current goal is to make libvirt ask for supported CPU models including their definition so that we don't have to keep our own definitions. That is, libvirt would support all models as they are added to QEMU without any additional work. Not sure if CPU features would follow, but adding them is fairly easy and what's more important they don't change in time (unlike CPU models).
Jiri, you mean libvirt will ask qemu for all supported models and features instead of maintaining its own list? Is it there in libvirt currently? We also had similar idea in one of our KVM forum talk but thought it is little long task so may take some time, till then we thought we can have this workaround.
* Libvirt checks if a features is supported or not based on cpuid, a feature supported by host may not be supported by qemu or there can be some features which are emulated by Qemu so libvirt level checking does not make sense in those cases. This has not been the case for several years already. Libvirt asks QEMU what CPU features it can provide on the host (either natively or emulated) and checks against them. So no issue here. Well, expect that some features are strange and QEMU does not report them as enabled for -cpu host even though it would enable them for some specific CPU models. But that's more in a "bug" area which needs to be solved between QEMU and libvirt rather than being it a principal issue.
You mean with check =='full', i never tried that but wondered how it works for cpu feature strings which are not defined in libvirt, because if some feature is not defined in libvirt we fail in basic checks like virCPUValidateFeatures. Currently also i agree libvirt asks Qemu for CPU features but i thought that is just for domcapability output or qemuCapability cache file but it was not used in checks.
* Qemu already provide an option 'enforce' to validate if features with which vm is started is exactly same as one provided and nothing is silently dropped. Right, but it's not enough. In addition to removed features libvirt also checks for unexpectedly added features. And you really need to do both. Because if you ask for -cpu Model,feat1=on,feat2=on,enforce and QEMU says everything is fine, the guest might see more than what you asked. For example, if a feature is enabled only if a host supports it you may or may not get it without any complains from QEMU. But if you get it you really need to explicitly ask for it during migration, otherwise the feature can just silently disappear. Of course, this would be a really bad behavior from QEMU, but that does not mean it can't happen (I think SVM is a bit problematic in this way) and the whole point of libvirt's checks is to prevent this kind of issues.
So once QEMU starts we checks all features enabled in the vCPU and note those that were (unexpectedly) added so that we can explicitly ask for them during migration and also check that no other features were added on top. And for this whole process to work, we need to understand, i.e., have explicit support, for all CPU features (QEMU report a lot of CPU properties and we need to know which one needs to be stored as enabled features) and CPU models to make sure both sides of migration understand the CPU definition in domain XML in the same way.
Our assumption was that qemu-enforce will make sure live migrations are safe as nothing can be silently dropped with enforce and it will be user's resposnsibilty to always pass suported features. But i was not aware of this, that Qemu can drop some features silently even with enforce, Jiri can you please point us to some of the example for this. How libvirt handles these cases as of now with check == 'full' ? Then again i had same doubt does it work fine for features which are defined in qemu but not in libvirt?
Basically we want another check option like 'qemu-enforce' which will skip all feature and cpu model verfications in libvirt and pass 'enforce' option to qemu. It will work similar to check 'none' i mean no check to verify if provide features are supported by host, also on top of that skip things like virCPUValidateFeatures where ever required. In short, I don't think this is a good idea and we should or could support it and still maintain migration safety. If you don't care about migration safety, you don't need to care about "enforce" QEMU check either and can just use host-passthrough and get whatever QEMU supports on the host.
Jirka
Thanks Manish Mishra
participants (4)
-
Daniel P. Berrangé
-
Jiri Denemark
-
John Levon
-
manish.mishra