[libvirt] [RFC] Support for CPUID masking

Hi, We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking. My proposal is to define new 'cpuid' feature advertised in guest capabilities: <guest> ... <features> <cpuid/> </feature> </guest> When a driver supports cpuid feature, one can use it to mask/check for specific bits returned by CPU ID as follows: <domain> ... <features> <cpuid> <mask level='hex' register='eax|ebx|ecx|edx'>MASK</mask> </cpuid> </features> ... </domain> Where - level is a hexadecimal number used as an input to CPU ID (i.e. eax register), - register is one of eax, ebx, ecx or edx, - and MASK is a string with the following format: mmmm:mmmm:mmmm:mmmm:mmmm:mmmm:mmmm:mmmm with m being 1-bit mask for the corresponding bit in the register. There are three possibilities of specifying what values can be used for 'm': - let it be driver-specific, - define all possible values, - define a common set of values and allow drivers to specify their own additional values. I think the third is the way to go as it lowers the confusion of different values used by different drivers for the same purpose while maintaining the flexibility to support driver-specific masks. The following could be a good set of predefined common values: - 1 force the bit to be 1 - 0 force the bit to be 0 - x don't care, i.e., use driver's default value - T require the bit to be 1 - F require the bit to be 0 Some examples of what it could look like follow: <capabilities> ... <guest> <os_type>xen</os_type> ... <features> <pae/> <cpuid/> </features> </guest> ... </capabilities> <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> <mask level='1' register='ecx'> xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xx1x:xxxx </mask> <mask level='1' register='edx'> xxx1:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx </mask> <mask level='80000001' register='ecx'> xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xx1x </mask> <mask level='80000008' register='ecx'> xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xx00:1001 </mask> </cpuid> </features> ... </domain> What are your opinions about this? Thank for all comments. Jirka

2009/9/2 Jiri Denemark <jdenemar@redhat.com>:
Hi,
We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
[...]
<domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> <mask level='1' register='ecx'> xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xx1x:xxxx </mask> <mask level='1' register='edx'> xxx1:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx </mask> <mask level='80000001' register='ecx'> xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xx1x </mask> <mask level='80000008' register='ecx'> xxxx:xxxx:xxxx:xxxx:xxxx:xxxx:xx00:1001 </mask> </cpuid> </features> ... </domain>
I like the proposed mapping for the domain XML, because it's an 1:1 mapping of what VMware uses in the VI API [1] and the VMX config. Beside that VMware has two more possible values for the CPUID bits: H and R. Both are used to define how to handle/interpret those bits in the context of VMotion (migration). For example the domain XML snippet above maps to this VMX snippet: cpuid.1.ebx = "XXXXXXXX00001010XXXXXXXXXXXXXXXX" cpuid.1.ecx = "XXXXXXXXXXXXXXXXXXXXXXXXXX1XXXXX" cpuid.1.edx = "XXX1XXXXXXXXXXXXXXXXXXXXXXXXXXXX" cpuid.80000001.ecx = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXX1X" cpuid.80000008.ecx = "XXXXXXXXXXXXXXXXXXXXXXXXXX001001" Matthias [1] http://www.vmware.com/support/developer/vc-sdk/visdk400pubs/ReferenceGuide/v...

Jiri Denemark wrote:
Hi,
We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
My proposal is to define new 'cpuid' feature advertised in guest capabilities: ... <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> ... What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.: -cpu core2duo,model=23,+ssse3,+lahf_lm I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well. -jim

On Wed, Sep 02, 2009 at 11:59:39AM -0400, Jim Paris wrote:
Jiri Denemark wrote:
Hi,
We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
My proposal is to define new 'cpuid' feature advertised in guest capabilities: ... <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> ... What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.:
-cpu core2duo,model=23,+ssse3,+lahf_lm
I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well.
I'm going back & forth on this too. We essentially have 3 options - Named CPU + flags/features - CPUID masks - Allow either If we do either of the first two, we have to translate between the two formats for one or more of the hypervisors. For the last one we are just punting the problem off to applications. If we choose CPUID, and made QEMU driver convert to named CPU + flags we'd be stuck for non-x86 as you say. If we chose named CPU + flags, and made VMWare/Xen convert to raw CPUID we'd potentially loose information if user had defined a config with a raw CPUID mask outside context of libvirt. The other thing to remember is that CPUID also encodes sockets/cores/ threads topology data, and it'd be very desirable to expose that in a sensible fashion (ie not a bitmask). On balance i'm currently leaning to named CPU + flags + expliciti topology data because although its harder to implement for Xen/VMWare I think its much nicer to applications & users. We might loose a tiny bit of data in the CPU -> named/flags conversion for Xen/VMWare but I reckon we can get it good enough that most people won't really care about that. Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 09/02/2009 07:09 PM, Daniel P. Berrange wrote:
On Wed, Sep 02, 2009 at 11:59:39AM -0400, Jim Paris wrote:
Jiri Denemark wrote:
Hi,
We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
My proposal is to define new 'cpuid' feature advertised in guest capabilities: ... <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> ... What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.:
-cpu core2duo,model=23,+ssse3,+lahf_lm
I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well.
I'm going back& forth on this too. We essentially have 3 options
- Named CPU + flags/features - CPUID masks - Allow either
If we do either of the first two, we have to translate between the two formats for one or more of the hypervisors. For the last one we are just punting the problem off to applications.
If we choose CPUID, and made QEMU driver convert to named CPU + flags we'd be stuck for non-x86 as you say.
Why is that? cpu model + flags may apply for other arch too.
If we chose named CPU + flags, and made VMWare/Xen convert to raw CPUID we'd potentially loose information if user had defined a config with a raw CPUID mask outside context of libvirt.
The other thing to remember is that CPUID also encodes sockets/cores/ threads topology data, and it'd be very desirable to expose that in a sensible fashion (ie not a bitmask).
On balance i'm currently leaning to named CPU + flags + expliciti topology data because although its harder to implement for Xen/VMWare I think its much nicer to applications& users. We might loose a tiny bit of data in the CPU -> named/flags conversion for Xen/VMWare but I reckon we can get it good enough that most people won't really care about that.
Daniel
There are 2 more issues to consider: 1. The VMW approach with all the cpuid bits might be ok, the problem is to map it into qemu model, will libvirt to that? 2. If we use the qemu approach, the host information (cpuids) need to travel to higher mgmt level in order to allow computation of greatest common denominator. Regards, Dor

On Thu, Sep 03, 2009 at 11:19:47AM +0300, Dor Laor wrote:
On 09/02/2009 07:09 PM, Daniel P. Berrange wrote:
On Wed, Sep 02, 2009 at 11:59:39AM -0400, Jim Paris wrote:
Jiri Denemark wrote:
Hi,
We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
My proposal is to define new 'cpuid' feature advertised in guest capabilities: ... <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> ... What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.:
-cpu core2duo,model=23,+ssse3,+lahf_lm
I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well.
I'm going back& forth on this too. We essentially have 3 options
- Named CPU + flags/features - CPUID masks - Allow either
If we do either of the first two, we have to translate between the two formats for one or more of the hypervisors. For the last one we are just punting the problem off to applications.
If we choose CPUID, and made QEMU driver convert to named CPU + flags we'd be stuck for non-x86 as you say.
Why is that? cpu model + flags may apply for other arch too.
If we have CPUID in the XML, there is no meaningful CPUID register representation for sparc/ppc/arm/etc. It is an x86 concept, which is almost certainly why QEMU uses named CPU models + named flags instead of CPUID as is public facing config. Xen/VMWare of course don't have this limitation since they only really care about x86. So really QEMU's CPU model + flags approach is more generic albeit being much more verbose to achieve the same level of expressivity.
If we chose named CPU + flags, and made VMWare/Xen convert to raw CPUID we'd potentially loose information if user had defined a config with a raw CPUID mask outside context of libvirt.
The other thing to remember is that CPUID also encodes sockets/cores/ threads topology data, and it'd be very desirable to expose that in a sensible fashion (ie not a bitmask).
On balance i'm currently leaning to named CPU + flags + expliciti topology data because although its harder to implement for Xen/VMWare I think its much nicer to applications& users. We might loose a tiny bit of data in the CPU -> named/flags conversion for Xen/VMWare but I reckon we can get it good enough that most people won't really care about that.
Daniel
There are 2 more issues to consider: 1. The VMW approach with all the cpuid bits might be ok, the problem is to map it into qemu model, will libvirt to that?
THe problem is that CPUID is not viable for non-x86 archs so can't really be used as our master representation
2. If we use the qemu approach, the host information (cpuids) need to travel to higher mgmt level in order to allow computation of greatest common denominator.
Yes, whatever we decide for exposing guest CPU model/flags/etc should be equally applied to the libvirt capabilities XML so that apps can query physical host data Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Speaking from an x86 angle,providing an ability to enable or disable high level constructs like SSE instead of low level constructs, will make it easy to understand. Thanks Mukesh On Thu, Sep 3, 2009 at 3:09 PM, Daniel P. Berrange <berrange@redhat.com> wrote:
On Thu, Sep 03, 2009 at 11:19:47AM +0300, Dor Laor wrote:
On 09/02/2009 07:09 PM, Daniel P. Berrange wrote:
On Wed, Sep 02, 2009 at 11:59:39AM -0400, Jim Paris wrote:
Jiri Denemark wrote:
Hi,
We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
My proposal is to define new 'cpuid' feature advertised in guest capabilities: ... <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> ... What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.:
-cpu core2duo,model=23,+ssse3,+lahf_lm
I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well.
I'm going back& forth on this too. We essentially have 3 options
- Named CPU + flags/features - CPUID masks - Allow either
If we do either of the first two, we have to translate between the two formats for one or more of the hypervisors. For the last one we are just punting the problem off to applications.
If we choose CPUID, and made QEMU driver convert to named CPU + flags we'd be stuck for non-x86 as you say.
Why is that? cpu model + flags may apply for other arch too.
If we have CPUID in the XML, there is no meaningful CPUID register representation for sparc/ppc/arm/etc. It is an x86 concept, which is almost certainly why QEMU uses named CPU models + named flags instead of CPUID as is public facing config.
Xen/VMWare of course don't have this limitation since they only really care about x86.
So really QEMU's CPU model + flags approach is more generic albeit being much more verbose to achieve the same level of expressivity.
If we chose named CPU + flags, and made VMWare/Xen convert to raw CPUID we'd potentially loose information if user had defined a config with a raw CPUID mask outside context of libvirt.
The other thing to remember is that CPUID also encodes sockets/cores/ threads topology data, and it'd be very desirable to expose that in a sensible fashion (ie not a bitmask).
On balance i'm currently leaning to named CPU + flags + expliciti topology data because although its harder to implement for Xen/VMWare I think its much nicer to applications& users. We might loose a tiny bit of data in the CPU -> named/flags conversion for Xen/VMWare but I reckon we can get it good enough that most people won't really care about that.
Daniel
There are 2 more issues to consider: 1. The VMW approach with all the cpuid bits might be ok, the problem is to map it into qemu model, will libvirt to that?
THe problem is that CPUID is not viable for non-x86 archs so can't really be used as our master representation
2. If we use the qemu approach, the host information (cpuids) need to travel to higher mgmt level in order to allow computation of greatest common denominator.
Yes, whatever we decide for exposing guest CPU model/flags/etc should be equally applied to the libvirt capabilities XML so that apps can query physical host data
Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On 10/09/2009 12:58 PM, Mukesh G wrote:
Speaking from an x86 angle,providing an ability to enable or disable high level constructs like SSE instead of low level constructs, will make it easy to understand.
ain't SSE a low level construct too? What about another approach for the cpuid issue: I think that dealing with specific flags is pretty error prone on all levels - virt-mgr, libvirt, qemu, migration, and even the guest. We can't just 'invent' new cpu modules that will be a combination of flags we have on the host. It might work for some kernels/apps but it might break others. In addition to cpuid we have the stepping and more MSRs that needed to be hidden/exposed. The offer: Libvirt, virt-mgr, qemu will not deal with lowlevel bits of the cpuid. We'll use predefined cpu modules to be emulated. These predefined modules will represent a real module that was once shipped by Intel/AMD. We'll write an additional tool, not under libvirt, that will be able to calculate all the virtual cpus that can run over the physical cpu. This tool will be the one messing with /proc/cpuinfo, etc. Example (theoretical): Physical cpu is "Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz" the output of the tool will be: shell>computeVirtCpuCapabilities core2duo core solo 486 pentium3 .. Libvirt will only expose the real physical cpu and all of the outputs above. Higher level mgmt will compute the best -cpu to pick for the VM, and it will take account the user needs for performance or for migration flexibility. The cons: - Simple for all levels - Fast implementation - Accurate representative of real cpus The pros: - none - we should write the tool anyway - Maybe we hide some of the possibilities, but as said above, it is safer and friendlier. I also recommend of adding a field to be added next to the cpu for flexibility and possible future changes + dealing with problematic bios with NX bit disabled. Regards, Dor
Thanks
Mukesh
On Thu, Sep 3, 2009 at 3:09 PM, Daniel P. Berrange<berrange@redhat.com> wrote:
On Thu, Sep 03, 2009 at 11:19:47AM +0300, Dor Laor wrote:
On 09/02/2009 07:09 PM, Daniel P. Berrange wrote:
On Wed, Sep 02, 2009 at 11:59:39AM -0400, Jim Paris wrote:
Jiri Denemark wrote:
Hi,
We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
My proposal is to define new 'cpuid' feature advertised in guest capabilities: ... <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> ... What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.:
-cpu core2duo,model=23,+ssse3,+lahf_lm
I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well.
I'm going back& forth on this too. We essentially have 3 options
- Named CPU + flags/features - CPUID masks - Allow either
If we do either of the first two, we have to translate between the two formats for one or more of the hypervisors. For the last one we are just punting the problem off to applications.
If we choose CPUID, and made QEMU driver convert to named CPU + flags we'd be stuck for non-x86 as you say.
Why is that? cpu model + flags may apply for other arch too.
If we have CPUID in the XML, there is no meaningful CPUID register representation for sparc/ppc/arm/etc. It is an x86 concept, which is almost certainly why QEMU uses named CPU models + named flags instead of CPUID as is public facing config.
Xen/VMWare of course don't have this limitation since they only really care about x86.
So really QEMU's CPU model + flags approach is more generic albeit being much more verbose to achieve the same level of expressivity.
If we chose named CPU + flags, and made VMWare/Xen convert to raw CPUID we'd potentially loose information if user had defined a config with a raw CPUID mask outside context of libvirt.
The other thing to remember is that CPUID also encodes sockets/cores/ threads topology data, and it'd be very desirable to expose that in a sensible fashion (ie not a bitmask).
On balance i'm currently leaning to named CPU + flags + expliciti topology data because although its harder to implement for Xen/VMWare I think its much nicer to applications& users. We might loose a tiny bit of data in the CPU -> named/flags conversion for Xen/VMWare but I reckon we can get it good enough that most people won't really care about that.
Daniel
There are 2 more issues to consider: 1. The VMW approach with all the cpuid bits might be ok, the problem is to map it into qemu model, will libvirt to that?
THe problem is that CPUID is not viable for non-x86 archs so can't really be used as our master representation
2. If we use the qemu approach, the host information (cpuids) need to travel to higher mgmt level in order to allow computation of greatest common denominator.
Yes, whatever we decide for exposing guest CPU model/flags/etc should be equally applied to the libvirt capabilities XML so that apps can query physical host data
Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

On Sun, Oct 11, 2009 at 11:44:04AM +0200, Dor Laor wrote:
On 10/09/2009 12:58 PM, Mukesh G wrote:
Speaking from an x86 angle,providing an ability to enable or disable high level constructs like SSE instead of low level constructs, will make it easy to understand.
ain't SSE a low level construct too?
What about another approach for the cpuid issue: I think that dealing with specific flags is pretty error prone on all levels - virt-mgr, libvirt, qemu, migration, and even the guest. We can't just 'invent' new cpu modules that will be a combination of flags we have on the host. It might work for some kernels/apps but it might break others. In addition to cpuid we have the stepping and more MSRs that needed to be hidden/exposed.
The offer: Libvirt, virt-mgr, qemu will not deal with lowlevel bits of the cpuid. We'll use predefined cpu modules to be emulated. These predefined modules will represent a real module that was once shipped by Intel/AMD.
We'll write an additional tool, not under libvirt, that will be able to calculate all the virtual cpus that can run over the physical cpu. This tool will be the one messing with /proc/cpuinfo, etc.
Example (theoretical): Physical cpu is "Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz" the output of the tool will be: shell>computeVirtCpuCapabilities core2duo core solo 486 pentium3 ..
Libvirt will only expose the real physical cpu and all of the outputs above. Higher level mgmt will compute the best -cpu to pick for the VM, and it will take account the user needs for performance or for migration flexibility.
The cons: - Simple for all levels - Fast implementation - Accurate representative of real cpus
The pros: - none - we should write the tool anyway - Maybe we hide some of the possibilities, but as said above, it is safer and friendlier. I also recommend of adding a field to be added next to the cpu for flexibility and possible future changes + dealing with problematic bios with NX bit disabled.
Regards, Dor
Dan was discussing something quite like this early on, IIRC... --Hugh
Thanks
Mukesh
On Thu, Sep 3, 2009 at 3:09 PM, Daniel P. Berrange<berrange@redhat.com> wrote:
On Thu, Sep 03, 2009 at 11:19:47AM +0300, Dor Laor wrote:
On 09/02/2009 07:09 PM, Daniel P. Berrange wrote:
On Wed, Sep 02, 2009 at 11:59:39AM -0400, Jim Paris wrote:
Jiri Denemark wrote: > Hi, > > We need to provide support for CPU ID masking. Xen and VMware ESX are > examples > of current hypervisors which support such masking. > > My proposal is to define new 'cpuid' feature advertised in guest > capabilities: ... > <domain type='xen' id='42'> > ... > <features> > <pae/> > <acpi/> > <apic/> > <cpuid> > <mask level='1' register='ebx'> > xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx > </mask> ... > What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.:
-cpu core2duo,model=23,+ssse3,+lahf_lm
I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well.
I'm going back& forth on this too. We essentially have 3 options
- Named CPU + flags/features - CPUID masks - Allow either
If we do either of the first two, we have to translate between the two formats for one or more of the hypervisors. For the last one we are just punting the problem off to applications.
If we choose CPUID, and made QEMU driver convert to named CPU + flags we'd be stuck for non-x86 as you say.
Why is that? cpu model + flags may apply for other arch too.
If we have CPUID in the XML, there is no meaningful CPUID register representation for sparc/ppc/arm/etc. It is an x86 concept, which is almost certainly why QEMU uses named CPU models + named flags instead of CPUID as is public facing config.
Xen/VMWare of course don't have this limitation since they only really care about x86.
So really QEMU's CPU model + flags approach is more generic albeit being much more verbose to achieve the same level of expressivity.
If we chose named CPU + flags, and made VMWare/Xen convert to raw CPUID we'd potentially loose information if user had defined a config with a raw CPUID mask outside context of libvirt.
The other thing to remember is that CPUID also encodes sockets/cores/ threads topology data, and it'd be very desirable to expose that in a sensible fashion (ie not a bitmask).
On balance i'm currently leaning to named CPU + flags + expliciti topology data because although its harder to implement for Xen/VMWare I think its much nicer to applications& users. We might loose a tiny bit of data in the CPU -> named/flags conversion for Xen/VMWare but I reckon we can get it good enough that most people won't really care about that.
Daniel
There are 2 more issues to consider: 1. The VMW approach with all the cpuid bits might be ok, the problem is to map it into qemu model, will libvirt to that?
THe problem is that CPUID is not viable for non-x86 archs so can't really be used as our master representation
2. If we use the qemu approach, the host information (cpuids) need to travel to higher mgmt level in order to allow computation of greatest common denominator.
Yes, whatever we decide for exposing guest CPU model/flags/etc should be equally applied to the libvirt capabilities XML so that apps can query physical host data
Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list

Dor Laor wrote:
What about another approach for the cpuid issue: I think that dealing with specific flags is pretty error prone on all levels - virt-mgr, libvirt, qemu, migration, and even the guest.
..and performance verification, QA, and the average end user. Unless we reduce all possible combinations of knob settings into a few well thought out lumped models the complexity can be overwhelming. It is probably a reasonable compromise to initially support the most obvious use cases with more fringe models added on an as-needed, as-justified basis.
We can't just 'invent' new cpu modules that will be a combination of flags we have on the host. It might work for some kernels/apps but it might break others. In addition to cpuid we have the stepping and more MSRs that needed to be hidden/exposed.
In whatever abstraction, exposing all of that low-level detail as the sole configuration means will needlessly exaggerate the complexity as above.
The offer: Libvirt, virt-mgr, qemu will not deal with lowlevel bits of the cpuid. We'll use predefined cpu modules to be emulated. These predefined modules will represent a real module that was once shipped by Intel/AMD.
We'll write an additional tool, not under libvirt, that will be able to calculate all the virtual cpus that can run over the physical cpu. This tool will be the one messing with /proc/cpuinfo, etc.
Example (theoretical): Physical cpu is "Intel(R) Core(TM)2 Duo CPU T9600 @ 2.80GHz" the output of the tool will be: shell>computeVirtCpuCapabilities core2duo core solo 486 pentium3 ..
Libvirt will only expose the real physical cpu and all of the outputs above. Higher level mgmt will compute the best -cpu to pick for the VM, and it will take account the user needs for performance or for migration flexibility.
The cons: - Simple for all levels - Fast implementation - Accurate representative of real cpus
The pros: - none - we should write the tool anyway - Maybe we hide some of the possibilities, but as said above, it is safer and friendlier.
I think paving over the complexity to export the most common use cases is a reasonable approach. We can allow some sort of fingers-in-the-gears mode for experimentation and tuning as needed. But supporting features such as safe migration could be a non-goal in these scenarios. -john
I also recommend of adding a field to be added next to the cpu for flexibility and possible future changes + dealing with problematic bios with NX bit disabled.
Regards, Dor
Thanks
Mukesh
On Thu, Sep 3, 2009 at 3:09 PM, Daniel P. Berrange<berrange@redhat.com> wrote:
On Thu, Sep 03, 2009 at 11:19:47AM +0300, Dor Laor wrote:
On 09/02/2009 07:09 PM, Daniel P. Berrange wrote:
On Wed, Sep 02, 2009 at 11:59:39AM -0400, Jim Paris wrote:
Jiri Denemark wrote: > Hi, > > We need to provide support for CPU ID masking. Xen and VMware ESX > are > examples > of current hypervisors which support such masking. > > My proposal is to define new 'cpuid' feature advertised in guest > capabilities: ... > <domain type='xen' id='42'> > ... > <features> > <pae/> > <acpi/> > <apic/> > <cpuid> > <mask level='1' register='ebx'> > xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx > </mask> ... > What are your opinions about this?
I think it's too low-level, and the structure is x86-specific. QEMU and KVM compute their CPUID response based on arguments to the -cpu argument, e.g.:
-cpu core2duo,model=23,+ssse3,+lahf_lm
I think a similar structure makes more sense for libvirt, where the configuration generally avoids big blocks of binary data, and the XML format should suit other architectures as well.
I'm going back& forth on this too. We essentially have 3 options
- Named CPU + flags/features - CPUID masks - Allow either
If we do either of the first two, we have to translate between the two formats for one or more of the hypervisors. For the last one we are just punting the problem off to applications.
If we choose CPUID, and made QEMU driver convert to named CPU + flags we'd be stuck for non-x86 as you say.
Why is that? cpu model + flags may apply for other arch too.
If we have CPUID in the XML, there is no meaningful CPUID register representation for sparc/ppc/arm/etc. It is an x86 concept, which is almost certainly why QEMU uses named CPU models + named flags instead of CPUID as is public facing config.
Xen/VMWare of course don't have this limitation since they only really care about x86.
So really QEMU's CPU model + flags approach is more generic albeit being much more verbose to achieve the same level of expressivity.
If we chose named CPU + flags, and made VMWare/Xen convert to raw CPUID we'd potentially loose information if user had defined a config with a raw CPUID mask outside context of libvirt.
The other thing to remember is that CPUID also encodes sockets/cores/ threads topology data, and it'd be very desirable to expose that in a sensible fashion (ie not a bitmask).
On balance i'm currently leaning to named CPU + flags + expliciti topology data because although its harder to implement for Xen/VMWare I think its much nicer to applications& users. We might loose a tiny bit of data in the CPU -> named/flags conversion for Xen/VMWare but I reckon we can get it good enough that most people won't really care about that.
Daniel
There are 2 more issues to consider: 1. The VMW approach with all the cpuid bits might be ok, the problem is to map it into qemu model, will libvirt to that?
THe problem is that CPUID is not viable for non-x86 archs so can't really be used as our master representation
2. If we use the qemu approach, the host information (cpuids) need to travel to higher mgmt level in order to allow computation of greatest common denominator.
Yes, whatever we decide for exposing guest CPU model/flags/etc should be equally applied to the libvirt capabilities XML so that apps can query physical host data
Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
-- Libvir-list mailing list Libvir-list@redhat.com https://www.redhat.com/mailman/listinfo/libvir-list
-- john.cooper@redhat.com

On Tue, Oct 13, 2009 at 02:56:41AM -0400, john cooper wrote:
Dor Laor wrote:
What about another approach for the cpuid issue: I think that dealing with specific flags is pretty error prone on all levels - virt-mgr, libvirt, qemu, migration, and even the guest.
..and performance verification, QA, and the average end user. Unless we reduce all possible combinations of knob settings into a few well thought out lumped models the complexity can be overwhelming.
That is a policy decision for applications to make. libvirt should expose the fine grained named CPU models + arbitrary flags, and other bits of info as appropriate (eg formal model for core/socket topology). Apps can decide whether they want to turn that into a higher level concept where admins pick one of a handful of common setups, or expose the full level of control Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

On 10/13/2009 11:40 AM, Daniel P. Berrange wrote:
On Tue, Oct 13, 2009 at 02:56:41AM -0400, john cooper wrote:
Dor Laor wrote:
What about another approach for the cpuid issue: I think that dealing with specific flags is pretty error prone on all levels - virt-mgr, libvirt, qemu, migration, and even the guest.
..and performance verification, QA, and the average end user. Unless we reduce all possible combinations of knob settings into a few well thought out lumped models the complexity can be overwhelming.
That is a policy decision for applications to make. libvirt should expose the fine grained named CPU models + arbitrary flags, and other bits of info as appropriate (eg formal model for core/socket topology). Apps can decide whether they want to turn that into a higher level concept where admins pick one of a handful of common setups, or expose the full level of control
As long as the cpu model is exposed (both host and guest) it works for me. My guess is that most apps will try to be as dumb as possible. Some might only use problematic flags like the NX bit that might be off in the bios menu.
Daniel

We need to provide support for CPU ID masking. Xen and VMware ESX are examples of current hypervisors which support such masking.
My proposal is to define new 'cpuid' feature advertised in guest capabilities: ... <domain type='xen' id='42'> ... <features> <pae/> <acpi/> <apic/> <cpuid> <mask level='1' register='ebx'> xxxx:xxxx:0000:1010:xxxx:xxxx:xxxx:xxxx </mask> ... What are your opinions about this?
I think it's too low-level, and the structure is x86-specific.
Hmm, right, it's only for x86... I'll come up with something better. Jirka
participants (8)
-
Daniel P. Berrange
-
Dor Laor
-
Hugh O. Brock
-
Jim Paris
-
Jiri Denemark
-
john cooper
-
Matthias Bolte
-
Mukesh G