[Libvir] XML format for QEMU / KVM driver

Since KVM is the new "shiny thing" in the virtualization world for Linux I figure its time to resurrect & finish the prototype QEMU backend I started developing for libvirt. Thanks to the design of KVM, it ought to be possible to support KVM & QEMU in a single driver with near-zero extra effort to add the bits for KVM support. There's a couple of interesting decisions wrt additions in the XML format describing guests. 1. QEMU can be built with several different CPU accelerators, eg KQEMU, QVM86 or KVM. These are only available if the guest CPU matches the host CPU architecture though. 2. QEMU can emulate many different CPU types. x86, x86_64, mips, sparc, ppc, arm, etc 3. QEMU can emulate many different different 'machine' types, for each CPU type. eg for x86 cpus: $ qemu -M ? Supported machines are: pc Standard PC (default) isapc ISA-only PC Some of these options are activated by specifying a particular command line flag, eg -M for machine type. Others require you to use a different qemu binary, eg 'qemu-system-arm', 'qemu-system-ppc' instead of 'qemu'. QEMU is basically a fully-virt solution, so XML description for the <os> block would be superficically similar to Xen HVM support, in that it will typically try to boot the kernel found in the MBR of the first harddisk (unless you specify an alternate boot device). Unlike Xen HVM support, it can also have an explicit kernel, initrd & arguments specified via command line. The other difference is that Xen has a separater binary for the loader, vs device model - QEMU/KVM does it all in one binary. So, my initial thoughts are: 1. Use the <type> element within the <os> block to describe the CPU accelerator to use, accepting the values 'qemu', 'kvm', 'kqemu' or 'qvm86'. 2. For CPU architecture, there are a couple of choices a) Add an 'arch' attribute to the <type> element to select between x86, ppc, sparc, etc. b) Add an <arch> element within the <os> block to switch between architectures. c) Just use the <loader> element to point to the QEMU binary matching the desired architecture. d) Use the <emulator> element within the <devices> section to point to the QEMU binary matching the desired architecture. e) Option a) and also allow use of <loader> to override the default QEMU binary to use f) Option b) and also allow use of <loader> to override the default QEMU binary to use g) Option a) and also allow use of <emulator> within the <devices> block to override the default QEMU binary to use h) Option b) and also allow use of <emulator> within the <devices> block to override the default QEMU binary to use At this time, i'm leaning towards either option e). This would give examples like * Using PPC with regular non-accelerated QEMU <os> <type arch="ppc">qemu</type> </os> * Using non-accelerated QEMU, and guest architecture matching the host <os> <type>qemu</type> </os> * Using accelerated KVM <os> <type>kvm</type> </os> * Using accelerated KVM, but with an alernate binary <os> <type>kvm</type> <loader>/home/berrange/work/kvm-devel/qemu-kvm</loader> </os> 3. For machine type, again there are a couple of options: a) Add a 'machine' attribute to the <type> element b) Add a <machine> element within the <os> block For this I think we should just follow same scheme that we use to specify architecture. So I'd lean towards a) * Using PPC with a Mac-99 machine: <os> <type arch="ppc" machine="mac99">qemu</type> </os> * Using non-accelerated QEMU, and guest architecture matching the host, and a non-PCI machine: <os> <type machine="isapc">qemu</type> </os> All the other bits of XML like disk/network/console configuration are pretty straightforward, following very similar structure to Xen. So there isn't really much interesting stuff to discuss there. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Wed, 2007-01-03 at 15:11 +0000, Daniel P. Berrange wrote:
Since KVM is the new "shiny thing" in the virtualization world for Linux I figure its time to resurrect & finish the prototype QEMU backend I started developing for libvirt. Thanks to the design of KVM, it ought to be possible to support KVM & QEMU in a single driver with near-zero extra effort to add the bits for KVM support.
Note that KVM can conceivably be used with something other than qemu as the device model. Although there isn't anything else currently doing so, someone on the kvm list is working on something. Also, there is some investigation underway on ways to do some paravirt with KVM. [snip]
1. Use the <type> element within the <os> block to describe the CPU accelerator to use, accepting the values 'qemu', 'kvm', 'kqemu' or 'qvm86'.
Given the above, I'd lean a bit towards doing something like <type accel="kvm">qemu</type> instead. This feels a little bit more consistent with the rest of the proposal. Jeremy

On Wed, Jan 03, 2007 at 10:27:03AM -0500, Jeremy Katz wrote:
On Wed, 2007-01-03 at 15:11 +0000, Daniel P. Berrange wrote:
Since KVM is the new "shiny thing" in the virtualization world for Linux I figure its time to resurrect & finish the prototype QEMU backend I started developing for libvirt. Thanks to the design of KVM, it ought to be possible to support KVM & QEMU in a single driver with near-zero extra effort to add the bits for KVM support.
Note that KVM can conceivably be used with something other than qemu as the device model. Although there isn't anything else currently doing so, someone on the kvm list is working on something. Also, there is some investigation underway on ways to do some paravirt with KVM.
Well KVM with a non-QEMU driver model would require a different libvirt backend, because the manner of starting/stoppping/managing the driver model would be completely different. So in this scenario we're only really interested in differentiating the different means of using QEMU itself, of which KVM is one option.
[snip]
1. Use the <type> element within the <os> block to describe the CPU accelerator to use, accepting the values 'qemu', 'kvm', 'kqemu' or 'qvm86'.
Given the above, I'd lean a bit towards doing something like <type accel="kvm">qemu</type>
instead. This feels a little bit more consistent with the rest of the proposal.
Having the contents of <type> always be 'qemu' is a little redudant, since we already specify 'qemu' as the guest type in the top-level XML element. I see 'type' as referring the to virtualization strategy for the driver backend, eg Xen has 'linux' or 'hvm' approaches. QEMU has 'qemu', kvm or kqemu approaches. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Wed, Jan 03, 2007 at 03:11:58PM +0000, Daniel P. Berrange wrote:
QEMU is basically a fully-virt solution, so XML description for the <os> block would be superficically similar to Xen HVM support, in that it will typically try to boot the kernel found in the MBR of the first harddisk (unless you specify an alternate boot device). Unlike Xen HVM support, it can also have an explicit kernel, initrd & arguments specified via command line. The other difference is that Xen has a separater binary for the loader, vs device model - QEMU/KVM does it all in one binary.
So, my initial thoughts are:
1. Use the <type> element within the <os> block to describe the CPU accelerator to use, accepting the values 'qemu', 'kvm', 'kqemu' or 'qvm86'.
Hum, there is also the <domain type='...'> value on the root element. I think it's relatively important to keep the differenciator early on. Moreover since there is only a relatively limited set of values to pick from it's good to have it on an attribute (it helps validation say if you're using DTDs). The type in the <os> block to me indicate the type of os (say linux, freebsd or hvm in case of full virt). To be consistent I would keep os/type to hvm because in any cases this is a full virtualization (or rather a cpu emulation).
2. For CPU architecture, there are a couple of choices
a) Add an 'arch' attribute to the <type> element to select between x86, ppc, sparc, etc. b) Add an <arch> element within the <os> block to switch between architectures. c) Just use the <loader> element to point to the QEMU binary matching the desired architecture. d) Use the <emulator> element within the <devices> section to point to the QEMU binary matching the desired architecture. e) Option a) and also allow use of <loader> to override the default QEMU binary to use f) Option b) and also allow use of <loader> to override the default QEMU binary to use g) Option a) and also allow use of <emulator> within the <devices> block to override the default QEMU binary to use h) Option b) and also allow use of <emulator> within the <devices> block to override the default QEMU binary to use
At this time, i'm leaning towards either option e). This would give examples like
Yes I like this too, this is consistent with previous uses (and then ease validation), and again the arch is a limited list which can be enumerated statically so having it as an attribute value is nice. IIRC though the qemu versus kqemu is not really noticeable at the command line level, if QEmu was compiled with kqemu support it will be on by default (if the module is loaded/loadable) and off by defaut otherwise, one can just disable it if one want to avoid it when compiled in.
* Using PPC with regular non-accelerated QEMU
<os> <type arch="ppc">qemu</type> </os>
* Using non-accelerated QEMU, and guest architecture matching the host
<os> <type>qemu</type> </os>
* Using accelerated KVM
<os> <type>kvm</type> </os>
* Using accelerated KVM, but with an alernate binary
<os> <type>kvm</type> <loader>/home/berrange/work/kvm-devel/qemu-kvm</loader> </os>
I assume that libvirt will have default compiled-in values for the loaders of all supported types.
3. For machine type, again there are a couple of options:
a) Add a 'machine' attribute to the <type> element b) Add a <machine> element within the <os> block
For this I think we should just follow same scheme that we use to specify architecture. So I'd lean towards a)
* Using PPC with a Mac-99 machine:
<os> <type arch="ppc" machine="mac99">qemu</type> </os>
* Using non-accelerated QEMU, and guest architecture matching the host, and a non-PCI machine:
<os> <type machine="isapc">qemu</type> </os>
yes except that I would expect the top level type attribute to define the engine used, and have os/type be hvm (or so common emulation value), but it's fine to have os/type/@arch and os/type/@machine as differentiators there
All the other bits of XML like disk/network/console configuration are pretty straightforward, following very similar structure to Xen. So there isn't really much interesting stuff to discuss there.
yeah, that should be fairly close, or easilly mappeable. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Wed, Jan 03, 2007 at 10:35:47AM -0500, Daniel Veillard wrote:
On Wed, Jan 03, 2007 at 03:11:58PM +0000, Daniel P. Berrange wrote:
QEMU is basically a fully-virt solution, so XML description for the <os> block would be superficically similar to Xen HVM support, in that it will typically try to boot the kernel found in the MBR of the first harddisk (unless you specify an alternate boot device). Unlike Xen HVM support, it can also have an explicit kernel, initrd & arguments specified via command line. The other difference is that Xen has a separater binary for the loader, vs device model - QEMU/KVM does it all in one binary.
So, my initial thoughts are:
1. Use the <type> element within the <os> block to describe the CPU accelerator to use, accepting the values 'qemu', 'kvm', 'kqemu' or 'qvm86'.
Hum, there is also the <domain type='...'> value on the root element. I think it's relatively important to keep the differenciator early on. Moreover since there is only a relatively limited set of values to pick from it's good to have it on an attribute (it helps validation say if you're using DTDs).
Yes, I forgot to mention I'd already assumed the 'type' attribute on the top level <domain> tag would be 'qemu' to indicate the qemu driver.
The type in the <os> block to me indicate the type of os (say linux, freebsd or hvm in case of full virt). To be consistent I would keep os/type to hvm because in any cases this is a full virtualization (or rather a cpu emulation).
It just feels a little wrong to use 'hvm' (Hardware Virtual Machine) as the term for qemu, in which only the kvm variant is actually hardware accelerated, with all the others just being software emulated. As per my other reply to Jeremy, I felt <type> to be refering to the virtualization strategy for the backend, and so implementation specific. For Xen world we have 'linux' or 'hvm', while for QEMU world we have the different 'qemu' 'kvm', 'kqemu' options.
2. For CPU architecture, there are a couple of choices
a) Add an 'arch' attribute to the <type> element to select between x86, ppc, sparc, etc. b) Add an <arch> element within the <os> block to switch between architectures. c) Just use the <loader> element to point to the QEMU binary matching the desired architecture. d) Use the <emulator> element within the <devices> section to point to the QEMU binary matching the desired architecture. e) Option a) and also allow use of <loader> to override the default QEMU binary to use f) Option b) and also allow use of <loader> to override the default QEMU binary to use g) Option a) and also allow use of <emulator> within the <devices> block to override the default QEMU binary to use h) Option b) and also allow use of <emulator> within the <devices> block to override the default QEMU binary to use
At this time, i'm leaning towards either option e). This would give examples like
Yes I like this too, this is consistent with previous uses (and then ease validation), and again the arch is a limited list which can be enumerated statically so having it as an attribute value is nice. IIRC though the qemu versus kqemu is not really noticeable at the command line level, if QEmu was compiled with kqemu support it will be on by default (if the module is loaded/loadable) and off by defaut otherwise, one can just disable it if one want to avoid it when compiled in.
Well, yes you need to have '--no-kqemu' command line argument to turn off the kqemu acceleration. We definitely need to support both having it on/off because some OS' will crash badly with kqemu due to emulation bugs.
* Using PPC with regular non-accelerated QEMU
<os> <type arch="ppc">qemu</type> </os>
* Using non-accelerated QEMU, and guest architecture matching the host
<os> <type>qemu</type> </os>
* Using accelerated KVM
<os> <type>kvm</type> </os>
* Using accelerated KVM, but with an alernate binary
<os> <type>kvm</type> <loader>/home/berrange/work/kvm-devel/qemu-kvm</loader> </os>
I assume that libvirt will have default compiled-in values for the loaders of all supported types.
Yes, there'd either be a list of pre-define binaries for each architecture, or we can generate binary names with some reasonable naming scheme, so we'd only need the explicit <loader> option if you wanted to point to an instance outside of standard /usr/bin, /bin, or $PATH locations.
3. For machine type, again there are a couple of options:
a) Add a 'machine' attribute to the <type> element b) Add a <machine> element within the <os> block
For this I think we should just follow same scheme that we use to specify architecture. So I'd lean towards a)
* Using PPC with a Mac-99 machine:
<os> <type arch="ppc" machine="mac99">qemu</type> </os>
* Using non-accelerated QEMU, and guest architecture matching the host, and a non-PCI machine:
<os> <type machine="isapc">qemu</type> </os>
yes except that I would expect the top level type attribute to define the engine used, and have os/type be hvm (or so common emulation value), but it's fine to have os/type/@arch and os/type/@machine as differentiators there
I thought the top level 'type' attribute referred to the name of the libvirt driver backend - so would always be 'qemu' in this case. Hence thought of using the <type> element to specify acceleration method. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Wed, Jan 03, 2007 at 04:06:29PM +0000, Daniel P. Berrange wrote:
On Wed, Jan 03, 2007 at 10:35:47AM -0500, Daniel Veillard wrote:
On Wed, Jan 03, 2007 at 03:11:58PM +0000, Daniel P. Berrange wrote:
QEMU is basically a fully-virt solution, so XML description for the <os> block would be superficically similar to Xen HVM support, in that it will typically try to boot the kernel found in the MBR of the first harddisk (unless you specify an alternate boot device). Unlike Xen HVM support, it can also have an explicit kernel, initrd & arguments specified via command line. The other difference is that Xen has a separater binary for the loader, vs device model - QEMU/KVM does it all in one binary.
So, my initial thoughts are:
1. Use the <type> element within the <os> block to describe the CPU accelerator to use, accepting the values 'qemu', 'kvm', 'kqemu' or 'qvm86'.
Hum, there is also the <domain type='...'> value on the root element. I think it's relatively important to keep the differenciator early on. Moreover since there is only a relatively limited set of values to pick from it's good to have it on an attribute (it helps validation say if you're using DTDs).
Yes, I forgot to mention I'd already assumed the 'type' attribute on the top level <domain> tag would be 'qemu' to indicate the qemu driver.
The problem is that they don't all share the qemu driver, even Xen HVM uses qemu. To me the type in the top level domain element should be sufficient to tell what virtualization technology is used. It's basically the big switch telling what allowed or not inside the XML description. I really prefer to have as complete information there as possible.
The type in the <os> block to me indicate the type of os (say linux, freebsd or hvm in case of full virt). To be consistent I would keep os/type to hvm because in any cases this is a full virtualization (or rather a cpu emulation).
It just feels a little wrong to use 'hvm' (Hardware Virtual Machine) as the term for qemu,
okay use another terminology there if you want.
in which only the kvm variant is actually hardware accelerated, with all the others just being software emulated. As per my other reply to Jeremy, I felt <type> to be refering to the virtualization strategy for the backend, and so implementation specific. For Xen world we have 'linux' or 'hvm', while for QEMU world we have the different 'qemu' 'kvm', 'kqemu' options.
and 6 months from now people will start developping special paravirt drivers and kernel for KVM (those kernel guys won't resist shaving 5% of performances sooner or later :-) and then you will run a specific os type in a kvm <domain type="kvm"> <os> <type>linux</type> will then be just fine. But if we override the os/type value now with something which is not realated to the running os we will loose consistency later. Sorry I'm still disagreeing, I really prefer to see <domain type="qemu"> <domain type="kvm"> <domain type="kqemu"> and keep a generic value in <os> <type> as long as the actual OS is indifferent but still have provision to indicate a specific one there later if needed.
Yes I like this too, this is consistent with previous uses (and then ease validation), and again the arch is a limited list which can be enumerated statically so having it as an attribute value is nice. IIRC though the qemu versus kqemu is not really noticeable at the command line level, if QEmu was compiled with kqemu support it will be on by default (if the module is loaded/loadable) and off by defaut otherwise, one can just disable it if one want to avoid it when compiled in.
Well, yes you need to have '--no-kqemu' command line argument to turn off the kqemu acceleration. We definitely need to support both having it on/off because some OS' will crash badly with kqemu due to emulation bugs.
:-)
I assume that libvirt will have default compiled-in values for the loaders of all supported types.
Yes, there'd either be a list of pre-define binaries for each architecture, or we can generate binary names with some reasonable naming scheme, so we'd only need the explicit <loader> option if you wanted to point to an instance outside of standard /usr/bin, /bin, or $PATH locations.
okay
yes except that I would expect the top level type attribute to define the engine used, and have os/type be hvm (or so common emulation value), but it's fine to have os/type/@arch and os/type/@machine as differentiators there
I thought the top level 'type' attribute referred to the name of the libvirt driver backend - so would always be 'qemu' in this case. Hence thought of using the <type> element to specify acceleration method.
hum, the driver back-end is an internal mapping problem, we actually have already 4 back-end for xen. I prefer to think of the XML in term of user data rather than export of libvirt internals, which anyway have been subject to many changes, and will continue, the key is preserving the sanity of interface despite the changes, trying to be as consistant for the user as possible, even if it messes up stull internallly a bit :-) Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Wed, Jan 03, 2007 at 11:29:48AM -0500, Daniel Veillard wrote:
On Wed, Jan 03, 2007 at 04:06:29PM +0000, Daniel P. Berrange wrote:
accelerated, with all the others just being software emulated. As per my other reply to Jeremy, I felt <type> to be refering to the virtualization strategy for the backend, and so implementation specific. For Xen world we have 'linux' or 'hvm', while for QEMU world we have the different 'qemu' 'kvm', 'kqemu' options.
and 6 months from now people will start developping special paravirt drivers and kernel for KVM (those kernel guys won't resist shaving 5% of performances sooner or later :-) and then you will run a specific os type in a kvm <domain type="kvm"> <os> <type>linux</type>
will then be just fine. But if we override the os/type value now with something which is not realated to the running os we will loose consistency later. Sorry I'm still disagreeing, I really prefer to see <domain type="qemu"> <domain type="kvm"> <domain type="kqemu"> and keep a generic value in <os> <type> as long as the actual OS is indifferent but still have provision to indicate a specific one there later if needed.
Ok, given the possiblilty of doing a paravirt guest using QEMU+KVM + paravirt_ops, then it does sound reasonable to use the top level 'type' attribute to specify the different virtualization approach. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
participants (3)
-
Daniel P. Berrange
-
Daniel Veillard
-
Jeremy Katz