[Libvir] Fix handling of HVM boot parameters

Jeremy mentioned that it looked like libvirt wasn't able to create an HVM domain configured to boot off cdrom, so I took a closer look at the code and indeed the code dealing with <boot> section was both incomplete, and just plain broken. Incomplete in so much as it never included details of the ISO file backing the CDROM, and broken in so much as it was doing string comparisons against the wrong variables. Digging further found lots more work relating to creation of HVM domains so I've had a go at writing a patch to resolve matters. * Parsing of UUID from SEXPR assumed that the UUID fetched would always have four '-' in the usual places. Well, when you run 'dumpxml' from virsh the <uuid> element has the UUID encoded without any '-' chars. If you feed this back into 'virsh create' and then once again run 'dumpxml' parsing will fail & libvirt throws errors. So this patch allows for parsing of UUID's without '-'. * The XML document size was limited to 1k - we just 'malloc(1000)'. While this was enough for common cases, if someone creates lots of disk or network devices this would overflow and libvirt return an error. So I increased it to 4k which ought to be enough for forseeable future - in any case my previous patch to fetch XML via the proxy is limited to 4k in size too. Now onto the fine details... When converting SEXPR into XML current code is doing the following: (boot a) -> <boot dev='/dev/fd0'/> (boot c) -> <boot dev='hda'/> (boot d) -> <boot dev='/dev/cdrom'/> This is rather inconsistent - the 'hda' is intended to map to an entry in <devices> block. The '/dev/fd0' and '/dev/cdrom' entries did not map to anything. Meanwhile, when converting from XML to SXEXPR it is doing the following <boot dev='hda'/> -> (boot a) <boot dev='hdd'/> -> (boot d) <boot dev='*'/> -> (boot c) Obviously this sucks because these processes should be matching each other. Secondly, the (image (hvm....)) SEXPR has three entries for defining the ISO / disk image file backing the CDROM / Floppy devices. These were just being ignored, rather than turned into <disk> entries within the <devices> block. Similarly there was no way to express a <disk> entry for CDROM/Floppy in the XML when creating a domain. The upshot of all this is that although the last release of libvirt included HVM support it was basically unusable for domain creation unless you were using HD to boot. The XML returned was also incorrect. Now the good news. Since it was sooo broken, we can fix without worrying about XML compatability since there is no way any application could be relying on it in its current state. Now before I discuss the solution, one final point. The Xen IOMMU model allows specifiying the boot device in terms of 'a', 'c', 'd' which has the meaning: a - first connected floppy device c - first connected harddrive d - first connected cdrom In particular it does *not* allow for expressing boot device in terms of an explicit disk device - you can't say boot off 'hdb' or 'hdc'. So the current libvirt code which tries to fake such semantics is doomed to failure. This isn't really too bad IMHO - VMware has these same semantics, so does QEMU and so do normal bare metal BIOS This in the patch I have attached I have implemented the following mapping: (boot a) <-> <boot dev='fd'/> (boot c) <-> <boot dev='hd'/> (boot d) <-> <boot dev='cdrom'/> The other part of the patch is to deal with definition of the floppy and cdrom device backing files. For this I have done the following: (image (hvm (fda /root/diskboot.img))) <devices> <disk type='file'> <source file='/root/diskboot.img'/> <target dev='fda'/> </disk> </devices> And similar for 'fdb'. Then for cdroms: (image (hvm (cdrom /root/image.iso))) <devices> <disk type='file'> <source file='/root/image.iso'/> <target dev='cdrom'/> </disk> </devices> The patch has a little bit of logic such that when converting the <devices> block backinto an SEXPR it filters out the disk entries with a dev of 'fda', 'fdb' and 'cdrom' since they need to end up in a different part of the SEXPR. Finally, although PV guests automatically get a serial console created for them (and i don't think you can turn it off), HVM guests need to have a serial console explicitly requested. Since I recently added support in the DumpXML method for including details of serial console like this: <devices> <console tty="/dev/pts/5"/> </devices> I decided to leverage this same structure when creating HVM domains. So if you have: <devices> <console/> </devices> Then the SEXPR sent to XenD will include (image (hvm (serial pty))) Which enables allocation of PseudoTTY for the HVM's serial console. I have tested that with this patch I can successfully create a HVM domain which boots off a floppy, harddrive or cdrom. Furthermore if you then dump the XML of this domain,the XML you get back will match the XML you fed in (with the obvious exception of domain ID, and the Pseudo TTY path). If you have been monitoring xen-devel mailing lists you'll be aware that in 3.0.3 the way CDROM devices are configured is changing: http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00369.html Although the patch attached does not support the config outlined in that mail, I'm pretty confident that a small incremental patch will be able to support it without breaking compatability with the changes I've outlined in this mail. The only tricky bit will be that we need to detect whether libvirt is running against a 3.0.2 or 3.0.3 version of XenD to decide how to convert XML -> SEXPR & vica verca. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Wed, Aug 09, 2006 at 11:54:26PM +0100, Daniel P. Berrange wrote:
Jeremy mentioned that it looked like libvirt wasn't able to create an HVM domain configured to boot off cdrom, so I took a closer look at the code and indeed the code dealing with <boot> section was both incomplete, and just plain broken. Incomplete in so much as it never included details of the ISO file backing the CDROM, and broken in so much as it was doing string comparisons against the wrong variables. Digging further found lots more work relating to creation of HVM domains so I've had a go at writing a patch to resolve matters. [snip] I have tested that with this patch I can successfully create a HVM domain which boots off a floppy, harddrive or cdrom. Furthermore if you then dump the XML of this domain,the XML you get back will match the XML you fed in (with the obvious exception of domain ID, and the Pseudo TTY path).
I meant to include a complete example XML doc showing the changes in place, so here is a XML dump from a HVM domain which has been booted off a CDROM: <domain type='xen' id='9'> <name>too</name> <uuid>b5d70dd275cdaca517769660b059d8bc</uuid> <os> <type>hvm</type> <loader>/usr/lib/xen/boot/hvmloader</loader> <boot dev='cdrom'/> </os> <memory>409600</memory> <vcpu>1</vcpu> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/lib64/xen/bin/qemu-dm</emulator> <interface type='bridge'> <source bridge='xenbr0'/> <mac address='00:16:3e:1b:b1:47'/> <script path='vif-bridge'/> </interface> <disk type='file'> <source file='/root/foo.img'/> <target dev='ioemu:hda'/> </disk> <disk type='file'> <source file='/root/boot.iso'/> <target dev='cdrom'/> </disk> <graphics type='vnc' port='5909'/> <console tty='/dev/pts/3'/> </devices> </domain>
If you have been monitoring xen-devel mailing lists you'll be aware that in 3.0.3 the way CDROM devices are configured is changing:
http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00369.html
Although the patch attached does not support the config outlined in that mail, I'm pretty confident that a small incremental patch will be able to support it without breaking compatability with the changes I've outlined in this mail. The only tricky bit will be that we need to detect whether libvirt is running against a 3.0.2 or 3.0.3 version of XenD to decide how to convert XML -> SEXPR & vica verca.
Jeremy also just posted a patch to xen-devel allowing multiple boot devices to be specified: http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00576.html So you can do the classic installer use case of "Try harddisk, if no boot sector, then fallback to [installation] cdrom" It would seem the obvious way to express this in libvirt XML would be to allow multiple <boot> elements with their ordering translating to the boot orrdering. So for example that use case would be expressed as: <domain type='xen' id='9'> <name>too</name> <uuid>b5d70dd275cdaca517769660b059d8bc</uuid> <os> <type>hvm</type> <loader>/usr/lib/xen/boot/hvmloader</loader> <boot dev='hd'/> <boot dev='cdrom'/> </os> The other (non-compatible) change would be to allow nested device entries <domain type='xen' id='9'> <name>too</name> <uuid>b5d70dd275cdaca517769660b059d8bc</uuid> <os> <type>hvm</type> <loader>/usr/lib/xen/boot/hvmloader</loader> <boot> <dev type='hd'/> <dev type='cdrom'/> </boot> </os> I'm inclined to just go for the former. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, 2006-08-10 at 01:00 +0100, Daniel P. Berrange wrote:
I meant to include a complete example XML doc showing the changes in place, so here is a XML dump from a HVM domain which has been booted off a CDROM: [snip] <disk type='file'> <source file='/root/foo.img'/> <target dev='ioemu:hda'/> </disk>
Given what we know is coming, does it make sense to drop the ioemu: here and just have it be implied for HVM guests? Accept it if it's there (and then drop it if we're on xend 3.0.3), but not really show it? Then again, not 100% sure how all of this is going to interact when we start having PV drivers for HVM guests :-/
<disk type='file'> <source file='/root/boot.iso'/> <target dev='cdrom'/> </disk>
Similarly, instead of target dev='cdrom', does it make more sense to have a devicetype (or something) that's an attribute of the disk rather than a magic device? Jeremy

On Wed, Aug 09, 2006 at 09:33:11PM -0400, Jeremy Katz wrote:
On Thu, 2006-08-10 at 01:00 +0100, Daniel P. Berrange wrote:
I meant to include a complete example XML doc showing the changes in place, so here is a XML dump from a HVM domain which has been booted off a CDROM: [snip] <disk type='file'> <source file='/root/foo.img'/> <target dev='ioemu:hda'/> </disk>
Given what we know is coming, does it make sense to drop the ioemu: here and just have it be implied for HVM guests? Accept it if it's there (and then drop it if we're on xend 3.0.3), but not really show it?
Sound sensible, the problem is detecting the version of xend, of course you can ask xend, you will get the exact version of the compiler used to compile it, but when it comes to xen version itself (xen_major 3) (xen_minor 0) (xen_extra -unstable) which makes things a bit hard to distinguish 3.0.2 from 3.0.3 :-\ We could try to use the changest but it's not available in our build either. Still in spite of this I would rather not bury in the format an exotic labelling which we know will be ignored (or breaking) later.
Then again, not 100% sure how all of this is going to interact when we start having PV drivers for HVM guests :-/
<disk type='file'> <source file='/root/boot.iso'/> <target dev='cdrom'/> </disk>
Similarly, instead of target dev='cdrom', does it make more sense to have a devicetype (or something) that's an attribute of the disk rather than a magic device?
There is the read-only attribute. For example UML has no specific way to indicate an emulated CD-ROM, there is just a read-only command line flag. <disk type='file'> <source file='/root/boot.iso'/> <target dev='hdc'/> <readonly/> </disk> After all since we don't have hardware to tell us what kind of device it is, it is really a matter of what kind of accesses are allowed. How it is mapped underneath depends on the engine used, but should probably not affect the XML format. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Thu, 2006-08-10 at 05:08 -0400, Daniel Veillard wrote:
On Wed, Aug 09, 2006 at 09:33:11PM -0400, Jeremy Katz wrote:
On Thu, 2006-08-10 at 01:00 +0100, Daniel P. Berrange wrote:
I meant to include a complete example XML doc showing the changes in place, so here is a XML dump from a HVM domain which has been booted off a CDROM: [snip] <disk type='file'> <source file='/root/foo.img'/> <target dev='ioemu:hda'/> </disk>
Given what we know is coming, does it make sense to drop the ioemu: here and just have it be implied for HVM guests? Accept it if it's there (and then drop it if we're on xend 3.0.3), but not really show it?
Sound sensible, the problem is detecting the version of xend, of course you can ask xend, you will get the exact version of the compiler used to compile it, but when it comes to xen version itself (xen_major 3) (xen_minor 0) (xen_extra -unstable) which makes things a bit hard to distinguish 3.0.2 from 3.0.3 :-\ We could try to use the changest but it's not available in our build either.
Yeah, unfortunately, this is just going to be a general problem :(
Then again, not 100% sure how all of this is going to interact when we start having PV drivers for HVM guests :-/
<disk type='file'> <source file='/root/boot.iso'/> <target dev='cdrom'/> </disk>
Similarly, instead of target dev='cdrom', does it make more sense to have a devicetype (or something) that's an attribute of the disk rather than a magic device?
There is the read-only attribute. For example UML has no specific way to indicate an emulated CD-ROM, there is just a read-only command line flag.
<disk type='file'> <source file='/root/boot.iso'/> <target dev='hdc'/> <readonly/> </disk>
After all since we don't have hardware to tell us what kind of device it is, it is really a matter of what kind of accesses are allowed. How it is mapped underneath depends on the engine used, but should probably not affect the XML format.
But read-only isn't all that you want -- think about giving access to a CD-R drive. It's not read-only, but we still need to have it exposed as a CD device. And with things like the bios for qemu and HVM guests, if a device is a CD-ROM or a hard drive makes a large difference. Thinking out loud, what if we went with something like <cdrom type='file'> <source file='/root/boot.iso'/> <target dev='hdc'/> </cdrom> for CDs and then similarly <floppy .../> for floppies Jeremy

On Thu, Aug 10, 2006 at 08:02:20AM -0400, Jeremy Katz wrote:
On Thu, 2006-08-10 at 05:08 -0400, Daniel Veillard wrote:
There is the read-only attribute. For example UML has no specific way to indicate an emulated CD-ROM, there is just a read-only command line flag.
<disk type='file'> <source file='/root/boot.iso'/> <target dev='hdc'/> <readonly/> </disk>
After all since we don't have hardware to tell us what kind of device it is, it is really a matter of what kind of accesses are allowed. How it is mapped underneath depends on the engine used, but should probably not affect the XML format.
But read-only isn't all that you want -- think about giving access to a CD-R drive. It's not read-only, but we still need to have it exposed as a CD device. And with things like the bios for qemu and HVM guests, if a device is a CD-ROM or a hard drive makes a large difference.
Thinking out loud, what if we went with something like <cdrom type='file'> <source file='/root/boot.iso'/> <target dev='hdc'/> </cdrom> for CDs and then similarly <floppy .../> for floppies
I wouldn't do this for CDROMs, since they basically share the same device namespace as disks already - with versions Xen / QEMU any hda -> hdd can be labelled as a cdrom by appending :cdrom - so they're best handled under same XML tag as disks For floppy disks though we could certainly have a separate <floppy> tag name instead of <disk> - it would be clearer than distinguishing based on the value of the 'dev' attribute. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Aug 10, 2006 at 02:12:26PM +0100, Daniel P. Berrange wrote:
On Thu, Aug 10, 2006 at 08:02:20AM -0400, Jeremy Katz wrote:
But read-only isn't all that you want -- think about giving access to a CD-R drive. It's not read-only, but we still need to have it exposed as a CD device. And with things like the bios for qemu and HVM guests, if a device is a CD-ROM or a hard drive makes a large difference.
Thinking out loud, what if we went with something like <cdrom type='file'> <source file='/root/boot.iso'/> <target dev='hdc'/> </cdrom> for CDs and then similarly <floppy .../> for floppies
I wouldn't do this for CDROMs, since they basically share the same device namespace as disks already - with versions Xen / QEMU any hda -> hdd can be labelled as a cdrom by appending :cdrom - so they're best handled under same XML tag as disks
For floppy disks though we could certainly have a separate <floppy> tag name instead of <disk> - it would be clearer than distinguishing based on the value of the 'dev' attribute.
Actually I take that back. There is a potentially never ending list of different disk interfaces (IDE, FD, SCSI, XVDA) - I don't think we really need to dstingiush between them by having separate <floppy> <disk>, <cdrom> tags, since the value of the 'dev' attribute is always unique. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Aug 10, 2006 at 04:08:30PM +0100, Daniel P. Berrange wrote:
On Thu, Aug 10, 2006 at 02:12:26PM +0100, Daniel P. Berrange wrote:
On Thu, Aug 10, 2006 at 08:02:20AM -0400, Jeremy Katz wrote:
But read-only isn't all that you want -- think about giving access to a CD-R drive. It's not read-only, but we still need to have it exposed as a CD device. And with things like the bios for qemu and HVM guests, if a device is a CD-ROM or a hard drive makes a large difference.
Thinking out loud, what if we went with something like <cdrom type='file'> <source file='/root/boot.iso'/> <target dev='hdc'/> </cdrom> for CDs and then similarly <floppy .../> for floppies
I wouldn't do this for CDROMs, since they basically share the same device namespace as disks already - with versions Xen / QEMU any hda -> hdd can be labelled as a cdrom by appending :cdrom - so they're best handled under same XML tag as disks
For floppy disks though we could certainly have a separate <floppy> tag name instead of <disk> - it would be clearer than distinguishing based on the value of the 'dev' attribute.
Actually I take that back. There is a potentially never ending list of different disk interfaces (IDE, FD, SCSI, XVDA) - I don't think we really need to dstingiush between them by having separate <floppy> <disk>, <cdrom> tags, since the value of the 'dev' attribute is always unique.
Well hdc could as well be an ide disk or an ide CD-Rom or an ide CD-RW. I though that the readonly element should be sufficient at the emulation level, but Jeremy don't think so. making new element name ain't good from an XML perspective. But we could hint at the expected kind of device as an extra attribute on <disk>, <disk type='file' device='cdrom'> <source file='/root/boot.iso'/> <target dev='hdc'/> <readonly/> </disk> the problem would then to maintain a list of supported device values the default being a disk if the attribute is ommited. I think that would be relatively sane at the XML level, then the lower layer may or may not be able to make sense of it (an UML backend would not for example) depending on how the device are emulated. Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Thu, Aug 10, 2006 at 11:37:33AM -0400, Daniel Veillard wrote:
On Thu, Aug 10, 2006 at 04:08:30PM +0100, Daniel P. Berrange wrote:
Actually I take that back. There is a potentially never ending list of different disk interfaces (IDE, FD, SCSI, XVDA) - I don't think we really need to dstingiush between them by having separate <floppy> <disk>, <cdrom> tags, since the value of the 'dev' attribute is always unique.
Well hdc could as well be an ide disk or an ide CD-Rom or an ide CD-RW. I though that the readonly element should be sufficient at the emulation level, but Jeremy don't think so. making new element name
<readonly> is fine if we also have 'device=cdrom' attribute because we can reliably map back & forth with the latter. <readonly> will be needed when the new CDROM device model appears in xen 3.0.3 because that has space for an explicit 'mode' flag in the device SEXPR to distinguish 'r' and 'rw'.
ain't good from an XML perspective. But we could hint at the expected kind of device as an extra attribute on <disk>,
<disk type='file' device='cdrom'> <source file='/root/boot.iso'/> <target dev='hdc'/> <readonly/> </disk>
the problem would then to maintain a list of supported device values the default being a disk if the attribute is ommited.
Well, HAL has a similar property is assigns to storage devices 'device_type' http://webcvs.freedesktop.org/hal/hal/doc/spec/hal-spec.html?view=co&pathrev=HEAD#device-properties-storage They current enumerate 'disk', 'cdrom', 'floppy', 'tape', and a bunch of flash memory types 'compact_flash', 'memory_stick', 'smart_media','sd_mmc' We could just adopt the first three options, and say 'disk' is default if it is ommitted. Trying to get into finer detail like 'dvd', 'cdr', 'cdrw' is doomed to failure because there are just soo many combinations and many drives support many cd* variants at once. The 'disk', 'cdrom' and 'floppy' options are trivial to map to SEXPR both directions. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Aug 10, 2006 at 05:08:58AM -0400, Daniel Veillard wrote:
On Wed, Aug 09, 2006 at 09:33:11PM -0400, Jeremy Katz wrote:
On Thu, 2006-08-10 at 01:00 +0100, Daniel P. Berrange wrote:
I meant to include a complete example XML doc showing the changes in place, so here is a XML dump from a HVM domain which has been booted off a CDROM: [snip] <disk type='file'> <source file='/root/foo.img'/> <target dev='ioemu:hda'/> </disk>
Given what we know is coming, does it make sense to drop the ioemu: here and just have it be implied for HVM guests? Accept it if it's there (and then drop it if we're on xend 3.0.3), but not really show it?
Sound sensible, the problem is detecting the version of xend, of course you can ask xend, you will get the exact version of the compiler used to compile it, but when it comes to xen version itself (xen_major 3) (xen_minor 0) (xen_extra -unstable) which makes things a bit hard to distinguish 3.0.2 from 3.0.3 :-\
Basically trying to hook off version number is not ever really going to be reliable because we need libvirt to be able to work against development snapshots - features may be introduced during dev that need detecting before the version number is incremented.
We could try to use the changest but it's not available in our build either.
Hooking off changeset looks & smells like a nasty hack. What is really needed is a version number for the SEXPR format returned by XenD. A simple incrementing integer digit would suffice really. (xen_sexpr_format 4) Which could be incremented each time a new capability is introduced,or an existing one changed. Something to propose upstream asap ? Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Aug 10, 2006 at 03:15:41PM +0100, Daniel P. Berrange wrote:
On Thu, Aug 10, 2006 at 05:08:58AM -0400, Daniel Veillard wrote:
Sound sensible, the problem is detecting the version of xend, of course you can ask xend, you will get the exact version of the compiler used to compile it, but when it comes to xen version itself (xen_major 3) (xen_minor 0) (xen_extra -unstable) which makes things a bit hard to distinguish 3.0.2 from 3.0.3 :-\
Basically trying to hook off version number is not ever really going to be reliable because we need libvirt to be able to work against development snapshots - features may be introduced during dev that need detecting before the version number is incremented.
sigh, yes
We could try to use the changest but it's not available in our build either.
Hooking off changeset looks & smells like a nasty hack.
Well that's something we know will increase, and trying to get versionning from something not versionned will be hackish
What is really needed is a version number for the SEXPR format returned by XenD. A simple incrementing integer digit would suffice really.
(xen_sexpr_format 4)
Which could be incremented each time a new capability is introduced,or an existing one changed. Something to propose upstream asap ?
I'm not sure this will be agreed upon, if you present it that way since sexpr will be deprecated "soon", but asking for a a rev number in xend API changes would be logical. But that will be too late for "ioemu:" anyway ... Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Wed, Aug 09, 2006 at 09:33:11PM -0400, Jeremy Katz wrote:
On Thu, 2006-08-10 at 01:00 +0100, Daniel P. Berrange wrote:
I meant to include a complete example XML doc showing the changes in place, so here is a XML dump from a HVM domain which has been booted off a CDROM: [snip] <disk type='file'> <source file='/root/foo.img'/> <target dev='ioemu:hda'/> </disk>
Given what we know is coming, does it make sense to drop the ioemu: here and just have it be implied for HVM guests? Accept it if it's there (and then drop it if we're on xend 3.0.3), but not really show it?
Well there are two possibilities - we could drop it from user facing XML and prepend it when we convert to SEXPR, or could leave it in relying on the fact that it will be ignored in newer XenD. The former is probably nicest long term, just hinges on reliably detecting XenD version.
<disk type='file'> <source file='/root/boot.iso'/> <target dev='cdrom'/> </disk>
Similarly, instead of target dev='cdrom', does it make more sense to have a devicetype (or something) that's an attribute of the disk rather than a magic device?
Well, in 3.0.3 the way CDROMs are expressed is changing so it will look exactly same as specifiying a harddrive - you will simply append :cdrom to the target device name: <disk type='file'> <source file='/root/boot.iso'/> <target dev='hdc:cdrom'/> </disk> We could simply go with that format straight away, converting 'hdc:cdrom' backinto the current '(hvm (cdrom...))' SEXPR. We'd just document that f you are on Xen 3.0.2 you can only use 'hdc:cdrom', but for other versions you can use any 'hda:cdrom', etc. This would keep us pretty future proof and semantically makes sense since CDROMs & Harddrives do share the same IDE bus namespace after all. Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Wed, Aug 09, 2006 at 11:54:26PM +0100, Daniel P. Berrange wrote: [ lot of details about problems ]
The upshot of all this is that although the last release of libvirt included HVM support it was basically unusable for domain creation unless you were using HD to boot. The XML returned was also incorrect.
Now the good news. Since it was sooo broken, we can fix without worrying about XML compatability since there is no way any application could be relying on it in its current state.
Agreed. And the changes suggested should not affect the very simple case which worked, except for the 'ioemu:' device prefix. We can just discard it when reading the XML and add it when just discard it except that before xen-3.0.3 this will need to be added back, and post xen-3.0.3 this should be allowed by xend [...]
The other part of the patch is to deal with definition of the floppy and cdrom device backing files. For this I have done the following: [...] The patch has a little bit of logic such that when converting the <devices> block backinto an SEXPR it filters out the disk entries with a dev of 'fda', 'fdb' and 'cdrom' since they need to end up in a different part of the SEXPR. [...] Then the SEXPR sent to XenD will include
(image (hvm (serial pty)))
Which enables allocation of PseudoTTY for the HVM's serial console.
All this sounds good.
I have tested that with this patch I can successfully create a HVM domain which boots off a floppy, harddrive or cdrom. Furthermore if you then dump the XML of this domain,the XML you get back will match the XML you fed in (with the obvious exception of domain ID, and the Pseudo TTY path).
If you have been monitoring xen-devel mailing lists you'll be aware that in 3.0.3 the way CDROM devices are configured is changing:
http://lists.xensource.com/archives/html/xen-devel/2006-08/msg00369.html
Although the patch attached does not support the config outlined in that mail, I'm pretty confident that a small incremental patch will be able to support it without breaking compatability with the changes I've outlined in this mail. The only tricky bit will be that we need to detect whether libvirt is running against a 3.0.2 or 3.0.3 version of XenD to decide how to convert XML -> SEXPR & vica verca.
yeah, and really I don't know how to find it... Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

On Thu, Aug 10, 2006 at 08:31:28AM -0400, Daniel Veillard wrote:
On Wed, Aug 09, 2006 at 11:54:26PM +0100, Daniel P. Berrange wrote: [ lot of details about problems ]
The upshot of all this is that although the last release of libvirt included HVM support it was basically unusable for domain creation unless you were using HD to boot. The XML returned was also incorrect.
Now the good news. Since it was sooo broken, we can fix without worrying about XML compatability since there is no way any application could be relying on it in its current state.
Agreed. And the changes suggested should not affect the very simple case which worked, except for the 'ioemu:' device prefix. We can just discard it when reading the XML and add it when just discard it except that before xen-3.0.3 this will need to be added back, and post xen-3.0.3 this should be allowed by xend
Ok attached an updated version of the patch which the various changes discussed in this thread - in particular the addition of a 'device' attribute to the <disk> element accepting 'disk', 'floppy', or 'cdrom' and defaulting to 'disk' if it is omitted. Best illustrated by example XML dump: <domain type='xen' id='53'> <name>too</name> <uuid>b5d70dd275cdaca517769660b059d8bc</uuid> <os> <type>hvm</type> <loader>/usr/lib/xen/boot/hvmloader</loader> <boot dev='cdrom'/> </os> <memory>409600</memory> <vcpu>1</vcpu> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>restart</on_crash> <devices> <emulator>/usr/lib64/xen/bin/qemu-dm</emulator> <interface type='bridge'> <source bridge='xenbr0'/> <mac address='00:16:3e:1b:b1:47'/> <script path='vif-bridge'/> </interface> <disk type='file' device='disk'> <source file='/root/foo.img'/> <target dev='hda'/> </disk> <disk type='file' device='cdrom'> <source file='/root/boot.iso'/> <target dev='hdc'/> <readonly/> </disk> <graphics type='vnc' port='5953'/> <console tty='/dev/pts/16'/> </devices> </domain> Note that the ioemu: prefix is now not exposed in the XML - we add it back on when creating the SEXPR, or strip it when parsing the SEXPR. Until Xen 3.0.3 when 'device=cdrom' the only valid target dev is 'hdc', after 3.0.3 this can be relaxed to allow hda->hdd Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|

On Thu, Aug 10, 2006 at 11:00:01PM +0100, Daniel P. Berrange wrote:
On Thu, Aug 10, 2006 at 08:31:28AM -0400, Daniel Veillard wrote:
Agreed. And the changes suggested should not affect the very simple case which worked, except for the 'ioemu:' device prefix. We can just discard it when reading the XML and add it when just discard it except that before xen-3.0.3 this will need to be added back, and post xen-3.0.3 this should be allowed by xend
Ok attached an updated version of the patch which the various changes discussed in this thread - in particular the addition of a 'device' attribute to the <disk> element accepting 'disk', 'floppy', or 'cdrom' and defaulting to 'disk' if it is omitted. Best illustrated by example XML dump:
Looks just fine to me.
Note that the ioemu: prefix is now not exposed in the XML - we add it back on when creating the SEXPR, or strip it when parsing the SEXPR. Until Xen 3.0.3 when 'device=cdrom' the only valid target dev is 'hdc', after 3.0.3 this can be relaxed to allow hda->hdd
Since we can't detect xend version, that's a good way to proceed, thanks ! Daniel -- Red Hat Virtualization group http://redhat.com/virtualization/ Daniel Veillard | virtualization library http://libvirt.org/ veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/ http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/

Daniel P. Berrange wrote:
Jeremy mentioned that it looked like libvirt wasn't able to create an HVM domain configured to boot off cdrom, so I took a closer look at the code and indeed the code dealing with <boot> section was both incomplete, and just plain broken. Incomplete in so much as it never included details of the ISO file backing the CDROM, and broken in so much as it was doing string comparisons against the wrong variables. Digging further found lots more work relating to creation of HVM domains so I've had a go at writing a patch to resolve matters.
* Parsing of UUID from SEXPR assumed that the UUID fetched would always have four '-' in the usual places. Well, when you run 'dumpxml' from virsh the <uuid> element has the UUID encoded without any '-' chars. If you feed this back into 'virsh create' and then once again run 'dumpxml' parsing will fail & libvirt throws errors. So this patch allows for parsing of UUID's without '-'. * The XML document size was limited to 1k - we just 'malloc(1000)'. While this was enough for common cases, if someone creates lots of disk or network devices this would overflow and libvirt return an error. So I increased it to 4k which ought to be enough for forseeable future - in any case my previous patch to fetch XML via the proxy is limited to 4k in size too.
Now onto the fine details...
When converting SEXPR into XML current code is doing the following:
(boot a) -> <boot dev='/dev/fd0'/> (boot c) -> <boot dev='hda'/> (boot d) -> <boot dev='/dev/cdrom'/>
This is rather inconsistent - the 'hda' is intended to map to an entry in <devices> block. The '/dev/fd0' and '/dev/cdrom' entries did not map to anything.
Meanwhile, when converting from XML to SXEXPR it is doing the following
<boot dev='hda'/> -> (boot a) <boot dev='hdd'/> -> (boot d) <boot dev='*'/> -> (boot c)
Obviously this sucks because these processes should be matching each other.
Secondly, the (image (hvm....)) SEXPR has three entries for defining the ISO / disk image file backing the CDROM / Floppy devices. These were just being ignored, rather than turned into <disk> entries within the <devices> block. Similarly there was no way to express a <disk> entry for CDROM/Floppy in the XML when creating a domain.
The upshot of all this is that although the last release of libvirt included HVM support it was basically unusable for domain creation unless you were using HD to boot. The XML returned was also incorrect.
Now the good news. Since it was sooo broken, we can fix without worrying about XML compatability since there is no way any application could be relying on it in its current state.
Sorry to have dumped the HVM code in such crude shape. I wanted to finish the work when returning from vacation in July but became distracted with other tasks :-(, and then went on another vacation last week :-). The patch I sent in June only worked for the most basic configurations. Perhaps I wasn't clear enough about the state of the patch when posted. BTW, I like the resulting patch from this thread. My apologies again for not following through on the HVM support. Regards, Jim

On Mon, Aug 14, 2006 at 04:53:59PM -0600, Jim Fehlig wrote:
Daniel P. Berrange wrote:
Now the good news. Since it was sooo broken, we can fix without worrying about XML compatability since there is no way any application could be relying on it in its current state.
Sorry to have dumped the HVM code in such crude shape. I wanted to finish the work when returning from vacation in July but became distracted with other tasks :-(, and then went on another vacation last week :-). The patch I sent in June only worked for the most basic configurations. Perhaps I wasn't clear enough about the state of the patch when posted.
BTW, I like the resulting patch from this thread. My apologies again for not following through on the HVM support.
That's ok - the work you did to kickstart the HVM code was very helpful milestone to getting the core HVM support up & running. Thanks for reviewing the latest HVM updates too :-) While I remember - if anyone out there is using Xen on ia64, checking that libvirt works for HVM VTi domains would be helpful - most of my testing is on x86_64 platforms so far. Regards, Dan. -- |=- Red Hat, Engineering, Emerging Technologies, Boston. +1 978 392 2496 -=| |=- Perl modules: http://search.cpan.org/~danberr/ -=| |=- Projects: http://freshmeat.net/~danielpb/ -=| |=- GnuPG: 7D3B9505 F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 -=|
participants (4)
-
Daniel P. Berrange
-
Daniel Veillard
-
Jeremy Katz
-
Jim Fehlig