[libvirt] [RFC] Interface for disk hotadd/remove

Hi, I'm currently interested in implementing hard disk hot-add and -remove support for qemu (as opposed to controller-based hotplugging), and this brings up the question how to best support this feature in libvirt. Many SCSI-Controllers in real machines, for instance, allow to add and remove disks (without adding or removing the controller itself) while the system is up and running, so it would be nice to emulate this in a virtual machine. I'm focusing on qemu on the backend side, but the problem is not related to this particular backend. Rather, the question is how to integrate such a feature best into libvirt. Before implementing the functionality, it would be great to hear the community's opinion which route to take wrt. the interface. Essentially, I can see two options: - Naturally, there are virDomain{At,De}tachDevice, but the pair implements drive-hotadding via adding a new controller with an attached hard disk to the system. By extending the XML description of the drive with a parameter that specifies whether controller- or disk-based hotplugging is to be performed, it would be possible to implement the desired functionality, whilst preserving compatibility with existing semantics. Removing the drive would then require another new parameter in the XML description to identify the drive on the controller, which does not really prettify the thing. - Extend the API with a new method (for instance virDomainDiskAttach) that takes a hard disk description, a controller identifier, and a parameter that identifies the disk on the controller. - (Theoretically, it would also be possible to implement media exchange for hard disks in qemu and re-use the media exchange infrastructure already present in libvirt for CD-ROMs, but since this possibility comes to use on real hardware only very occasionally, guest operating systems are typically not really prepared to handle this well) My preference would be to go for option 2, that is, implement a new API method. Would there be any obstacles against adding such a patch to mainline? Or is anyone already working on similar functionality? Or can this be done in a much simpler way I've missed? If not, then I'd send patches for more detailed review before long. Thanks, Wolfgang

On Thu, Aug 13, 2009 at 12:54:00PM +0200, Wolfgang Mauerer wrote:
I'm currently interested in implementing hard disk hot-add and -remove support for qemu (as opposed to controller-based hotplugging), and this brings up the question how to best support this feature in libvirt. Many SCSI-Controllers in real machines, for instance, allow to add and remove disks (without adding or removing the controller itself) while the system is up and running, so it would be nice to emulate this in a virtual machine. I'm focusing on qemu on the backend side, but the problem is not related to this particular backend. Rather, the question is how to integrate such a feature best into libvirt.
Before implementing the functionality, it would be great to hear the community's opinion which route to take wrt. the interface. Essentially, I can see two options:
- Naturally, there are virDomain{At,De}tachDevice, but the pair implements drive-hotadding via adding a new controller with an attached hard disk to the system. By extending the XML description of the drive with a parameter that specifies whether controller- or disk-based hotplugging is to be performed, it would be possible to implement the desired functionality, whilst preserving compatibility with existing semantics. Removing the drive would then require another new parameter in the XML description to identify the drive on the controller, which does not really prettify the thing.
- Extend the API with a new method (for instance virDomainDiskAttach) that takes a hard disk description, a controller identifier, and a parameter that identifies the disk on the controller.
I don't think its desirable to extend the API. The virDomainAttachDevice API's XML parameter is intended to take any XML element that is valid inside the domain <devices> XML section. So the key to deciding how to deal with hotplug, is to first decide how to represent disk controllers in the domain XML. At boot time, if you list multiple SCSI disks in the XML, you get a single controller with multiple disks attached. As such the current semantics of the hotplug implementation for SCSI are divergant from the semantics at boot time. This has the consequence that if you boot a guest with 2 SCSI drives, then hot-attach one more you get 2 controllers, 3 disk. If you then shutdown & boot the guest again, you'll have 1 controller, 1 disks. This says to me that the hotplug implementation for QEMU SCSI should be fixed so that if you supply <disk> it adds a disk to the existing SCSI controller. Obviously it requires some QEMU support for this first. Thus my feeling would be todo something like adding a new <controller> element to represent a disk controller. Support hotplugging <controller> instances directly with obvious semantics. Then extend the <disk> schema to allow a controller name/identifier to be provided. If no controller is listed then plug the disk into the first available controller slot, otherwise use the explicitly requested controller. Never implicitily add new controllers, unless dealing with a legacy QEMU without the disk hotplug support that you're writing. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|

Hi, On Thu, Aug 13, 2009 at 8:31 PM, Daniel P. Berrange<berrange@redhat.com> wrote:
On Thu, Aug 13, 2009 at 12:54:00PM +0200, Wolfgang Mauerer wrote:
I'm currently interested in implementing hard disk hot-add and -remove support for qemu (as opposed to controller-based hotplugging), and this brings up the question how to best support this feature in libvirt. Many SCSI-Controllers in real machines, for instance, allow to add and remove disks (without adding or removing the controller itself) while the system is up and running, so it would be nice to emulate this in a virtual machine. I'm focusing on qemu on the backend side, but the problem is not related to this particular backend. Rather, the question is how to integrate such a feature best into libvirt.
Before implementing the functionality, it would be great to hear the community's opinion which route to take wrt. the interface. Essentially, I can see two options:
- Naturally, there are virDomain{At,De}tachDevice, but the pair implements drive-hotadding via adding a new controller with an attached hard disk to the system. By extending the XML description of the drive with a parameter that specifies whether controller- or disk-based hotplugging is to be performed, it would be possible to implement the desired functionality, whilst preserving compatibility with existing semantics. Removing the drive would then require another new parameter in the XML description to identify the drive on the controller, which does not really prettify the thing.
- Extend the API with a new method (for instance virDomainDiskAttach) that takes a hard disk description, a controller identifier, and a parameter that identifies the disk on the controller.
I don't think its desirable to extend the API. The virDomainAttachDevice API's XML parameter is intended to take any XML element that is valid inside the domain <devices> XML section.
okay, that's good to know.
So the key to deciding how to deal with hotplug, is to first decide how to represent disk controllers in the domain XML.
At boot time, if you list multiple SCSI disks in the XML, you get a single controller with multiple disks attached. As such the current semantics of the hotplug implementation for SCSI are divergant from the semantics at boot time. This has the consequence that if you boot a guest with 2 SCSI drives, then hot-attach one more you get 2 controllers, 3 disk. If you then shutdown & boot the guest again, you'll have 1 controller, 1 disks.
This says to me that the hotplug implementation for QEMU SCSI should be fixed so that if you supply <disk> it adds a disk to the existing SCSI controller. Obviously it requires some QEMU support for this first.
Qemu support for this is already there (I was not quite clear on that, the thing I'm currently adding additionally is hot-remove). I completely agree that the current semantics of libvirt-controlled disk-hotplug in qemu provide some possibilities for improvement, but I was not sure if it's okay to change the semantics of a libvirt command whilst preserving its syntax.
Thus my feeling would be todo something like adding a new <controller> element to represent a disk controller. Support hotplugging <controller> instances directly with obvious semantics.
Then extend the <disk> schema to allow a controller name/identifier to be provided. If no controller is listed then plug the disk into the first available controller slot, otherwise use the explicitly requested controller. Never implicitily add new controllers, unless dealing with a legacy QEMU without the disk hotplug support that you're writing.
Sounds reasonable to me. Roughly, that would lead to something along the lines of (for a SCSI host on the PCI bus) <controller type="scsi" id="my_controller" > <bus addr="00:04"/> </controller> <disk type="scsi"> <source file="disk.img"/> <controller id="my_controller" unit="0"/> </disk> This allows for specifying both the address of the controller on the bus, and to identify the disk on the controller -- this is important for hot-remove if multiple "physically" identical disks are attached to a single controller. If virDomainAttachDevice is fed with a <disk> element containing an explicitely specified controller id, libvirt can then simply add the disk if the controller exists, or add controller+disk if the controller is not yet present on the machine. If the controller is not specified and none is present in the system, follow the current behaviour. If no controller is specified, but a controller is present, just add the disk, not controller+disk, breaking current behaviour. For hot-remove, do the reverse action (though I'll consider under which circumstances it is desirable to remove the controller when a disk is supposed to be removed, but these details are best discussed once I've done the code, I suppose). Does what I've outlined correspond to your idea? Thanks, Wolfgang

On Thu, Aug 13, 2009 at 10:58:34PM +0200, Wolfgang Mauerer wrote:
On Thu, Aug 13, 2009 at 8:31 PM, Daniel P. Berrange<berrange@redhat.com> wrote:
So the key to deciding how to deal with hotplug, is to first decide how to represent disk controllers in the domain XML.
At boot time, if you list multiple SCSI disks in the XML, you get a single controller with multiple disks attached. As such the current semantics of the hotplug implementation for SCSI are divergant from the semantics at boot time. This has the consequence that if you boot a guest with 2 SCSI drives, then hot-attach one more you get 2 controllers, 3 disk. If you then shutdown & boot the guest again, you'll have 1 controller, 1 disks.
This says to me that the hotplug implementation for QEMU SCSI should be fixed so that if you supply <disk> it adds a disk to the existing SCSI controller. Obviously it requires some QEMU support for this first.
Qemu support for this is already there (I was not quite clear on that, the thing I'm currently adding additionally is hot-remove). I completely agree that the current semantics of libvirt-controlled disk-hotplug in qemu provide some possibilities for improvement, but I was not sure if it's okay to change the semantics of a libvirt command whilst preserving its syntax.
In this case the semantics of the hotplug command are broken, so we're merely fixing them to be consistent with semantics at normal boot. Thus it is acceptable to change it in this particular case.
Thus my feeling would be todo something like adding a new <controller> element to represent a disk controller. Support hotplugging <controller> instances directly with obvious semantics.
Then extend the <disk> schema to allow a controller name/identifier to be provided. If no controller is listed then plug the disk into the first available controller slot, otherwise use the explicitly requested controller. Never implicitily add new controllers, unless dealing with a legacy QEMU without the disk hotplug support that you're writing.
Sounds reasonable to me. Roughly, that would lead to something along the lines of (for a SCSI host on the PCI bus)
<controller type="scsi" id="my_controller" > <bus addr="00:04"/> </controller>
<disk type="scsi"> <source file="disk.img"/> <controller id="my_controller" unit="0"/> </disk>
This allows for specifying both the address of the controller on the bus, and to identify the disk on the controller -- this is important for hot-remove if multiple "physically" identical disks are attached to a single controller.
Yes that idea sounds reasonable.
If virDomainAttachDevice is fed with a <disk> element containing an explicitely specified controller id, libvirt can then simply add the disk if the controller exists, or add controller+disk if the controller is not yet present on the machine. If the controller is not specified and none is present in the system, follow the current behaviour. If no controller is specified, but a controller is present, just add the disk, not controller+disk, breaking current behaviour. For hot-remove, do the reverse action (though I'll consider under which circumstances it is desirable to remove the controller when a disk is supposed to be removed, but these details are best discussed once I've done the code, I suppose). Does what I've outlined correspond to your idea?
Yes, with one minor change - if the disk XML contains a controller, but the controller doesn't exist, then we should return an error and requiring the app to explicitly add the <controller> separately. Regards, Daniel -- |: Red Hat, Engineering, London -o- http://people.redhat.com/berrange/ :| |: http://libvirt.org -o- http://virt-manager.org -o- http://ovirt.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: GnuPG: 7D3B9505 -o- F3C9 553F A1DA 4AC2 5648 23C1 B3DF F742 7D3B 9505 :|
participants (2)
-
Daniel P. Berrange
-
Wolfgang Mauerer