
On Tue, Nov 20, 2012 at 11:26:53AM -0500, Dave Allan wrote:
On Tue, Nov 20, 2012 at 10:17:11AM +0000, Daniel P. Berrange wrote:
On Mon, Nov 19, 2012 at 05:30:11PM +0800, Osier Yang wrote:
Hi,
This proposal is trying to figure out a solution for migration of domain which uses LUN behind vHBA as disk device (QEMU emulated disk only at this stage). And other related NPIV improvements which are not related with migration. I'm not luck to get a environment to test if the thoughts are workable, but I'd like see if guys have good idea/suggestions earlier.
1) Persistent vHBA support
This is the useful stuff missed for long time. Assuming that one created a vHBA, did masking/zoning, everything works as expected. However, after a system rebooting, everything is just lost. If the user wants to get things back, he has to find out the preivous WWNN & WWPN, and create the vHBA again.
On the other hand, Persistent vHBA support is actually required for domain which uses LUN behind a vHBA. Othewise the domain could fail to start after a system rebooting.
To support the persistent vHBA, new APIs like virNodeDeviceDefineXML, virNodeDeviceUndefine is required. Also it's useful to introduce "autostart" for vHBA, so that the vHBA could be started automatically after system rebooting.
Proposed APIs:
virNodeDevicePtr virNodeDeviceDefineXML(virConnectPtr conn, const char *xml, unsigned int flags);
int virNodeDeviceUndefine(virConnectPtr conn, virNodeDevicePtr dev, unsigned int flags);
int virNodeDeviceSetAutostart(virNodeDevicePtr dev, int autostart, unsigned int flags);
int virNodeDeviceGetAutostart(virNodeDevicePtr dev, int *autostart, unsigned int flags);
I don't really much like this approach. IMHO, this should all be done via the virStoragePool APIs instead. Adding define/undefine/autostart to virNodeDevice is really just duplicating the storage pool functionality.
I like the idea of making vHBAs persist as part of pools; how do you envision it should work? Extend the scsi pools to take a vHBA descriptor and then instantiating the vHBA as part of starting the pool, or something else?
Yes, pretty much that. Create when you start the pool, delete when you destroy the pool.
If we do the mapping of HBAs to guest domains using storage pools, then at a guest level, migration requires zero work.
It is simply upto the management app to create the storage pool on the destination host with the same Name + UUID, but with the secondary WWNN/WWPN. The nice thing about this, is that you don't need to hardcode details of a secondary WWNN/WWPN up-front. The management app can just decide on those at the time it performs the migration, so 99% of the time there will only need to be a single vHBA setup on the SAN. During migration the mgmt app can setup a second vHBA for the target host, and once complete, delete the original vHBA entirely.
Agreed, although there will of course need to be some degree of up-front coordination between the management app and the SAN administrators to avoid having to involve them to migrate a VM.
Yep, this is in fact why I like to push off more of this detail to the mgmt app. Libvirt is unable to talk to the SAN, so its better if the mgmt app had more direct control of the VHBA setup/teardown via the storage APIs, than to do it automagically in virDomainMigrate where the mgmt app cannot synchronize so easily.
4) Enrich HBA's XML
It's hard to known the vHBAs created from a HBA with current implementation. One have to dump XML of each (v)HBAs and find out the clue with element "parent" of vHBAs. It's good to introduce new element for HBA like "vports", so that one can easily known what (how many) vHBAs are created from the HBA?
And also it's good to have the maximum vports the HBA supports.
Except these, other useful information should be exposed too, such as the vendor name, the HBA state, PCI address, etc.
The new XMLs should be like:
<vports num='2' max='64'> <vport name="scsi_host40" wwpn="2101001b32a90004"/> <vport name="scsi_host40" wwpn="2101001b32a90005"/> </vports> <online/> <vendor>QLogic</vendor> <address type="pci" domain="0" bus="0" slot="5" function="0"/>
"online", "vendor", "address" make sense to vHBA too.
I'm trying to remember how we modelled the parent/child relationship for SR-IOV PCI cards. NPIV is a very similar concept, so we should ideally seek to model the parent/child relationship in the same manner.
Physical function:
<device> <name>pci_0000_01_00_0</name> <parent>pci_0000_00_01_0</parent> <driver> <name>igb</name> </driver> <capability type='pci'> <domain>0</domain> <bus>1</bus> <slot>0</slot> <function>0</function> <product id='0x10c9'>82576 Gigabit Network Connection</product> <vendor id='0x8086'>Intel Corporation</vendor> <capability type='virt_functions'> <address domain='0x0000' bus='0x01' slot='0x10' function='0x0'/> <address domain='0x0000' bus='0x01' slot='0x10' function='0x2'/> <address domain='0x0000' bus='0x01' slot='0x10' function='0x4'/> <address domain='0x0000' bus='0x01' slot='0x10' function='0x6'/> <address domain='0x0000' bus='0x01' slot='0x11' function='0x0'/> <address domain='0x0000' bus='0x01' slot='0x11' function='0x2'/> <address domain='0x0000' bus='0x01' slot='0x11' function='0x4'/> </capability> </capability> </device>
Virtual function:
<device> <name>pci_0000_01_10_0</name> <parent>pci_0000_00_01_0</parent> <driver> <name>igbvf</name> </driver> <capability type='pci'> <domain>0</domain> <bus>1</bus> <slot>16</slot> <function>0</function> <product id='0x10ca'>82576 Virtual Function</product> <vendor id='0x8086'>Intel Corporation</vendor> <capability type='phys_function'> <address domain='0x0000' bus='0x01' slot='0x00' function='0x0'/> </capability> <capability type='virt_functions'> </capability> </capability> </device>
Interesingly, I think there's a bug there; the VF should not be showing <capability type='virt_functions'> but that's unrelated to the present discussion.
Ok, so we should model vHBA relationships via some kind of <capability> then. Daniel -- |: http://berrange.com -o- http://www.flickr.com/photos/dberrange/ :| |: http://libvirt.org -o- http://virt-manager.org :| |: http://autobuild.org -o- http://search.cpan.org/~danberr/ :| |: http://entangle-photo.org -o- http://live.gnome.org/gtk-vnc :|