Hi!
Report from my multipath tests today.
My test virtual machine, that runs from an NPIV pool, is not able to use multipath.
When I pulled the cable from one of the targets, it crashed.
But, strangely, it could boot up again on that other path, that it just crashed on.
That tells me it can use both paths, and is not limited to one of them only, but because
the multipath layer isn't there, it can not survive a path failure, but can come up
again on a reboot.
The question is, WHY doesn't an NPIV pool support multipath? It is sort of the idea
behind FC to be redundant and to always have multiple paths to failover to. Why was the
NPIV pool designed like this?
If we could use the underlying devices and pass them directly to the guest, then we could
implement multipath in the guest.
But I sort of lean to that not use the NPIV anymore, since it only seems to complicate
things. In VmWare they can attach the NPIV directly to the guest, which means that the
NPIV, and whith that the LUN's are easily transfered across the SAN hosts. Here, with
libvirt/qemu/kvm, we can not attach an NPIV to the guest, which sort of makes the whole
idea fall. Especially if this is the case, that there is no multipath support. Better then
to map the LUN's directly to the host, and use the multipath devices for the guests.
If anyone else has opinions on this, or ideas that are better than mine, I would very much
like to hear them.
Regards Johan
-----libvirt-users-bounces(a)redhat.com skrev: -----
Till: John Ferlan <jferlan(a)redhat.com>
Från: Johan Kragsterman
Sänt av: libvirt-users-bounces(a)redhat.com
Datum: 2016-09-03 08:36
Kopia: libvirt-users(a)redhat.com, Yuan Dan <dyuan(a)redhat.com>
Ärende: [libvirt-users] Ang: Re: Ang: Ang: Re: Ang: Re: attaching storage pool error
Hi, John, and thank you!
This was a very thorough and welcome response, I was wondering where all the storage guys
were...
I will get back to you with more details later, specifically about multipath, since this
needs to be investigated thoroughly.
I have, with trial and error method, during the elapsed time, been able to attach the NPIV
pool LUN to a virtio-scsi controller, and it seems it already uses multipath, when I look
at the volumes in the host.
It seems for me a little bit confusing with this multipath pool procedure, since an NPIV
vhba by nature always is multipath. I will do a very simple test later today, the best
test there is: Just pulling a cable, first from one of the FC targets, and put it back
again, and then do the same with the other one. This will give me the answer if it runs on
multipath or not.
The considerations I got was, whether I would implement multipath on the guest or on the
host, and I don't know which I would prefer. Simplicity is always to prefer, so if it
is working fine on the host, I guess I'd prefer that.
Get back to you later...
/Johan
-----John Ferlan <jferlan(a)redhat.com> skrev: -----
Till: Johan Kragsterman <johan.kragsterman(a)capvert.se>, Yuan Dan
<dyuan(a)redhat.com>
Från: John Ferlan <jferlan(a)redhat.com>
Datum: 2016-09-02 20:51
Kopia: libvirt-users(a)redhat.com
Ärende: Re: [libvirt-users] Ang: Ang: Re: Ang: Re: attaching storage pool error
On 08/24/2016 06:31 AM, Johan Kragsterman wrote:
Hi again!
I saw this last week while I was at KVM Forum, but just haven't had the
time until now to start thinking about this stuff again ... as you point
out with your questions and replies - NPIV/vHBA is tricky and
complicated... I always have try to "clear the decks" of anything else
before trying to page how this all works back into the frontal cortex.
Once done, I quickly do a page flush.
It was also a bit confusing with respect to how the responses have been
threaded - so I just took the most recent one and started there.
-----libvirt-users-bounces(a)redhat.com skrev: -----
Till: Yuan Dan <dyuan(a)redhat.com>
Från: Johan Kragsterman
Sänt av: libvirt-users-bounces(a)redhat.com
Datum: 2016-08-24 07:52
Kopia: libvirt-users(a)redhat.com
Ärende: [libvirt-users] Ang: Re: Ang: Re: attaching storage pool error
Hi and thanks for your important input,Dan!
>>
>>
>> System centos7, system default libvirt version.
>>
>> I've succeeded to create an npiv storage pool, which I could start without
>> problems. Though I couldn't attach it to the vm, it throwed errors when
>> trying. I want to boot from it, so I need it working from start. I read one
>> of Daniel Berrange's old(2010) blogs about attaching an iScsi pool, and
>> draw
>> my conclusions from that. Other documentation I haven't found. Someone can
>> point me to a more recent documentation of this?
>>
>> Are there other mailing list in the libvirt/KVM communities that are more
>> focused on storage? I'd like to know about these, if so, since I'm a
>> storage
>> guy, and fiddle around a lot with these things...
>>
>> There are quite a few things I'd like to know about, that I doubt this list
>> cares about, or have knowledge about, like multipath devices/pools,
>> virtio-scsi in combination with npiv-storagepool, etc.
>>
>> So anyone that can point me further....?
>
>
http://libvirt.org/formatstorage.html
>
https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/...
>
The Red Hat documentation is most up-to-date - it was sourced (more or
less) from:
http://wiki.libvirt.org/page/NPIV_in_libvirt
There's some old stuff in there and probably needs a cleanse to provide
all the "supported" options.
> Hope it can help you to get start with it.
>
>
> Unfortunatly I have already gone through these documents, several times as
> well, but these are only about the creation of storage pools, not how you
> attach them to the guest.
If the pool is ready, here are kinds of examples
http://libvirt.org/formatdomain.html#elementsDisks
you can use it in guest like this:
<disk type='volume' device='disk'>
<driver name='qemu' type='raw'/>
<source pool='iscsi-pool' volume='unit:0:0:1'
mode='host'/>
<auth username='myuser'>
<secret type='iscsi' usage='libvirtiscsi'/>
</auth>
<target dev='vdb' bus='virtio'/>
</disk>
This is an 'iscsi' pool format, but something similar can be crafted for
the 'scsi' pool used for fc_host devices.
As I described above, I created an npiv pool for my FC backend. I'd also like to get
scsi pass through, which seems to be possible only if I use "device=lun". Can I
NOT use "device=lun", and then obviously NOT get "scsi pass through",
if I use an npiv storage pool? Is the only way to get "scsi pass through" to NOT
use a storage pool, but instead use the host lun's?
So for the purposes of taking the right steps, I assume you used 'virsh
nodedev-list --cap vports' in order to find FC capable scsi_host#'s.
Then you created your vHBA based on the FC capable fc_host, using XML
such as:
<device>
<parent>scsi_hostN</parent>
<capability type='scsi_host'>
<capability type='fc_host'>
</capability>
</capability>
</device>
where scsi_hostN and 'N' in particular is the FC capable fc_host
Then creation of the node device :
#virsh nodedev-create vhba.xml
Node device scsi_hostM created from vhba.xml
where 'M' is whatever the next available scsi_host# is on your host.
If you 'virsh nodedev-dumpxml scsi_hostM' you'll see the wwnn/wwpn details.
You can then create a vHBA scsi pool from that in order to ensure the
persistence of the vHBA. Although it's not required - the vHBA scsi
pool just allows you to provide a source pool and volume by unit # for
your guest rather than having to edit guests between host reboots or
other such conditions which cause
What do you think about this?:
<disk type='volume' device='disk'>
<driver name='qemu' type='raw'/>
<source pool='vhbapool_host8' volume='unit:0:0:1'/>
<target dev='hda' bus='ide'/>
</disk>
But I'd prefer to be able to use something like this instead:
<disk type='volume' device='lun'>
<driver name='qemu' type='raw'/>
<source pool='vhbapool_host8' volume='unit:0:0:1'/>
<target dev='vda' bus='scsi'/>
</disk>
But that might not be possible...?
The "volume" and "disk" or "volume" and "lun"
syntax can be used
somewhat interchangeably. As your point out the features for disk and
lun are slightly different. Usage of the 'lun' syntax allows addition
of the attribute "sgio='unfiltered'"
A couple of additional questions here:
* Since the target device is already defined in the pool, I don't see the reason for
defining it here as well, like in your example with the iscsi pool?
Let's forget about iscsi
* I'd like to use virtio-scsi combined with the pool, is that
possible?
Works on my test guest (ok not the order from dumpxml):
...
<controller type='scsi' index='0' model='virtio-scsi'>
<alias name='scsi0'/>
<address type='pci' domain='0x0000' bus='0x00'
slot='0x05'
function='0x0'/>
</controller>
...
<disk type='volume' device='lun'>
<driver name='qemu' type='raw'/>
<source pool='vhbapool_host3' volume='unit:0:4:0'/>
<backingStore/>
<target dev='sda' bus='scsi'/>
<shareable/>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0'
target='0' unit='0'/>
</disk>
...
* If so, how do I define that? I understand I can define a controller
separatly, but how do I tell the guest to use that specific controller in combination with
that pool?
See above... The controller has a "type", "index", and
"model"... Then
when adding the disk, use the type='drive' controller='#', where # is
the index number from your virtio-scsi controller.
* Since the npiv pool obviously is a pool based on an fc initiator,
the fc target can/will provision more lun's to that initiator, how will that effect
the pool and the guest's access to new lun's? In this example the volume says
'unit:0:0:1', and I guess that will change if there will be more lun's in
there? Or is that "volume unit" the "scsi target device", and can hold
multiple lun's?
You can use 'virsh pool-refresh $poolname' - it will find new luns...
Err, it *should* find new luns ;-) Existing 'unit:#:#:#' values
shouldn't change - they should be "tied to" the same wwnn. Use "virsh
vol-list $poolname" to see the Path. So when new ones are added they are
given new unit number's. Reboots should find the same order.
...more...
I've found something here in the RHEL7 virt guide:
<disk type='volume' device='lun' sgio='unfiltered'>
<driver name='qemu' type='raw'/>
<source pool='vhbapool_host3' volume='unit:0:0:0'/>
<target dev='sda' bus='scsi'/>
<shareable />
</disk>
Fair warning, use of sgio='unfiltered' does require some specific
kernels... There were many "issues" with this - mostly related to kernel
support. If not supported by the kernel, you are advised :
error: Requested operation is not valid: unpriv_sgio is not supported by
this kernel
Question that shows up here is the multipath question. Since this is fibre channel it is
of coarse multipath. The "target dev" says 'sda'. In a multipath dev
list it should say "/dev/mapper/mpatha".
How to handle that?
Uhh... multipath... Not my strong suit... I'm taking an example from a
bz that you won't be able to read because it's marked private.
Once you have your vHBA and scsi_hostM for that vHBA on the host you can
use 'lsscsi' (you may have to yum/dnf install it - it's a very useful
tool)...
# lsscsi
...
//assume scsi_host6 is the new vHBA created as follow
[6:0:0:0] disk IBM 1726-4xx FAStT 0617 -
[6:0:1:0] disk IBM 1726-4xx FAStT 0617 -
[6:0:2:0] disk IBM 1726-4xx FAStT 0617 /dev/sdf
[6:0:3:0] disk IBM 1726-4xx FAStT 0617 /dev/sdg
You'll need an mpath pool:
# virsh pool-dumpxml mpath
<pool type='mpath'>
<name>mpath</name>
<source>
</source>
<target>
<path>/dev/mapper</path>
<permissions>
<mode>0755</mode>
<owner>-1</owner>
<group>-1</group>
</permissions>
</target>
</pool>
# virsh pool-define mpath
# virsh pool-start mpath
# virsh vol-list mpath
Name Path
-----------------------------------------
dm-0 /dev/mapper/3600a0b80005adb0b0000ab2d4cae9254
dm-5 /dev/mapper/3600a0b80005ad1d700002dde4fa32ca8
<=== this one is from vhba scsi_host6
Then using something like:
<disk type='block' device='lun' sgio='unfiltered'>
<driver name='qemu' type='raw'/>
<source dev='/dev/mapper/3600a0b80005ad1d700002dde4fa32ca8'/>
<target dev='sda' bus='scsi'/>
<alias name='scsi0-0-0-0'/>
<address type='drive' controller='0' bus='0'
target='0' unit='0'/>
</disk>
HTH,
John
(FWIW: I'm not sure how the leap of faith was taken that dm-5 is the
vHBA for scsi_host6... Although I think it's from the wwnn for a volume
in the vHBA as seen when using a virsh vol-list from a pool created
using the vHBA within the bz).
_______________________________________________
libvirt-users mailing list
libvirt-users(a)redhat.com
https://www.redhat.com/mailman/listinfo/libvirt-users