Re: [libvirt] [PATCH 2/2] storage: Polling the sysfs for pool with "fc_host" type adapter

22 Jan 2014


      On 22/01/14 22:02, John Ferlan wrote:
...
On 01/07/2014 05:37 AM, Osier Yang wrote:
...
On 07/01/14 02:30, John Ferlan wrote:
...
On 01/06/2014 05:19 AM, Osier Yang wrote:
...
The SCSI device corresponding to the vHBA might not show up in
sysfs yet when we trying to scan the LUNs. As a result, we will
end up with an empty volume set for the pool after pool-start,
even if there are LUNs.
So what causes the "delay" to get the LUN's into sysfs?
It's basicly from the delay of udev.
...
Is there
something that can be done at creation time (or wherever) to sync that
sooner?
I thought like that, let's say at the point of "createVport". But the
"createVport" is just to create the vHBA, and nothing else to do,
the left work for device showing up in the sysfs tree is on the
udev then.
Polling right after "createVport" for the SCSI device in sysfs tree
will take more time.
...
Is there a way to determine that the SCSI device hasn't shown
up yet other than the readdir()?  You're adding a delay/loop for some
other subsystem's inability? to sync or provide the resources.
It's the only way AFAIK.
...
...
Though the time of the device showing up is rather depended,
better than doing nothing, this patch introduces the polling
with 5 * 1 seconds in maximum (the time works fine on my
testing machine at least). Note that for the pool which doesn't
have any LUN, it will still take 5 seconds to poll, but it's
not a bad trade, 5 seconds is not much, and in most cases,
one won't use an empty pool in practice.
Since the paths that call into this only seem to be via refresh and
perhaps a subsequent refresh would resolve things.
Exactly, in most cases it will work, since the time window between
pool-start and pool-refresh should be enough for the SCSI device
showing up in the sysfs tree, *generally*.  but it's not necessary
to be that.
...
Could this be better
served by documenting that it's possible that depending on circumstance
"X" (answer to my first question) that in order to see elements in the
pool, one may have to reload again.  Assuming of course that the next
reload would find them...
I thought like this too, but the problem is it's not guaranteed that
the volume could be loaded after execute "pool-refresh" one time,
may be 2 times, 3 times, ... N times.  Also the time of each
"pool-refresh" is not fixed, it depends on how long  the
"udevadm settle" (see virFileWaitForDevices) will take.
...
I guess I'm a bit cautious about adding randomly picked timeout values
based on some test because while it may work for you, perhaps it's 10
seconds for someone else. While you may consider a 5 second pause "OK"
and "reasonable" a customer may not consider that to be reasonable.
People (and testers) do strange and random things.
Exactly, this is not the only problem we faced regarding to the
storage stuffs, and the users keeps asking why, why, and why.
...
Furthermore, could it be possible that you "catch" things perfectly and
only say 10 of 100 devices are found... But if you waited another 5
seconds the other 90 devices would show up..  I think by adding this
code you end up down a slippery slope of handing fc_host devices
specially...
We are exactly on the same page, but the question is what the
best solution we should provide? It looks ugly if we add documentation
saying one should use pool-refresh after the pool is started, if the
volumes are not loaded, but how many times to use the pool-refresh
is depended?  This patch was basicly a proposal for discussion. I
didn't expect it could be committed smoothly.
I'm still not convinced that the retry loop is the right way to go.  We
could conceivably still have a failure even after 5 retries at once per
second.
Also is sleep() the right call or should it be usleep()?  I have this
alarm (sic) going off in the back of my head about using sleep() in a
threaded context.  Although I do see node_device_driver.c uses it...
Regardless of whether we (u)sleep() or not, if we still have a failure,
then we're still left documenting the fact that for fc_host type pools
there are "conditions" that need to be met which are out of the control
of libvirt's scope of influence that would cause the pool to not be
populated. The "user" is still left trying to refresh multiple times.
And we still don't know exactly whey we're not seeing the devices. The
reality is this is also true for other pools as well since it's not
guaranteed that processLU() is ever called and 'retval' is always 0 (and
probably could be removed since it's pointless).
I gave up hacking in the code, instead, I'm making a patch to add
document in virsh manual as a "Note" to prompt the fact.  Something
like this:

<...>
+B<Note>: For pool which relies on remote resources, e.g. a "iscsi" type
+pool, or a "scsi" type pool based on a (v)HBA, since the corresponding
+devices for the volumes may not show up on the host's sysfs yet during
+the pool starting process, it may need to refresh the pool (see
+B<pool-refresh>) to get the volumes correctly loaded. How many times needed
+for the refreshing is depended on the network connection and the time
+the host takes to export the corresponding devices to sysfs.
</...>

Osier