On 22/01/14 22:02, John Ferlan wrote:
On 01/07/2014 05:37 AM, Osier Yang wrote:
> On 07/01/14 02:30, John Ferlan wrote:
>> On 01/06/2014 05:19 AM, Osier Yang wrote:
>>> The SCSI device corresponding to the vHBA might not show up in
>>> sysfs yet when we trying to scan the LUNs. As a result, we will
>>> end up with an empty volume set for the pool after pool-start,
>>> even if there are LUNs.
>> So what causes the "delay" to get the LUN's into sysfs?
> It's basicly from the delay of udev.
>
>> Is there
>> something that can be done at creation time (or wherever) to sync that
>> sooner?
> I thought like that, let's say at the point of "createVport". But the
> "createVport" is just to create the vHBA, and nothing else to do,
> the left work for device showing up in the sysfs tree is on the
> udev then.
>
> Polling right after "createVport" for the SCSI device in sysfs tree
> will take more time.
>
>> Is there a way to determine that the SCSI device hasn't shown
>> up yet other than the readdir()? You're adding a delay/loop for some
>> other subsystem's inability? to sync or provide the resources.
> It's the only way AFAIK.
>
>>> Though the time of the device showing up is rather depended,
>>> better than doing nothing, this patch introduces the polling
>>> with 5 * 1 seconds in maximum (the time works fine on my
>>> testing machine at least). Note that for the pool which doesn't
>>> have any LUN, it will still take 5 seconds to poll, but it's
>>> not a bad trade, 5 seconds is not much, and in most cases,
>>> one won't use an empty pool in practice.
>> Since the paths that call into this only seem to be via refresh and
>> perhaps a subsequent refresh would resolve things.
> Exactly, in most cases it will work, since the time window between
> pool-start and pool-refresh should be enough for the SCSI device
> showing up in the sysfs tree, *generally*. but it's not necessary
> to be that.
>
>> Could this be better
>> served by documenting that it's possible that depending on circumstance
>> "X" (answer to my first question) that in order to see elements in the
>> pool, one may have to reload again. Assuming of course that the next
>> reload would find them...
> I thought like this too, but the problem is it's not guaranteed that
> the volume could be loaded after execute "pool-refresh" one time,
> may be 2 times, 3 times, ... N times. Also the time of each
> "pool-refresh" is not fixed, it depends on how long the
> "udevadm settle" (see virFileWaitForDevices) will take.
>
>> I guess I'm a bit cautious about adding randomly picked timeout values
>> based on some test because while it may work for you, perhaps it's 10
>> seconds for someone else. While you may consider a 5 second pause "OK"
>> and "reasonable" a customer may not consider that to be reasonable.
>> People (and testers) do strange and random things.
> Exactly, this is not the only problem we faced regarding to the
> storage stuffs, and the users keeps asking why, why, and why.
>
>> Furthermore, could it be possible that you "catch" things perfectly
and
>> only say 10 of 100 devices are found... But if you waited another 5
>> seconds the other 90 devices would show up.. I think by adding this
>> code you end up down a slippery slope of handing fc_host devices
>> specially...
> We are exactly on the same page, but the question is what the
> best solution we should provide? It looks ugly if we add documentation
> saying one should use pool-refresh after the pool is started, if the
> volumes are not loaded, but how many times to use the pool-refresh
> is depended? This patch was basicly a proposal for discussion. I
> didn't expect it could be committed smoothly.
>
I'm still not convinced that the retry loop is the right way to go. We
could conceivably still have a failure even after 5 retries at once per
second.
Also is sleep() the right call or should it be usleep()? I have this
alarm (sic) going off in the back of my head about using sleep() in a
threaded context. Although I do see node_device_driver.c uses it...
Regardless of whether we (u)sleep() or not, if we still have a failure,
then we're still left documenting the fact that for fc_host type pools
there are "conditions" that need to be met which are out of the control
of libvirt's scope of influence that would cause the pool to not be
populated. The "user" is still left trying to refresh multiple times.
And we still don't know exactly whey we're not seeing the devices. The
reality is this is also true for other pools as well since it's not
guaranteed that processLU() is ever called and 'retval' is always 0 (and
probably could be removed since it's pointless).
I gave up hacking in the code, instead, I'm making a patch to add
document in virsh manual as a "Note" to prompt the fact. Something
like this:
<...>
+B<Note>: For pool which relies on remote resources, e.g. a "iscsi" type
+pool, or a "scsi" type pool based on a (v)HBA, since the corresponding
+devices for the volumes may not show up on the host's sysfs yet during
+the pool starting process, it may need to refresh the pool (see
+B<pool-refresh>) to get the volumes correctly loaded. How many times needed
+for the refreshing is depended on the network connection and the time
+the host takes to export the corresponding devices to sysfs.
</...>
Osier