On Mon, Dec 12, 2016 at 8:02 PM, Prasanna Kalever <pkalever(a)redhat.com> wrote:
On Wed, Dec 7, 2016 at 10:38 PM, Peter Krempa
<pkrempa(a)redhat.com> wrote:
> On Tue, Dec 06, 2016 at 22:52:01 +0530, Prasanna Kumar Kalever wrote:
>> Currently, in case if we have 4 extra attached disks, then for each disk
>> we need to call 'glfs_init' (over network) and friends which could be
costly.
>>
>> Additionally snapshot(external) scenario will further complex the situation.
>
> This part was extracted to a separate patch already, thus this doesn't
> have to deal with snapshots in any way than with other gluster
> connections.
Yeah!
Sorry, this still holds good. on taking a snapshot
virStorageFileGetMetadataRecurse will be called on each overlay.
Which seeks/hits the cache help while the virStorageFileInitAs
respective calls (In case if snaps reside on the same volume).
Thanks,
--
Prasanna
>
>> This patch maintain a cache of glfs objects per volume, hence the
>> all disks/snapshots belonging to the volume whose glfs object is cached
>> can avoid calls to 'glfs_init' and friends by simply taking a ref on
>> existing object.
>>
>> The glfs object is shared only if volume name and all the volfile servers match
>> (includes hostname, transport and port number).
>>
>> Thanks to 'Peter Krempa' for all the inputs.
>>
>> Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever(a)redhat.com>
>> ---
>> src/storage/storage_backend_gluster.c | 279 +++++++++++++++++++++++++++++++---
>> 1 file changed, 254 insertions(+), 25 deletions(-)
>>
>> diff --git a/src/storage/storage_backend_gluster.c
b/src/storage/storage_backend_gluster.c
>> index e0841ca..9f633ac 100644
>> --- a/src/storage/storage_backend_gluster.c
>> +++ b/src/storage/storage_backend_gluster.c
>> @@ -36,6 +36,20 @@
>>
>> VIR_LOG_INIT("storage.storage_backend_gluster");
>>
>> +
>> +typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
>> +typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
>> +
>> +typedef struct _virStorageFileBackendGlusterPriv
virStorageFileBackendGlusterPriv;
>> +typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
>> +
>> +typedef struct _virStorageBackendGlusterCache virStorageBackendGlusterCache;
>> +typedef virStorageBackendGlusterCache *virStorageBackendGlusterCachePtr;
>> +
>> +typedef struct _virStorageBackendGlusterCacheConn
virStorageBackendGlusterCacheConn;
>> +typedef virStorageBackendGlusterCacheConn
*virStorageBackendGlusterCacheConnPtr;
>> +
>> +
>> struct _virStorageBackendGlusterState {
>> glfs_t *vol;
>>
>> @@ -47,8 +61,205 @@ struct _virStorageBackendGlusterState {
>> char *dir; /* dir from URI, or "/"; always starts and ends in
'/' */
>> };
>>
>> -typedef struct _virStorageBackendGlusterState virStorageBackendGlusterState;
>> -typedef virStorageBackendGlusterState *virStorageBackendGlusterStatePtr;
>> +struct _virStorageFileBackendGlusterPriv {
>> + glfs_t *vol;
>
> You should also keep a reference to the object in the cache, so that you
> can return it and don't have to look it up again.
>
>> + char *canonpath;
>> +};
>> +
>> +struct _virStorageBackendGlusterCacheConn {
>> + virObjectLockable parent;
>> + virStorageFileBackendGlusterPrivPtr priv;
>
> You don't really need the complete object here. you only need the glfs_t
> part here. The 'cannonpath' variable is different for possible storage
> source objects.
got it.
>
>> + size_t nhosts;
>> + virStorageNetHostDefPtr hosts;
>> + char *volname;
>> +};
>> +
>> +struct _virStorageBackendGlusterCache {
>> + virMutex lock;
>> + size_t nConn;
>> + virStorageBackendGlusterCacheConnPtr *conn;
>> +};
>> +
>> +
>> +static virStorageBackendGlusterCachePtr virStorageBackendGlusterGlobalCacheObj;
>> +static virClassPtr virStorageBackendGlusterCacheConnClass;
>> +static virOnceControl virStorageBackendGlusterCacheOnce =
VIR_ONCE_CONTROL_INITIALIZER;
>> +static bool virStorageBackendGlusterCacheInitGlobalError;
>> +static bool virStorageBackendGlusterCacheConnCleaned;
>
> As said. Don't use global flags. You should be able to avoid both of
> them.
>
> Okay. I've seen below. They not only are avoidable, but your usage is
> outright dangerous.
Okay.
>
>> +
>> +
>> +static void virStorageBackendGlusterCacheConnDispose(void *obj);
>> +
>> +
>> +static int virStorageBackendGlusterCacheConnOnceInit(void)
>> +{
>> + if (!(virStorageBackendGlusterCacheConnClass =
>> + virClassNew(virClassForObjectLockable(),
>> +
"virStorageBackendGlusterCacheConnClass",
>> + sizeof(virStorageBackendGlusterCacheConn),
>> + virStorageBackendGlusterCacheConnDispose)))
>
> I've noticed that you actually create a lockable object here, but don't
> use the locks. This is currently okay, since it's write only. The lock
> might come handy though ...
>
>> + return -1;
>> +
>> + return 0;
>> +}
>> +
>> +static void virStorageBackendGlusterCacheConnDispose(void *object)
>> +{
>> + virStorageBackendGlusterCacheConnPtr conn = object;
>> +
>> + glfs_fini(conn->priv->vol);
>> +
>> + VIR_FREE(conn->priv->canonpath);
>> + VIR_FREE(conn->priv);
>> + VIR_FREE(conn->volname);
>> +
>> + virStorageNetHostDefFree(conn->nhosts, conn->hosts);
>> +
>> + /* all set to delete the connection from cache */
>> + virStorageBackendGlusterCacheConnCleaned = true;
>> +}
>> +
>> +VIR_ONCE_GLOBAL_INIT(virStorageBackendGlusterCacheConn);
>> +
>> +static bool
>> +virStorageBackendGlusterCompareHosts(size_t nSrcHosts,
>> + virStorageNetHostDefPtr srcHosts,
>> + size_t nDstHosts,
>> + virStorageNetHostDefPtr dstHosts)
>> +{
>> + bool match = false;
>> + size_t i;
>> + size_t j;
>> +
>> + if (nSrcHosts != nDstHosts)
>> + return false;
>> + for (i = 0; i < nSrcHosts; i++) {
>> + for (j = 0; j < nDstHosts; j++) {
>> + if (srcHosts[i].type != dstHosts[j].type)
>> + continue;
>> + switch (srcHosts[i].type) {
>> + case VIR_STORAGE_NET_HOST_TRANS_UNIX:
>> + if (STREQ(srcHosts[i].u.uds.path, dstHosts[j].u.uds.path))
>> + match = true;
>> + break;
>> + case VIR_STORAGE_NET_HOST_TRANS_TCP:
>> + case VIR_STORAGE_NET_HOST_TRANS_RDMA:
>> + if (STREQ(srcHosts[i].u.inet.addr, dstHosts[j].u.inet.addr)
>> + &&
>> + STREQ(srcHosts[i].u.inet.port,
dstHosts[j].u.inet.port))
>> + match = true;
>> + break;
>> + case VIR_STORAGE_NET_HOST_TRANS_LAST:
>> + break;
>> + }
>> + if (match)
>> + break; /* out of inner for loop */
>> + }
>> + if (!match)
>> + return false;
>> + match = false; /* reset */
>> + }
>> + return true;
>> +}
>> +
>> +static int
>> +virStorageBackendGlusterCacheAdd(virStorageSourcePtr src,
>> + virStorageFileBackendGlusterPrivPtr priv)
>> +{
>> + virStorageBackendGlusterCacheConnPtr newConn = NULL;
>> + virStorageBackendGlusterCacheConnPtr conn = NULL;
>> + size_t i;
>> +
>> + for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>
> Umm ... you iterate the cache ...
This is the problem with changing the code in the last minute.
I have edited something after I did git format-patch.
Will address them.
>
>> + conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>> + if (STRNEQ(conn->volname, src->volume))
>> + continue;
>> + if (virStorageBackendGlusterCompareHosts(conn->nhosts,
conn->hosts,
>> + src->nhosts, src->hosts))
{
>> + return 0;
>> + }
>> + }
>> +
>> + if (virStorageBackendGlusterCacheConnInitialize() < 0)
>> + return -1;
>
> ... before attempting to initialize the global cache?!? That doesn't
> make sense. Also this call is not needed here, since the only entrypoint
> is virStorageFileBackendGlusterInit which initializes it.
>
>> +
>> + if (!(newConn =
virObjectLockableNew(virStorageBackendGlusterCacheConnClass)))
>> + return -1;
>> +
>> + if (VIR_STRDUP(newConn->volname, src->volume) < 0)
>> + goto error;
>> +
>> + newConn->nhosts = src->nhosts;
>> + if (!(newConn->hosts = virStorageNetHostDefCopy(src->nhosts,
src->hosts)))
>> + goto error;
>> + newConn->priv = priv;
>> +
>> + if (VIR_APPEND_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn,
>> + virStorageBackendGlusterGlobalCacheObj->nConn,
>> + newConn) < 0)
>> + goto error;
>> +
>> + return 0;
>> +
>> + error:
>> + virStorageNetHostDefFree(newConn->nhosts, newConn->hosts);
>> + VIR_FREE(newConn->volname);
>
> You leak newConn in some cases.
Unref!
>
>> + return -1;
>> +}
>> +
>> +static glfs_t *
>> +virStorageBackendGlusterCacheQuery(virStorageSourcePtr src)
>
> We usually use "Lookup".
good one. My naming skills are pathetic. Mercy me ...
>
>> +{
>> + virStorageBackendGlusterCacheConnPtr conn;
>> + size_t i;
>> +
>> + if (virStorageBackendGlusterGlobalCacheObj == NULL)
>> + return NULL;
>> +
>> + for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>> + conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>> + if (STRNEQ(conn->volname, src->volume))
>> + continue;
>> + if (virStorageBackendGlusterCompareHosts(conn->nhosts,
conn->hosts,
>> + src->nhosts, src->hosts))
{
>> + virObjectRef(conn);
>> + return conn->priv->vol;
>> + }
>> + }
>> +
>> + return NULL;
>> +}
>> +
>> +static void
>> +virStorageBackendGlusterCacheRefresh(virStorageSourcePtr src)
>
> This name is weird. Did you mean "Release"?
As you say.
>
>> +{
>> + virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
>> + virStorageBackendGlusterCacheConnPtr conn = NULL;
>> + size_t i;
>> +
>> + virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> + for (i = 0; i < virStorageBackendGlusterGlobalCacheObj->nConn; i++) {
>> + conn = virStorageBackendGlusterGlobalCacheObj->conn[i];
>> + if (STRNEQ(conn->volname, src->volume))
>> + continue;
>> + if (virStorageBackendGlusterCompareHosts(conn->nhosts,
conn->hosts,
>> + src->nhosts, src->hosts))
{
>> + virObjectUnref(conn);
>> + if (!virStorageBackendGlusterCacheConnCleaned) {
>
> Umm, nope, that's not how it works. This is totaly thread unsafe. Other
> thread can modify your global flag.
>
> You probably did not check how virObjectUnref actually works. The return
> value notifes you if the object was disposed (you removed the last
> referernce) or not.
Exactly.
Okay I realized virObjectUnref has ret value, which no one used it
earlier or I missed them all.
Me bad hacker used the same way. I will start using the ret value.
The best part is, I did not notice your comment above and bugged the
code, after many tries finally found virObjectUnref has ret, when I
stared reading the email for reply I found that you already gave the
clue above. Poor me!
>
>> + if (priv && priv->canonpath)
>> + VIR_FREE(priv->canonpath);
>> + goto unlock;
>> + }
>> + virStorageBackendGlusterCacheConnCleaned = false;
>> + VIR_DELETE_ELEMENT(virStorageBackendGlusterGlobalCacheObj->conn,
i,
>> +
virStorageBackendGlusterGlobalCacheObj->nConn);
>> + VIR_FREE(src->drv);
>
> This leaks src->drv in cases where the entry was used by two objects
> simultaneously. This needs to be done in the calling function. (As also
> pointed out in one of the other patches)
right, Will make this part of virStorageFileDeinit.
>
>> + }
>> + }
>> +
>> + unlock:
>> + virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +}
>>
>> static void
>> virStorageBackendGlusterClose(virStorageBackendGlusterStatePtr state)
>> @@ -538,32 +749,15 @@ virStorageBackend virStorageBackendGluster = {
>> };
>>
>>
>> -typedef struct _virStorageFileBackendGlusterPriv
virStorageFileBackendGlusterPriv;
>> -typedef virStorageFileBackendGlusterPriv *virStorageFileBackendGlusterPrivPtr;
>> -
>> -struct _virStorageFileBackendGlusterPriv {
>> - glfs_t *vol;
>> - char *canonpath;
>> -};
>> -
>> -
>> static void
>> virStorageFileBackendGlusterDeinit(virStorageSourcePtr src)
>> {
>> - virStorageFileBackendGlusterPrivPtr priv = src->drv->priv;
>> -
>> VIR_DEBUG("deinitializing gluster storage file %p
(gluster://%s:%s/%s%s)",
>> src, src->hosts->u.inet.addr,
>> src->hosts->u.inet.port ? src->hosts->u.inet.port :
"0",
>> src->volume, src->path);
>>
>> - if (priv->vol)
>> - glfs_fini(priv->vol);
>> - VIR_FREE(priv->canonpath);
>> -
>> - VIR_FREE(priv);
>> -
>> - VIR_FREE(src->drv);
>> + virStorageBackendGlusterCacheRefresh(src);
>> }
>>
>> static int
>> @@ -610,6 +804,20 @@
virStorageFileBackendGlusterInitServer(virStorageFileBackendGlusterPrivPtr priv,
>> return 0;
>> }
>>
>> +static void
>> +virStorageBackendGlusterCacheInit(void)
>> +{
>> + if (VIR_ALLOC(virStorageBackendGlusterGlobalCacheObj) < 0) {
>> + virStorageBackendGlusterCacheInitGlobalError = false;
>> + return;
>> + }
>> +
>> + if (virMutexInit(&virStorageBackendGlusterGlobalCacheObj->lock) <
0) {
>> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
>> + _("Unable to initialize mutex"));
>> + virStorageBackendGlusterCacheInitGlobalError = false;
>> + }
>> +}
>>
>> static int
>> virStorageFileBackendGlusterInit(virStorageSourcePtr src)
>> @@ -632,11 +840,32 @@ virStorageFileBackendGlusterInit(virStorageSourcePtr src)
>> src, priv, src->volume, src->path,
>> (unsigned int)src->drv->uid, (unsigned
int)src->drv->gid);
>>
>> + if (virOnce(&virStorageBackendGlusterCacheOnce,
>> + virStorageBackendGlusterCacheInit) < 0)
>> + return -1;
>> +
>> + if (virStorageBackendGlusterCacheInitGlobalError)
>> + return -1;
>> +
>> + virMutexLock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +
>> + priv->vol = virStorageBackendGlusterCacheQuery(src);
>> + if (priv->vol) {
>> + src->drv->priv = priv;
>> + virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> + return 0;
>> + }
>> +
>> if (!(priv->vol = glfs_new(src->volume))) {
>> virReportOOMError();
>> - goto error;
>> + goto unlock;
>> }
>>
>> + if (virStorageBackendGlusterCacheAdd(src, priv) < 0)
>> + goto unlock;
>> +
>> + virMutexUnlock(&virStorageBackendGlusterGlobalCacheObj->lock);
>> +
>
> So yet another thread-unsafe part. You correctly add the not-yet-opened
> connection to the cache, but there is no mechanism to guard suff after
> this.
>
> If a second thread obtains the cache entry wile the first one is not yet
> finished opening the connection it will get an invalid one.
>
> Any thread obtaining an entry from the cache needs to check whether the
> conection was initialized already as the first thing to do. It can
> continue only if it's already open and otherwise needs to wait until the
> original thread finishes.
>
> The first thread that added the object into the cache needs to open the
> connection and notify all other threads potentialy waiting on the
> condition that it was opened. A virCond should help.
Thanks for the clue, will need to figure out how to use virCond.
>
>> for (i = 0; i < src->nhosts; i++) {
>> if (virStorageFileBackendGlusterInitServer(priv, src->hosts + i) <
0)
>> goto error;
Apologies for the delay.
Got a switch to internal priority item in gluster.
Will try to work on the reviews tomorrow.
Thanks Peter!
--
Prasanna