[PATCH 0/2] virdevmapper: Deal with kernels without DM support

*** BLURB HERE *** Michal Prívozník (2): virdevmapper: Don't error on kernels without DM support virdevmapper: Deal with unloading dm module src/util/virdevmapper.c | 59 ++++++++++++++++++++++++++++++++--------- 1 file changed, 47 insertions(+), 12 deletions(-) -- 2.26.2

In one of my latest patch (v6.6.0~30) I was trying to remove libdevmapper use in favor of our own implementation. However, the code did not take into account that device mapper can be not compiled into the kernel (e.g. be a separate module that's not loaded) in which case /proc/devices won't have the device-mapper major number and thus virDevMapperGetTargets() and/or virIsDevMapperDevice() fails. Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-) diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index fe7f611496..cc33d8211e 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -47,10 +47,11 @@ G_STATIC_ASSERT(BUF_SIZE > sizeof(struct dm_ioctl)); static unsigned int virDMMajor; +static virMutex virDevMapperInitMutex = VIR_MUTEX_INITIALIZER; static int -virDevMapperOnceInit(void) +virDevMapperGetMajor(unsigned int *dmmaj) { g_autofree char *buf = NULL; VIR_AUTOSTRINGLIST lines = NULL; @@ -69,23 +70,34 @@ virDevMapperOnceInit(void) if (sscanf(lines[i], "%u %ms\n", &maj, &dev) == 2 && STREQ(dev, DM_NAME)) { - virDMMajor = maj; + *dmmaj = maj; break; } } if (!lines[i]) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find major for %s"), - DM_NAME); - return -1; + /* Don't error here. Maybe the kernel was built without + * devmapper. Let the caller decide. */ + return -2; } return 0; } -VIR_ONCE_GLOBAL_INIT(virDevMapper); +static int +virDevMapperInit(void) +{ + int rc = 0; + + virMutexLock(&virDevMapperInitMutex); + + if (virDMMajor == 0) + rc = virDevMapperGetMajor(&virDMMajor); + + virMutexUnlock(&virDevMapperInitMutex); + return rc; +} static void * @@ -220,7 +232,6 @@ virDevMapperGetTargetsImpl(int controlFD, size_t i; memset(&dm, 0, sizeof(dm)); - *devPaths_ret = NULL; if (ttl == 0) { errno = ELOOP; @@ -298,14 +309,24 @@ virDevMapperGetTargets(const char *path, char ***devPaths) { VIR_AUTOCLOSE controlFD = -1; + int rc; const unsigned int ttl = 32; /* Arbitrary limit on recursion level. A devmapper target can * consist of devices or yet another targets. If that's the * case, we have to stop recursion somewhere. */ - if (virDevMapperInitialize() < 0) + *devPaths = NULL; + + if ((rc = virDevMapperInit()) < 0) { + if (rc == -2) { + /* Devmapper is not available. Yet. Maybe the module + * will be loaded later, but do not error out for now. */ + return 0; + } + return -1; + } if ((controlFD = virDMOpen()) < 0) return -1; @@ -319,7 +340,7 @@ virIsDevMapperDevice(const char *dev_name) { struct stat buf; - if (virDevMapperInitialize() < 0) + if (virDevMapperInit() < 0) return false; if (!stat(dev_name, &buf) && -- 2.26.2

On Mon, Aug 17, 2020 at 16:26:54 +0200, Michal Privoznik wrote:
In one of my latest patch (v6.6.0~30) I was trying to remove libdevmapper use in favor of our own implementation. However, the code did not take into account that device mapper can be not compiled into the kernel (e.g. be a separate module that's not loaded) in which case /proc/devices won't have the device-mapper major number and thus virDevMapperGetTargets() and/or virIsDevMapperDevice() fails.
Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index fe7f611496..cc33d8211e 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -47,10 +47,11 @@ G_STATIC_ASSERT(BUF_SIZE > sizeof(struct dm_ioctl));
static unsigned int virDMMajor; +static virMutex virDevMapperInitMutex = VIR_MUTEX_INITIALIZER;
'virDMMajor' should now be initialized explicitly ...
static int -virDevMapperOnceInit(void) +virDevMapperGetMajor(unsigned int *dmmaj) { g_autofree char *buf = NULL; VIR_AUTOSTRINGLIST lines = NULL; @@ -69,23 +70,34 @@ virDevMapperOnceInit(void)
if (sscanf(lines[i], "%u %ms\n", &maj, &dev) == 2 && STREQ(dev, DM_NAME)) { - virDMMajor = maj;
... since it's not always initialized here.
+ *dmmaj = maj; break; } }
if (!lines[i]) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find major for %s"), - DM_NAME); - return -1; + /* Don't error here. Maybe the kernel was built without + * devmapper. Let the caller decide. */ + return -2; }
return 0; }
-VIR_ONCE_GLOBAL_INIT(virDevMapper); +static int +virDevMapperInit(void) +{ + int rc = 0; + + virMutexLock(&virDevMapperInitMutex); + + if (virDMMajor == 0)
I'm not quite persuaded with this caching here. For cases when the device mapper is loaded we cache it, but in the negative case ...
+ rc = virDevMapperGetMajor(&virDMMajor);
... we always process /proc/devices. Why is caching necessary then?
+ + virMutexUnlock(&virDevMapperInitMutex); + return rc; +}
static void * @@ -220,7 +232,6 @@ virDevMapperGetTargetsImpl(int controlFD, size_t i;
memset(&dm, 0, sizeof(dm)); - *devPaths_ret = NULL;
if (ttl == 0) { errno = ELOOP; @@ -298,14 +309,24 @@ virDevMapperGetTargets(const char *path, char ***devPaths) { VIR_AUTOCLOSE controlFD = -1; + int rc; const unsigned int ttl = 32;
/* Arbitrary limit on recursion level. A devmapper target can * consist of devices or yet another targets. If that's the * case, we have to stop recursion somewhere. */
- if (virDevMapperInitialize() < 0) + *devPaths = NULL; + + if ((rc = virDevMapperInit()) < 0) { + if (rc == -2) { + /* Devmapper is not available. Yet. Maybe the module + * will be loaded later, but do not error out for now. */ + return 0; + } + return -1; + }
if ((controlFD = virDMOpen()) < 0) return -1; @@ -319,7 +340,7 @@ virIsDevMapperDevice(const char *dev_name) { struct stat buf;
- if (virDevMapperInitialize() < 0) + if (virDevMapperInit() < 0) return false;
if (!stat(dev_name, &buf) &&
Code after this hunk reads 'virDMMajor' directly without locks. Since virDevMapperInit returns it here it should be used from the return value rather than accessed directly which creates a code-pattern open for race conditions.
-- 2.26.2

On 8/17/20 5:16 PM, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 16:26:54 +0200, Michal Privoznik wrote:
In one of my latest patch (v6.6.0~30) I was trying to remove libdevmapper use in favor of our own implementation. However, the code did not take into account that device mapper can be not compiled into the kernel (e.g. be a separate module that's not loaded) in which case /proc/devices won't have the device-mapper major number and thus virDevMapperGetTargets() and/or virIsDevMapperDevice() fails.
Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index fe7f611496..cc33d8211e 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -47,10 +47,11 @@ G_STATIC_ASSERT(BUF_SIZE > sizeof(struct dm_ioctl));
static unsigned int virDMMajor; +static virMutex virDevMapperInitMutex = VIR_MUTEX_INITIALIZER;
'virDMMajor' should now be initialized explicitly ...
Aren't global static variables initialized to zero automatically (something about .bss section)?
static int -virDevMapperOnceInit(void) +virDevMapperGetMajor(unsigned int *dmmaj) { g_autofree char *buf = NULL; VIR_AUTOSTRINGLIST lines = NULL; @@ -69,23 +70,34 @@ virDevMapperOnceInit(void)
if (sscanf(lines[i], "%u %ms\n", &maj, &dev) == 2 && STREQ(dev, DM_NAME)) { - virDMMajor = maj;
... since it's not always initialized here.
+ *dmmaj = maj; break; } }
if (!lines[i]) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find major for %s"), - DM_NAME); - return -1; + /* Don't error here. Maybe the kernel was built without + * devmapper. Let the caller decide. */ + return -2; }
return 0; }
-VIR_ONCE_GLOBAL_INIT(virDevMapper); +static int +virDevMapperInit(void) +{ + int rc = 0; + + virMutexLock(&virDevMapperInitMutex); + + if (virDMMajor == 0)
I'm not quite persuaded with this caching here. For cases when the device mapper is loaded we cache it, but in the negative case ...
+ rc = virDevMapperGetMajor(&virDMMajor);
... we always process /proc/devices. Why is caching necessary then?
Are you suggesting to parse /proc/devices every time? My concern is it will burn CPU cycles needlessly. And I'm not sure how to address the negative case when DM module is not loaded. It's ineffective, I agree, I just don't see how to make it better.
+ + virMutexUnlock(&virDevMapperInitMutex); + return rc; +}
static void * @@ -220,7 +232,6 @@ virDevMapperGetTargetsImpl(int controlFD, size_t i;
memset(&dm, 0, sizeof(dm)); - *devPaths_ret = NULL;
if (ttl == 0) { errno = ELOOP; @@ -298,14 +309,24 @@ virDevMapperGetTargets(const char *path, char ***devPaths) { VIR_AUTOCLOSE controlFD = -1; + int rc; const unsigned int ttl = 32;
/* Arbitrary limit on recursion level. A devmapper target can * consist of devices or yet another targets. If that's the * case, we have to stop recursion somewhere. */
- if (virDevMapperInitialize() < 0) + *devPaths = NULL; + + if ((rc = virDevMapperInit()) < 0) { + if (rc == -2) { + /* Devmapper is not available. Yet. Maybe the module + * will be loaded later, but do not error out for now. */ + return 0; + } + return -1; + }
if ((controlFD = virDMOpen()) < 0) return -1; @@ -319,7 +340,7 @@ virIsDevMapperDevice(const char *dev_name) { struct stat buf;
- if (virDevMapperInitialize() < 0) + if (virDevMapperInit() < 0) return false;
if (!stat(dev_name, &buf) &&
Code after this hunk reads 'virDMMajor' directly without locks. Since virDevMapperInit returns it here it should be used from the return value rather than accessed directly which creates a code-pattern open for race conditions.
I'm not sure I follow. virDevMapperInit() returns -2, -1, or 0. Not the major. And reading without lock is fine because there is no way that the virDMMajor is not set at this point (in which case the virDevMapperInit() is basically a NOP. But if it makes us feel better I can put some locking around. Michal

On Mon, Aug 17, 2020 at 17:40:04 +0200, Michal Privoznik wrote:
On 8/17/20 5:16 PM, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 16:26:54 +0200, Michal Privoznik wrote:
In one of my latest patch (v6.6.0~30) I was trying to remove libdevmapper use in favor of our own implementation. However, the code did not take into account that device mapper can be not compiled into the kernel (e.g. be a separate module that's not loaded) in which case /proc/devices won't have the device-mapper major number and thus virDevMapperGetTargets() and/or virIsDevMapperDevice() fails.
Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index fe7f611496..cc33d8211e 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -47,10 +47,11 @@ G_STATIC_ASSERT(BUF_SIZE > sizeof(struct dm_ioctl)); static unsigned int virDMMajor; +static virMutex virDevMapperInitMutex = VIR_MUTEX_INITIALIZER;
'virDMMajor' should now be initialized explicitly ...
Aren't global static variables initialized to zero automatically (something about .bss section)?
Yes, they are.
static int -virDevMapperOnceInit(void) +virDevMapperGetMajor(unsigned int *dmmaj) { g_autofree char *buf = NULL; VIR_AUTOSTRINGLIST lines = NULL; @@ -69,23 +70,34 @@ virDevMapperOnceInit(void) if (sscanf(lines[i], "%u %ms\n", &maj, &dev) == 2 && STREQ(dev, DM_NAME)) { - virDMMajor = maj;
... since it's not always initialized here.
+ *dmmaj = maj; break; } } if (!lines[i]) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find major for %s"), - DM_NAME); - return -1; + /* Don't error here. Maybe the kernel was built without + * devmapper. Let the caller decide. */ + return -2; } return 0; } -VIR_ONCE_GLOBAL_INIT(virDevMapper); +static int +virDevMapperInit(void) +{ + int rc = 0; + + virMutexLock(&virDevMapperInitMutex); + + if (virDMMajor == 0)
I'm not quite persuaded with this caching here. For cases when the device mapper is loaded we cache it, but in the negative case ...
+ rc = virDevMapperGetMajor(&virDMMajor);
... we always process /proc/devices. Why is caching necessary then?
Are you suggesting to parse /proc/devices every time? My concern is it will burn CPU cycles needlessly. And I'm not sure how to address the negative case when DM module is not loaded. It's ineffective, I agree, I just don't see how to make it better.
Well, at first we should establish when the value is determined/can change: 1) at kernel build time/boot time (any case when the libvirtd process will need to be restarted for it to change) In these scenarios it doesn't actually make sense to check it prior to trying to open the device mapper control socket first. You can look it up and cache it after you open the socket and don't ever need to re-do it. In that case you can also use the existing VIR_ONCE code. You then don't have to clear it when the module is ejected, because afterwards the control socket will not exist. In case the module is re-injected, given the contstraints above the value will not change so no need to re-load. 2) at module insertion time In this case it's actually wrong to cache it, because the module can be unloaded and reloaded while libvirt will not check and update the cached value. In those scenarios it should also be determined only after you open the control fd frist.
+ + virMutexUnlock(&virDevMapperInitMutex); + return rc; +} static void * @@ -220,7 +232,6 @@ virDevMapperGetTargetsImpl(int controlFD, size_t i; memset(&dm, 0, sizeof(dm)); - *devPaths_ret = NULL; if (ttl == 0) { errno = ELOOP; @@ -298,14 +309,24 @@ virDevMapperGetTargets(const char *path, char ***devPaths) { VIR_AUTOCLOSE controlFD = -1; + int rc; const unsigned int ttl = 32; /* Arbitrary limit on recursion level. A devmapper target can * consist of devices or yet another targets. If that's the * case, we have to stop recursion somewhere. */ - if (virDevMapperInitialize() < 0) + *devPaths = NULL; + + if ((rc = virDevMapperInit()) < 0) { + if (rc == -2) { + /* Devmapper is not available. Yet. Maybe the module + * will be loaded later, but do not error out for now. */ + return 0; + } + return -1; + } if ((controlFD = virDMOpen()) < 0) return -1; @@ -319,7 +340,7 @@ virIsDevMapperDevice(const char *dev_name) { struct stat buf; - if (virDevMapperInitialize() < 0) + if (virDevMapperInit() < 0) return false; if (!stat(dev_name, &buf) &&
Code after this hunk reads 'virDMMajor' directly without locks. Since virDevMapperInit returns it here it should be used from the return value rather than accessed directly which creates a code-pattern open for race conditions.
I'm not sure I follow. virDevMapperInit() returns -2, -1, or 0. Not the major. And reading without lock is fine because there is no way that the
Sorry, I thought it returned it, which would IMO make sense given the locks.
virDMMajor is not set at this point (in which case the virDevMapperInit() is basically a NOP. But if it makes us feel better I can put some locking around.
Well, right in the next patch you introduce a way which can change the value without the lock to 0. I'm not saying that the code will break, but you are adding code which is not obvious without properly documenting the implications and contstraints in the code. Once somebody else will try to see how it works, this will most certainly look wrong on the first glance given the obvius races present even when they are impossible due to constraints outside of libvirt.

On 8/17/20 5:58 PM, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 17:40:04 +0200, Michal Privoznik wrote:
On 8/17/20 5:16 PM, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 16:26:54 +0200, Michal Privoznik wrote:
In one of my latest patch (v6.6.0~30) I was trying to remove libdevmapper use in favor of our own implementation. However, the code did not take into account that device mapper can be not compiled into the kernel (e.g. be a separate module that's not loaded) in which case /proc/devices won't have the device-mapper major number and thus virDevMapperGetTargets() and/or virIsDevMapperDevice() fails.
Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index fe7f611496..cc33d8211e 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -47,10 +47,11 @@ G_STATIC_ASSERT(BUF_SIZE > sizeof(struct dm_ioctl)); static unsigned int virDMMajor; +static virMutex virDevMapperInitMutex = VIR_MUTEX_INITIALIZER;
'virDMMajor' should now be initialized explicitly ...
Aren't global static variables initialized to zero automatically (something about .bss section)?
Yes, they are.
static int -virDevMapperOnceInit(void) +virDevMapperGetMajor(unsigned int *dmmaj) { g_autofree char *buf = NULL; VIR_AUTOSTRINGLIST lines = NULL; @@ -69,23 +70,34 @@ virDevMapperOnceInit(void) if (sscanf(lines[i], "%u %ms\n", &maj, &dev) == 2 && STREQ(dev, DM_NAME)) { - virDMMajor = maj;
... since it's not always initialized here.
+ *dmmaj = maj; break; } } if (!lines[i]) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find major for %s"), - DM_NAME); - return -1; + /* Don't error here. Maybe the kernel was built without + * devmapper. Let the caller decide. */ + return -2; } return 0; } -VIR_ONCE_GLOBAL_INIT(virDevMapper); +static int +virDevMapperInit(void) +{ + int rc = 0; + + virMutexLock(&virDevMapperInitMutex); + + if (virDMMajor == 0)
I'm not quite persuaded with this caching here. For cases when the device mapper is loaded we cache it, but in the negative case ...
+ rc = virDevMapperGetMajor(&virDMMajor);
... we always process /proc/devices. Why is caching necessary then?
Are you suggesting to parse /proc/devices every time? My concern is it will burn CPU cycles needlessly. And I'm not sure how to address the negative case when DM module is not loaded. It's ineffective, I agree, I just don't see how to make it better.
Well, at first we should establish when the value is determined/can change:
1) at kernel build time/boot time (any case when the libvirtd process will need to be restarted for it to change)
In these scenarios it doesn't actually make sense to check it prior to trying to open the device mapper control socket first. You can look it up and cache it after you open the socket and don't ever need to re-do it. In that case you can also use the existing VIR_ONCE code.
You then don't have to clear it when the module is ejected, because afterwards the control socket will not exist. In case the module is re-injected, given the contstraints above the value will not change so no need to re-load.
2) at module insertion time
In this case it's actually wrong to cache it, because the module can be unloaded and reloaded while libvirt will not check and update the cached value. In those scenarios it should also be determined only after you open the control fd frist.
As promised yesterday, I've dived into the code and found out that the major number can be specified as a parameter to the dm module (just tested and it works). So the next thing I tried was to see how could we check whether the module was reloaded. I've tried opening /dev/mapper/control hoping that I will get EOF on module unload (which I could then use to run a callback that would invalidate the cached value). But having the file open prevents unloading the module. Loading/unloading a module results in an udev event, BUT we listen for udev events only in nodedev driver and moving the code out to a driver agnostic location and making it driver agnostic is too much code for a little gain. Then I looked whether it's possible to get the major number via an ioctl(). But haven't found anything. Hence, I think the best solution for us is to not cache anything and simply parse /proc/devices every time. As an optimization, I can firstly check whether /dev/mapper/control exists at all an if not exit early. Michal

On Tuesday, 18 August 2020 09:54:25 CEST Michal Privoznik wrote:
On 8/17/20 5:58 PM, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 17:40:04 +0200, Michal Privoznik wrote:
On 8/17/20 5:16 PM, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 16:26:54 +0200, Michal Privoznik wrote:
In one of my latest patch (v6.6.0~30) I was trying to remove libdevmapper use in favor of our own implementation. However, the code did not take into account that device mapper can be not compiled into the kernel (e.g. be a separate module that's not loaded) in which case /proc/devices won't have the device-mapper major number and thus virDevMapperGetTargets() and/or virIsDevMapperDevice() fails.
Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 41 +++++++++++++++++++++++++++++++---------- 1 file changed, 31 insertions(+), 10 deletions(-)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index fe7f611496..cc33d8211e 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -47,10 +47,11 @@ G_STATIC_ASSERT(BUF_SIZE > sizeof(struct dm_ioctl)); static unsigned int virDMMajor; +static virMutex virDevMapperInitMutex = VIR_MUTEX_INITIALIZER;
'virDMMajor' should now be initialized explicitly ...
Aren't global static variables initialized to zero automatically (something about .bss section)?
Yes, they are.
static int -virDevMapperOnceInit(void) +virDevMapperGetMajor(unsigned int *dmmaj) { g_autofree char *buf = NULL; VIR_AUTOSTRINGLIST lines = NULL; @@ -69,23 +70,34 @@ virDevMapperOnceInit(void) if (sscanf(lines[i], "%u %ms\n", &maj, &dev) == 2 && STREQ(dev, DM_NAME)) { - virDMMajor = maj;
... since it's not always initialized here.
+ *dmmaj = maj; break; } } if (!lines[i]) { - virReportError(VIR_ERR_INTERNAL_ERROR, - _("Unable to find major for %s"), - DM_NAME); - return -1; + /* Don't error here. Maybe the kernel was built without + * devmapper. Let the caller decide. */ + return -2; } return 0; } -VIR_ONCE_GLOBAL_INIT(virDevMapper); +static int +virDevMapperInit(void) +{ + int rc = 0; + + virMutexLock(&virDevMapperInitMutex); + + if (virDMMajor == 0)
I'm not quite persuaded with this caching here. For cases when the device mapper is loaded we cache it, but in the negative case ...
+ rc = virDevMapperGetMajor(&virDMMajor);
... we always process /proc/devices. Why is caching necessary then?
Are you suggesting to parse /proc/devices every time? My concern is it will burn CPU cycles needlessly. And I'm not sure how to address the negative case when DM module is not loaded. It's ineffective, I agree, I just don't see how to make it better.
Well, at first we should establish when the value is determined/can change:
1) at kernel build time/boot time (any case when the libvirtd process will need to be restarted for it to change)
In these scenarios it doesn't actually make sense to check it prior to trying to open the device mapper control socket first. You can look it up and cache it after you open the socket and don't ever need to re-do it. In that case you can also use the existing VIR_ONCE code.
You then don't have to clear it when the module is ejected, because afterwards the control socket will not exist. In case the module is re-injected, given the contstraints above the value will not change so no need to re-load.
2) at module insertion time
In this case it's actually wrong to cache it, because the module can be unloaded and reloaded while libvirt will not check and update the cached value. In those scenarios it should also be determined only after you open the control fd frist.
As promised yesterday, I've dived into the code and found out that the major number can be specified as a parameter to the dm module (just tested and it works). So the next thing I tried was to see how could we check whether the module was reloaded. I've tried opening /dev/mapper/control hoping that I will get EOF on module unload (which I could then use to run a callback that would invalidate the cached value). But having the file open prevents unloading the module.
Loading/unloading a module results in an udev event, BUT we listen for udev events only in nodedev driver and moving the code out to a driver agnostic location and making it driver agnostic is too much code for a little gain.
Then I looked whether it's possible to get the major number via an ioctl(). But haven't found anything.
What about stat()ing /dev/mapper/control? That should give you the major/minor of that special character device. -- Pino Toscano

On 8/18/20 10:10 AM, Pino Toscano wrote:
What about stat()ing /dev/mapper/control? That should give you the major/minor of that special character device.
That won't help. We need to get the major of devmapper targets. For instance, if there is /dev/dm-0 node, it's going to be a block device with major of devmapper targets. All devmapper targets will have the same major number. And this number can be defined when loading the module. Then, /dev/mapper/control is a character device with completely different major:minor numbers. Michal

On Tue, Aug 18, 2020 at 09:54:25 +0200, Michal Privoznik wrote: [...]
As promised yesterday, I've dived into the code and found out that the major number can be specified as a parameter to the dm module (just tested and it works). So the next thing I tried was to see how could we check whether the module was reloaded. I've tried opening /dev/mapper/control hoping that I will get EOF on module unload (which I could then use to run a callback that would invalidate the cached value). But having the file open prevents unloading the module.
Loading/unloading a module results in an udev event, BUT we listen for udev events only in nodedev driver and moving the code out to a driver agnostic location and making it driver agnostic is too much code for a little gain.
Then I looked whether it's possible to get the major number via an ioctl(). But haven't found anything.
Hence, I think the best solution for us is to not cache anything and simply parse /proc/devices every time. As an optimization, I can firstly check whether /dev/mapper/control exists at all an if not exit early.
I agree with this solution. In the end, loading a <1k file from procfs and looking for a string doesn't really add too much to the code. In the end libvirt mainly just shovels strings from one pile to another most of the time. (see domain definition copy function for example).

The following situation can happen: we've initialized DM major successfully. But later, the devmapper module(s) was removed and thus kernel lost its ability to handle multipath devices. More importantly, the /dev/mapper/control file is removed and thus our attempt to open it will fail. Note, we will attempt to open it only after we've found devmapper major number successfully in one of the previous runs. Anyway, if it so happens that the module was unloaded then we have to reset the major number we remembered from one of the previous runs and not report error. Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-) diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index cc33d8211e..642f1d9bdb 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -143,8 +143,13 @@ virDMOpen(void) memset(&dm, 0, sizeof(dm)); - if ((controlFD = open(CONTROL_PATH, O_RDWR)) < 0) + if ((controlFD = open(CONTROL_PATH, O_RDWR)) < 0) { + if (errno == ENOENT) + return -2; + + virReportSystemError(errno, _("Unable to open %s"), CONTROL_PATH); return -1; + } if (!virDMIoctl(controlFD, DM_VERSION, &dm, &tmp)) { virReportSystemError(errno, "%s", @@ -328,8 +333,17 @@ virDevMapperGetTargets(const char *path, return -1; } - if ((controlFD = virDMOpen()) < 0) + if ((controlFD = virDMOpen()) < 0) { + if (controlFD == -2) { + /* Devmapper was available but now it isn't. Somebody + * must have removed the module. Reset the major + * number we remember. */ + virDMMajor = 0; + return 0; + } + return -1; + } return virDevMapperGetTargetsImpl(controlFD, path, devPaths, ttl); } -- 2.26.2

On Mon, Aug 17, 2020 at 16:26:55 +0200, Michal Privoznik wrote:
The following situation can happen: we've initialized DM major successfully. But later, the devmapper module(s) was removed and thus kernel lost its ability to handle multipath devices. More importantly, the /dev/mapper/control file is removed and thus our attempt to open it will fail. Note, we will attempt to open it only after we've found devmapper major number successfully in one of the previous runs. Anyway, if it so happens that the module was unloaded then we have to reset the major number we remembered from one of the previous runs and not report error.
Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index cc33d8211e..642f1d9bdb 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -143,8 +143,13 @@ virDMOpen(void)
memset(&dm, 0, sizeof(dm));
- if ((controlFD = open(CONTROL_PATH, O_RDWR)) < 0) + if ((controlFD = open(CONTROL_PATH, O_RDWR)) < 0) { + if (errno == ENOENT) + return -2; + + virReportSystemError(errno, _("Unable to open %s"), CONTROL_PATH); return -1; + }
if (!virDMIoctl(controlFD, DM_VERSION, &dm, &tmp)) { virReportSystemError(errno, "%s", @@ -328,8 +333,17 @@ virDevMapperGetTargets(const char *path, return -1; }
- if ((controlFD = virDMOpen()) < 0) + if ((controlFD = virDMOpen()) < 0) { + if (controlFD == -2) { + /* Devmapper was available but now it isn't. Somebody + * must have removed the module. Reset the major + * number we remember. */ + virDMMajor = 0;
This overwrites 'virDMMajor' without taking the mutex. 'unsigned int' on x86_64 will be atomic, we shouldn't use a knowingly broken code pattern without at least a proper explanation why it's okay. Also as noted in review of previous patch this looks shady and error-prone. If the unloading and re-loading of the device mapper module changes the 'major' number of number we use internally for checking, then caching it without validating that somebody didn't unload and reload the module between two calls to virDMOpen is wrong as we might use the wrong number. In case when for a given kernel it can only have one value regardless of how you load the module, you can cache it without the mutex dance after calling virDMOpen and remember it for ever, you'll still be gated whether the control device path exists. In any other case, caching it is just wrong.
+ return 0; + } + return -1; + }
return virDevMapperGetTargetsImpl(controlFD, path, devPaths, ttl); } -- 2.26.2

On 8/17/20 5:28 PM, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 16:26:55 +0200, Michal Privoznik wrote:
The following situation can happen: we've initialized DM major successfully. But later, the devmapper module(s) was removed and thus kernel lost its ability to handle multipath devices. More importantly, the /dev/mapper/control file is removed and thus our attempt to open it will fail. Note, we will attempt to open it only after we've found devmapper major number successfully in one of the previous runs. Anyway, if it so happens that the module was unloaded then we have to reset the major number we remembered from one of the previous runs and not report error.
Fixes: 22494556542c676d1b9e7f1c1f2ea13ac17e1e3e Reported-by: Andrea Bolognani <abologna@redhat.com> Signed-off-by: Michal Privoznik <mprivozn@redhat.com> --- src/util/virdevmapper.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c index cc33d8211e..642f1d9bdb 100644 --- a/src/util/virdevmapper.c +++ b/src/util/virdevmapper.c @@ -143,8 +143,13 @@ virDMOpen(void)
memset(&dm, 0, sizeof(dm));
- if ((controlFD = open(CONTROL_PATH, O_RDWR)) < 0) + if ((controlFD = open(CONTROL_PATH, O_RDWR)) < 0) { + if (errno == ENOENT) + return -2; + + virReportSystemError(errno, _("Unable to open %s"), CONTROL_PATH); return -1; + }
if (!virDMIoctl(controlFD, DM_VERSION, &dm, &tmp)) { virReportSystemError(errno, "%s", @@ -328,8 +333,17 @@ virDevMapperGetTargets(const char *path, return -1; }
- if ((controlFD = virDMOpen()) < 0) + if ((controlFD = virDMOpen()) < 0) { + if (controlFD == -2) { + /* Devmapper was available but now it isn't. Somebody + * must have removed the module. Reset the major + * number we remember. */ + virDMMajor = 0;
This overwrites 'virDMMajor' without taking the mutex. 'unsigned int' on x86_64 will be atomic, we shouldn't use a knowingly broken code pattern without at least a proper explanation why it's okay.
Also as noted in review of previous patch this looks shady and error-prone. If the unloading and re-loading of the device mapper module changes the 'major' number of number we use internally for checking, then caching it without validating that somebody didn't unload and reload the module between two calls to virDMOpen is wrong as we might use the wrong number.
I haven't looked into the kernel code, but IIRC even libdevmapper considered the major number immutable. Let me dive into the code.
In case when for a given kernel it can only have one value regardless of how you load the module, you can cache it without the mutex dance after calling virDMOpen and remember it for ever, you'll still be gated whether the control device path exists.
The dance with locks is to have only one thread parse /proc/devices instead of all of them at once. That's why I did not bother guarding other accesses to virDMMajor with locks. Michal

On Mon, 2020-08-17 at 17:28 +0200, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 16:26:55 +0200, Michal Privoznik wrote:
- if ((controlFD = virDMOpen()) < 0) + if ((controlFD = virDMOpen()) < 0) { + if (controlFD == -2) { + /* Devmapper was available but now it isn't. Somebody + * must have removed the module. Reset the major + * number we remember. */ + virDMMajor = 0;
This overwrites 'virDMMajor' without taking the mutex. 'unsigned int' on x86_64 will be atomic, we shouldn't use a knowingly broken code pattern without at least a proper explanation why it's okay.
IIUC you're saying that you'd be okay with adding a comment that explains why this will work fine on x86_64. If so, I'd like to point out that libvirt supports many other architectures... Is the same pattern safe on those as well? -- Andrea Bolognani / Red Hat / Virtualization

On Mon, Aug 17, 2020 at 18:02:04 +0200, Andrea Bolognani wrote:
On Mon, 2020-08-17 at 17:28 +0200, Peter Krempa wrote:
On Mon, Aug 17, 2020 at 16:26:55 +0200, Michal Privoznik wrote:
- if ((controlFD = virDMOpen()) < 0) + if ((controlFD = virDMOpen()) < 0) { + if (controlFD == -2) { + /* Devmapper was available but now it isn't. Somebody + * must have removed the module. Reset the major + * number we remember. */ + virDMMajor = 0;
This overwrites 'virDMMajor' without taking the mutex. 'unsigned int' on x86_64 will be atomic, we shouldn't use a knowingly broken code pattern without at least a proper explanation why it's okay.
IIUC you're saying that you'd be okay with adding a comment that explains why this will work fine on x86_64. If so, I'd like to point out that libvirt supports many other architectures... Is the same pattern safe on those as well?
It would really depend on how the arguments in the explanation persuade me that the non-obvious/weird/wrong-looking code is justified in that particular scenario. If the justification doesn't include that it's safe in every scenario I'd protest it anyways. What I wanted to say is that I know that this works on x86 but it goes against established patterns of lock usage. That was to prevent any discussion in the direction "but it's safe to do in this case" from happening without moving towards a better code pattern or explanation why it's necessary in this case.

On Mon, Aug 17, 2020 at 4:27 PM Michal Privoznik <mprivozn@redhat.com> wrote:
*** BLURB HERE ***
Michal Prívozník (2): virdevmapper: Don't error on kernels without DM support virdevmapper: Deal with unloading dm module
Hi Michal, I know from the discussions on the patches that there likely will be some changes before this is complete. Nevertheless I did some testing with them already. In particular since I was reporting this to also affect system containers that have device-mapper loaded as module and in /proc/devices but no /dev/mapper/control exposed. I wanted to give your patches a try for this use case as well since I wasn't sure anyone else would. I can confirm that functionally your patches applied on top of 6.6 (as we work on it for Debian&Ubuntu) make it work again. Therefore: Tested-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
src/util/virdevmapper.c | 59 ++++++++++++++++++++++++++++++++--------- 1 file changed, 47 insertions(+), 12 deletions(-)
-- 2.26.2
-- Christian Ehrhardt Staff Engineer, Ubuntu Server Canonical Ltd
participants (5)
-
Andrea Bolognani
-
Christian Ehrhardt
-
Michal Privoznik
-
Peter Krempa
-
Pino Toscano