
Ok. I agree. On 2021/3/26 17:10, Michal Privoznik wrote:
There's no need to CC random developers - we are subscribed to the list.
On 3/26/21 4:21 AM, wangjian wrote:> serialize access pci_get_strings function with a mutex.> > nodedev-init thread:
nodeStateInitializeEnumerate -> udevEnumerateDevices-> udevProcessDeviceListEntry -> udevAddOneDevice -> udevGetDeviceDetails-> udevProcessPCI -> udevTranslatePCIIds -> pci_get_strings -> (libpciaccess) find_device_name -> populate_vendor -> d = realloc( vend->devices, (vend->num_devices + 1), * sizeof( struct pci_device_leaf ) ); vend->num_devices++;
udev-event thread: udevEventHandleThread -> udevHandleOneDevice -> udevAddOneDevice-> udevGetDeviceDetails-> udevProcessPCI -> udevTranslatePCIIds -> pci_get_strings -> (libpciaccess) find_device_name -> populate_vendor -> d = realloc( vend->devices, (vend->num_devices + 1), * sizeof( struct pci_device_leaf ) ); vend->num_devices++;
Since the functions provided by libpciaccess are not thread-safe, when the udev-event and nodedev-init threads of libvirt call the pci_get_strings function provided by libpaciaccess at the same time. It will appear that for the same address, realloc a large space first, and then realloc a small space, which triggers the 'invalid next size' of realloc, leading to the thread core.
Signed-off-by: WangJian <wangjian161@huawei.com> --- src/node_device/node_device_udev.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 3f28858..cc752ba 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -331,6 +331,7 @@ udevGenerateDeviceName(struct udev_device *device, return 0; }
+static pthread_mutex_t g_pciaccess_mutex = PTHREAD_MUTEX_INITIALIZER;
We tend to use virMutex (even though it is pthread_mutex under the hood).
static int udevTranslatePCIIds(unsigned int vendor, @@ -350,11 +351,13 @@ udevTranslatePCIIds(unsigned int vendor, m.match_data = 0;
/* pci_get_strings returns void */ + pthread_mutex_lock(&g_pciaccess_mutex); pci_get_strings(&m, &device_name, &vendor_name, NULL, NULL); + pthread_mutex_unlock(&g_pciaccess_mutex);
*vendor_string = g_strdup(vendor_name); *product_string = g_strdup(device_name);
Apart from that, the patch is correct. I'd squash this in and push if you agree:
diff --git i/src/node_device/node_device_udev.c w/src/node_device/node_device_udev.c index cc752bad32..3daa5c90ad 100644 --- i/src/node_device/node_device_udev.c +++ w/src/node_device/node_device_udev.c @@ -331,7 +331,7 @@ udevGenerateDeviceName(struct udev_device *device, return 0; }
-static pthread_mutex_t g_pciaccess_mutex = PTHREAD_MUTEX_INITIALIZER; +static virMutex pciaccessMutex = VIR_MUTEX_INITIALIZER;
static int udevTranslatePCIIds(unsigned int vendor, @@ -350,14 +350,14 @@ udevTranslatePCIIds(unsigned int vendor, m.device_class_mask = 0; m.match_data = 0;
- /* pci_get_strings returns void */ - pthread_mutex_lock(&g_pciaccess_mutex); + /* pci_get_strings returns void and unfortunately is not thread safe. */ + virMutexLock(&pciaccessMutex); pci_get_strings(&m, &device_name, &vendor_name, NULL, NULL); - pthread_mutex_unlock(&g_pciaccess_mutex); + virMutexUnlock(&pciaccessMutex);
*vendor_string = g_strdup(vendor_name); *product_string = g_strdup(device_name);
Michal
.