[PATCH 0/4] nodedev: adjust handling DASDs

Adjusting how DASDs are handled as recently ID_* tags are also included in the udev information which causes the problems reported by https://issues.redhat.com/browse/RHEL-39497 Removing the filtering of offline ccw devices as these devices are available in the system and also are used to set them online again. The state information is made available in the ccw capability as an optional state element. Boris Fiuczynski (4): nodedev: refactor storage type fixup nodedev: improve DASD detection nodedev: prevent invalid DASD node object creation nodedev: add ccw device state and remove fencing src/conf/node_device_conf.c | 24 +++++++++++++ src/conf/node_device_conf.h | 11 ++++++ src/conf/schemas/nodedev.rng | 8 +++++ src/node_device/node_device_udev.c | 56 +++++++++++++++++++++++++----- 4 files changed, 90 insertions(+), 9 deletions(-) -- 2.45.0

Refactor the storage type fixup into a reusable method. Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> --- src/node_device/node_device_udev.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 3b85db00da..15e31d522a 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -962,6 +962,21 @@ udevProcessDASD(struct udev_device *device, } +static int +udevFixupStorageType(virNodeDeviceDef *def, const char *prefix, const char *subst) +{ + if (STRPREFIX(def->caps->data.storage.block, prefix)) { + def->caps->data.storage.drive_type = g_strdup(subst); + VIR_DEBUG("Found storage type '%s' for device with sysfs path '%s'", + def->caps->data.storage.drive_type, + def->sysfs_path); + return 1; + } + + return 0; +} + + /* This function exists to deal with the case in which a driver does * not provide a device type in the usual place, but udev told us it's * a storage device, and we can make a good guess at what kind of @@ -994,13 +1009,8 @@ udevKludgeStorageType(virNodeDeviceDef *def) def->sysfs_path); for (i = 0; i < G_N_ELEMENTS(fixups); i++) { - if (STRPREFIX(def->caps->data.storage.block, fixups[i].prefix)) { - def->caps->data.storage.drive_type = g_strdup(fixups[i].subst); - VIR_DEBUG("Found storage type '%s' for device with sysfs path '%s'", - def->caps->data.storage.drive_type, - def->sysfs_path); + if (udevFixupStorageType(def, fixups[i].prefix, fixups[i].subst)) return 0; - } } VIR_DEBUG("Could not determine storage type " -- 2.45.0

On 6/19/24 14:29, Boris Fiuczynski wrote:
Refactor the storage type fixup into a reusable method.
Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> --- src/node_device/node_device_udev.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 3b85db00da..15e31d522a 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -962,6 +962,21 @@ udevProcessDASD(struct udev_device *device, }
+static int +udevFixupStorageType(virNodeDeviceDef *def, const char *prefix, const char *subst)
Nitpick, one argument per line please.
+{ + if (STRPREFIX(def->caps->data.storage.block, prefix)) { + def->caps->data.storage.drive_type = g_strdup(subst); + VIR_DEBUG("Found storage type '%s' for device with sysfs path '%s'", + def->caps->data.storage.drive_type, + def->sysfs_path); + return 1; + } + + return 0; +} + + /* This function exists to deal with the case in which a driver does * not provide a device type in the usual place, but udev told us it's * a storage device, and we can make a good guess at what kind of @@ -994,13 +1009,8 @@ udevKludgeStorageType(virNodeDeviceDef *def) def->sysfs_path);
for (i = 0; i < G_N_ELEMENTS(fixups); i++) { - if (STRPREFIX(def->caps->data.storage.block, fixups[i].prefix)) { - def->caps->data.storage.drive_type = g_strdup(fixups[i].subst); - VIR_DEBUG("Found storage type '%s' for device with sysfs path '%s'", - def->caps->data.storage.drive_type, - def->sysfs_path); + if (udevFixupStorageType(def, fixups[i].prefix, fixups[i].subst)) return 0; - } }
VIR_DEBUG("Could not determine storage type "
Michal

In newer DASD driver versions the ID_TYPE tag is supported. This tag is missing after a system reboot but when the ccw device is set offline and online the tag is included. To fix this version independently we need to check if devices detected as type disk is actually a DASD to maintain the node object consistency and not end up with multiple node objects for DASDs. Resolves: https://issues.redhat.com/browse/RHEL-39497 Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> --- src/node_device/node_device_udev.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index 15e31d522a..c0e258fe9c 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1065,6 +1065,10 @@ udevProcessStorage(struct udev_device *device, storage->drive_type = g_strdup("sd"); else if (udevKludgeStorageType(def) != 0) goto cleanup; + } else { + /* A detected disk might be a DASD */ + if (STREQ(def->caps->data.storage.drive_type, "disk")) + udevFixupStorageType(def, "/dev/dasd", "dasd"); } if (STREQ(def->caps->data.storage.drive_type, "cd") || -- 2.45.0

Prevent the creation of a new DASD node object when the device does not exist. Resolves: https://issues.redhat.com/browse/RHEL-39497 Reviewed-by: Marc Hartmayer <mhartmay@linux.ibm.com> Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> --- src/node_device/node_device_udev.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index c0e258fe9c..ad994ef0b2 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -958,6 +958,9 @@ udevProcessDASD(struct udev_device *device, udevGetStringSysfsAttr(device, "device/uid", &storage->serial); + if (!storage->serial) + return -1; + return udevProcessDisk(device, def); } -- 2.45.0

Instead of fencing offline ccw devices add the state to the ccw capability. Resolves: https://issues.redhat.com/browse/RHEL-39497 Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> --- src/conf/node_device_conf.c | 24 ++++++++++++++++++++++++ src/conf/node_device_conf.h | 11 +++++++++++ src/conf/schemas/nodedev.rng | 8 ++++++++ src/node_device/node_device_udev.c | 27 ++++++++++++++++++++++++--- 4 files changed, 67 insertions(+), 3 deletions(-) diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 416238bec1..08a89942ba 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -86,6 +86,12 @@ VIR_ENUM_IMPL(virNodeDevDRM, "render", ); +VIR_ENUM_IMPL(virNodeDevCCWState, + VIR_NODE_DEV_CCW_STATE_LAST, + "offline", + "online", +); + static int virNodeDevCapsDefParseString(const char *xpath, xmlXPathContextPtr ctxt, @@ -743,6 +749,10 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def, unsigned int flags) virNodeDeviceCapMdevDefFormat(&buf, data, inactive_state); break; case VIR_NODE_DEV_CAP_CCW_DEV: + if (data->ccw_dev.state != VIR_NODE_DEV_CCW_STATE_LAST) { + const char *state = virNodeDevCCWStateTypeToString(data->ccw_dev.state); + virBufferEscapeString(&buf, "<state>%s</state>\n", state); + } virNodeDeviceCapCCWDefFormat(&buf, data); break; case VIR_NODE_DEV_CAP_CSS_DEV: @@ -1189,9 +1199,23 @@ virNodeDevCapCCWParseXML(xmlXPathContextPtr ctxt, { VIR_XPATH_NODE_AUTORESTORE(ctxt) g_autofree virCCWDeviceAddress *ccw_addr = NULL; + g_autofree char *state = NULL; + int val; ctxt->node = node; + /* state is optional */ + ccw_dev->state = VIR_NODE_DEV_CCW_STATE_LAST; + + if ((state = virXPathString("string(./state[1])", ctxt))) { + if ((val = virNodeDevCCWStateTypeFromString(state)) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown state '%1$s' for '%2$s'"), state, def->name); + return -1; + } + ccw_dev->state = val; + } + ccw_addr = g_new0(virCCWDeviceAddress, 1); if (virNodeDevCCWDeviceAddressParseXML(ctxt, node, def->name, ccw_addr) < 0) diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 4b82636af7..32a8a5884b 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -112,6 +112,16 @@ typedef enum { VIR_NODE_DEV_CAP_FLAG_CSS_MDEV = (1 << 0), } virNodeDevCCWCapFlags; +typedef enum { + /* Keep in sync with VIR_ENUM_IMPL in node_device_conf.c */ + VIR_NODE_DEV_CCW_STATE_OFFLINE = 0, + VIR_NODE_DEV_CCW_STATE_ONLINE, + + VIR_NODE_DEV_CCW_STATE_LAST +} virNodeDevCCWStateType; + +VIR_ENUM_DECL(virNodeDevCCWState); + typedef enum { VIR_NODE_DEV_CAP_FLAG_AP_MATRIX_MDEV = (1 << 0), } virNodeDevAPMatrixCapFlags; @@ -279,6 +289,7 @@ struct _virNodeDevCapCCW { virMediatedDeviceType **mdev_types; size_t nmdev_types; virCCWDeviceAddress *channel_dev_addr; + virNodeDevCCWStateType state; }; typedef struct _virNodeDevCapVDPA virNodeDevCapVDPA; diff --git a/src/conf/schemas/nodedev.rng b/src/conf/schemas/nodedev.rng index ff07313968..42a0cdcfd9 100644 --- a/src/conf/schemas/nodedev.rng +++ b/src/conf/schemas/nodedev.rng @@ -673,6 +673,14 @@ <attribute name="type"> <value>ccw</value> </attribute> + <optional> + <element name="state"> + <choice> + <value>online</value> + <value>offline</value> + </choice> + </element> + </optional> <ref name="capccwaddress"/> </define> diff --git a/src/node_device/node_device_udev.c b/src/node_device/node_device_udev.c index ad994ef0b2..9d11261b88 100644 --- a/src/node_device/node_device_udev.c +++ b/src/node_device/node_device_udev.c @@ -1199,13 +1199,34 @@ udevGetCCWAddress(const char *sysfs_path, static int -udevProcessCCW(struct udev_device *device, - virNodeDeviceDef *def) +udevCCWGetState(struct udev_device *device, + virNodeDevCapData *data) { int online = 0; + if (udevGetIntSysfsAttr(device, "online", &online, 0) < 0 || online < 0) + return -1; + + switch (online) { + case VIR_NODE_DEV_CCW_STATE_OFFLINE: + case VIR_NODE_DEV_CCW_STATE_ONLINE: + data->ccw_dev.state = online; + break; + default: + data->ccw_dev.state = VIR_NODE_DEV_CCW_STATE_LAST; + break; + } + + return 0; +} + + +static int +udevProcessCCW(struct udev_device *device, + virNodeDeviceDef *def) +{ /* process only online devices to keep the list sane */ - if (udevGetIntSysfsAttr(device, "online", &online, 0) < 0 || online != 1) + if (udevCCWGetState(device, &def->caps->data) < 0) return -1; if (udevGetCCWAddress(def->sysfs_path, &def->caps->data) < 0) -- 2.45.0

On 6/19/24 14:29, Boris Fiuczynski wrote:
Instead of fencing offline ccw devices add the state to the ccw capability.
Resolves: https://issues.redhat.com/browse/RHEL-39497 Signed-off-by: Boris Fiuczynski <fiuczy@linux.ibm.com> --- src/conf/node_device_conf.c | 24 ++++++++++++++++++++++++ src/conf/node_device_conf.h | 11 +++++++++++ src/conf/schemas/nodedev.rng | 8 ++++++++ src/node_device/node_device_udev.c | 27 ++++++++++++++++++++++++--- 4 files changed, 67 insertions(+), 3 deletions(-)
diff --git a/src/conf/node_device_conf.c b/src/conf/node_device_conf.c index 416238bec1..08a89942ba 100644 --- a/src/conf/node_device_conf.c +++ b/src/conf/node_device_conf.c @@ -86,6 +86,12 @@ VIR_ENUM_IMPL(virNodeDevDRM, "render", );
+VIR_ENUM_IMPL(virNodeDevCCWState, + VIR_NODE_DEV_CCW_STATE_LAST, + "offline", + "online", +); + static int virNodeDevCapsDefParseString(const char *xpath, xmlXPathContextPtr ctxt, @@ -743,6 +749,10 @@ virNodeDeviceDefFormat(const virNodeDeviceDef *def, unsigned int flags) virNodeDeviceCapMdevDefFormat(&buf, data, inactive_state); break; case VIR_NODE_DEV_CAP_CCW_DEV: + if (data->ccw_dev.state != VIR_NODE_DEV_CCW_STATE_LAST) { + const char *state = virNodeDevCCWStateTypeToString(data->ccw_dev.state); + virBufferEscapeString(&buf, "<state>%s</state>\n", state); + } virNodeDeviceCapCCWDefFormat(&buf, data); break; case VIR_NODE_DEV_CAP_CSS_DEV: @@ -1189,9 +1199,23 @@ virNodeDevCapCCWParseXML(xmlXPathContextPtr ctxt, { VIR_XPATH_NODE_AUTORESTORE(ctxt) g_autofree virCCWDeviceAddress *ccw_addr = NULL; + g_autofree char *state = NULL; + int val;
ctxt->node = node;
+ /* state is optional */ + ccw_dev->state = VIR_NODE_DEV_CCW_STATE_LAST; + + if ((state = virXPathString("string(./state[1])", ctxt))) { + if ((val = virNodeDevCCWStateTypeFromString(state)) < 0) { + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, + _("unknown state '%1$s' for '%2$s'"), state, def->name); + return -1; + } + ccw_dev->state = val; + } + ccw_addr = g_new0(virCCWDeviceAddress, 1);
if (virNodeDevCCWDeviceAddressParseXML(ctxt, node, def->name, ccw_addr) < 0) diff --git a/src/conf/node_device_conf.h b/src/conf/node_device_conf.h index 4b82636af7..32a8a5884b 100644 --- a/src/conf/node_device_conf.h +++ b/src/conf/node_device_conf.h @@ -112,6 +112,16 @@ typedef enum { VIR_NODE_DEV_CAP_FLAG_CSS_MDEV = (1 << 0), } virNodeDevCCWCapFlags;
+typedef enum { + /* Keep in sync with VIR_ENUM_IMPL in node_device_conf.c */
This is not needed because our VIR_ENUM_* macros do a compile time asserts to check if respective enums have the same number of members. I know you copied this over from a pre-existing code but the comment there is also unnecessary.
+ VIR_NODE_DEV_CCW_STATE_OFFLINE = 0, + VIR_NODE_DEV_CCW_STATE_ONLINE, + + VIR_NODE_DEV_CCW_STATE_LAST +} virNodeDevCCWStateType; + +VIR_ENUM_DECL(virNodeDevCCWState); +
Michal

Thank you, Boris. Is the following test scenario sufficient to describe this change, are you aware of any other scenarios that might be at risk of seeing a regression, maybe some scenario for other storage devices like SCSI? Given a DASD disk with identifier 0.0.4024 on subchannel 0.0.0030 When I deconfigure the device (chzdev -d 0.0.4024) And I configure the device (chzdev -e 0.0.4204) And I request the xml details of node devices (virsh nodedev-list --storage) Then the DASD disk is listed And its parent is a device with capability ccw and identifier 0.0.4024 And the parent of that device is another device with capability css and identifier 0.0.0030

On 6/19/24 5:10 PM, smitterl@redhat.com wrote:
Thank you, Boris.
Is the following test scenario sufficient to describe this change, are you aware of any other scenarios that might be at risk of seeing a regression, maybe some scenario for other storage devices like SCSI?
Given a DASD disk with identifier 0.0.4024 on subchannel 0.0.0030 When I deconfigure the device (chzdev -d 0.0.4024) And I configure the device (chzdev -e 0.0.4204) And I request the xml details of node devices (virsh nodedev-list --storage) I am not aware of a "storage" option on command nodedev-list. My guess is that your are referring to "nodedev-list --tree".
Then the DASD disk is listed And its parent is a device with capability ccw and identifier 0.0.4024 And the parent of that device is another device with capability css and identifier 0.0.0030
The first three patches do not actually change existing behavior but instead fix the problem described in the referenced issue. When the ccw device is configured (online) the relationships are and also should have been as you describe above. Theses patches are not modifying SCSI storage device node behavior. -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Wolfgang Wendt Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294

On 6/19/24 5:10 PM, smitterl(a)redhat.com wrote: I am not aware of a "storage" option on command nodedev-list. My guess is that your are referring to "nodedev-list --tree". I meant the '--cap storage' option. But I see the '--tree' also works and it additionally displays the hierarchy that my test usually would traverse by checking dumpxml and there the parents.
The first three patches do not actually change existing behavior but instead fix the problem described in the referenced issue. When the ccw device is configured (online) the relationships are and also should have been as you describe above.
Theses patches are not modifying SCSI storage device node behavior. Understood.
I was able to confirm that with the scratch build Jiŕi Denemark provided a) the device is listed only once b) the device' parent is correctly set (ccw instead of css) c) no critical product regression hit in our test suite tp-libvirt for test set 'virsh.nodedev_list,virsh.nodedev_dumpxml'

On 6/19/24 14:29, Boris Fiuczynski wrote:
Adjusting how DASDs are handled as recently ID_* tags are also included in the udev information which causes the problems reported by https://issues.redhat.com/browse/RHEL-39497 Removing the filtering of offline ccw devices as these devices are available in the system and also are used to set them online again. The state information is made available in the ccw capability as an optional state element.
Boris Fiuczynski (4): nodedev: refactor storage type fixup nodedev: improve DASD detection nodedev: prevent invalid DASD node object creation nodedev: add ccw device state and remove fencing
src/conf/node_device_conf.c | 24 +++++++++++++ src/conf/node_device_conf.h | 11 ++++++ src/conf/schemas/nodedev.rng | 8 +++++ src/node_device/node_device_udev.c | 56 +++++++++++++++++++++++++----- 4 files changed, 90 insertions(+), 9 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com> and merged. Michal

On 6/20/24 9:41 AM, Michal Prívozník wrote:
On 6/19/24 14:29, Boris Fiuczynski wrote:
Adjusting how DASDs are handled as recently ID_* tags are also included in the udev information which causes the problems reported by https://issues.redhat.com/browse/RHEL-39497 Removing the filtering of offline ccw devices as these devices are available in the system and also are used to set them online again. The state information is made available in the ccw capability as an optional state element.
Boris Fiuczynski (4): nodedev: refactor storage type fixup nodedev: improve DASD detection nodedev: prevent invalid DASD node object creation nodedev: add ccw device state and remove fencing
src/conf/node_device_conf.c | 24 +++++++++++++ src/conf/node_device_conf.h | 11 ++++++ src/conf/schemas/nodedev.rng | 8 +++++ src/node_device/node_device_udev.c | 56 +++++++++++++++++++++++++----- 4 files changed, 90 insertions(+), 9 deletions(-)
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
and merged.
Michal
Thanks, Michal. -- Mit freundlichen Grüßen/Kind regards Boris Fiuczynski IBM Deutschland Research & Development GmbH Vorsitzender des Aufsichtsrats: Wolfgang Wendt Geschäftsführung: David Faller Sitz der Gesellschaft: Böblingen Registergericht: Amtsgericht Stuttgart, HRB 243294
participants (3)
-
Boris Fiuczynski
-
Michal Prívozník
-
smitterl@redhat.com