[PATCH 0/2] Add locking option for disk

There is a case that Linux has a bug and unlocking does not work properly like this: https://lore.kernel.org/lkml/20230608084609.14245-1-zhangjiachen.jaycee@byte... Especiall in the situation that live migration source node has this kind of bug, destination must not locking, or otherwise VM stops when live migration runs. There commits add workaround for this kind of problems. Could you take a look? Especially following is points I could not make sure: In the first commit, I wrote explanation on file section in the document, but placed parser and formatter in virDomainStorageSourceParse and virDomainStorageSourceParse because locking can be common if added to other backends. Is it good to place here, or is there some better location to place? Hiroki Narukawa (2): conf: add locking option to disk source qemu_block: add locking option docs/formatdomain.rst | 5 ++++ src/conf/domain_conf.c | 8 +++++++ src/conf/schemas/domaincommon.rng | 5 ++++ src/conf/storage_source_conf.h | 3 +++ src/qemu/qemu_block.c | 7 ++++++ tests/qemublocktest.c | 1 + ...le-raw-aio_native-locking-off-srconly.json | 10 ++++++++ .../file-raw-aio_native-locking-off.json | 23 +++++++++++++++++++ .../file-raw-aio_native-locking-off.xml | 12 ++++++++++ 9 files changed, 74 insertions(+) create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off-srconly.json create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.json create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.xml -- 2.25.1

There is a case that locking hits a bug and users wants to disable locking like bug in Linux kernel. This commit adds option to configure locking for file source. Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> --- docs/formatdomain.rst | 5 +++++ src/conf/domain_conf.c | 8 ++++++++ src/conf/schemas/domaincommon.rng | 5 +++++ src/conf/storage_source_conf.h | 3 +++ 4 files changed, 21 insertions(+) diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 0a4f9d9000..95869d573c 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -2782,6 +2782,11 @@ paravirtualized driver is specified via the ``disk`` element. via the filesystem. The filename passed via ``file`` can still be used to generate paths to write into image metadata when doing block operations but libvirt will not access these natively. + + :since:`Since 10.1.0` a new optional attribute ``locking`` can be added + which can take "on" or "off". Basically this option should be kept + default, but it can be used explicitly for workaround about bug in + locking. ``block`` The ``dev`` attribute specifies the fully-qualified path to the host device to serve as the disk. :since:`Since 0.0.3` diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c index 3597959e33..b7cefffc7c 100644 --- a/src/conf/domain_conf.c +++ b/src/conf/domain_conf.c @@ -7563,6 +7563,10 @@ virDomainStorageSourceParse(xmlNodePtr node, ctxt, flags) < 0) return -1; + if (virXMLPropTristateSwitch(node, "locking", VIR_XML_PROP_NONE, + &src->locking) < 0) + return -1; + /* People sometimes pass a bogus '' source path when they mean to omit the * source element completely (e.g. CDROM without media). This is just a * little compatibility check to help those broken apps */ @@ -22544,6 +22548,10 @@ virDomainDiskSourceFormat(virBuffer *buf, if (attrIndex && src->id != 0) virBufferAsprintf(&attrBuf, " index='%u'", src->id); + if (src->locking) + virBufferEscapeString(&attrBuf, " locking='%s'", + virTristateSwitchTypeToString(src->locking)); + if (virDomainDiskSourceFormatPrivateData(&childBuf, src, flags, xmlopt) < 0) return -1; diff --git a/src/conf/schemas/domaincommon.rng b/src/conf/schemas/domaincommon.rng index df44cd9857..8ccb1dd162 100644 --- a/src/conf/schemas/domaincommon.rng +++ b/src/conf/schemas/domaincommon.rng @@ -1789,6 +1789,11 @@ <ref name="positiveInteger"/> </attribute> </optional> + <optional> + <attribute name="locking"> + <ref name="virOnOff"/> + </attribute> + </optional> <optional> <element name="slices"> <element name="slice"> diff --git a/src/conf/storage_source_conf.h b/src/conf/storage_source_conf.h index 05b4bda16c..3984614a3d 100644 --- a/src/conf/storage_source_conf.h +++ b/src/conf/storage_source_conf.h @@ -356,6 +356,9 @@ struct _virStorageSource { /* image is shared across hosts */ bool shared; + /* intended to use for workaround */ + virTristateSwitch locking; + /* backing chain of the storage source */ virStorageSource *backingStore; -- 2.25.1

On Wed, Feb 14, 2024 at 14:17:57 +0900, Hiroki Narukawa wrote:
There is a case that locking hits a bug and users wants to disable locking like bug in Linux kernel.
This commit adds option to configure locking for file source.
Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> --- docs/formatdomain.rst | 5 +++++ src/conf/domain_conf.c | 8 ++++++++ src/conf/schemas/domaincommon.rng | 5 +++++ src/conf/storage_source_conf.h | 3 +++ 4 files changed, 21 insertions(+)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 0a4f9d9000..95869d573c 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -2782,6 +2782,11 @@ paravirtualized driver is specified via the ``disk`` element. via the filesystem. The filename passed via ``file`` can still be used to generate paths to write into image metadata when doing block operations but libvirt will not access these natively. + + :since:`Since 10.1.0` a new optional attribute ``locking`` can be added + which can take "on" or "off". Basically this option should be kept + default, but it can be used explicitly for workaround about bug in + locking.
This is not the first time this is being suggested and I'm still not persuaded that we want this. We definitelly do not want this as a generic production attribute. A possibility would be to do it as part of the qemu XML namespace along with tainting the VM that such config is being used. https://libvirt.org/drvqemu.html#pass-through-of-arbitrary-qemu-commands The main reason is that you never actually want to use this, except for the very specific case of bugs you are mentioning.
``block`` The ``dev`` attribute specifies the fully-qualified path to the host device to serve as the disk. :since:`Since 0.0.3`

On Wed, Feb 14, 2024 at 13:17:39 +0100, Peter Krempa wrote:
On Wed, Feb 14, 2024 at 14:17:57 +0900, Hiroki Narukawa wrote:
There is a case that locking hits a bug and users wants to disable locking like bug in Linux kernel.
This commit adds option to configure locking for file source.
Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> --- docs/formatdomain.rst | 5 +++++ src/conf/domain_conf.c | 8 ++++++++ src/conf/schemas/domaincommon.rng | 5 +++++ src/conf/storage_source_conf.h | 3 +++ 4 files changed, 21 insertions(+)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 0a4f9d9000..95869d573c 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -2782,6 +2782,11 @@ paravirtualized driver is specified via the ``disk`` element. via the filesystem. The filename passed via ``file`` can still be used to generate paths to write into image metadata when doing block operations but libvirt will not access these natively. + + :since:`Since 10.1.0` a new optional attribute ``locking`` can be added + which can take "on" or "off". Basically this option should be kept + default, but it can be used explicitly for workaround about bug in + locking.
This is not the first time this is being suggested and I'm still not persuaded that we want this. We definitelly do not want this as a generic production attribute.
A possibility would be to do it as part of the qemu XML namespace along with tainting the VM that such config is being used.
https://libvirt.org/drvqemu.html#pass-through-of-arbitrary-qemu-commands
The main reason is that you never actually want to use this, except for the very specific case of bugs you are mentioning.
To elaborate on the qemu XML namespace integration: Currently we parse the namespace elements only on a global scope, so per-disk XML attributes are not supported. Thus to add per-disk or per-device XML namespace attributes a new parsing callback will be needed. A second option would be to do it similarly to how we allow override of device parameters: https://libvirt.org/drvqemu.html#overriding-properties-of-qemu-devices The image to override would be identified using the TARGET[INDEX] tuple (e.g. vda[1]) as declared in the XML.

Thank you for your comment. I will try to write so that this will not be locking specific.
On Wed, Feb 14, 2024 at 13:17:39 +0100, Peter Krempa wrote:
On Wed, Feb 14, 2024 at 14:17:57 +0900, Hiroki Narukawa wrote:
There is a case that locking hits a bug and users wants to disable locking like bug in Linux kernel.
This commit adds option to configure locking for file source.
Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> --- docs/formatdomain.rst | 5 +++++ src/conf/domain_conf.c | 8 ++++++++ src/conf/schemas/domaincommon.rng | 5 +++++ src/conf/storage_source_conf.h | 3 +++ 4 files changed, 21 insertions(+)
diff --git a/docs/formatdomain.rst b/docs/formatdomain.rst index 0a4f9d9000..95869d573c 100644 --- a/docs/formatdomain.rst +++ b/docs/formatdomain.rst @@ -2782,6 +2782,11 @@ paravirtualized driver is specified via the ``disk`` element. via the filesystem. The filename passed via ``file`` can still be used to generate paths to write into image metadata when doing block operations but libvirt will not access these natively. + + :since:`Since 10.1.0` a new optional attribute ``locking`` can be added + which can take "on" or "off". Basically this option should be kept + default, but it can be used explicitly for workaround about bug in + locking.
This is not the first time this is being suggested and I'm still not persuaded that we want this. We definitelly do not want this as a generic production attribute.
A possibility would be to do it as part of the qemu XML namespace along with tainting the VM that such config is being used.
irt.org%2Fdrvqemu.html%23pass-through-of-arbitrary-qemu-commands&data=
05%7C02%7Chnarukaw%40yahoo-corp.jp%7Ce8a8b57666244c52e3b108dc2d58 bcf2%
7Ca208d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C638435106350299392%7 CUnkn
own%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1h aWwi
LCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=MIFzrihGDgVq4XO8na2K%2F7c9q 708wMwKAZ
0yKU1ZJVE%3D&reserved=0
The main reason is that you never actually want to use this, except for the very specific case of bugs you are mentioning.
To elaborate on the qemu XML namespace integration:
Currently we parse the namespace elements only on a global scope, so per-disk XML attributes are not supported. Thus to add per-disk or per-device XML namespace attributes a new parsing callback will be needed.
A second option would be to do it similarly to how we allow override of device parameters:
https://libvirt.org/ %2Fdrvqemu.html%23overriding-properties-of-qemu-devices&data=05%7C02% 7Chnarukaw%40yahoo-corp.jp%7Ce8a8b57666244c52e3b108dc2d58bcf2%7Ca2 08d369cd4e4f87b11998eaf31df2c3%7C1%7C0%7C638435106350305499%7CUnk nown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1 haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=7eH0zujLapnHQRIQXqihwP ADhiAt77yInaSVjUdFez4%3D&reserved=0
The image to override would be identified using the TARGET[INDEX] tuple (e.g. vda[1]) as declared in the XML.

There is a case that locking hits a bug and users wants to disable locking like bug in Linux kernel. This commit adds actual qemu option to the domain conf added in previous commit. Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> --- src/qemu/qemu_block.c | 7 ++++++ tests/qemublocktest.c | 1 + ...le-raw-aio_native-locking-off-srconly.json | 10 ++++++++ .../file-raw-aio_native-locking-off.json | 23 +++++++++++++++++++ .../file-raw-aio_native-locking-off.xml | 12 ++++++++++ 5 files changed, 53 insertions(+) create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off-srconly.json create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.json create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.xml diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c index 738b72d7ea..5905898b38 100644 --- a/src/qemu/qemu_block.c +++ b/src/qemu/qemu_block.c @@ -852,6 +852,13 @@ qemuBlockStorageSourceGetFileProps(virStorageSource *src, "S:aio", iomode, "S:pr-manager", prManagerAlias, NULL) < 0); + + if (src->locking) { + const char *lockingStr = virTristateSwitchTypeToString(src->locking); + if (virJSONValueObjectAdd(&ret, "S:locking", lockingStr, NULL) < 0) + return NULL; + } + return ret; } diff --git a/tests/qemublocktest.c b/tests/qemublocktest.c index c581bd1748..3b005cb475 100644 --- a/tests/qemublocktest.c +++ b/tests/qemublocktest.c @@ -1116,6 +1116,7 @@ mymain(void) TEST_DISK_TO_JSON("dir-fat-readonly"); TEST_DISK_TO_JSON("dir-fat-floppy"); TEST_DISK_TO_JSON("file-raw-aio_native"); + TEST_DISK_TO_JSON("file-raw-aio_native-locking-off"); TEST_DISK_TO_JSON("file-backing_basic-aio_threads"); TEST_DISK_TO_JSON("file-backing_basic-aio_io_uring"); TEST_DISK_TO_JSON("file-raw-luks"); diff --git a/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off-srconly.json b/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off-srconly.json new file mode 100644 index 0000000000..88c9970ccf --- /dev/null +++ b/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off-srconly.json @@ -0,0 +1,10 @@ +( + source only properties: + { + "driver": "file", + "filename": "/path/to/i.img", + "locking": "off" + } + backing store string: + /path/to/i.img +) diff --git a/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.json b/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.json new file mode 100644 index 0000000000..aeb483e4b4 --- /dev/null +++ b/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.json @@ -0,0 +1,23 @@ +{ + "node-name": "test1", + "read-only": false, + "cache": { + "direct": true, + "no-flush": false + }, + "driver": "raw", + "file": "test2" +} +{ + "driver": "file", + "filename": "/path/to/i.img", + "aio": "native", + "locking": "off", + "node-name": "test2", + "auto-read-only": true, + "discard": "unmap", + "cache": { + "direct": true, + "no-flush": false + } +} diff --git a/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.xml b/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.xml new file mode 100644 index 0000000000..9e6a9d26bc --- /dev/null +++ b/tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.xml @@ -0,0 +1,12 @@ +<disk device='disk'> + <driver name='qemu' type='raw' io='native' cache='none'/> + <source file='/path/to/i.img' locking='off'> + <privateData> + <nodenames> + <nodename type='storage' name='test2'/> + <nodename type='format' name='test1'/> + </nodenames> + </privateData> + </source> + <target dev='vda'/> +</disk> -- 2.25.1

On Wed, Feb 14, 2024 at 14:17:58 +0900, Hiroki Narukawa wrote:
There is a case that locking hits a bug and users wants to disable locking like bug in Linux kernel.
This commit adds actual qemu option to the domain conf added in previous commit.
Signed-off-by: Hiroki Narukawa <hnarukaw@yahoo-corp.jp> --- src/qemu/qemu_block.c | 7 ++++++ tests/qemublocktest.c | 1 + ...le-raw-aio_native-locking-off-srconly.json | 10 ++++++++ .../file-raw-aio_native-locking-off.json | 23 +++++++++++++++++++ .../file-raw-aio_native-locking-off.xml | 12 ++++++++++
Rahter than qemublocktest we generally require testing via qemuxmlconftest for anything that has XML or uis formatted on the commandline.
5 files changed, 53 insertions(+) create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off-srconly.json create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.json create mode 100644 tests/qemublocktestdata/xml2json/file-raw-aio_native-locking-off.xml
diff --git a/src/qemu/qemu_block.c b/src/qemu/qemu_block.c index 738b72d7ea..5905898b38 100644 --- a/src/qemu/qemu_block.c +++ b/src/qemu/qemu_block.c @@ -852,6 +852,13 @@ qemuBlockStorageSourceGetFileProps(virStorageSource *src, "S:aio", iomode, "S:pr-manager", prManagerAlias, NULL) < 0); + + if (src->locking) { + const char *lockingStr = virTristateSwitchTypeToString(src->locking); + if (virJSONValueObjectAdd(&ret, "S:locking", lockingStr, NULL) < 0) + return NULL;
The 'S:' conversion is skipped if the argument is NULL, thus no need for an extra call to virJSONValueObjectAdd.
participants (2)
-
Hiroki Narukawa
-
Peter Krempa