[PATCH 0/2] tests: fix very subtle bugs in 'read-big-pipe'

Daniel P. Berrangé (2): tests: fix hang in virshtest 'read-big-pipe' case tests: fix two off-by-1 errors in read-big-pipe test tests/virshtest.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) -- 2.43.0

The virshtest program testPipeFeeder method is doing this: mkfifo("test.fifo", 0600) ; int fd = open("test.fifo", O_RDWR); char buf[...]; memset(buf, 'a', sizeof(buf)); write(fd, buf, sizeof(buf)) == sizeof(buf)); close(fd); while the the 'virsh' child process then ends up doing: fd = open("test.fifo", O_RDONLY); read(fd, buf, sizeof(buf)) == sizeof(buf)); close(fd); The 'virsh' code hangs on open() on at least ppc64 and some other arches. It can be provoked to hang even on x86 by reducing the size of the buffer. It can be prevented from hanging on ppc64 by increasing the size of the buffer. What is happening is a result of differing page sizes, altering the overall pipe capacity size, since pipes on linux default to 16 pages in size and thus have architecture specific capacity when measured in bytes. * On x86, testPipeFeeder opens R+W, tries to write 140kb and write() blocks because the pipe is full. This gives time for virsh to start up, and it can open the pipe for O_RDONLY since testPipeFeeder still has it open for write. Everything works as intended. * On ppc64, testPipeFeeder opens R+W, tries to write 140kb and write() succeeds because the larger 64kb page size resulted in greater buffer capacity for the pipe. It thus quickly closes the pipe, removing the writer, and triggering discard of all the unread data. Now virsh starts up, tries to open the pipe for O_RDONLY and blocks waiting for a new writer to open it, which will never happen. Meson kills it the test after 30 seconds. NB, every now & then, it will not block because virsh starts up quickly enough that testPipeFeeder has not yet closed the write end of the pipe, giving the illusion of correctness. The key flaw here is that it should not have been using O_RDWR in testPipeFeeder. Synchronization is required such that both virsh and testPipeFeeder have their respective ends of the pipe open before any data is sent. This is trivially arranged by using O_WRONLY in testPipeFeeder. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- tests/virshtest.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/virshtest.c b/tests/virshtest.c index a1ae481316..7a7797647c 100644 --- a/tests/virshtest.c +++ b/tests/virshtest.c @@ -145,7 +145,7 @@ testPipeFeeder(void *opaque) g_autofree char *doc = g_new0(char, emptyspace + xmlsize + 1); VIR_AUTOCLOSE fd = -1; - if ((fd = open(pipepath, O_RDWR)) < 0) { + if ((fd = open(pipepath, O_WRONLY)) < 0) { fprintf(stderr, "\nfailed to open pipe '%s': %s\n", pipepath, g_strerror(errno)); return; } -- 2.43.0

On Wed, May 08, 2024 at 01:11:30PM GMT, Daniel P. Berrangé wrote:
* On ppc64, testPipeFeeder opens R+W, tries to write 140kb and write() succeeds because the larger 64kb page size resulted in greater buffer capacity for the pipe. It thus quickly closes the pipe, removing the writer, and triggering discard of all the unread data. Now virsh starts up, tries to open the pipe for O_RDONLY and blocks waiting for a new writer to open it, which will never happen. Meson kills it the test after 30 seconds.
s/kills it/kills/ Thanks a lot for looking into this! Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization

When testPipeFeeder copies the XML document into the padded buffer, it tells virStrcpy that 'xmlsize' bytes are available. This is under reporting size by 1 byte, and as a result it fails to copy the trailing '\n' replacing it when '\0'. The return value of virStrcpy wasn't checked, but was reporting this truncation. When testPipeFeeder then sends the padded buffer down the pipe, it askes to send 'emptyspace + xmlsize + 1' bytes, which means it sends the data, as well as the trailing '\0' terminator. Both bugs combined mean it is sending '\0\0' as the last bytes, instead of '\n' which was intended. When virFileReadAll reads data from the pipe, it ends up adding another '\0' resulting in in a very NUL terminated string ('\0\0\0'). This is all harmless, but should be fixed regardless. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> --- tests/virshtest.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/virshtest.c b/tests/virshtest.c index 7a7797647c..03d499b759 100644 --- a/tests/virshtest.c +++ b/tests/virshtest.c @@ -151,9 +151,9 @@ testPipeFeeder(void *opaque) } memset(doc, ' ', emptyspace); - virStrcpy(doc + emptyspace, xml, xmlsize); + g_assert(virStrcpy(doc + emptyspace, xml, xmlsize + 1) == 0); - if (safewrite(fd, doc, emptyspace + xmlsize + 1) < 0) { + if (safewrite(fd, doc, emptyspace + xmlsize) < 0) { fprintf(stderr, "\nfailed to write to pipe '%s': %s\n", pipepath, g_strerror(errno)); return; } -- 2.43.0

On Wed, May 08, 2024 at 01:11:31PM GMT, Daniel P. Berrangé wrote:
When testPipeFeeder copies the XML document into the padded buffer, it tells virStrcpy that 'xmlsize' bytes are available. This is under reporting size by 1 byte, and as a result it fails to copy the trailing '\n' replacing it when '\0'. The return value of virStrcpy wasn't
*replacing it with
checked, but was reporting this truncation.
When testPipeFeeder then sends the padded buffer down the pipe, it askes
*asks
to send 'emptyspace + xmlsize + 1' bytes, which means it sends the data, as well as the trailing '\0' terminator.
Both bugs combined mean it is sending '\0\0' as the last bytes, instead of '\n' which was intended. When virFileReadAll reads data from the pipe, it ends up adding another '\0' resulting in in a very NUL
*resulting in a very Reviewed-by: Andrea Bolognani <abologna@redhat.com> -- Andrea Bolognani / Red Hat / Virtualization
participants (2)
-
Andrea Bolognani
-
Daniel P. Berrangé