Re: [libvirt PATCH] qemu: fix potential resource leak

23 Oct 2020

On 10/23/20 1:28 AM, Peter Krempa wrote:
...
On Thu, Oct 22, 2020 at 23:20:20 -0400, Laine Stump wrote:
...
On 10/22/20 3:01 AM, Peter Krempa wrote:
...
...
On 10/21/20 5:50 PM, Jonathon Jongsma wrote:
...
Coverity reported a potential resource leak. While it's probably not
a real-world scenario, the code could technically jump to cleanup
between the time that vdpafd is opened and when it is used. Ensure that
it gets cleaned up in that case.
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
---
    src/qemu/qemu_command.c | 3 +++
    1 file changed, 3 insertions(+)

diff --git a/src/qemu/qemu_command.c b/src/qemu/qemu_command.c
index 5c4e37bd9e..cbe7a6e331 100644
--- a/src/qemu/qemu_command.c
+++ b/src/qemu/qemu_command.c
@@ -8135,6 +8135,7 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
            addfdarg = g_strdup_printf("%s,opaque=%s", fdset,
                                       net->data.vdpa.devicepath);
            virCommandAddArgList(cmd, "-add-fd", addfdarg, NULL);
+        vdpafd = -1;
        }
        if (chardev)
@@ -8204,6 +8205,8 @@ qemuBuildInterfaceCommandLine(virQEMUDriverPtr driver,
        VIR_FREE(tapfdName);
        VIR_FREE(vhostfd);
        VIR_FREE(tapfd);
+    if (vdpafd >= 0)
+        VIR_FORCE_CLOSE(vdpafd);
VIR_FORCE_CLOSE() ==>virForceCloseHelper() ==> virFileClose()
and virFileClose() is a NOP if fd < 0, so this doesn't need the conditional.
I *was* going to say "With that change,
Reviewed-by: Laine Stump <laine@redhat.com>
"
but this got me looking at the code of the entire function rather than just
the diffs in the patch, and I've got a question - is there any reason that
you only ope n the vdpa device inside the switch, and save everything else
related to it until later (at the "if (vdpafd)")? You could then device
I'd like to point out that opening anything in the command line
On Wed, Oct 21, 2020 at 22:31:09 -0400, Laine Stump wrote:
formatters is IMO bad practice.
Well... I agree that it is an ugly design, but that's pretty much what's in
place for almost everything.
Sure, but it shouldn't be used as an argument to not use a better
approach.
...
...
The command line formatter should format
the commandline and nothing more.
It would be nice if that was the case, but it already isn't. :-/
No. Please don't put it this way. My message is that while the old code
isn't compliant, it shouldn't be an excuse to make it worse!
Yes, I can see how my statement might be misunderstood as "the battle is 
already lost! Give up!!", which wasn't my intent.
...
In this instace the code is commited. We don't need to change it,
^^ that was more my intent, and it seems we agree.
...
but
any further change should be encouraged to use the better approach.
...
...
This is visible by the necessity to
have a lot of mock override functions which prevent the command line
formatter from touching resources on the host in the first place.
Better approach is to open resources in 'qemuProcessPrepareHost' and
store them in private data for command line formatting. Fake data for
tests are added in 'testCompareXMLToArgvCreateArgs'.
That's nice in the fact that it eliminates the need for mock overrides
(would be even *nicer* if that function had even a single line of
documentation included that described its purpose, and what code in the qemu
driver it should be mimicking, amirite?).
I agree, documentation is lacking in many parts. This is the
not-so-obvious techincal debt.
...
But it's bad because the code in qemuProcessPrepareHost() won't be tested
(can't be tested if there are no mocks for the system functions it calls).
IMO this statement is misrepresenting what's happening. Sure
qemuProcessPrepareHost can't be tested by unit testing. It's replaced by
another function which fakes inputs. But that is EXACTLY what the mock
functions preloaded into our tests are doing!
With qemuProcessPrepareHost() you at least have one place and one
function where all the code is aggregated
Where all the code *to prepare/setup devices* is aggregated. But in the 
process of aggregating that code, you've *split* the code for each 
individual device into multiple places with possibly no overt connection 
between them. Which of those is better depends on what you're doing. If 
you're adding a new feature, it means there is one more place you have 
to seek out and add something to. If there is something in each of those 
places that causes a build error when its omitted from the additions 
(e.g. a switch statement that needs a new case) then everything is 
great, but if it's just something that people are expected to find 
themselves by following the bread crumbs of a similar existing feature, 
then it becomes more trouble.


Keep in mind, I'm not saying that this is worse than the old way, just 
that it brings in its own new set of issues.
...
rather than spreading it
trhough the command line generator and it being very hard to audit
afterwards.
...
Basically you're trading the extra work of mocking system-level functions
for the extra work of filling in stuff in the privateData (and the
maintenance of that code), and eliminating testing of the code that's been
moved (pretty *lame* testing, I'll admit, since it's getting back canned
results from the fake system calls).
As noted before the extent of testing is exactly identical.
Not identical. With the old method of mocking system calls, the code 
that, e.g., opens a file in sysfs and processes its contents is still 
tested (although with canned data that may not be representative of 
reality). With the new method, that code is segregated off and never 
called during testing - the "mock" is still there, just at a higher 
level and done by explicitly calling a different function rather than by 
faking out the dynamic linking.


If we wanted to satisfy both the segregation of the "prepare" code and 
its testing, then we would need to still maintain the mocked system 
library functions, and initialize the privateData by calling the actual 
prepare code rather than the replacements in 
testCompareXMLToArgvCreateArgs(). (Not that I'm suggesting that's what 
we should do - there is a limit to everything!)
...
The
difference is that all the code which is touching the host is aggregated
in one place and replaced by another function vs scattered accross the
whole file and LD_PRELOAD-ed.
...
So it's not really the panacea your advocacy implies :-)
(Don't get me wrong! I've always disliked the mixing of device/file/whatever
init with the commandline generating functions.)
(actually a couple months ago I looked into putting the network interface
"prepare" stuff into privateData similar to what's done with the slirp stuff
now. In the end I gave up because it didn't provide the result I wanted - I
was trying to keep track of what device prep actions had been done for which
devices during domain startup so that the shutdown in case of startup
failure would only shutdown those things that had actually been setup; it
ended up being too complicated to make it work correctly both in the case of
an aborted startup and a normal shutdown, once you took into account the
possibility of libvirtd being restarted as part of a libvirt package update.
I'll point out that during all my searches through the code during the
experiment referenced in the previous paragraph, I never ran across
testCompareXMLToArgvCreateArgs(), and didn't know of its existence (or at
Yup, sorry I've extracted it recently. Previously we've just piled up
all the "init" code into testCompareXMLToArgv without thinking twice.
...
least didn't remember it, if I had known about it before). Is this
documented somewhere? Or is it expected to be learned by reading every patch
coming across the mailing list (I unfortunately fail at that in a major
way)?
You see, I've failed the same way pointing out that the approach used
was outdated.
Also given that the documentation of the new approach would be applied
via a patch, you could have missed it the same way as the patch
implementing the new approach to initialize host-specific data (or it
would be added at the same time). I doubt that any of us is re-reading
contribution guidelines just for fun.
yup. Information overload is a real thing. :-/
...
...
...
I'm aware though that there's a lot of "prior art" in this area though.
... and nothing in the code or the coding practices to warn against it,
point people in the other direction.
Patches are welcome! :P
Dang it!! I had that in the original draft of my message, but removed 
it! (Being the first person to call "Patches are welcome" in an OSS 
mailing list discussion is similar to being the first to call 
"Shotgun!!" when everyone is heading for the car to drive someplace (for 
non-English speakers, the "shotgun" position in a car is the front row 
passenger's seat, next to the window. I guess this comes from the days 
of stagecoaches, when the only people sitting outside were the driver, 
and a man sitting next to him with a shotgun (or some other type of 
gun), to defend the coach from roadside bandits.)
...
...
This sounds like another "saga" in the making - split all commandline
generating functions into separate "prepare device" and "generate
commandline" parts. I don't know that we should require Jonathon to change
his code that much just to fix a memory leak though ... (too bad I hadn't
kept up with the latest cool stuff so I would have pointed it out in review
of the original patch).
As said, I'm fine with prior art. We should encourage any further
contributions to avoid this pattern though. Obviously if Jonathon wants
to fix it I'm all for it. The fd leak fix is fine though.
Okay, we're in agreement then. Move along! Nothing to see here! :-)