On 01/30/2015 07:17 PM, Eric Blake wrote:
On 11/20/2014 08:08 AM, Stefan Berger wrote:
> Pass the TPM file descriptor to QEMU via command line.
> Instead of passing /dev/tpm0 we now pass /dev/fdset/10 and the additional
> parameters -add-fd set=10,fd=20.
>
> This addresses the use case when QEMU is started with non-root privileges
> and QEMU cannot open /dev/tpm0 for example.
>
> One problem is that for the passing of the file descriptor set to work,
> virCommandReorderFDs must not be called on the virCommand. This is prevented
> by setting a flag in the virCommandPassFDGetFDIndex that is checked to be
> clear when virCommandReorderFDs is run.
I'm still trying to figure out how virCommandReorderFDs() got into the
picture (I didn't write that section of the code); when I originally
Well, I also only stumbled upon this function and found it being used in
a different area of the code and thought it wouldn't interfere with what
we are doing here.
worked on virCommand, the only way to pass fds to the child was in
direct positions (same fd in child as in parent), not by remapping one
fd in the parent to another position in the child, except for the three
standard descriptors. It looks like virCommandReorderFDs was added to
allow remapping other fds and populates the LISTEN_FDS environment
variable with how many fds were thus mapped. So the two approaches
don't really mix. Do we ever use virCommandPassListenFDs() on a
virCommand that will also do raw fd passthrough? Maybe the real thing
Did not see that. When I checked I found it in a different part of the
code. Basicaly virCommandPassListenFDs causes the reordering to happen.
This seems to only be run by src/rpc/virnetsocket::virNetSocketForkDaemon().
to do is to track at virCommand build-up time that only one of the
two
passthrough methods can be used, and fail loudly if a programmer tries
to do both methods at once.
So far I think we would not run into this problem. Unfortunately the TPM
related code, that would need the order to be stable, is likely not used
frequently and the possibility could exist that some things may break
without noticing.
Or maybe we should ALWAYS prepare to remap fds in the child, so that the
child receives fds in contiguous order with no gaps. It might simplify
the code base to always reorder things, and have the mapping done up
front. That is, change the virCommandPassFD() function to return an
integer of which next consecutive fd the child will see, or -1 on
failure. Callers that then need to alter the command line of the child
will have to pay attention to the return value (something a bit
different than most virCommand build-up, which intentionally defer error
checking to the very end), but it might be worth it.
Not sure I follow. But see my question further below.
At any rate, this patch needs to be split into at least two - first, a
patch to just util/vircommand.c and tests/commandtest.c to tweak the
command passing functionality AND test that any new semantics work while
no old semantics are broken (except maybe that we now explicitly reject
a combination that previously was silently accepted but made no sense),
previously we didn't pass file descriptors /dev/fdset to QEMU, so the
problem presumably didn't exist.
and second the patch to qemu to use the enhancements in virCommand
for
the TPM feature at hand.
Yes, will do. Though let me know how you intend to change things -- took
you by your offer :-)
> Signed-off-by: Stefan Berger <stefanb(a)linux.vnet.ibm.com>
>
> v2->v3: Fixed some memory leaks
> ---
> src/libvirt_private.syms | 1 +
> src/qemu/qemu_command.c | 136 ++++++++++++++++++++++++++++++++++++++++++++---
> src/util/vircommand.c | 33 ++++++++++++
> src/util/vircommand.h | 3 ++
> 4 files changed, 166 insertions(+), 7 deletions(-)
>
> /**
> + * qemuVirCommandGetFDSet:
> + * @cmd: the command to modify
> + * @fd: fd to reassign to the child
> + *
> + * Get the parameters for the QEMU -add-fd command line option
> + * for the given file descriptor. The file descriptor must previously
> + * have been 'transferred' in a virCommandPassFD() call.
Would it make any more sense for this function to do the transfer
instead of making the caller do it?
Maybe both should exist? The only reason that this could be unpleasant
is if you wanted to do this in very different places in the code, the
one place does the transfer, the other builds up some command line, like
we do here. I guess the code could be adapted. Though there should
always be a way to find the FD index.
> + * This function for example returns "set=10,fd=20".
> + */
> +static char *
> +qemuVirCommandGetFDSet(virCommandPtr cmd, int fd)
> +{
> + char *result = NULL;
> + int idx = virCommandPassFDGetFDIndex(cmd, fd);
> +
> + if (idx >= 0) {
> + ignore_value(virAsprintf(&result, "set=%d,fd=%d", idx, fd)
< 0);
> + } else {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("file descriptor %d has not been transferred"),
fd);
> + }
> +
> + return result;
> +}
> +
> +/**
> + * qemuVirCommandGetDevSet:
> + * @cmd: the command to modify
> + * @fd: fd to reassign to the child
> + *
> + * Get the parameters for the QEMU path= parameter where a file
> + * descriptor is accessed via a file descriptor set, for example
> + * /dev/fdset/10. The file descriptor must previously have been
> + * 'transferred' in a virCommandPassFD() call.
Oh, maybe not, since the caller marks the fd for passing once, but then
calls multiple helper methods to format multiple parameters to qemu that
end up working together to describe the fd used by the child.
> + */
> +static char *
> +qemuVirCommandGetDevSet(virCommandPtr cmd, int fd)
> +{
> + char *result = NULL;
> + int idx = virCommandPassFDGetFDIndex(cmd, fd);
> +
> + if (idx >= 0) {
> + ignore_value(virAsprintf(&result, "/dev/fdset/%d", idx) <
0);
> + } else {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("file descriptor %d has not been transferred"),
fd);
> + }
> + return result;
> +}
> +
> +
> +/**
> * qemuPhysIfaceConnect:
> * @def: the definition of the VM (needed by 802.1Qbh and audit)
> * @driver: pointer to the driver instance
> @@ -5926,14 +5978,20 @@ qemuBuildRNGDeviceArgs(virCommandPtr cmd,
>
>
> static char *qemuBuildTPMBackendStr(const virDomainDef *def,
> + virCommandPtr cmd,
> virQEMUCapsPtr qemuCaps,
> - const char *emulator)
> + const char *emulator,
> + int *tpmfd, int *cancelfd)
> {
> const virDomainTPMDef *tpm = def->tpm;
> virBuffer buf = VIR_BUFFER_INITIALIZER;
> const char *type = virDomainTPMBackendTypeToString(tpm->type);
> - char *cancel_path;
> + char *cancel_path = NULL;
> const char *tpmdev;
> + char *devset = NULL, *cancel_devset = NULL;
> +
> + *tpmfd = -1;
> + *cancelfd = -1;
>
> virBufferAsprintf(&buf, "%s,id=tpm-%s", type,
tpm->info.alias);
>
> @@ -5946,11 +6004,49 @@ static char *qemuBuildTPMBackendStr(const virDomainDef *def,
> if (!(cancel_path = virTPMCreateCancelPath(tpmdev)))
> goto error;
>
> - virBufferAddLit(&buf, ",path=");
> - virBufferEscape(&buf, ',', ",", "%s",
tpmdev);
> + if (virQEMUCapsGet(qemuCaps, QEMU_CAPS_ADD_FD)) {
> + *tpmfd = open(tpmdev, O_RDWR);
> + if (*tpmfd < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Could not open TPM device %s"),
tpmdev);
Here, we should use virReportSystemError and pass along the errno value
from the failed open().
Ok, I will fix this.
> + goto error;
> + }
> +
> + virCommandPassFD(cmd, *tpmfd,
> + VIR_COMMAND_PASS_FD_CLOSE_PARENT);
> + devset = qemuVirCommandGetDevSet(cmd, *tpmfd);
> + if (devset == NULL)
> + goto error;
> +
> + *cancelfd = open(cancel_path, O_WRONLY);
> + if (*cancelfd < 0) {
> + virReportError(VIR_ERR_INTERNAL_ERROR,
> + _("Could not open TPM device's cancel path
"
> + "%s"), cancel_path);
> + goto error;
> + }
> +
> + virCommandPassFD(cmd, *cancelfd,
> + VIR_COMMAND_PASS_FD_CLOSE_PARENT);
> + cancel_devset = qemuVirCommandGetDevSet(cmd, *cancelfd);
> + if (cancel_devset == NULL)
> + goto error;
> +
> + virBufferAddLit(&buf, ",path=");
> + virBufferEscape(&buf, ',', ",", "%s",
devset);
> + VIR_FREE(devset);
>
> - virBufferAddLit(&buf, ",cancel-path=");
> - virBufferEscape(&buf, ',', ",", "%s",
cancel_path);
> + virBufferAddLit(&buf, ",cancel-path=");
> + virBufferEscape(&buf, ',', ",", "%s",
cancel_devset);
> + VIR_FREE(cancel_devset);
> + } else {
> + /* all test cases will use this path */
> + virBufferAddLit(&buf, ",path=");
> + virBufferEscape(&buf, ',', ",", "%s",
tpmdev);
> +
> + virBufferAddLit(&buf, ",cancel-path=");
> + virBufferEscape(&buf, ',', ",", "%s",
cancel_path);
> + }
The else clause looks redundant with the last few lines of the if
clause. Why not just sink it to occur just once after the 'if', as long
as the 'if' side does a mere tmpdev=devset?
True. Will combine it.
> VIR_FREE(cancel_path);
>
> break;
> @@ -5970,6 +6066,10 @@ static char *qemuBuildTPMBackendStr(const virDomainDef *def,
> emulator, type);
>
> error:
> + VIR_FREE(devset);
> + VIR_FREE(cancel_devset);
> + VIR_FREE(cancel_path);
> +
> virBufferFreeAndReset(&buf);
> return NULL;
> }
> @@ -9223,13 +9323,35 @@ qemuBuildCommandLine(virConnectPtr conn,
>
> if (def->tpm) {
> char *optstr;
This function is really huge. Yeah, we've not been good at breaking it
into chunks in the past, but I think this addition would be nicer if the
code inside 'if (def->tpm)' were extracted into a smaller helper
function. But if we do that refactoring, the code motion of the
existing pre-patch code should be its own patch.
Yes, can do that also.
> + int tpmfd = -1;
> + int cancelfd = -1;
> + char *fdset;
>
> - if (!(optstr = qemuBuildTPMBackendStr(def, qemuCaps, emulator)))
> + if (!(optstr = qemuBuildTPMBackendStr(def, cmd, qemuCaps, emulator,
> + &tpmfd, &cancelfd)))
> goto error;
>
> virCommandAddArgList(cmd, "-tpmdev", optstr, NULL);
> VIR_FREE(optstr);
>
> + if (tpmfd >= 0) {
> + fdset = qemuVirCommandGetFDSet(cmd, tpmfd);
> + if (!fdset)
> + goto error;
> +
> + virCommandAddArgList(cmd, "-add-fd", fdset, NULL);
> + VIR_FREE(fdset);
> + }
> +
> + if (cancelfd >= 0) {
> + fdset = qemuVirCommandGetFDSet(cmd, cancelfd);
> + if (!fdset)
> + goto error;
> +
> + virCommandAddArgList(cmd, "-add-fd", fdset, NULL);
> + VIR_FREE(fdset);
> + }
> +
> if (!(optstr = qemuBuildTPMDevStr(def, qemuCaps, emulator)))
> goto error;
>
> diff --git a/src/util/vircommand.c b/src/util/vircommand.c
> index 6527d85..2616446 100644
> --- a/src/util/vircommand.c
> +++ b/src/util/vircommand.c
> @@ -67,6 +67,7 @@ enum {
> VIR_EXEC_RUN_SYNC = (1 << 3),
> VIR_EXEC_ASYNC_IO = (1 << 4),
> VIR_EXEC_LISTEN_FDS = (1 << 5),
> + VIR_EXEC_FIXED_FDS = (1 << 6),
> };
>
> typedef struct _virCommandFD virCommandFD;
> @@ -214,6 +215,12 @@ virCommandReorderFDs(virCommandPtr cmd)
> if (!cmd || cmd->has_error || !cmd->npassfd)
> return;
>
> + if ((cmd->flags & VIR_EXEC_FIXED_FDS)) {
> + virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
> + _("The fds are fixed and cannot be reordered"));
> + goto error;
> + }
Okay, so you are enforcing that a caller can't pick two conflicting fd
passing methods on the same virCommand, but my point remains that the
testsuite should prove we can hit this error case, to avoid regressions.
Or we should redo virCommand to ALWAYS rearrange fds in the child, and
make it possible for a caller passing an fd down to know what fd the
child will see it as.
That's the part I don't quite follow. We are passing indices via
/dev/fdset/%d, right? At the time we build up the command line, we need
to know the index that the fd will have. If we shuffle things around
later in a way that we cannot determine at the point we build up the
command line, how would this work then?
> +
> for (i = 0; i < cmd->npassfd; i++)
> maxfd = MAX(cmd->passfd[i].fd, maxfd);
>
> @@ -1019,6 +1026,32 @@ virCommandPassListenFDs(virCommandPtr cmd)
> cmd->flags |= VIR_EXEC_LISTEN_FDS;
> }
>
> +/*
> + * virCommandPassFDGetFDIndex:
> + * @cmd: pointer to virCommand
> + * @fd: FD to get index of
> + *
> + * Determine the index of the FD in the transfer set.
> + *
> + * Returns index >= 0 if @set contains @fd,
> + * -1 otherwise.
> + */
> +int
> +virCommandPassFDGetFDIndex(virCommandPtr cmd, int fd)
> +{
> + size_t i = 0;
> +
> + while (i < cmd->npassfd) {
> + if (cmd->passfd[i].fd == fd) {
> + cmd->flags |= VIR_EXEC_FIXED_FDS;
> + return i;
So, the mapping you are using is that the first fd passed through the
child is eventually associated with /dev/fdset/$i, where $i represents
which slot of cmd->passfd the descriptor was found in. I know the
reason qemu added /dev/fdset/nnn was to allow the possibility of passing
in sets of descriptors (such as a read-only and read-write handle both
visiting the same file, so in the same fdset), but simplicity argues
that unless we need that complexity, always naming our fdsets by the
same fd that the child will first use is easier than trying to track the
mapping between a parent fd and which order the fd was registered in.
So I'm wondering if we change virCommandPassFD to always remap in the
child and return an integer (the next consecutive integer starting at
3), then just use '-add-fd set=3,fd=3 ... /dev/fdset/3' is any simpler
than trying to have set number unrelated to the fd number.
We are not passing all file descriptors to the child. So how would you
handle passing fd's 10,3,55 to the child? Do we pass empty slots for 4-9
and 11-54 then, like pass fd = -1, so we can pass indices 10,3, and 55
or something like that?
Regards,
Stefan