On 07/04/2012 04:00 AM, Kevin Wolf wrote:
Am 03.07.2012 19:03, schrieb Eric Blake:
>>>> 2. drive_add file=/dev/fdset/1 -> qemu_open uses the first fd from
the
>>>> set that has access flags matching the qemu_open action flags.
>>>> qemu_open increments refcount for this fd.
>>>> 3. add-fd /dev/fdset/1 FDSET={M} -> qemu adds fd to set named
>>>> "/dev/fdset/1" - command returns qemu fd to caller (e.g fd=5).
libvirt
>>>> in-use flag turned on for fd.
>>>> 3. block-commit -> qemu_open reopens "/dev/fdset/1" by using
the first
>>>> fd from the set that has access flags matching the qemu_open action
>>>> flags. qemu_open increments refcount for this fd.
>>>> 4. remove-fd /dev/fdset/1 5 -> caller requests fd==5 be removed from
the
>>>> set. turns libvirt in-use flag off marking the fd ready to be closed
>>>> when qemu is done with it.
>>>
>>> If we decided to not return the individual fd numbers to libvirt, file
>>> descriptors would be uniquely identified by an fdset/flags pair here.
>>>
>>
>> Are you saying we'd pass the fdset name and flags parameters on
>> remove-fd to somehow identify the fds to remove?
>
> Passing the flag parameters is not trivial, as that would mean the QMP
> code would have to define constants mapping to all of the O_* flags that
> qemu_open supports. It's easier to support closing by fd number.
Good point. So pass-fd or whatever it will be called needs to returnn
both fdset number and the individual fd number.
>>>> 5. qemu_close decrements refcount for fd, and closes fd when refcount is
>>>> zero and libvirt in use flag is off.
>>>
>>> The monitor could just hold another reference, then we save the
>>> additional flag. But that's a qemu implementation detail.
>>>
>>
>> I'm not sure I understand what you mean.
>
> pass-fd (or add-fd, whatever name we give it) adds an fd to an fdset,
> with initial use count of 1 (the use is the monitor). qemu_open()
> increments the use count. A new qemu_close() wrapper would decrement
> the use count. And both calling 'remove-fd', or closing the QMP monitor
> of an fd that has not yet been passed through 'remove-fd', serves as a
> way to decrement the use count. You'd still have to track whether the
> monitor is using an fd (to avoid over-decrementing on QMP monitor
> close), but by having the monitor's use also tracked under the refcount,
> then refcount reaching 0 is sufficient to auto-close an fd.
> I think
> that also means that re-establishing the client QMP connection would
> increment
This is an interesting idea. I've never thought about what to do with a
reconnect. It felt a bit odd at first, but now your proposal totally
makes sense to me.
> For some examples:
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount; refcount of fd=4 is
> now 0 so qemu closes it
Doesn't the refcount belong to an fdset rather than a single fd? As long
as the fdset has still a reference, it's possible to reach the other fds
in the set through a reopen. For example:
1. client calls 'add-fd', qemu is now tracking fd=4 (O_RDWR) as a member
of fdset1, in use by monitor, refcount 1
2. client calls 'add-fd', qemu is now tracing fd=5 (O_RDONLY) as a
member of fdset, in use by monitor, refcount is still 1
3. client calls 'device-add' with /dev/fdset/1 as the backing filename
and r/w, so qemu_open() dup()s fd 4 and increments the refcount to 2
4. client crashes, so all tracked fdsets are visited; fdset1 gets its
refcount decreased to 1, and both member fds 4 and 5 stay open.
If you had refcounted a single fd, fd 5 would be closed now and qemu
couldn't switch to O_RDONLY any more.
If the O_RDONLY is for a future command, it would make more sense to add
that fd before that future command. Or are you passing it at step 2
because it is needed for the probe re-open issue?
> 1. client calls 'add-fd', qemu is now tracking fd=4 with
refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client crashes, so all tracked fds are visited; fd=4 had not yet been
> passed to 'remove-fd', so qemu decrements refcount to 1, but leaves fd=4
> open because it is still in use by the block device
> 4. client re-establishes QMP connection, and 'query-fds' lets client
> learn about fd=4 still being open as part of fdset1, but also informs
> client that fd is not in use by the monitor
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> so qemu_open() increments the refcount to 2
> 3. client calls 'remove-fd fdset=1 fd=4', so qemu marks fd=4 as no
> longer in use by the monitor, refcount decremented to 1 but still left
> open because it is in use by the block device
> 4. client crashes, so all tracked fds are visited; but fd=4 is already
> marked as not in use by the monitor, so its refcount is unchanged
>
> 1. client calls 'add-fd', qemu is now tracking fd=4 with refcount 1, in
> use by monitor, as member of fdset1
> 2. client calls 'device-add' with /dev/fdset/1 as the backing filename,
> but the command fails for some other reason, so the refcount is still 1
> at the end of the command (although it may have been temporarily
> incremented then decremented during the command)
> 3. client calls 'remove-fd fdset=1 fd=4' to deal with the failure (or
> QMP connection is closed), so qemu marks fd=4 as no longer in use by the
> monitor, refcount is now decremented to 0 and fd=4 is closed
Yup, apart from the comment above the examples look good to me.
> I think that covers the idea; you need a bool in_use for tracking
> monitor state (the monitor is in use until either a remove-fd or a
> monitor connection closes), as well as a ref-count.
Yes, makes sense to me.
>> We also need a query-fdsets command that lists all fdsets that exist. If
>>> we add information about single fds to the return value of it, we
>>> probably don't need a separate query-fd that operates on a single fdset.
>>>
>>
>> Yes, good point. And maybe we don't need 2 commands. query-fdsets
>> could return all the sets and all the fds that are in those sets.
>
> Yes, I think a single query command is good enough here, something like:
>
> { "execute":"query-fdsets" } =>
> { "return" : { "sets": [
> { "name": "fdset1",
> "fds": [ { "fd": 4, "monitor": true,
"refcount": 1 } ] },
> { "name": "fdset2",
> "fds": [ { "fd": 5, "monitor": false,
"refcount": 1 },
> { "fd": 6, "monitor": true, "refcount":
2 } ] } ] } }
Looks good, this is exactly what I was thinking of.
>>> In use by whom? If it's still in use in qemu (as in "in-use flag
would
>>> be set") and we have a refcount of zero, then that's a bug.
>>>
>>
>> In use by qemu. I don't think it's a bug. I think there are situations
>> where refcount gets to zero but qemu is still using the fd.
>
> I think the refcount being non-zero _is_ what defines an fd as being in
> use by qemu (including use by the monitor). Any place you have to close
> an fd before reopening it is dangerous; the safe way is always to open
> with the new permissions before closing the old permissions.
There is one case I'm aware of where we need to be careful: Before
opening an image, qemu may probe the format. In this case, the image
gets opened twice, and the first close comes before the second open. I'm
not entirely sure how hard it would be to get rid of that behaviour.
If we can't get rid of it, we have a small window that the refcount
doesn't really cover, and if we weren't careful it would become racy.
This is why I mentioned earlier that maybe we need to defer the refcount
decrease a bit. However, I can't see how the in-use flag would make a
difference there. If the refcount can't cover it, the in-use flag can't
either.
Yeah this is a problem. Could we introduce another flag to cover this?
--
Regards,
Corey