On 02/05/2018 02:21 PM, Daniel P. Berrangé wrote:
> On Mon, Feb 05, 2018 at 02:11:24PM +0100, Peter Krempa wrote:
>> On Mon, Feb 05, 2018 at 12:55:11 +0000, Daniel Berrange wrote:
>>> On Mon, Feb 05, 2018 at 01:39:56PM +0100, Michal Privoznik wrote:
>>>> On 02/01/2018 08:51 AM, Nikolay Shirokovskiy wrote:
>>>>>
>>>>>
>>>>> On 29.01.2018 09:09, Michal Privoznik wrote:
>>>>>> On 12/20/2017 07:35 AM, Nikolay Shirokovskiy wrote:
>>>>>>> Host tcp4/tcp6 ports is a global resource thus we need to
make
>>>>>>> port accounting also global or we have issues described in
[1] when
>>>>>>> port allocator ranges of different instances are overlapped
(which
>>>>>>> is by default for qemu for example).
>>>>>>>
>>>>>>> Let's have only one global port allocator object that
take care
>>>>>>> of the entire ports range (0 - 65535) and introduce port
range object
>>>>>>> for clients to specify desired auto allocation band.
>>>>>>>
>>>>>>> [1]
https://www.redhat.com/archives/libvir-list/2017-December/msg00600.html
>>>>>>> ---
>>>>>>> src/bhyve/bhyve_driver.c | 4 +-
>>>>>>> src/bhyve/bhyve_utils.h | 2 +-
>>>>>>> src/libvirt_private.syms | 3 +-
>>>>>>> src/libxl/libxl_conf.c | 8 +--
>>>>>>> src/libxl/libxl_conf.h | 8 +--
>>>>>>> src/libxl/libxl_driver.c | 18 +++---
>>>>>>> src/qemu/qemu_conf.h | 6 +-
>>>>>>> src/qemu/qemu_driver.c | 30 +++++-----
>>>>>>> src/util/virportallocator.c | 125
++++++++++++++++++++++++++++-------------
>>>>>>> src/util/virportallocator.h | 20 ++++---
>>>>>>> tests/bhyvexml2argvtest.c | 6 +-
>>>>>>> tests/libxlxml2domconfigtest.c | 8 +--
>>>>>>> tests/virportallocatortest.c | 48 ++++++++++------
>>>>>>> 13 files changed, 175 insertions(+), 111 deletions(-)
>>>>>>>
>>>>>>
>>>>>>> diff --git a/src/util/virportallocator.c
b/src/util/virportallocator.c
>>>>>>> index fcd4f74..cd64356 100644
>>>>>>> --- a/src/util/virportallocator.c
>>>>>>> +++ b/src/util/virportallocator.c
>>>>>>> @@ -35,10 +35,14 @@
>>>>>>>
>>>>>>> #define VIR_FROM_THIS VIR_FROM_NONE
>>>>>>>
>>>>>>> +typedef struct _virPortAllocator virPortAllocator;
>>>>>>> +typedef virPortAllocator *virPortAllocatorPtr;
>>>>>>> struct _virPortAllocator {
>>>>>>> virObjectLockable parent;
>>>>>>> virBitmapPtr bitmap;
>>>>>>> +};
>>>>>>>
>>>>>>> +struct _virPortRange {
>>>>>>> char *name;
>>>>>>>
>>>>>>> unsigned short start;
>>>>>>> @@ -48,6 +52,7 @@ struct _virPortAllocator {
>>>>>>> };
>>>>>>>
>>>>>>> static virClassPtr virPortAllocatorClass;
>>>>>>> +static virPortAllocatorPtr virPortAllocatorInstance;
>>>>>>
>>>>>> I wonder if this is the way to go. I mean, this
virPortAllocatorInstance
>>>>>> is going to be a global variable that will never be freed (even
if we
>>>>>> wanted to). I mean, if virPortRange had a pointer to
virPortAllocator
>>>>>
>>>>> Not sure why we need to free it. It is like global variables for
classes,
>>>>> we don't need to free them yet. As to libxlxml2domconfigtest it
can be
>>>>> fixed just like virportallocatortest by releasing all acquired
ports.
>>>>
>>>> Well, okay. Disregard my suggestion. However, what we still need to
>>>> discuss is the driver separation work of Daniel's. Dan, how badly
will
>>>> this hit you if I merged these? In another thread I suggested to turn
>>>> this into a separate deaemon (which might be overkill).
>>>
>>> The caching of the used ports in the bitmap is just an optimization, to
>>> avoid us having to retry the bind()+listen() on every port we've
previously
>>> got in use. If we split the daemon, if multiple daemons all need port
>>> allocation tracking, they'll get separate virPortAllocator bitmap
instances.
>>> Since one daemon won't see what other daemon has in use, it will mean
that
>>> we must try to bind()+listen() on ports that the other daemon has in use.
>>> Thereafter we'll have cached that usage the bitmap.
>>>
>>> The main downside is that if one daemon releases a port, the other daemon
>>> won't see that release. This is only a significant problem if the 2 (or
>>> more) daemons are using the same port range. This would, however, be
>>> exactly the same when we have a per-QEMU instance daemon. The proposed
>>> change, however, does not make life worse than it already is in this
>>> respect.
>>>
>>> IOW, we'll probably have some trouble, but that's not a reason to
reject
>>> this proposal. It is just one of many things we'll need to figure out
>>> wrt unique assignment.
>>
>> Well, you get slightly worse odds of having the same kind of race if you
>> have multiple instances of the port allocation approach in multiple
>> processes.
>>
>> Our problem is that when we bind()+listen() we still need to close that
>> port and have qemu open it again. This race window is still present but
>> will be worsened by multiple of these doing the same thing.
>>
>> When qemu will be able to accept the socket via FD passing then this
>> would be strictly an optimization, but until then it worsens the odds of
>> failure.
>
> I have patches to let QEMU accept a preopened FD for chardevs.
>
> VNC / SPICE are the other big ones we hit. I should make fixing those a
> higher priority.
Okay, so it looks like this can be merged. I mean, v2 can be merged
(which fixes other issues raised like tests failing, build fixes in some
drivers, etc.). Nikolay, do you think you can send v2?