On Wed, Aug 24, 2016 at 2:37 AM, Michal Privoznik <mprivozn(a)redhat.com> wrote:
On 23.08.2016 08:57, Prasanna Kalever wrote:
> On Tue, Aug 23, 2016 at 4:10 AM, Michal Privoznik <mprivozn(a)redhat.com> wrote:
>> On 17.08.2016 11:02, Prasanna Kalever wrote:
>>> [ oops! apologies, my previous draft miss the links ]
>>>
>>> Hello,
>>>
> Thanks Michal that really helps.
>
> So If I understand it right, in the window between destination daemon
> passing the port (listening) and source (qemu) binding to it, there
> could be some other external process that uses the same port (race?
> again I'm not sure how big the window is?)
Yes. there's this possibility. But then again - these applications
you're using should be configured to use different sets of ports. For
instance libvirt can use 49152 - 49182, while gluster could use 49183 -
49213.
From my/our experiences it is better not to take control over
ephemeral port ranges.
It is very much recommended to leave the choice of port selection for
kernel by something like 'bind(0)'
And controlling the required port ranges (properties) via local port
ranges and local reserved ports (in LINUX)
In linux systems you can find about them in
net.ipv4.ip_local_port_range & net.ipv4.ip_local_reserved_ports and in
NetBSD {ip.lowportmin, ip.lowportmax} & {ip.anonportmin,
ip.anonportmax}
If we are bound to it that's wonderful or else at least we should
honor the port ranges and reserved ports to avoid clashes with third
party applications and better firewall management.
Does libvirt honor local port ranges and reserved port range ?
The window is not that long. Maybe my description in the previous e-mail
wasn't that clear. At the destination, where the port allocation takes
place:
1) libvirt uses its internal module to find a free port
2) libvirt spawns the qemu and tell it to bind to that port
3) libvirt reports back to the migration source host with the port number.
So the window is not that big - probably just a couple of milisecs as
it's just that part between steps 1 and 2. But there always be some
window - even if we implemented what I suggested earlier. That's one of
the limitations of kernel API. One simply doesn't detect port free and
bind to it atomically.
Make Sense, still we have window here :-)
>
> Don't you think libvirt needs a fall back mechanism here, since the
> port numbers could be in ephemeral port range and are free to be used
> by any application ?
You should fix the application then. Or its configuration. We may end up
in fallback loop if two applications with the fallback start fighting
over the ports.
I mean, if it was libvirt who allocates the migration port(socket) for
qemu and then just merely pass the FD to it, then libvirt could do a
couple of tries to allocate the port (bind socket to it). However, if it
is still qemu doing that, we would need to spawn qemu each time we want
to try next port. That's not very efficient.
We have done something like this now, In our previous implementation
glusterd (management demon)
used to allocate a free port pass that to brick process (which is
glusterfsd) and now glusterfsd use to bind on it, in case in the
window some one catch over that port brick used to simply fail.
Then we though of implementing fallback mechanism [1], by getting a
new port on bind fail for clash/inuse.
Then after we came up with a design that respects local port ranges
and reserve port ranges [2] which honors everything and even firewall
management became very easy for us now. And In this design we have
left port binding to glusterfsd (brick process) so there is no window
now.
I request you to go along the commit messages in the links so that you
start appreciating the design.
Frankly, I have no idea why we don't allocate the port ourselves. I'll
ask offline our migration guru and let you know :)
Yeah, +1 for this :-)
[1]
http://review.gluster.org/#/c/14235/
[2]
http://review.gluster.org/#/c/14613/
Cheers Michal,
--
Prasanna
Michal