Hello Michael and Peter,
Exactly, not so compelling, as I did it first only on servers widely
used for production in our data center. The network adapters are
Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5720
2-port Gigabit Ethernet PCIe
InfiniBand controller: Mellanox Technologies MT27800 Family [ConnectX-5]
which doesn't meet our purpose. I can choose RDMA or TCP for VM
migration. RDMA traffic is through InfiniBand and TCP through Ethernet
on these two hosts. One is standby while the other is active.
Now I'll try on a server with more recent Ethernet and InfiniBand
network adapters. One of them has:
BCM57414 NetXtreme-E 10Gb/25Gb RDMA Ethernet Controller (rev 01)
The comparison between RDMA and TCP on the same NIC could make more sense.
Best regards,
Yu Zhang @ IONOS Cloud
On Thu, May 16, 2024 at 7:30 PM Michael Galaxy <mgalaxy(a)akamai.com> wrote:
>
> These are very compelling results, no?
>
> (40gbps cards, right? Are the cards active/active? or active/standby?)
>
> - Michael
>
> On 5/14/24 10:19, Yu Zhang wrote:
> > Hello Peter and all,
> >
> > I did a comparison of the VM live-migration speeds between RDMA and
> > TCP/IP on our servers
> > and plotted the results to get an initial impression. Unfortunately,
> > the Ethernet NICs are not the
> > recent ones, therefore, it may not make much sense. I can do it on
> > servers with more recent Ethernet
> > NICs and keep you updated.
> >
> > It seems that the benefits of RDMA becomes obviously when the VM has
> > large memory and is
> > running memory-intensive workload.
> >
> > Best regards,
> > Yu Zhang @ IONOS Cloud
> >
> > On Thu, May 9, 2024 at 4:14 PM Peter Xu <peterx(a)redhat.com> wrote:
> >> On Thu, May 09, 2024 at 04:58:34PM +0800, Zheng Chuan via wrote:
> >>> That's a good news to see the socket abstraction for RDMA!
> >>> When I was developed the series above, the most pain is the RDMA
migration has no QIOChannel abstraction and i need to take a 'fake channel'
> >>> for it which is awkward in code implementation.
> >>> So, as far as I know, we can do this by
> >>> i. the first thing is that we need to evaluate the rsocket is good
enough to satisfy our QIOChannel fundamental abstraction
> >>> ii. if it works right, then we will continue to see if it can give us
opportunity to hide the detail of rdma protocol
> >>> into rsocket by remove most of code in rdma.c and also some hack in
migration main process.
> >>> iii. implement the advanced features like multi-fd and multi-uri for
rdma migration.
> >>>
> >>> Since I am not familiar with rsocket, I need some times to look at it
and do some quick verify with rdma migration based on rsocket.
> >>> But, yes, I am willing to involved in this refactor work and to see if
we can make this migration feature more better:)
> >> Based on what we have now, it looks like we'd better halt the
deprecation
> >> process a bit, so I think we shouldn't need to rush it at least in 9.1
> >> then, and we'll need to see how it goes on the refactoring.
> >>
> >> It'll be perfect if rsocket works, otherwise supporting multifd with
little
> >> overhead / exported APIs would also be a good thing in general with
> >> whatever approach. And obviously all based on the facts that we can get
> >> resources from companies to support this feature first.
> >>
> >> Note that so far nobody yet compared with rdma v.s. nic perf, so I hope if
> >> any of us can provide some test results please do so. Many people are
> >> saying RDMA is better, but I yet didn't see any numbers comparing it
with
> >> modern TCP networks. I don't want to have old impressions floating
around
> >> even if things might have changed.. When we have consolidated results, we
> >> should share them out and also reflect that in QEMU's migration docs
when a
> >> rdma document page is ready.
> >>
> >> Chuan, please check the whole thread discussion, it may help to understand
> >> what we are looking for on rdma migrations [1]. Meanwhile please feel free
> >> to sync with Jinpu's team and see how to move forward with such a
project.
> >>
> >> [1]
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/87frwatp7n...
> >>
> >> Thanks,
> >>
> >> --
> >> Peter Xu
> >>