These are very compelling results, no?
(40gbps cards, right? Are the cards active/active? or active/standby?)
- Michael
On 5/14/24 10:19, Yu Zhang wrote:
> Hello Peter and all,
>
> I did a comparison of the VM live-migration speeds between RDMA and
> TCP/IP on our servers
> and plotted the results to get an initial impression. Unfortunately,
> the Ethernet NICs are not the
> recent ones, therefore, it may not make much sense. I can do it on
> servers with more recent Ethernet
> NICs and keep you updated.
>
> It seems that the benefits of RDMA becomes obviously when the VM has
> large memory and is
> running memory-intensive workload.
>
> Best regards,
> Yu Zhang @ IONOS Cloud
>
> On Thu, May 9, 2024 at 4:14 PM Peter Xu <peterx(a)redhat.com> wrote:
>> On Thu, May 09, 2024 at 04:58:34PM +0800, Zheng Chuan via wrote:
>>> That's a good news to see the socket abstraction for RDMA!
>>> When I was developed the series above, the most pain is the RDMA migration
has no QIOChannel abstraction and i need to take a 'fake channel'
>>> for it which is awkward in code implementation.
>>> So, as far as I know, we can do this by
>>> i. the first thing is that we need to evaluate the rsocket is good enough to
satisfy our QIOChannel fundamental abstraction
>>> ii. if it works right, then we will continue to see if it can give us
opportunity to hide the detail of rdma protocol
>>> into rsocket by remove most of code in rdma.c and also some hack in
migration main process.
>>> iii. implement the advanced features like multi-fd and multi-uri for rdma
migration.
>>>
>>> Since I am not familiar with rsocket, I need some times to look at it and do
some quick verify with rdma migration based on rsocket.
>>> But, yes, I am willing to involved in this refactor work and to see if we can
make this migration feature more better:)
>> Based on what we have now, it looks like we'd better halt the deprecation
>> process a bit, so I think we shouldn't need to rush it at least in 9.1
>> then, and we'll need to see how it goes on the refactoring.
>>
>> It'll be perfect if rsocket works, otherwise supporting multifd with little
>> overhead / exported APIs would also be a good thing in general with
>> whatever approach. And obviously all based on the facts that we can get
>> resources from companies to support this feature first.
>>
>> Note that so far nobody yet compared with rdma v.s. nic perf, so I hope if
>> any of us can provide some test results please do so. Many people are
>> saying RDMA is better, but I yet didn't see any numbers comparing it with
>> modern TCP networks. I don't want to have old impressions floating around
>> even if things might have changed.. When we have consolidated results, we
>> should share them out and also reflect that in QEMU's migration docs when a
>> rdma document page is ready.
>>
>> Chuan, please check the whole thread discussion, it may help to understand
>> what we are looking for on rdma migrations [1]. Meanwhile please feel free
>> to sync with Jinpu's team and see how to move forward with such a project.
>>
>> [1]
https://urldefense.com/v3/__https://lore.kernel.org/qemu-devel/87frwatp7n...
>>
>> Thanks,
>>
>> --
>> Peter Xu
>>