Yu,
On Thu, Apr 11, 2024 at 06:36:54PM +0200, Yu Zhang wrote:
> 1) Either a CI test covering at least the major RDMA paths, or
at least
> periodically tests for each QEMU release will be needed.
We use a batch of regression test cases for the stack, which covers the
test for QEMU. I did such test for most of the QEMU releases planned as
candidates for rollout.
The least I can think of is a few tests in one release. Definitely too
less if one release can already break..
The migration test needs a pair of (either physical or virtual) servers with
InfiniBand network, which makes it difficult to do on a single server. The
nested VM could be a possible approach, for which we may need virtual
InfiniBand network. Is SoftRoCE [1] a choice? I will try it and let you know.
[1]
https://enterprise-support.nvidia.com/s/article/howto-configure-soft-roce
Does it require a kernel driver? The less host kernel / hardware /
.. dependencies the better.
I am wondering whether there can be a library doing everything in
userspace, translating RDMA into e.g. socket messages (so maybe ultimately
that's something like IP->rdma->IP.. just to cover the "rdma"
procedures),
then that'll work for CI reliably.
Please also see my full list, though, especially entry 4). Thanks already
for looking for solutions on the tests, but I don't want to waste your time
then found that tests are not enough even if ready. I think we need people
that understand these stuff well enough, have dedicated time and look after
it.
Thanks,
--
Peter Xu