On Thu, 2012-09-06 at 12:14 +0200, Andrew Holway wrote:
On Sep 5, 2012, at 4:02 PM, Avi Kivity wrote:
> On 09/04/2012 03:04 PM, Myklebust, Trond wrote:
>> On Tue, 2012-09-04 at 11:31 +0200, Andrew Holway wrote:
>>> Hello.
>>>
>>> # Avi Kivity avi(a)redhat recommended I copy kvm in on this. It would also
seem relevent to libvirt. #
>>>
>>> I have a Centos 6.2 server and Centos 6.2 client.
>>>
>>> [root@store ~]# cat /etc/exports
>>> /dev/shm 10.149.0.0/16(rw,fsid=1,no_root_squash,insecure) (I have
tried with non tempfs targets also)
>>>
>>>
>>> [root@node001 ~]# cat /etc/fstab
>>> store.ibnet:/dev/shm /mnt nfs
rdma,port=2050,defaults 0 0
>>>
>>>
>>> I wrote a little for loop one liner that dd'd the centos net install
image to a file called 'hello' then checksummed that file. Each iteration uses a
different block size.
>>>
>>> Non DIRECT_IO seems to work fine. DIRECT_IO with 512byte, 1K and 2K block
sizes get corrupted.
>>
>>
>> That is expected behaviour. DIRECT_IO over RDMA needs to be page aligned
>> so that it can use the more efficient RDMA READ and RDMA WRITE memory
>> semantics (instead of the SEND/RECEIVE channel semantics).
>
> Shouldn't subpage requests fail then? O_DIRECT block requests fail for
> subsector writes, instead of corrupting your data.
But silent data corruption is so much fun!!
A couple of RDMA folks are looking into why this is happening. I'm
hoping they will get back to me soon.
--
Trond Myklebust
Linux NFS client maintainer
NetApp
Trond.Myklebust(a)netapp.com
www.netapp.com