On 8/8/23 6:00 AM, Stefano Garzarella wrote:
On Mon, Aug 07, 2023 at 03:41:21PM +0200, Peter Krempa wrote:
> On Thu, Aug 03, 2023 at 09:48:01 +0200, Stefano Garzarella wrote:
>> On Wed, Aug 2, 2023 at 10:33 PM Jonathon Jongsma
>> <jjongsma(a)redhat.com> wrote:
>> > On 7/24/23 8:05 AM, Peter Krempa wrote:
>>
>> [...]
>>
>> > >
>> > > I've also noticed that using 'qcow2' format for the device
>> doesn't work:
>> > >
>> > > error: internal error: process exited while connecting to
>> monitor: 2023-07-24T12:54:15.818631Z qemu-system-x86_64: -blockdev
>>
{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage"}:
Could not read qcow2 header: Invalid argument
>> > >
>> > > If that is supposed to work, then qemu devs will probably need to
>> know
>> > > about that, if that is not supposed to work, libvirt needs to add a
>> > > check, because the error doesn't tell much. It's also possible
I've
>> > > messed up when formatting the image though, as didn't really try
to
>> > > figure out what's happening.
>> > >
>> >
>> >
>> > That's a good question, and I don't actually know the answer. Were
you
>> > using an actual vdpa block device for your tests or were you using the
>> > vdpa block simulator kernel module? How did you set it up? Adding
>> > Stefano to cc for his thoughts.
>>
>> Yep, I would also like to understand how you initialized the device
>> with a qcow2 format.
>
> Naively and originally I've simply used it as 'raw' at first and
> formatted it from the guest OS. Then I've shut-down the VM and started
> it back reconfiguring the image format as qcow2. This normally works
> with real-file backed storage, and since the vdpa simulator seems to
> persist the contents I supposed this would work.
Cool, I'll try that.
Can you try to reboot the VM, use it as `raw`, and read the qcow2 in the
vm from the guest OS?
Note: there could be some bugs in the simulator!
>
>> Theoretically, the best use case for vDPA block is that the backend
>> handles formats, for QEMU it should just be a virtio device, but being
>> a blockdev, we should be able to use formats anyway, so it should
>> work.
>
> Yeah, ideally there will be no format driver in qemu used for these
> devices (this is not yet the case, I'll need to fix libvirt to stop
> using the 'raw' driver if not needed).
>
> Here I'm more interested whether it is supposed to work, in which case
> we want to allow using qcow2 as a format in libvirt, or it's not
> supposed to work and we should forbid it before the user gets a
> suboptimal error message such as now.
This is a good question. We certainly haven't tested it, because it's an
uncommon scenario, but as I said before, maybe it should work. I need to
check it better.
>
>>
>> For now, waiting for real hardware, the only way to test vDPA block
>> support in QEMU is to use the simulator in the kernel or VDUSE.
>>
>> With the kernel simulator we only have a 128 MB ramdisk available,
>> with VDUSE you can use QSD with any file:
>>
>> $ modprobe -a vhost_vdpa vduse
>> $ qemu-storage-daemon \
>> --blockdev
>> file,filename=/path/to/image.qcow2,cache.direct=on,aio=native,node-name=file
>> \
>> --blockdev qcow2,file=file,node-name=qcow2 \
>> --export
>> vduse-blk,id=vduse0,name=vduse0,num-queues=1,node-name=qcow2,writable=on
>>
>> $ vdpa dev add name vduse0 mgmtdev vduse
>>
>> Then you have a /dev/vhost-vdpa-X device that you can use with the
>> `virtio-blk-vhost-vdpa` blockdev (note: vduse requires QEMU with a
>> memory-backed with `share=on`), but using raw since the qcow2 is
>> handled by QSD.
>> Of course, we should be able to use raw file with QSD and qcow2 on
>> qemu (although it's not the optimal configuration), but I don't know
>> how to initialize a `virtio-blk-vhost-vdpa` blockdev with a qcow2
>> image :-(
>
> With the above qemu storage daemon you should be able to do that by
> simply dropping the qcow2 format driver and simply exposing a qcow2
> formatted image. It similarly works with NBD:
>
> I've formatted 2 qcow2 images:
>
> # qemu-img create -f qcow2 /root/image1.qcow2 100M
> # qemu-img create -f qcow2 /root/image2.qcow2 100M
>
> And then exported them both via vduse and nbd without interpreting
> qcow2, thus making the QSD into just a dumb storage device:
>
> # qemu-storage-daemon \
> --blockdev
> file,filename=/root/image1.qcow2,cache.direct=on,aio=native,node-name=file1 \
> --export
> vduse-blk,id=vduse0,name=vduse0,num-queues=1,node-name=file1,writable=on \
> --blockdev
> file,filename=/root/image2.qcow2,cache.direct=on,aio=native,node-name=file2 \
> --nbd-server addr.type=unix,addr.path=/tmp/nbd.sock \
> --export nbd,id=nbd0,node-name=file2,writable=on,name=exportname
Cool! Thanks for sharing!
>
> Now when I start a VM using the NBD export in qcow2 format:
>
> <disk type='network' device='disk'>
> <driver name='qemu' type='qcow2'/>
> <source protocol='nbd' name='exportname'>
> <host transport='unix' socket='/tmp/nbd.sock'/>
> </source>
> <target dev='vda' bus='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00'
slot='0x02'
> function='0x0'/>
> </disk>
>
> The VM starts fine, but when using:
>
> <disk type='vhostvdpa' device='disk'>
> <driver name='qemu' type='qcow2' cache='none'/>
> <source dev='/dev/vhost-vdpa-0'/>
> <target dev='vda' bus='virtio'/>
> <address type='pci' domain='0x0000' bus='0x00'
slot='0x02'
> function='0x0'/>
> </disk>
>
> I get:
>
> error: internal error: QEMU unexpectedly closed the monitor
> (vm='vdpa'): 2023-08-07T12:34:21.628520Z qemu-system-x86_64: -blockdev
>
{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage"}:
Could not read qcow2 header: Invalid argument
mmm, I just tried this scenario using QEMU directly and it worked.
These are the steps I did (qemu upstream,
commit 9400601a689a128c25fa9c21e932562e0eeb7a26):
./build/storage-daemon/qemu-storage-daemon \
--blockdev
file,filename=test.qcow2,cache.direct=on,aio=native,node-name=file \
--export
vduse-blk,id=vduse0,name=vduse0,num-queues=1,node-name=file,writable=on
vdpa dev add name vduse0 mgmtdev vduse
/build/qemu-system-x86_64 -m 512M -smp 2 \
-M q35,accel=kvm,memory-backend=mem \
-drive file=f38-vm-build.qcow2,format=qcow2,if=none,id=hd0 \
-device virtio-blk-pci,drive=hd0,bootindex=1 \
-blockdev
node-name=drive_src1,driver=virtio-blk-vhost-vdpa,path=/dev/vhost-vdpa-0,cache.direct=on
\
-blockdev qcow2,node-name=qcow2,file=drive_src1 \
-device virtio-blk-pci,id=src1,bootindex=2,drive=qcow2 \
-object
memory-backend-file,share=on,id=mem,size=512M,mem-path="/dev/hugepages"
Then I'm able to see /dev/vdb and /dev/vdb1.
(test.qcow2 has a fs on the first partition)
I mounted vdb1 and did I did md5sum on a file.
Then I turned off the machine, moved the `-blockdev qcow2...` from qemu
to QSD, and I did the same steps and checked that md5 is the same.
So it seems to work, but maybe we have something different.
My kernel host is: 6.4.7-200.fc38.x86_64
Thanks,
Stefano
By the way, I get the same "Could not read qcow2 header" error that
Peter reported when I use this direct qemu commandline. My laptop is a
little bit behind so I'm still on fedora 37.
Jonathon