Sent: Thursday, November 02, 2023 at 2:34 PM
From: "Martin Kletzander" <mkletzan(a)redhat.com>
To: "daggs" <daggs(a)gmx.com>
Cc: users(a)lists.libvirt.org
Subject: Re: hdd kills vm
On Wed, Nov 01, 2023 at 10:24:05PM +0100, daggs wrote:
>> Sent: Wednesday, November 01, 2023 at 10:06 AM
>> From: "Martin Kletzander" <mkletzan(a)redhat.com>
>> To: "daggs" <daggs(a)gmx.com>
>> Cc: users(a)lists.libvirt.org
>> Subject: Re: hdd kills vm
>>
>> On Tue, Oct 31, 2023 at 05:58:32PM +0100, daggs wrote:
>> >> Sent: Thursday, October 26, 2023 at 9:50 AM
>> >> From: "Martin Kletzander" <mkletzan(a)redhat.com>
>> >> To: "daggs" <daggs(a)gmx.com>
>> >> Cc: libvir-list(a)redhat.com
>> >> Subject: Re: hdd kills vm
>> >>
>> >> On Wed, Oct 25, 2023 at 03:06:55PM +0200, daggs wrote:
>> >> >> Sent: Tuesday, October 24, 2023 at 5:28 PM
>> >> >> From: "Martin Kletzander"
<mkletzan(a)redhat.com>
>> >> >> To: "daggs" <daggs(a)gmx.com>
>> >> >> Cc: libvir-list(a)redhat.com
>> >> >> Subject: Re: hdd kills vm
>> >> >>
>> >> >> On Mon, Oct 23, 2023 at 04:59:08PM +0200, daggs wrote:
>> >> >> >Greetings Martin,
>> >> >> >
>> >> >> >> Sent: Sunday, October 22, 2023 at 12:37 PM
>> >> >> >> From: "Martin Kletzander"
<mkletzan(a)redhat.com>
>> >> >> >> To: "daggs" <daggs(a)gmx.com>
>> >> >> >> Cc: libvir-list(a)redhat.com
>> >> >> >> Subject: Re: hdd kills vm
>> >> >> >>
>> >> >> >> On Fri, Oct 20, 2023 at 02:42:38PM +0200, daggs
wrote:
>> >> >> >> >Greetings,
>> >> >> >> >
>> >> >> >> >I have a windows 11 vm running on my Gentoo using
libvirt (9.8.0) + qemu (8.1.2), I'm passing almost all available resources to the vm
>> >> >> >> >(all 16 cpus, 31 out of 32 GB, nVidia gpu is pt),
but the performance is not good, system lags, takes long time to boot.
>> >> >> >>
>> >> >> >> There are couple of things that stand out to me in
your setup and I'll
>> >> >> >> assume the host has one NUMA node with 8 cores, each
with 2 threads as,
>> >> >> >> just like you set it up in the guest XML.
>> >> >> >thats correct, see:
>> >> >> >$ lscpu | grep -i numa
>> >> >> >NUMA node(s): 1
>> >> >> >NUMA node0 CPU(s): 0-15
>> >> >> >
>> >> >> >however:
>> >> >> >$ dmesg | grep -i numa
>> >> >> >[ 0.003783] No NUMA configuration found
>> >> >> >
>> >> >> >can that be the reason?
>> >> >> >
>> >> >>
>> >> >> no, this is fine, 1 NUMA node is not a NUMA, technically, so
that's
>> >> >> perfectly fine.
>> >> >thanks for clarifying it for me
>> >> >
>> >> >>
>> >> >> >>
>> >> >> >> * When you give the guest all the CPUs the host has
there is nothing
>> >> >> >> left to run the host tasks. You might think that
there "isn't
>> >> >> >> anything running", but there is, if only your
init system, the kernel
>> >> >> >> and the QEMU which is emulating the guest. This
is definitely one of
>> >> >> >> the bottlenecks.
>> >> >> >I've tried with 12 out of 16, same behavior.
>> >> >> >
>> >> >> >>
>> >> >> >> * The pinning of vCPUs to CPUs is half-suspicious.
If you are trying to
>> >> >> >> make vCPU 0 and 1 be threads on the same core and
on the host the
>> >> >> >> threads are represented as CPUs 0 and 8, then
that's fine. If that is
>> >> >> >> just copy-pasted from somewhere, then it might not
reflect the current
>> >> >> >> situation and can be source of many scheduling
issues (even once the
>> >> >> >> above is dealt with).
>> >> >> >I found a site that does it for you, if it is wrong, can
you point me to a place I can read about it?
>> >> >> >
>> >> >>
>> >> >> Just check what the topology is on the host and try to match
it with the
>> >> >> guest one. If in doubt, then try it without the pinning.
>> >> >I can try to play with it, what I don't know is what should be
the mapping logic?
>> >> >
>> >>
>> >> Threads on the same core in the guest should map to threads on the
same
>> >> core in the host. Since there is no NUMA that should be enough to get
>> >> the best performance. But even misconfiguration of this will not
>> >> introduce lags in the system if it has 8 CPUs. So that's
definitely not
>> >> the root cause of the main problem, it just might be suboptimal.
>> >>
>> >> >>
>> >> >> >>
>> >> >> >> * I also seem to recall that Windows had some issues
with systems that
>> >> >> >> have too many cores. I'm not sure whether
that was an issue with an
>> >> >> >> edition difference or just with some older
versions, or if it just did
>> >> >> >> not show up in the task manager, but there was
something that was
>> >> >> >> fixed by using either more sockets or cores in the
topology. This is
>> >> >> >> probably not the issue for you though.
>> >> >> >>
>> >> >> >> >after trying a few ways to fix it, I've
concluded that the issue might be related to the why the hdd is defined at the vm level.
>> >> >> >> >here is the xml:
https://bpa.st/MYTA
>> >> >> >> >I assume that the hdd sits on the sata ctrl
causing the issue but I'm not sure what is the proper way to fix it, any ideas?
>> >> >> >> >
>> >> >> >>
>> >> >> >> It looks like your disk is on SATA, but I don't
see why that would be an
>> >> >> >> issue. Passing the block device to QEMU as VirtIO
shouldn't cause that
>> >> >> >> much of a difference. Try measuring the speed of the
disk on the host
>> >> >> >> and then in the VM maybe. Is that SSD or NVMe? I
presume that's not
>> >> >> >> spinning rust, is it.
>> >> >> >as seen, I have 3 drives, 2 cdroms as sata and one hdd pt
as virtio, I read somewhere that if the controller of the virtio
>> >> >> >device is sata, than it doesn't uses the virtio
optimally.
>> >> >>
>> >> >> Well it _might_ be slightly more beneficial to use virtio-scsi
or even
>> >> >> <disk type='block' device='lun'>, but I
can't imagine that would make
>> >> >> the system lag. I'm not that familiar with the details.
>> >> >configure virtio-scsi and sata-scai at the same time?
>> >> >
>> >>
>> >> Yes, forgot that, sorry. Try virtio-scsi. You could also go farther
>> >> and pass through the LUN or the whole HBA (if you don't need to
access
>> >> any other disk on it) to the VM. Try the information presented here:
>> >>
>> >>
https://libvirt.org/formatdomain.html#usb-pci-scsi-devices
>> >>
>> >> >>
>> >> >> >it is a spindle, nvmes are too expensive where I live,
frankly, I don't need lightning fast boot, the other BM machines running windows on
spindle
>> >> >> >run it quite fast and they aren't half as fast as this
server
>> >> >> >
>> >> >>
>> >> >> That might actually be related. The guest might think it is a
different
>> >> >> type of disk and use completely suboptimal scheduling. This
might
>> >> >> actually be solved by passing it as <disk
device='lun'..., but at this
>> >> >> point I'm just guessing.
>> >> >I'll look into that, thanks.
>> >
>> >so bottom line, you suggest the following:
>> >1. remove the manual cpu pin, let qemu sort that out.
>>
>> You might try it, of course pinning it is in the end the better option.
>>
>> >2. add a virtio scsi controller and connect the os hdd to it
>> >3. pass the hss via scsi pt and not dev node
>> >4. if I able to do #3, no need to add device='lun' as it won't
use the disk option
>> >
>>
>> First try (3), then you don't need to do anything else and if that
>> succeeds you have the superior configuration. If you can pass through
>> something that will not remove anything from your host system.
>>
>> >Dagg.
>> >
>>
>
>I've decided to first try #3 as yo suggested, based on this output:
>$ lsscsi
>[0:0:0:0] disk ATA WDC WD1003FZEX-0 1A01 /dev/sda
>[1:0:0:0] disk ATA WDC WD10EZEX-08W 1A02 /dev/sdb
>[2:0:0:0] disk ATA SAMSUNG HD103SJ 0001 /dev/sdc
>[3:0:0:0] disk ATA SAMSUNG HD103SJ 0001 /dev/sdd
>[4:0:0:0] disk ATA ST1000DM005 HD10 00E5 /dev/sde
>[5:0:0:0] disk ATA WDC WD10EZEX-08W 1A02 /dev/sdf
>[6:0:0:0] disk Kingston DataTraveler 3.0 0000 /dev/sdg
>[7:0:0:0] cd/dvd TS8XDVDS TRANSCEND 1.02 /dev/sr0
>
>I deduced my data is 0:0:0:0, so I've added this to the file:
I have to trust you here, the link to the XML does not lead anywhere at the moment.
><controller type='scsi' index='0'
model='virtio-scsi'>
> <address type='pci' domain='0x0000' bus='0x00'
slot='0x0c' function='0x0'/>
></controller>
><hostdev mode='subsystem' type='scsi' managed='no'>
With managed='no' you are responsible for detaching and re-attaching the device
for it to be accessible to QEMU. With managed='yes' libvirt can do that for
you. But be really really sure that it is the device you want to plug to the
guest domain.
so it should work if I set managed to 'yes'?
>
> > <source>
> > <adapter name='scsi_host0'/>
> > <address bus='0' target='0' unit='0'/>
> > </source>
> > <address type='drive' controller='0' bus='0'
target='0' unit='0'/>
> ></hostdev>
> >removed the previous config and tried to boot, the vm didn't booted, the qemu
log shows this:
> >char device redirected to /dev/pts/0 (label charserial0)
> >2023-11-01T05:00:27.949977Z qemu-system-x86_64: vfio: Cannot reset device
0000:07:00.4, depends on group 16 which is not owned.
> >2023-11-01T05:00:28.113089Z qemu-system-x86_64: vfio: Cannot reset device
0000:07:00.4, depends on group 16 which is not owned.
> >2023-11-01T05:01:04.511969Z qemu-system-x86_64: libusb_release_interface: -99
[OTHER]
> >2023-11-01T05:01:04.511993Z qemu-system-x86_64: libusb_release_interface: -99
[OTHER]
> >2023-11-01T17:22:48.200982Z qemu-system-x86_64: libusb_release_interface: -4
[NO_DEVICE]
> >2023-11-01T17:22:48.201015Z qemu-system-x86_64: libusb_release_interface: -4
[NO_DEVICE]
> >2023-11-01T17:22:48.201025Z qemu-system-x86_64: libusb_release_interface: -4
[NO_DEVICE]
> >2023-11-01T17:22:48.201035Z qemu-system-x86_64: libusb_release_interface: -4
[NO_DEVICE]
> >libusb_release_interface: -4 [NO_DEVICE]
> >libusb_release_interface: -4 [NO_DEVICE]
> >libusb_release_interface: -4 [NO_DEVICE]
> >libusb_release_interface: -4 [NO_DEVICE]
> >2023-11-01T20:37:31.246043Z qemu-system-x86_64: vfio: Cannot reset device
0000:07:00.4, depends on group 16 which is not owned.
> >2023-11-01T20:37:31.465993Z qemu-system-x86_64: vfio: Cannot reset device
0000:07:00.4, depends on group 16 which is not owned.
> >2023-11-01T20:38:07.049875Z qemu-system-x86_64: libusb_release_interface: -99
[OTHER]
> >2023-11-01T20:38:07.049910Z qemu-system-x86_64: libusb_release_interface: -99
[OTHER]
> >2023-11-01T20:38:07.050063Z qemu-system-x86_64: libusb_set_interface_alt_setting:
-99 [OTHER]
> >2023-11-01T20:47:47.400781Z qemu-system-x86_64: libusb_release_interface: -99
[OTHER]
> >2023-11-01T20:47:47.400804Z qemu-system-x86_64: libusb_release_interface: -99
[OTHER]
> >2023-11-01 20:47:57.096+0000: shutting down, reason=shutdown
> >2023-11-01 20:57:37.514+0000: shutting down, reason=failed
> >
> >if I keep the scsi part but restore the previous device pt, it boots.
> >any idea why it failed booting?
> >
> >Dagg.
> >
>