On 30/05/2024 18:00, Dario Faggioli via Devel wrote:
On Thu, 2024-05-30 at 14:45 +0200, Igor Mammedov wrote:
> Usability of huge VMs (incl. large amount of vCPUs) heavily depends
> on used guest OS. What would work for one might not work for other.
> For general purpose OS adding IOMMU is typically necessary to make
> vCPUs over 254 usable, for Windows adjusting -smp might make a
> difference.
>
> On the other hand, special built guest with tailored QEMU config
> might
> work without IOMMU just fine (David was the one who patched KVM to
> that
> effect if I recall correctly).
>
So, as far as I know, there is no way for any OS with any configuration
to bring up more than 255 vCPUs, without a vIOMMU.
IIUIC, it's a matter of number of bits available in the I/O APIC IRQ
destination register. Like, with only that available, and it being only
8 bit wide, it's just not doable.
On VMs, there's alternatively a KVM PV op to bump the limit to 32k vCPUs i.e.
KVM_FEATURE_MSI_EXT_DEST_ID
(on qemu it's cpu feature name +kvm-msi-ext-dest-id)
Which uses the other 24-bits for that destination register (which on hardware
would cross a page boundary in IOAPIC entry IIUC) without needing IOMMU
interrupt remapping. But you need the guest to understand that feature (which is
there since Linux v5.15 or around that timeframe). I think this is what Igor is
referring to.
Joao