Hi Martin,
On 3/6/24 04:41, Martin Kletzander wrote:
On Mon, Mar 04, 2024 at 03:54:23PM -0600, Michael Galaxy wrote:
> Hi Martin,
>
> Answers inline. Thanks for helping with the review and all the tips!
>
> On 3/1/24 04:00, Martin Kletzander wrote:
>> On Mon, Jan 29, 2024 at 04:43:53PM -0500, mgalaxy(a)akamai.com wrote:
>>> From: Michael Galaxy <mgalaxy(a)akamai.com>
>>>
>>
>>> In our case, we almost always have two NUMA nodes, so in that
>>> example, we have two PMEM regions which are created on the
>>> Linux kernel command line that get mounted into those two
>>> locations for libvirt to use.
>>>
>>
>> There are PMEM devices which you then expose as filesystems to use for
>> libvirt as a backing for VM's PMEMs. Do I understand that correctly?
>>
>> If yes, how are these different? Can't they be passed through?
>>
> So, these are very different. QEMU currently already supports passing
> through
> PMEM for guest internal use (The guest puts its own filesystem onto the
> passed-through
> PMEM device).
>
> In our case, we are using the PMEM area only in the host to place the
> QEMU memory backing
> for all guests into a single PMEM area.
>
> To support NUMA correctly, QEMU needs to support mutiple host-level PMEM
> areas which
> have been pre-configured to be NUMA aware. This is strictly for the
Is this preconfiguration something that libvirt should be able to do as
well? How would anyone know which region is tied to which NUMA node?
Shouldn't there be some probing for that?
That's a good question, but probably no. The preconfiguration is done
via the memmap=XXXX
paramter on the kernel command line. I doubt we want libvirt in the
business of managing something
like that. I do have checks in the patchset to handle different corner
cases where the user
provides invalid input in a way that does not match the existing number
of NUMA nodes, but
beyond that, the reserved PMEM areas are strictly configured at kernel
boot time.
- Michael