On Wed, Dec 21, 2016 at 12:51:29PM +0800, Feng, Shaohe wrote:Thanks. Dolpher. Reply inline. On 2016年12月21日 11:56, Du, Dolpher wrote:Shaohe was dropped from the loop, adding him back.-----Original Message----- From: He Chen [mailto:he.chen@linux.intel.com] Sent: Friday, December 9, 2016 3:46 PM To: Daniel P. Berrange <berrange@redhat.com> Cc: libvir-list@redhat.com; Du, Dolpher <dolpher.du@intel.com>; Zyskowski, Robert <robert.zyskowski@intel.com>; Daniluk, Lukasz <lukasz.daniluk@intel.com>; Zang, Rui <rui.zang@intel.com>; jdenemar@redhat.com Subject: Re: [libvirt] [RFC] phi support in libvirtOn Mon, Dec 05, 2016 at 04:12:22PM +0000, Feng, Shaohe wrote:Hi all: As we are know Intel® Xeon phi targets high-performance computing and other parallel workloads. Now qemu has supported phi virtualization,it is time for libvirt to support phi.Can you provide pointer to the relevant QEMU changes.Xeon Phi Knights Landing (KNL) contains 2 primary hardware features, one is up to 288 CPUs which needs patches to support and we are pushing it, the other is Multi-Channel DRAM (MCDRAM) which does not need any changes currently. Let me introduce more about MCDRAM, MCDRAM is on-package high-bandwidth memory (~500GB/s). On KNL platform, hardware expose MCDRAM as a seperate, CPUless and remote NUMA node to OS so that MCDRAM will not be allocated by default (since MCDRAM node has no CPU, every CPU regards MCDRAM node as remote node). In this way, MCDRAM can be reserved for certain specific applications.Different from the traditional X86 server, There is a special numa node with Multi-Channel DRAM (MCDRAM) on Phi, but without any CPU . Now libvirt requires nonempty cpus argument for NUMA node, such as. <numa> <cell id='0' cpus='0-239' memory='80' unit='GiB'/> <cell id='1' cpus='240-243' memory='16' unit='GiB'/> </numa> In order to support phi virtualization, libvirt needs to allow a numa cell definition without 'cpu' attribution. Such as: <numa> <cell id='0' cpus='0-239' memory='80' unit='GiB'/> <cell id='1' memory='16' unit='GiB'/> </numa> When a cell without 'cpu', qemu will allocate memory by default MCDRAMinstead of DDR.There's separate concepts at play which your description here is mixing up. First is the question of whether the guest NUMA node can be created withonly RAM or CPUs, or a mix of both.Second is the question of what kind of host RAM (MCDRAM vs DDR) is usedas the backing store for the guest Guest NUMA node shoulde be created with memory only (keep the same as host's) and the more important things is the memory should bind to (come from) host MCDRAM node.So I suggest libvirt distinguish the MCDRAM And the MCDRAM numa config as follow, add a "mcdram" attribute for "cell" element: <numa> <cell id='1' mcdram='16' unit='GiB'/> </numa> <cell id='0' cpus='0-239' memory='80' unit='GiB'/>No, that is not backwards compatible for applications using libvirt. We already have a place for storing info about memory backing type, which we use for huge pages. mcdram should use the same approach IMHO. eg <domain> ... <memoryBacking> <mcdram nodeset="3-4"/> </memoryBacking> </domain> to indicate that nodes 3 & 4 should use mcdram Regards, Daniel