[libvirt-users] VMs fail to start with NUMA configuration

I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + qemu 1.2.2 applied on top plus a number of stability patches). Having issue where my VMs fail to start with the following message: kvm_init_vcpu failed: Cannot allocate memory Following the instructions at http://libvirt.org/formatdomain.html#elementsNUMATuning I've added the following to my VCPU configuration: <vcpu placement='auto'>2</vcpu> Which libvirt expands out as I'd expect per the documentation to: <vcpu placement='auto'>2</vcpu> <numatune> <memory tune='strict' placement='auto'/> </numatune> However, the VMs won't start and the system is no low on memory. # numactl --hardware available: 8 nodes (0-7) node 0 cpus: 0 4 8 12 16 20 24 28 node 0 size: 16374 MB node 0 free: 11899 MB node 1 cpus: 32 36 40 44 48 52 56 60 node 1 size: 16384 MB node 1 free: 15318 MB node 2 cpus: 2 6 10 14 18 22 26 30 node 2 size: 16384 MB node 2 free: 15766 MB node 3 cpus: 34 38 42 46 50 54 58 62 node 3 size: 16384 MB node 3 free: 15347 MB node 4 cpus: 3 7 11 15 19 23 27 31 node 4 size: 16384 MB node 4 free: 15041 MB node 5 cpus: 35 39 43 47 51 55 59 63 node 5 size: 16384 MB node 5 free: 15202 MB node 6 cpus: 1 5 9 13 17 21 25 29 node 6 size: 16384 MB node 6 free: 15197 MB node 7 cpus: 33 37 41 45 49 53 57 61 node 7 size: 16368 MB node 7 free: 15669 MB The system has 4 Opteron 6272 which add up to a total of 64 cores, 16 cores per socket. These are the CPUs that Dan B noticed issues with in the past regarding topology but I thought this was resolved. But the capabilities are posted below, you'll notice the topology is incorrect. # virsh capabilities <capabilities> <host> <uuid>44454c4c-5300-1038-8031-c4c04f545331</uuid> <cpu> <arch>x86_64</arch> <model>Opteron_G4</model> <vendor>AMD</vendor> <topology sockets='1' cores='8' threads='2'/> <feature name='nodeid_msr'/> <feature name='wdt'/> <feature name='skinit'/> <feature name='ibs'/> <feature name='osvw'/> <feature name='cr8legacy'/> <feature name='extapic'/> <feature name='cmp_legacy'/> <feature name='fxsr_opt'/> <feature name='mmxext'/> <feature name='osxsave'/> <feature name='monitor'/> <feature name='ht'/> <feature name='vme'/> </cpu> <power_management> <suspend_disk/> </power_management> <migration_features> <live/> <uri_transports> <uri_transport>tcp</uri_transport> </uri_transports> </migration_features> <topology> <cells num='8'> <cell id='0'> <cpus num='8'> <cpu id='0'/> <cpu id='4'/> <cpu id='8'/> <cpu id='12'/> <cpu id='16'/> <cpu id='20'/> <cpu id='24'/> <cpu id='28'/> </cpus> </cell> <cell id='1'> <cpus num='8'> <cpu id='32'/> <cpu id='36'/> <cpu id='40'/> <cpu id='44'/> <cpu id='48'/> <cpu id='52'/> <cpu id='56'/> <cpu id='60'/> </cpus> </cell> <cell id='2'> <cpus num='8'> <cpu id='2'/> <cpu id='6'/> <cpu id='10'/> <cpu id='14'/> <cpu id='18'/> <cpu id='22'/> <cpu id='26'/> <cpu id='30'/> </cpus> </cell> <cell id='3'> <cpus num='8'> <cpu id='34'/> <cpu id='38'/> <cpu id='42'/> <cpu id='46'/> <cpu id='50'/> <cpu id='54'/> <cpu id='58'/> <cpu id='62'/> </cpus> </cell> <cell id='4'> <cpus num='8'> <cpu id='3'/> <cpu id='7'/> <cpu id='11'/> <cpu id='15'/> <cpu id='19'/> <cpu id='23'/> <cpu id='27'/> <cpu id='31'/> </cpus> </cell> <cell id='5'> <cpus num='8'> <cpu id='35'/> <cpu id='39'/> <cpu id='43'/> <cpu id='47'/> <cpu id='51'/> <cpu id='55'/> <cpu id='59'/> <cpu id='63'/> </cpus> </cell> <cell id='6'> <cpus num='8'> <cpu id='1'/> <cpu id='5'/> <cpu id='9'/> <cpu id='13'/> <cpu id='17'/> <cpu id='21'/> <cpu id='25'/> <cpu id='29'/> </cpus> </cell> <cell id='7'> <cpus num='8'> <cpu id='33'/> <cpu id='37'/> <cpu id='41'/> <cpu id='45'/> <cpu id='49'/> <cpu id='53'/> <cpu id='57'/> <cpu id='61'/> </cpus> </cell> </cells> </topology> <secmodel> <model>none</model> <doi>0</doi> </secmodel> <secmodel> <model>dac</model> <doi>0</doi> </secmodel> </host> <guest> <os_type>hvm</os_type> <arch name='i686'> <wordsize>32</wordsize> <emulator>/usr/bin/qemu-system-x86_64</emulator> <machine>pc-1.2</machine> <machine canonical='pc-1.2'>pc</machine> <machine>pc-1.1</machine> <machine>pc-1.0</machine> <machine>pc-0.15</machine> <machine>pc-0.14</machine> <machine>pc-0.13</machine> <machine>pc-0.12</machine> <machine>pc-0.11</machine> <machine>pc-0.10</machine> <machine>isapc</machine> <machine>none</machine> <domain type='qemu'> </domain> <domain type='kvm'> <emulator>/usr/bin/qemu-kvm</emulator> <machine>pc-1.2</machine> <machine canonical='pc-1.2'>pc</machine> <machine>pc-1.1</machine> <machine>pc-1.0</machine> <machine>pc-0.15</machine> <machine>pc-0.14</machine> <machine>pc-0.13</machine> <machine>pc-0.12</machine> <machine>pc-0.11</machine> <machine>pc-0.10</machine> <machine>isapc</machine> <machine>none</machine> </domain> </arch> <features> <cpuselection/> <deviceboot/> <pae/> <nonpae/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> </features> </guest> <guest> <os_type>hvm</os_type> <arch name='x86_64'> <wordsize>64</wordsize> <emulator>/usr/bin/qemu-system-x86_64</emulator> <machine>pc-1.2</machine> <machine canonical='pc-1.2'>pc</machine> <machine>pc-1.1</machine> <machine>pc-1.0</machine> <machine>pc-0.15</machine> <machine>pc-0.14</machine> <machine>pc-0.13</machine> <machine>pc-0.12</machine> <machine>pc-0.11</machine> <machine>pc-0.10</machine> <machine>isapc</machine> <machine>none</machine> <domain type='qemu'> </domain> <domain type='kvm'> <emulator>/usr/bin/qemu-kvm</emulator> <machine>pc-1.2</machine> <machine canonical='pc-1.2'>pc</machine> <machine>pc-1.1</machine> <machine>pc-1.0</machine> <machine>pc-0.15</machine> <machine>pc-0.14</machine> <machine>pc-0.13</machine> <machine>pc-0.12</machine> <machine>pc-0.11</machine> <machine>pc-0.10</machine> <machine>isapc</machine> <machine>none</machine> </domain> </arch> <features> <cpuselection/> <deviceboot/> <acpi default='on' toggle='yes'/> <apic default='on' toggle='no'/> </features> </guest> </capabilities> Any suggestions are appreciated. -- Doug Goldstein

On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein <cardoe@gentoo.org> wrote:
I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + qemu 1.2.2 applied on top plus a number of stability patches). Having issue where my VMs fail to start with the following message:
kvm_init_vcpu failed: Cannot allocate memory
I've tried libvirt-1.0.1 on this machine and had the same results. The capabilities have been fixed on that machine however and now display: <topology sockets='1' cores='64' threads='1'/> I also forgot to mention that LIBVIRT_DEBUG=1 and running the daemon with --verbose didn't have more details than what I originally posted. -- Doug Goldstein

On 2013年01月24日 12:11, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> wrote:
I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + qemu 1.2.2 applied on top plus a number of stability patches). Having issue where my VMs fail to start with the following message:
kvm_init_vcpu failed: Cannot allocate memory
Smell likes we have problem on setting the NUMA policy (perhaps caused by the incorrect host NUMA topology), given that the system still has enough memory. Or numad (if it's installed) is doing something wrong. Can you see if there is something about the Nodeset used to set the policy in debug log? E.g. % cat libvirtd.debug | grep Nodeset % cat /var/log/libvirt/qemu/$guest.log On the other hand, I'm wondering if there is a chance to see the full error msg kernel throws out.
I've tried libvirt-1.0.1 on this machine and had the same results. The capabilities have been fixed on that machine however and now display:
<topology sockets='1' cores='64' threads='1'/>
I also forgot to mention that LIBVIRT_DEBUG=1 and running the daemon with --verbose didn't have more details than what I originally posted.

On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang <jyang@redhat.com> wrote:
On 2013年01月24日 12:11, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> wrote:
I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + qemu 1.2.2 applied on top plus a number of stability patches). Having issue where my VMs fail to start with the following message:
kvm_init_vcpu failed: Cannot allocate memory
Smell likes we have problem on setting the NUMA policy (perhaps caused by the incorrect host NUMA topology), given that the system still has enough memory. Or numad (if it's installed) is doing something wrong.
Can you see if there is something about the Nodeset used to set the policy in debug log?
E.g.
% cat libvirtd.debug | grep Nodeset
Well I don't see anything but its likely because I didn't do something correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose from the command line. My /etc/libvirt/libvirtd.conf had: log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't get any debug messages.
% cat /var/log/libvirt/qemu/$guest.log
Nothing in here. Just the command line: 2013-01-24 04:00:49.989+0000: starting up LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/ bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.6.3 HOME=/root USER=root LOGNAME=root QEM U_AUDIO_DRV=none /usr/bin/qemu-kvm -name bb-2.6.18-128.el5.x86_64 -S -M pc-1.2 - cpu Opteron_G4,+perfctr_nb,+perfctr_core,+topoext,+nodeid_msr,+lwp,+wdt,+skinit, +ibs,+osvw,+cr8legacy,+extapic,+cmp_legacy,+fxsr_opt,+mmxext,+osxsave,+monitor,+ ht,+vme -enable-kvm -m 2048 -smp 2,sockets=1,cores=2,threads=1 -uuid 2b5990f0-649f-ae25-99f0-dc4b05f682e1 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/bb-2.6.18-128.el5.x86_64.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-shutdown -boot menu=off -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/disk/by-path/ip-192.168.200.20:3260-iscsi-iqn.2011-07.lab.san-1:2.6.18-128.el5-x86_64-lun-0,if=none,id=drive-ide0-0-0,format=raw,cache=none,werror=stop,rerror=stop,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=31,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:7c:b2:81,bus=pci.0,addr=0x7 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:10,password -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device i6300esb,id=watchdog0,bus=pci.0,addr=0x6 -watchdog-action reset -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 char device redirected to /dev/pts/12 CPU feature perfctr_nb not found CPU feature perfctr_core not found CPU feature topoext not found CPU feature lwp not found kvm_init_vcpu failed: Cannot allocate memory 2013-01-24 04:00:50.194+0000: shutting down
On the other hand, I'm wondering if there is a chance to see the full error msg kernel throws out.
That's what I've been trying to figure out. Trying to get some more info out of libvirt as well. If you've got any ideas I'll give them a shot. -- Doug Goldstein

On 2013年01月24日 14:26, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 12:11, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> wrote:
I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + qemu 1.2.2 applied on top plus a number of stability patches). Having issue where my VMs fail to start with the following message:
kvm_init_vcpu failed: Cannot allocate memory
Smell likes we have problem on setting the NUMA policy (perhaps caused by the incorrect host NUMA topology), given that the system still has enough memory. Or numad (if it's installed) is doing something wrong.
Can you see if there is something about the Nodeset used to set the policy in debug log?
E.g.
% cat libvirtd.debug | grep Nodeset
Well I don't see anything but its likely because I didn't do something correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose from the command line.
If the process is in background, it's expected you can't see anything My /etc/libvirt/libvirtd.conf had:
log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't get any debug messages.
log_level=1 has to be set. Anyway, let's simply do this: % service libvirtd stop % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
% cat /var/log/libvirt/qemu/$guest.log
Nothing in here. Just the command line:
2013-01-24 04:00:49.989+0000: starting up LC_ALL=C PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/ bin:/usr/x86_64-pc-linux-gnu/gcc-bin/4.6.3 HOME=/root USER=root LOGNAME=root QEM U_AUDIO_DRV=none /usr/bin/qemu-kvm -name bb-2.6.18-128.el5.x86_64 -S -M pc-1.2 - cpu Opteron_G4,+perfctr_nb,+perfctr_core,+topoext,+nodeid_msr,+lwp,+wdt,+skinit, +ibs,+osvw,+cr8legacy,+extapic,+cmp_legacy,+fxsr_opt,+mmxext,+osxsave,+monitor,+ ht,+vme -enable-kvm -m 2048 -smp 2,sockets=1,cores=2,threads=1 -uuid 2b5990f0-649f-ae25-99f0-dc4b05f682e1 -no-user-config -nodefaults -chardev socket,id=charmonitor,path=/var/lib/libvirt/qemu/bb-2.6.18-128.el5.x86_64.monitor,server,nowait -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -no-shutdown -boot menu=off -device piix3-usb-uhci,id=usb,bus=pci.0,addr=0x1.0x2 -drive file=/dev/disk/by-path/ip-192.168.200.20:3260-iscsi-iqn.2011-07.lab.san-1:2.6.18-128.el5-x86_64-lun-0,if=none,id=drive-ide0-0-0,format=raw,cache=none,werror=stop,rerror=stop,aio=native -device ide-hd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0,bootindex=1 -netdev tap,fd=31,id=hostnet0 -device rtl8139,netdev=hostnet0,id=net0,mac=52:54:00:7c:b2:81,bus=pci.0,addr=0x7 -chardev pty,id=charserial0 -device isa-serial,chardev=charserial0,id=serial0 -vnc 0.0.0.0:10,password -vga cirrus -device AC97,id=sound0,bus=pci.0,addr=0x4 -device i6300esb,id=watchdog0,bus=pci.0,addr=0x6 -watchdog-action reset -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5 char device redirected to /dev/pts/12 CPU feature perfctr_nb not found CPU feature perfctr_core not found CPU feature topoext not found CPU feature lwp not found kvm_init_vcpu failed: Cannot allocate memory 2013-01-24 04:00:50.194+0000: shutting down
On the other hand, I'm wondering if there is a chance to see the full error msg kernel throws out.
That's what I've been trying to figure out. Trying to get some more info out of libvirt as well. If you've got any ideas I'll give them a shot.

On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang <jyang@redhat.com> wrote:
On 2013年01月24日 14:26, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 12:11, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> wrote:
I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + qemu 1.2.2 applied on top plus a number of stability patches). Having issue where my VMs fail to start with the following message:
kvm_init_vcpu failed: Cannot allocate memory
Smell likes we have problem on setting the NUMA policy (perhaps caused by the incorrect host NUMA topology), given that the system still has enough memory. Or numad (if it's installed) is doing something wrong.
Can you see if there is something about the Nodeset used to set the policy in debug log?
E.g.
% cat libvirtd.debug | grep Nodeset
Well I don't see anything but its likely because I didn't do something correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose from the command line.
If the process is in background, it's expected you can't see anything
My /etc/libvirt/libvirtd.conf had:
log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't get any debug messages.
log_level=1 has to be set.
Anyway, let's simply do this:
% service libvirtd stop % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
That's what I was doing, minus the tee just to the console and nothing was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, which also didn't get any debug messages. Turns out this instance must have been built with --disable-debug, All I've got in the log is: # grep -i 'numa' libvirtd.debug 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : About to run /usr/bin/numad -w 2:2048 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1 Immediately below that is 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : Setting up domain cgroup (if required) 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : Make group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 : Setting up inheritance /libvirt/qemu -> /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.cpus = 0-63 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' to '0-63' 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.mems = 0-7 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '0-7' 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : Could not autoset a RSS limit for domain bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '1' 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 Could the RSS issue be related? Some kernel related option not playing nice or enabled? -- Doug Goldstein

On 2013年01月26日 01:07, Doug Goldstein wrote:
On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 14:26, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 12:11, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> wrote:
I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + qemu 1.2.2 applied on top plus a number of stability patches). Having issue where my VMs fail to start with the following message:
kvm_init_vcpu failed: Cannot allocate memory
Smell likes we have problem on setting the NUMA policy (perhaps caused by the incorrect host NUMA topology), given that the system still has enough memory. Or numad (if it's installed) is doing something wrong.
Can you see if there is something about the Nodeset used to set the policy in debug log?
E.g.
% cat libvirtd.debug | grep Nodeset
Well I don't see anything but its likely because I didn't do something correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose from the command line.
If the process is in background, it's expected you can't see anything
My /etc/libvirt/libvirtd.conf had:
log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't get any debug messages.
log_level=1 has to be set.
Anyway, let's simply do this:
% service libvirtd stop % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
That's what I was doing, minus the tee just to the console and nothing was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, which also didn't get any debug messages. Turns out this instance must have been built with --disable-debug,
All I've got in the log is:
# grep -i 'numa' libvirtd.debug 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : About to run /usr/bin/numad -w 2:2048 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1
This looks right.
Immediately below that is
2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : Setting up domain cgroup (if required) 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : Make group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 : Setting up inheritance /libvirt/qemu -> /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.cpus = 0-63 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' to '0-63'
This looks not right, it should be 0-7 instead.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.mems = 0-7 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '0-7'
This is right.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : Could not autoset a RSS limit for domain bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '1'
And it's strange that the cpuset.mems is changed to '1' here.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39
Could the RSS issue be related? Some kernel related option not playing nice or enabled?

On 2013年01月28日 11:44, Osier Yang wrote:
On 2013年01月26日 01:07, Doug Goldstein wrote:
On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 14:26, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 12:11, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> wrote: > > > I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + > qemu > 1.2.2 applied on top plus a number of stability patches). Having > issue > where my VMs fail to start with the following message: > > kvm_init_vcpu failed: Cannot allocate memory
Smell likes we have problem on setting the NUMA policy (perhaps caused by the incorrect host NUMA topology), given that the system still has enough memory. Or numad (if it's installed) is doing something wrong.
Can you see if there is something about the Nodeset used to set the policy in debug log?
E.g.
% cat libvirtd.debug | grep Nodeset
Well I don't see anything but its likely because I didn't do something correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose from the command line.
If the process is in background, it's expected you can't see anything
My /etc/libvirt/libvirtd.conf had:
log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't get any debug messages.
log_level=1 has to be set.
Anyway, let's simply do this:
% service libvirtd stop % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
That's what I was doing, minus the tee just to the console and nothing was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, which also didn't get any debug messages. Turns out this instance must have been built with --disable-debug,
All I've got in the log is:
# grep -i 'numa' libvirtd.debug 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : About to run /usr/bin/numad -w 2:2048 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1
This looks right.
Immediately below that is
2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : Setting up domain cgroup (if required) 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : Make group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 : Setting up inheritance /libvirt/qemu -> /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.cpus = 0-63 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' to '0-63'
This looks not right, it should be 0-7 instead.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.mems = 0-7 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '0-7'
This is right.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : Could not autoset a RSS limit for domain bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '1'
And it's strange that the cpuset.mems is changed to '1' here.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39
Could the RSS issue be related? Some kernel related option not playing nice or enabled?
Instead, I'm wondering if the problem is caused by the mismatch (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which thus cause the problem for kernel memory management? Osier

On 2013年01月28日 11:47, Osier Yang wrote:
On 2013年01月28日 11:44, Osier Yang wrote:
On 2013年01月26日 01:07, Doug Goldstein wrote:
On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 14:26, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 12:11, Doug Goldstein wrote: > > > On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> > wrote: >> >> >> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + >> qemu >> 1.2.2 applied on top plus a number of stability patches). Having >> issue >> where my VMs fail to start with the following message: >> >> kvm_init_vcpu failed: Cannot allocate memory
Smell likes we have problem on setting the NUMA policy (perhaps caused by the incorrect host NUMA topology), given that the system still has enough memory. Or numad (if it's installed) is doing something wrong.
Can you see if there is something about the Nodeset used to set the policy in debug log?
E.g.
% cat libvirtd.debug | grep Nodeset
Well I don't see anything but its likely because I didn't do something correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose from the command line.
If the process is in background, it's expected you can't see anything
My /etc/libvirt/libvirtd.conf had:
log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't get any debug messages.
log_level=1 has to be set.
Anyway, let's simply do this:
% service libvirtd stop % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
That's what I was doing, minus the tee just to the console and nothing was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, which also didn't get any debug messages. Turns out this instance must have been built with --disable-debug,
All I've got in the log is:
# grep -i 'numa' libvirtd.debug 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : About to run /usr/bin/numad -w 2:2048 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1
This looks right.
Immediately below that is
2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : Setting up domain cgroup (if required) 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : Make group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 : Setting up inheritance /libvirt/qemu -> /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.cpus = 0-63 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' to '0-63'
This looks not right, it should be 0-7 instead.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.mems = 0-7 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '0-7'
This is right.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : Could not autoset a RSS limit for domain bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '1'
And it's strange that the cpuset.mems is changed to '1' here.
Oh, actually this is right, cpuset.mems is about the memory nodes.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39
Could the RSS issue be related? Some kernel related option not playing nice or enabled?
Instead, I'm wondering if the problem is caused by the mismatch (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which thus cause the problem for kernel memory management?
So, the simple method to prove the guess is to use static placement like: <vcpu placement='static' cpuset='0-63'>2</vcpu> <numatune> <memory placement='static' nodeset='1'/> </numatune> Osier

On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang <jyang@redhat.com> wrote:
On 2013年01月28日 11:47, Osier Yang wrote:
On 2013年01月28日 11:44, Osier Yang wrote:
On 2013年01月26日 01:07, Doug Goldstein wrote:
On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 14:26, Doug Goldstein wrote:
On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> wrote: > > > On 2013年01月24日 12:11, Doug Goldstein wrote: >> >> >> >> On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> >> wrote: >>> >>> >>> >>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + >>> qemu >>> 1.2.2 applied on top plus a number of stability patches). Having >>> issue >>> where my VMs fail to start with the following message: >>> >>> kvm_init_vcpu failed: Cannot allocate memory > > > > > Smell likes we have problem on setting the NUMA policy (perhaps > caused by the incorrect host NUMA topology), given that the system > still has enough memory. Or numad (if it's installed) is doing > something wrong. > > Can you see if there is something about the Nodeset used to set > the policy in debug log? > > E.g. > > % cat libvirtd.debug | grep Nodeset
Well I don't see anything but its likely because I didn't do something correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose from the command line.
If the process is in background, it's expected you can't see anything
My /etc/libvirt/libvirtd.conf had:
log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't get any debug messages.
log_level=1 has to be set.
Anyway, let's simply do this:
% service libvirtd stop % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
That's what I was doing, minus the tee just to the console and nothing was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, which also didn't get any debug messages. Turns out this instance must have been built with --disable-debug,
All I've got in the log is:
# grep -i 'numa' libvirtd.debug 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : About to run /usr/bin/numad -w 2:2048 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1
This looks right.
Immediately below that is
2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : Setting up domain cgroup (if required) 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : Make group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 : Setting up inheritance /libvirt/qemu -> /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.cpus = 0-63 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' to '0-63'
This looks not right, it should be 0-7 instead.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.mems = 0-7 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '0-7'
This is right.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : Could not autoset a RSS limit for domain bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '1'
And it's strange that the cpuset.mems is changed to '1' here.
Oh, actually this is right, cpuset.mems is about the memory nodes.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39
Could the RSS issue be related? Some kernel related option not playing nice or enabled?
Instead, I'm wondering if the problem is caused by the mismatch (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which thus cause the problem for kernel memory management?
So, the simple method to prove the guess is to use static placement like:
<vcpu placement='static' cpuset='0-63'>2</vcpu> <numatune> <memory placement='static' nodeset='1'/> </numatune>
Osier
Same error. Which I don't know if you expected or didn't expect. -- Doug Goldstein

On 2013年01月29日 00:17, Doug Goldstein wrote:
On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月28日 11:47, Osier Yang wrote:
On 2013年01月28日 11:44, Osier Yang wrote:
On 2013年01月26日 01:07, Doug Goldstein wrote:
On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月24日 14:26, Doug Goldstein wrote: > > > On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> wrote: >> >> >> On 2013年01月24日 12:11, Doug Goldstein wrote: >>> >>> >>> >>> On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> >>> wrote: >>>> >>>> >>>> >>>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + >>>> qemu >>>> 1.2.2 applied on top plus a number of stability patches). Having >>>> issue >>>> where my VMs fail to start with the following message: >>>> >>>> kvm_init_vcpu failed: Cannot allocate memory >> >> >> >> >> Smell likes we have problem on setting the NUMA policy (perhaps >> caused by the incorrect host NUMA topology), given that the system >> still has enough memory. Or numad (if it's installed) is doing >> something wrong. >> >> Can you see if there is something about the Nodeset used to set >> the policy in debug log? >> >> E.g. >> >> % cat libvirtd.debug | grep Nodeset > > > > Well I don't see anything but its likely because I didn't do something > correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose > from the command line.
If the process is in background, it's expected you can't see anything
My /etc/libvirt/libvirtd.conf had: > > > log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I didn't > get any debug messages.
log_level=1 has to be set.
Anyway, let's simply do this:
% service libvirtd stop % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug
That's what I was doing, minus the tee just to the console and nothing was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, which also didn't get any debug messages. Turns out this instance must have been built with --disable-debug,
All I've got in the log is:
# grep -i 'numa' libvirtd.debug 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : About to run /usr/bin/numad -w 2:2048 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1
This looks right.
Immediately below that is
2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : Setting up domain cgroup (if required) 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : Make group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 : Setting up inheritance /libvirt/qemu -> /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.cpus = 0-63 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' to '0-63'
This looks not right, it should be 0-7 instead.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.mems = 0-7 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '0-7'
This is right.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : Could not autoset a RSS limit for domain bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '1'
And it's strange that the cpuset.mems is changed to '1' here.
Oh, actually this is right, cpuset.mems is about the memory nodes.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39
Could the RSS issue be related? Some kernel related option not playing nice or enabled?
Instead, I'm wondering if the problem is caused by the mismatch (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which thus cause the problem for kernel memory management?
So, the simple method to prove the guess is to use static placement like:
<vcpu placement='static' cpuset='0-63'>2</vcpu> <numatune> <memory placement='static' nodeset='1'/> </numatune>
Osier
Same error. Which I don't know if you expected or didn't expect.
It's expected. as "0-63" is the final result when using "auto" placement.

On Mon, Jan 28, 2013 at 10:23 AM, Osier Yang <jyang@redhat.com> wrote:
On 2013年01月29日 00:17, Doug Goldstein wrote:
On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月28日 11:47, Osier Yang wrote:
On 2013年01月28日 11:44, Osier Yang wrote:
On 2013年01月26日 01:07, Doug Goldstein wrote:
On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> wrote: > > > On 2013年01月24日 14:26, Doug Goldstein wrote: >> >> >> >> On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> >> wrote: >>> >>> >>> >>> On 2013年01月24日 12:11, Doug Goldstein wrote: >>>> >>>> >>>> >>>> >>>> On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> >>>> wrote: >>>>> >>>>> >>>>> >>>>> >>>>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + >>>>> qemu >>>>> 1.2.2 applied on top plus a number of stability patches). Having >>>>> issue >>>>> where my VMs fail to start with the following message: >>>>> >>>>> kvm_init_vcpu failed: Cannot allocate memory >>> >>> >>> >>> >>> >>> Smell likes we have problem on setting the NUMA policy (perhaps >>> caused by the incorrect host NUMA topology), given that the system >>> still has enough memory. Or numad (if it's installed) is doing >>> something wrong. >>> >>> Can you see if there is something about the Nodeset used to set >>> the policy in debug log? >>> >>> E.g. >>> >>> % cat libvirtd.debug | grep Nodeset >> >> >> >> >> Well I don't see anything but its likely because I didn't do >> something >> correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose >> from the command line. > > > > > If the process is in background, it's expected you can't see anything > > > My /etc/libvirt/libvirtd.conf had: >> >> >> >> log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I >> didn't >> get any debug messages. > > > > > log_level=1 has to be set. > > Anyway, let's simply do this: > > % service libvirtd stop > % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug >
That's what I was doing, minus the tee just to the console and nothing was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, which also didn't get any debug messages. Turns out this instance must have been built with --disable-debug,
All I've got in the log is:
# grep -i 'numa' libvirtd.debug 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : About to run /usr/bin/numad -w 2:2048 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : Nodeset returned from numad: 1
This looks right.
Immediately below that is
2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : Setting up domain cgroup (if required) 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : Make group /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 : Setting up inheritance /libvirt/qemu -> /libvirt/qemu/bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.cpus = 0-63 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' to '0-63'
This looks not right, it should be 0-7 instead.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 : Inherit cpuset.mems = 0-7 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '0-7'
This is right.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : Could not autoset a RSS limit for domain bb-2.6.35.9-i686 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : Set value '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' to '1'
And it's strange that the cpuset.mems is changed to '1' here.
Oh, actually this is right, cpuset.mems is about the memory nodes.
2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed fd 39
Could the RSS issue be related? Some kernel related option not playing nice or enabled?
Instead, I'm wondering if the problem is caused by the mismatch (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which thus cause the problem for kernel memory management?
So, the simple method to prove the guess is to use static placement like:
<vcpu placement='static' cpuset='0-63'>2</vcpu> <numatune> <memory placement='static' nodeset='1'/> </numatune>
Osier
Same error. Which I don't know if you expected or didn't expect.
It's expected. as "0-63" is the final result when using "auto" placement.
Since there's another user on the libvirt-list asking about the exact same CPU I've got, I figured I'd do some poking. Oddly enough him and I had different outputs from virsh nodeinfo. Just as background its AMD 6272 CPUs. I've for 4 of them in the box but they're organized as follows: Sockets: 4 Cores: 16 Threads: 1 per core (16) NUMA nodes: 8 Mem per node: 16GB Total: 128GB # virsh nodeinfo CPU model: x86_64 CPU(s): 64 CPU frequency: 2100 MHz CPU socket(s): 1 Core(s) per socket: 64 Thread(s) per core: 1 NUMA cell(s): 1 Memory size: 132013200 KiB # virsh capabilities <snip> <topology sockets='1' cores='64' threads='1'/> <snip> <topology> <cells num='8'> <snip> I've hand verified all the values in /sys/devices/system/nodeX/cpuX/topology/physical_package_id to show that the physical package is oriented in pairs (0&1, 2&3, 4&5, 6&7) for the NUMA nodes. Need to give git a whirl as I know that's got a bit different code than 1.0.1 but I'll report back. -- Doug Goldstein

On 01/30/2013 01:25 PM, Doug Goldstein wrote:
On Mon, Jan 28, 2013 at 10:23 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月29日 00:17, Doug Goldstein wrote:
On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月28日 11:47, Osier Yang wrote:
On 2013年01月28日 11:44, Osier Yang wrote:
On 2013年01月26日 01:07, Doug Goldstein wrote: > > On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> wrote: >> >> On 2013年01月24日 14:26, Doug Goldstein wrote: >>> >>> >>> On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> >>> wrote: >>>> >>>> >>>> On 2013年01月24日 12:11, Doug Goldstein wrote: >>>>> >>>>> >>>>> >>>>> On Wed, Jan 23, 2013 at 3:45 PM, Doug Goldstein<cardoe@gentoo.org> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 + >>>>>> qemu >>>>>> 1.2.2 applied on top plus a number of stability patches). Having >>>>>> issue >>>>>> where my VMs fail to start with the following message: >>>>>> >>>>>> kvm_init_vcpu failed: Cannot allocate memory >>>> >>>> >>>> >>>> >>>> Smell likes we have problem on setting the NUMA policy (perhaps >>>> caused by the incorrect host NUMA topology), given that the system >>>> still has enough memory. Or numad (if it's installed) is doing >>>> something wrong. >>>> >>>> Can you see if there is something about the Nodeset used to set >>>> the policy in debug log? >>>> >>>> E.g. >>>> >>>> % cat libvirtd.debug | grep Nodeset >>> >>> >>> >>> Well I don't see anything but its likely because I didn't do >>> something >>> correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose >>> from the command line. >> >> >> >> If the process is in background, it's expected you can't see anything >> >> >> My /etc/libvirt/libvirtd.conf had: >>> >>> >>> log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I >>> didn't >>> get any debug messages. >> >> >> >> log_level=1 has to be set. >> >> Anyway, let's simply do this: >> >> % service libvirtd stop >> % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug >> > That's what I was doing, minus the tee just to the console and nothing > was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, > which also didn't get any debug messages. Turns out this instance must > have been built with --disable-debug, > > All I've got in the log is: > > # grep -i 'numa' libvirtd.debug > 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : > About to run /usr/bin/numad -w 2:2048 > 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : > Nodeset returned from numad: 1
This looks right.
> Immediately below that is > > 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : > Setting up domain cgroup (if required) > 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New > group /libvirt/qemu/bb-2.6.35.9-i686 > 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : > Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in > 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : > Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : > Make group /libvirt/qemu/bb-2.6.35.9-i686 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : > Make controller /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : > Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:469 > : Setting up inheritance /libvirt/qemu -> > /libvirt/qemu/bb-2.6.35.9-i686 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : > Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus > 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed > fd 39 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 > : Inherit cpuset.cpus = 0-63 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : > Set value > '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' > to '0-63'
This looks not right, it should be 0-7 instead.
> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed > fd 39 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 : > Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems > 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed > fd 39 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupCpuSetInherit:482 > : Inherit cpuset.mems = 0-7 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : > Set value > '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' > to '0-7'
This is right.
> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed > fd 39 > 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : > Could not autoset a RSS limit for domain bb-2.6.35.9-i686 > 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 : > Set value > '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' > to '1'
And it's strange that the cpuset.mems is changed to '1' here.
Oh, actually this is right, cpuset.mems is about the memory nodes.
> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed > fd 39 > > Could the RSS issue be related? Some kernel related option not playing > nice or enabled?
Instead, I'm wondering if the problem is caused by the mismatch (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which thus cause the problem for kernel memory management?
So, the simple method to prove the guess is to use static placement like:
<vcpu placement='static' cpuset='0-63'>2</vcpu> <numatune> <memory placement='static' nodeset='1'/> </numatune>
Osier
Same error. Which I don't know if you expected or didn't expect.
It's expected. as "0-63" is the final result when using "auto" placement. Since there's another user on the libvirt-list asking about the exact same CPU I've got, I figured I'd do some poking. Oddly enough him and I had different outputs from virsh nodeinfo. Just as background its AMD 6272 CPUs. I've for 4 of them in the box but they're organized as follows:
Sockets: 4 Cores: 16 Threads: 1 per core (16) NUMA nodes: 8 Mem per node: 16GB Total: 128GB
# virsh nodeinfo CPU model: x86_64 CPU(s): 64 CPU frequency: 2100 MHz CPU socket(s): 1 Core(s) per socket: 64 Thread(s) per core: 1 NUMA cell(s): 1 Memory size: 132013200 KiB
# virsh capabilities <snip> <topology sockets='1' cores='64' threads='1'/> <snip> <topology> <cells num='8'> <snip>
I've hand verified all the values in /sys/devices/system/nodeX/cpuX/topology/physical_package_id to show that the physical package is oriented in pairs (0&1, 2&3, 4&5, 6&7) for the NUMA nodes.
Need to give git a whirl as I know that's got a bit different code than 1.0.1 but I'll report back.
For AMD 62xx CPUs, the output is expected. Check out this bug: virsh nodeinfo can't get the right info on AMD Bulldozer cpu https://bugzilla.redhat.com/show_bug.cgi?id=874050 Wayne Sun 2013-01-30

On Wed, Jan 30, 2013 at 1:21 AM, Wayne Sun <gsun@redhat.com> wrote:
On 01/30/2013 01:25 PM, Doug Goldstein wrote:
On Mon, Jan 28, 2013 at 10:23 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月29日 00:17, Doug Goldstein wrote:
On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月28日 11:47, Osier Yang wrote:
On 2013年01月28日 11:44, Osier Yang wrote: > > > On 2013年01月26日 01:07, Doug Goldstein wrote: >> >> >> On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> >> wrote: >>> >>> >>> On 2013年01月24日 14:26, Doug Goldstein wrote: >>>> >>>> >>>> >>>> On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> >>>> wrote: >>>>> >>>>> >>>>> >>>>> On 2013年01月24日 12:11, Doug Goldstein wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Wed, Jan 23, 2013 at 3:45 PM, Doug >>>>>> Goldstein<cardoe@gentoo.org> >>>>>> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 >>>>>>> + >>>>>>> qemu >>>>>>> 1.2.2 applied on top plus a number of stability patches). >>>>>>> Having >>>>>>> issue >>>>>>> where my VMs fail to start with the following message: >>>>>>> >>>>>>> kvm_init_vcpu failed: Cannot allocate memory >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> Smell likes we have problem on setting the NUMA policy (perhaps >>>>> caused by the incorrect host NUMA topology), given that the >>>>> system >>>>> still has enough memory. Or numad (if it's installed) is doing >>>>> something wrong. >>>>> >>>>> Can you see if there is something about the Nodeset used to set >>>>> the policy in debug log? >>>>> >>>>> E.g. >>>>> >>>>> % cat libvirtd.debug | grep Nodeset >>>> >>>> >>>> >>>> >>>> Well I don't see anything but its likely because I didn't do >>>> something >>>> correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose >>>> from the command line. >>> >>> >>> >>> >>> If the process is in background, it's expected you can't see >>> anything >>> >>> >>> My /etc/libvirt/libvirtd.conf had: >>>> >>>> >>>> >>>> log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I >>>> didn't >>>> get any debug messages. >>> >>> >>> >>> >>> log_level=1 has to be set. >>> >>> Anyway, let's simply do this: >>> >>> % service libvirtd stop >>> % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug >>> >> That's what I was doing, minus the tee just to the console and >> nothing >> was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, >> which also didn't get any debug messages. Turns out this instance >> must >> have been built with --disable-debug, >> >> All I've got in the log is: >> >> # grep -i 'numa' libvirtd.debug >> 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : >> About to run /usr/bin/numad -w 2:2048 >> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : >> Nodeset returned from numad: 1 > > > > This looks right. > >> Immediately below that is >> >> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : >> Setting up domain cgroup (if required) >> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New >> group /libvirt/qemu/bb-2.6.35.9-i686 >> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : >> Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in >> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : >> Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : >> Make group /libvirt/qemu/bb-2.6.35.9-i686 >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : >> Make controller >> /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : >> Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ >> 2013-01-25 16:50:17.296+0000: 417: debug : >> virCgroupCpuSetInherit:469 >> : Setting up inheritance /libvirt/qemu -> >> /libvirt/qemu/bb-2.6.35.9-i686 >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 >> : >> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus >> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >> fd 39 >> 2013-01-25 16:50:17.296+0000: 417: debug : >> virCgroupCpuSetInherit:482 >> : Inherit cpuset.cpus = 0-63 >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 >> : >> Set value >> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' >> to '0-63' > > > > This looks not right, it should be 0-7 instead. > >> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >> fd 39 >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 >> : >> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems >> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >> fd 39 >> 2013-01-25 16:50:17.296+0000: 417: debug : >> virCgroupCpuSetInherit:482 >> : Inherit cpuset.mems = 0-7 >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 >> : >> Set value >> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' >> to '0-7' > > > > This is right. > >> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >> fd 39 >> 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : >> Could not autoset a RSS limit for domain bb-2.6.35.9-i686 >> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 >> : >> Set value >> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' >> to '1' > > > > And it's strange that the cpuset.mems is changed to '1' here.
Oh, actually this is right, cpuset.mems is about the memory nodes.
>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >> fd 39 >> >> Could the RSS issue be related? Some kernel related option not >> playing >> nice or enabled?
Instead, I'm wondering if the problem is caused by the mismatch (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which thus cause the problem for kernel memory management?
So, the simple method to prove the guess is to use static placement like:
<vcpu placement='static' cpuset='0-63'>2</vcpu> <numatune> <memory placement='static' nodeset='1'/> </numatune>
Osier
Same error. Which I don't know if you expected or didn't expect.
It's expected. as "0-63" is the final result when using "auto" placement.
Since there's another user on the libvirt-list asking about the exact same CPU I've got, I figured I'd do some poking. Oddly enough him and I had different outputs from virsh nodeinfo. Just as background its AMD 6272 CPUs. I've for 4 of them in the box but they're organized as follows:
Sockets: 4 Cores: 16 Threads: 1 per core (16) NUMA nodes: 8 Mem per node: 16GB Total: 128GB
# virsh nodeinfo CPU model: x86_64 CPU(s): 64 CPU frequency: 2100 MHz CPU socket(s): 1 Core(s) per socket: 64 Thread(s) per core: 1 NUMA cell(s): 1 Memory size: 132013200 KiB
# virsh capabilities <snip> <topology sockets='1' cores='64' threads='1'/> <snip> <topology> <cells num='8'> <snip>
I've hand verified all the values in /sys/devices/system/nodeX/cpuX/topology/physical_package_id to show that the physical package is oriented in pairs (0&1, 2&3, 4&5, 6&7) for the NUMA nodes.
Need to give git a whirl as I know that's got a bit different code than 1.0.1 but I'll report back.
For AMD 62xx CPUs, the output is expected.
Check out this bug: virsh nodeinfo can't get the right info on AMD Bulldozer cpu https://bugzilla.redhat.com/show_bug.cgi?id=874050
Wayne Sun 2013-01-30
Wayne, I'd argue we need to determine what format we really need the data in. Do we actually really care about physical sockets? Or should we care about packages? Because with this specific CPU there are 2 packages in 1 physical socket to form 2 NUMA nodes per package. The reason I say this is that we went from NUMA being defined for the domain working to the domain failing to start up with a cryptic error message which IMHO is worse. The flip side of the coin is that we can just strip out all the NUMA settings when starting the domain up if we know it won't work. -- Doug Goldstein

[ CC Peter ] On 2013年01月31日 06:01, Doug Goldstein wrote:
On Wed, Jan 30, 2013 at 1:21 AM, Wayne Sun<gsun@redhat.com> wrote:
On 01/30/2013 01:25 PM, Doug Goldstein wrote:
On Mon, Jan 28, 2013 at 10:23 AM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月29日 00:17, Doug Goldstein wrote:
On Sun, Jan 27, 2013 at 10:46 PM, Osier Yang<jyang@redhat.com> wrote:
On 2013年01月28日 11:47, Osier Yang wrote: > > > On 2013年01月28日 11:44, Osier Yang wrote: >> >> >> On 2013年01月26日 01:07, Doug Goldstein wrote: >>> >>> >>> On Thu, Jan 24, 2013 at 12:58 AM, Osier Yang<jyang@redhat.com> >>> wrote: >>>> >>>> >>>> On 2013年01月24日 14:26, Doug Goldstein wrote: >>>>> >>>>> >>>>> >>>>> On Wed, Jan 23, 2013 at 11:02 PM, Osier Yang<jyang@redhat.com> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On 2013年01月24日 12:11, Doug Goldstein wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Wed, Jan 23, 2013 at 3:45 PM, Doug >>>>>>> Goldstein<cardoe@gentoo.org> >>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> I am using libvirt 0.10.2.2 and qemu-kvm 1.2.2 (qemu-kvm 1.2.0 >>>>>>>> + >>>>>>>> qemu >>>>>>>> 1.2.2 applied on top plus a number of stability patches). >>>>>>>> Having >>>>>>>> issue >>>>>>>> where my VMs fail to start with the following message: >>>>>>>> >>>>>>>> kvm_init_vcpu failed: Cannot allocate memory >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Smell likes we have problem on setting the NUMA policy (perhaps >>>>>> caused by the incorrect host NUMA topology), given that the >>>>>> system >>>>>> still has enough memory. Or numad (if it's installed) is doing >>>>>> something wrong. >>>>>> >>>>>> Can you see if there is something about the Nodeset used to set >>>>>> the policy in debug log? >>>>>> >>>>>> E.g. >>>>>> >>>>>> % cat libvirtd.debug | grep Nodeset >>>>> >>>>> >>>>> >>>>> >>>>> Well I don't see anything but its likely because I didn't do >>>>> something >>>>> correct. I had LIBVIRT_DEBUG=1 exported and ran libvirtd --verbose >>>>> from the command line. >>>> >>>> >>>> >>>> >>>> If the process is in background, it's expected you can't see >>>> anything >>>> >>>> >>>> My /etc/libvirt/libvirtd.conf had: >>>>> >>>>> >>>>> >>>>> log_outputs="3:syslog:libvirtd 1:file:/tmp/libvirtd.log" But I >>>>> didn't >>>>> get any debug messages. >>>> >>>> >>>> >>>> >>>> log_level=1 has to be set. >>>> >>>> Anyway, let's simply do this: >>>> >>>> % service libvirtd stop >>>> % LIBVIRT_DEBUG=1 /usr/sbin/libvirtd 2>&1 | tee -a libvirtd.debug >>>> >>> That's what I was doing, minus the tee just to the console and >>> nothing >>> was coming out. Which is why I added the 1:file:/tmp/libvirtd.log, >>> which also didn't get any debug messages. Turns out this instance >>> must >>> have been built with --disable-debug, >>> >>> All I've got in the log is: >>> >>> # grep -i 'numa' libvirtd.debug >>> 2013-01-25 16:50:15.287+0000: 417: debug : virCommandRunAsync:2200 : >>> About to run /usr/bin/numad -w 2:2048 >>> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3614 : >>> Nodeset returned from numad: 1 >> >> >> >> This looks right. >> >>> Immediately below that is >>> >>> 2013-01-25 16:50:17.295+0000: 417: debug : qemuProcessStart:3622 : >>> Setting up domain cgroup (if required) >>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupNew:619 : New >>> group /libvirt/qemu/bb-2.6.35.9-i686 >>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : >>> Detected mount/mapping 1:cpuacct at /sys/fs/cgroup/cpuacct in >>> 2013-01-25 16:50:17.295+0000: 417: debug : virCgroupDetect:273 : >>> Detected mount/mapping 2:cpuset at /sys/fs/cgroup/cpuset in >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:537 : >>> Make group /libvirt/qemu/bb-2.6.35.9-i686 >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : >>> Make controller >>> /sys/fs/cgroup/cpuacct/libvirt/qemu/bb-2.6.35.9-i686/ >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupMakeGroup:562 : >>> Make controller /sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/ >>> 2013-01-25 16:50:17.296+0000: 417: debug : >>> virCgroupCpuSetInherit:469 >>> : Setting up inheritance /libvirt/qemu -> >>> /libvirt/qemu/bb-2.6.35.9-i686 >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 >>> : >>> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.cpus >>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>> fd 39 >>> 2013-01-25 16:50:17.296+0000: 417: debug : >>> virCgroupCpuSetInherit:482 >>> : Inherit cpuset.cpus = 0-63 >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 >>> : >>> Set value >>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.cpus' >>> to '0-63' >> >> >> >> This looks not right, it should be 0-7 instead. >> >>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>> fd 39 >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupGetValueStr:361 >>> : >>> Get value /sys/fs/cgroup/cpuset/libvirt/qemu/cpuset.mems >>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>> fd 39 >>> 2013-01-25 16:50:17.296+0000: 417: debug : >>> virCgroupCpuSetInherit:482 >>> : Inherit cpuset.mems = 0-7 >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 >>> : >>> Set value >>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' >>> to '0-7' >> >> >> >> This is right. >> >>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>> fd 39 >>> 2013-01-25 16:50:17.296+0000: 417: warning : qemuSetupCgroup:388 : >>> Could not autoset a RSS limit for domain bb-2.6.35.9-i686 >>> 2013-01-25 16:50:17.296+0000: 417: debug : virCgroupSetValueStr:331 >>> : >>> Set value >>> '/sys/fs/cgroup/cpuset/libvirt/qemu/bb-2.6.35.9-i686/cpuset.mems' >>> to '1' >> >> >> >> And it's strange that the cpuset.mems is changed to '1' here.
Oh, actually this is right, cpuset.mems is about the memory nodes.
>>> 2013-01-25 16:50:17.296+0000: 417: debug : virFileClose:72 : Closed >>> fd 39 >>> >>> Could the RSS issue be related? Some kernel related option not >>> playing >>> nice or enabled? > > > > Instead, I'm wondering if the problem is caused by the mismatch > (from libvirt p.o.v) between cpuset.cpus and cpuset.mems, which > thus cause the problem for kernel memory management?
So, the simple method to prove the guess is to use static placement like:
<vcpu placement='static' cpuset='0-63'>2</vcpu> <numatune> <memory placement='static' nodeset='1'/> </numatune>
Osier
Same error. Which I don't know if you expected or didn't expect.
It's expected. as "0-63" is the final result when using "auto" placement.
Since there's another user on the libvirt-list asking about the exact same CPU I've got, I figured I'd do some poking. Oddly enough him and I had different outputs from virsh nodeinfo. Just as background its AMD 6272 CPUs. I've for 4 of them in the box but they're organized as follows:
Sockets: 4 Cores: 16 Threads: 1 per core (16) NUMA nodes: 8 Mem per node: 16GB Total: 128GB
# virsh nodeinfo CPU model: x86_64 CPU(s): 64 CPU frequency: 2100 MHz CPU socket(s): 1 Core(s) per socket: 64 Thread(s) per core: 1 NUMA cell(s): 1 Memory size: 132013200 KiB
# virsh capabilities <snip> <topology sockets='1' cores='64' threads='1'/> <snip> <topology> <cells num='8'> <snip>
I've hand verified all the values in /sys/devices/system/nodeX/cpuX/topology/physical_package_id to show that the physical package is oriented in pairs (0&1, 2&3, 4&5, 6&7) for the NUMA nodes.
Need to give git a whirl as I know that's got a bit different code than 1.0.1 but I'll report back.
As far as I see, Peter committed more patches to fix the CPU toplogy parsing on AMD platfroms. Perhaps he will known if this is fixed in new release.
For AMD 62xx CPUs, the output is expected.
Check out this bug: virsh nodeinfo can't get the right info on AMD Bulldozer cpu https://bugzilla.redhat.com/show_bug.cgi?id=874050
Wayne Sun 2013-01-30
Wayne,
I'd argue we need to determine what format we really need the data in. Do we actually really care about physical sockets? Or should we care about packages? Because with this specific CPU there are 2 packages in 1 physical socket to form 2 NUMA nodes per package.
I agreed. Though the total number of CPUs is correct, which guarantees most of the stuffs related with CPU topology work. But it still should be fixed.
The reason I say this is that we went from NUMA being defined for the domain working to the domain failing to start up with a cryptic error message which IMHO is worse.
The flip side of the coin is that we can just strip out all the NUMA settings when starting the domain up if we know it won't work.
participants (3)
-
Doug Goldstein
-
Osier Yang
-
Wayne Sun