On Tue, Jan 26, 2016 at 2:01 PM, Andrei Perietanu <andrei.perietanu@klastelecom.com> wrote:

On Tue, Jan 26, 2016 at 1:51 PM, Michal Privoznik <mprivozn@redhat.com> wrote:
On 26.01.2016 14:35, Andrei Perietanu wrote:
> On Tue, Jan 26, 2016 at 12:39 PM, Michal Privoznik <mprivozn@redhat.com>
> wrote:
>
>> On 26.01.2016 12:30, Andrei Perietanu wrote:
>>> Hi all,
>>>
>>> I am running KVM on a 3.18 kernel. The system runs and Atom processor
>> with
>>> 2Gb RAM.
>>>
>>> Using KVM you obviously can over allocate your resources: say you have 4
>>> guests each configured with 1GB ram. Running all four at the same time,
>>> depending on the workload, can crash the system - I get a kernel trace
>> when
>>> this happens.
>>>
>>> But let's consider a simpler case: one guest with 1.5 Gb RAM, ubuntu
>> 14.03.
>>> During the installation the system will again crash.
>>> The memory statistics (top or proc/meminfo) will show that the FreeMemory
>>> goes down to 12Mb when this happens - which kind of makes sense
>> considering
>>> the host will require some RAM to run.
>>>
>>> But the question is: does libvirt offer any way to prevent this from
>>> happening?
>>>
>>> Some way of not allowing the user to start a guest unless you have enough
>>> free memory. I know how much ram each guest has configured but that is
>> not
>>> enough. I need to know how much the system has available, and just
>> reading
>>> the free memory statistic does not help much since that is only a
>> snapshot
>>> - when running a guest you can have 1gb free now, and 10 mb free 2 min
>>> later.
>>>
>>> Any ideas?
>>
>> There is one option I see, use -mem-prealloc. Either you can passthrough
>> it onto qemu commandline [1] or use locked memoryBacking [2]. I advocate
>> for the latter though. Not only it will allocate all the memory at qemu
>> startup it will also lock it so it won't get swapped off.
>>
>> Michal
>>
>> 1: http://libvirt.org/drvqemu.html#qemucommand
>> 2: http://libvirt.org/formatdomain.html#elementsMemoryBacking
>>
>
> I tried memoryBacking: I added this to the domain xml:
> <memoryBacking>
>   <locked/>
> </memoryBacking>
>
> And got an error when attempting to start the vm: memory locking no
> supported by QEMU binary.
>
> Aside from that I don't really understand how this helps solve my issue -
> if you don't mind going a bit into details I'd appreciate it.
>

Well, it will make qemu to allocate all its memory on the start. So
either the allocation would be successful and domain will run or it
won't and qemu will die immediately.

On the other hand, your kernel is heavily broken too. Instead of
crashing at OOM it should kill a process (usually the one that's
consuming the most memory => qemu in this case).

Michal
Sorry, I said crashing; I meant it spits out a kernel trace. You can still use the system after it happens. I just want to prevent from happening in the first place.

Andrei
In case it helps (maybe I didn't correctly diagnose the problem), this is how the kernel trace looks like:

 Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371208]  [<ffffffff81116389>] __get_free_pages+0x9/0x38

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371161]  [<ffffffff8184cfdb>] dump_stack+0x49/0x5e

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371203]  [<ffffffff8114be30>] alloc_pages_current+0xd1/0xda

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371299]  [<ffffffff81005e35>] kvm_vcpu_ioctl+0x12e/0x4b4

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371145]  ffff88004b5078e8 ffff88004b507803 ffff880000000015 0000000000000001

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371169]  [<ffffffff811124e7>] dump_header+0x6d/0x1d4

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371251]  [<ffffffff8102fc55>] ? vmx_invpcid_supported+0x1d/0x1d

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371239]  [<ffffffff8102fc55>] ? vmx_invpcid_supported+0x1d/0x1d

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371288]  [<ffffffff8109ba1d>] ? set_task_cpu+0x63/0xb4

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371195]  [<ffffffff811176f8>] __alloc_pages_nodemask+0x6a0/0x812

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371312]  [<ffffffff81096692>] ? check_preempt_curr+0x41/0x6f

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371139]  ffff880044d06600 ffff88004b507918 ffffffff811124e7 ffff88004b507a08

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371183]  [<ffffffff8123db16>] ? security_capable_noaudit+0x10/0x12

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371329]  [<ffffffff81165cab>] SyS_ioctl+0x5b/0x7a

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371318]  [<ffffffff81165c14>] do_vfs_ioctl+0x3e5/0x421

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371353] CPU    2: hi:    0, btch:   1 usd:   0

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371342] Node 0 DMA per-cpu:

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371306]  [<ffffffff810964b7>] ? resched_curr+0x6e/0x70

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371274]  [<ffffffff81016fdc>] kvm_arch_vcpu_ioctl_run+0xcbd/0xf59

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371245]  [<ffffffff81033434>] handle_ept_violation+0x13d/0x149

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371356] CPU    3: hi:    0, btch:   1 usd:   0

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371256]  [<ffffffff81036ffc>] vmx_handle_exit+0x80c/0x86f

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371335]  [<ffffffff8184fff2>] system_call_fastpath+0x12/0x17

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371369] CPU    2: hi:  186, btch:  31 usd:   0

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371380]  active_file:56 inactive_file:40 isolated_file:0

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371175]  [<ffffffff811128c4>] oom_kill_process+0x77/0x328

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371339] Mem-Info:

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371346] CPU    0: hi:    0, btch:   1 usd:   0

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371389] Node 0 DMA free:7080kB min:44kB low:52kB high:64kB active_anon:7232kB inactive_anon:4kB active_file:0kB inactive_file:0kB unevictable:8kB isolated(anon):0kB isolated(file):0kB present:15980kB managed:15892kB mlocked:0kB dirty:12kB writeback:4kB mapped:0kB shmem:20kB slab_reclaimable:140kB slab_unreclaimable:112kB kernel_stack:0kB pagetables:44kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371400] lowmem_reserve[]: 0 1760 1760 1760

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371150] Call Trace:

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371406] Node 0 DMA32 free:5276kB min:5344kB low:6680kB high:8016kB active_anon:1549604kB inactive_anon:88kB active_file:224kB inactive_file:164kB unevictable:281088kB isolated(anon):0kB isolated(file):0kB present:1956104kB managed:1907836kB mlocked:0kB dirty:0kB writeback:0kB mapped:18864kB shmem:300kB slab_reclaimable:25736kB slab_unreclaimable:13268kB kernel_stack:1968kB pagetables:4940kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:2480 all_unreclaimable? yes

Message from syslogd@trx-r2 at Thu Jan 21 04:14:29 2010 ...
trx-r2 kernel: [ 6116.371293]  [<ffffffff810153bd>] ? kvm_arch_vcpu_load+0xb0/0x183
.......
Andrei


The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, retransmission, dissemination or other use of or taking of any action in reliance upon this information by persons or entities other than the intended recipient is prohibited. If you receive this in error please contact the sender and delete the material from any computer immediately. It is the policy of Klas  Limited to disavow the sending of offensive material and should you consider that the material contained in the message is offensive you should contact the sender immediately and also your I.T. Manager.

Klas Telecom Inc., a Virginia Corporation with offices at 1101 30th St. NW, Washington, DC 20007.

Klas Limited (Company Number 163303) trading as Klas Telecom, an Irish Limited Liability Company, with its registered office at Fourth Floor, One Kilmainham Square, Inchicore Road, Kilmainham, Dublin 8, Ireland.