[Eisfair] kthreadd invoked oom-killer

Thomas Bork tom at eisfair.org
Mi Mai 31 19:14:23 CEST 2017


Am 31.05.2017 um 16:59 schrieb Marcus Roeckrath:

> Um wirklich den Schuldigen zu finden, müsste an das Verhalten dess
> oom-killers ändern, dass sofort der erste oom-killer Aufruf zum "Crash"
> führt und dann den Kerneltrace auswerten - solches habe ich per Tante G.
> gefunden.

echo 1 > /proc/sys/vm/oom_kill_allocating_task

Das killt statt einen per Heuristik ausgewählten Task den Task, der den 
oom killer ausgelöst hat:

oom_kill_allocating_task

This enables or disables killing the OOM-triggering task in
out-of-memory situations.

If this is set to zero, the OOM killer will scan through the entire
tasklist and select a task based on heuristics to kill.  This normally
selects a rogue memory-hogging task that frees up a large amount of
memory when killed.

If this is set to non-zero, the OOM killer simply kills the task that
triggered the out-of-memory condition.  This avoids the expensive
tasklist scan.

If panic_on_oom is selected, it takes precedence over whatever value
is used in oom_kill_allocating_task.

The default value is 0.


Oder

echo 2 > /proc/sys/vm/panic_on_oom

panic_on_oom

This enables or disables panic on out-of-memory feature.

If this is set to 0, the kernel will kill some rogue process,
called oom_killer.  Usually, oom_killer can kill rogue processes and
system will survive.

If this is set to 1, the kernel panics when out-of-memory happens.
However, if a process limits using nodes by mempolicy/cpusets,
and those nodes become memory exhaustion status, one process
may be killed by oom-killer. No panic occurs in this case.
Because other nodes' memory may be free. This means system total status
may be not fatal yet.

If this is set to 2, the kernel panics compulsorily even on the
above-mentioned. Even oom happens under memory cgroup, the whole
system panics.

The default value is 0.
1 and 2 are for failover of clustering. Please select either
according to your policy of failover.
panic_on_oom=2+kdump gives you very strong tool to investigate
why oom happens. You can get snapshot.

-- 
der tom
[eisfair-team]


Mehr Informationen über die Mailingliste Eisfair