On 2015-06-04 18:59, David Rientjes wrote:
> On Tue, 2 Jun 2015, Michal Hocko wrote:
>
>> OOM killer might be triggered externally via sysrq+f. This is supposed
>> to kill a task no matter what e.g. a task is selected even though there
>> is an OOM victim on the way to exit. This is a big hammer for an admin
>> to help to resolve a memory short condition when the system is not able
>> to cope with it on its own in a reasonable time frame (e.g. when the
>> system is trashing or the OOM killer cannot make sufficient progress).
>>
>> The forced OOM killing is currently wired into out_of_memory()
>> call which is kind of ugly because generic out_of_memory path
>> has to deal with configuration settings and heuristics which
>> are completely irrelevant to the forced OOM killer (e.g.
>> sysctl_oom_kill_allocating_task or OOM killer prevention for already
>> dying tasks). Some of those will not apply to sysrq because the handler
>> runs from the worker context.
>> check_panic_on_oom on the other hand will work and that is kind of
>> unexpected because sysrq+f should be usable to kill a mem hog whether
>> the global OOM policy is to panic or not.
>> It also doesn't make much sense to panic the system when no task cannot
>> be killed because admin has a separate sysrq for that purpose.
>>
>> Let's pull forced OOM killer code out into a separate function
>> (force_out_of_memory) which is really trivial now. Also extract the core
>> of oom_kill_process into __oom_kill_process which doesn't do any
>> OOM prevention heuristics.
>> As a bonus we can clearly state that this is a forced OOM killer in the
>> OOM message which is helpful to distinguish it from the regular OOM
>> killer.
>>
>
> I'm not sure what the benefit of this is, and it's adding more code.
> Having multiple pathways and requirements, such as constrained_alloc(), to
> oom kill a process isn't any clearer, in my opinion.  It also isn't
> intended to be optimized since the oom killer called from the page
> allocator and from sysrq aren't fastpaths.  To me, this seems like only a
> source code level change and doesn't make anything more clear but rather
> adds more code and obfuscates the entry path.

At the very least, it does make the semantics of sysrq-f much nicer for 
admins (especially the bit where it ignores the panic_on_oom setting, if 
the admin wants the system to panic, he'll use sysrq-c).  There have 
been times I've had to hit sysrq-f multiple times to get to actually 
kill anything, and this looks to me like it would eliminate that rather 
annoying issue as well.