Re: [PATCH v3] mm,page_alloc: wait for oom_lock before retrying.

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Michal Hocko <mhocko@kernel.org>
To: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
Cc: linux-mm <linux-mm@kvack.org>
Subject: Re: [PATCH v3] mm,page_alloc: wait for oom_lock before retrying.
Date: Fri, 15 Feb 2019 13:18:48 +0100	[thread overview]
Message-ID: <20190215121848.GY4525@dhcp22.suse.cz> (raw)
In-Reply-To: <87896c67-ddc9-56d9-8643-09865c6cbfe2@i-love.sakura.ne.jp>

On Fri 15-02-19 19:42:41, Tetsuo Handa wrote:
> On 2019/02/14 1:56, Michal Hocko wrote:
> > On Thu 14-02-19 01:30:28, Tetsuo Handa wrote:
> > [...]
> >> >From 63c5c8ee7910fa9ef1c4067f1cb35a779e9d582c Mon Sep 17 00:00:00 2001
> >> From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> >> Date: Tue, 12 Feb 2019 20:12:35 +0900
> >> Subject: [PATCH v3] mm,page_alloc: wait for oom_lock before retrying.
> >>
> >> When many hundreds of threads concurrently triggered a page fault, and
> >> one of them invoked the global OOM killer, the owner of oom_lock is
> >> preempted for minutes because they are rather depriving the owner of
> >> oom_lock of CPU time rather than waiting for the owner of oom_lock to
> >> make progress. We don't want to disable preemption while holding oom_lock
> >> but we want the owner of oom_lock to complete as soon as possible.
> >>
> >> Thus, this patch kills the dangerous assumption that sleeping for one
> >> jiffy is sufficient for allowing the owner of oom_lock to make progress.
> > 
> > What does this prevent any _other_ kernel path or even high priority
> > userspace to preempt the oom killer path? This was the essential
> > question the last time around and I do not see it covered here. I
> 
> Since you already NACKed disabling preemption at
> https://marc.info/?i=20180322114002.GC23100@dhcp22.suse.cz , pointing out
> "even high priority userspace to preempt the oom killer path" is invalid.

Why?

> Since printk() is very slow, dump_header() can become slow, especially when
> dump_tasks() is called. And changing dump_tasks() to use rcu_lock_break()
> does not solve this problem, for this is a problem that once current thread
> released CPU, current thread might be kept preempted for minutes.

dump_tasks might be disabled for those who are concerned about the
overhead but I do not see what your actual point here is.

> Allowing OOM path to be preempted is what you prefer, isn't it?

No, it is just practicality. If you disable preemption you are
immediatelly going to fight with soft lockups.

> Then,
> spending CPU time for something (what you call "any _other_ kernel path") is
> accountable for delaying OOM path. But wasting CPU time when allocating
> threads can do nothing but wait for the owner of oom_lock to complete OOM
> path is not accountable for delaying OOM path.
> 
> Thus, there is nothing to cover for your "I do not see it covered here"
> response, except how to avoid "wasting CPU time when allocating threads
> can do nothing but wait for the owner of oom_lock to complete OOM path".
> 
> > strongly suspect that all these games with the locking is just a
> > pointless tunning for an insane workload without fixing the underlying
> > issue.
> 
> We could even change oom_lock to a local lock inside oom_kill_process(), for
> all threads in a same allocating context will select the same OOM victim
> (unless oom_kill_allocating_task case), and many threads already inside
> oom_kill_process() will prevent themselves from selecting next OOM victim.
> Although this approach wastes some CPU resources for needlessly selecting
> same OOM victim for many times, this approach also can solve this problem.
> 
> It seems that you don't want to admit that "wasting CPU time when allocating
> threads can do nothing but wait for the owner of oom_lock to complete OOM path"
> as the underlying issue. But we can't fix it without throttling direct reclaim
> paths. That's the evidence that this problem is not fixed for many years.

And yet I do not rememeber any _single_ bug report for a real life
workload that would be suffering from this. I am all for a better
throttling on the OOM conditions but what you have been proposing are
hacks at best without any real world workload backing them.

Please try to understand that the OOM path in the current form is quite
complex already and adding more on top without addressing a problem
which real workloads do care about is not really all that attractive.
I have no objections to simple and obviously correct changes in this
area but playing with locking with hard to evaluate side effects is not
something I will ack.
-- 
Michal Hocko
SUSE Labs

     prev parent reply	other threads:[~2019-02-15 12:18 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-13 16:30 Tetsuo Handa
2019-02-13 16:56 ` Michal Hocko
2019-02-15 10:42   ` Tetsuo Handa
2019-02-15 12:18     ` Michal Hocko [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190215121848.GY4525@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox