From: Michal Hocko <mhocko@suse.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: linux-mm@kvack.org
Subject: Re: [PATCH] mm/page_alloc: Wait for oom_lock before retrying.
Date: Wed, 7 Dec 2016 09:15:56 +0100 [thread overview]
Message-ID: <20161207081555.GB17136@dhcp22.suse.cz> (raw)
In-Reply-To: <1481020439-5867-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp>
On Tue 06-12-16 19:33:59, Tetsuo Handa wrote:
> If the OOM killer is invoked when many threads are looping inside the
> page allocator, it is possible that the OOM killer is preempted by other
> threads.
Hmm, the only way I can see this would happen is when the task which
actually manages to take the lock is not invoking the OOM killer for
whatever reason. Is this what happens in your case? Are you able to
trigger this reliably?
> As a result, the OOM killer is unable to send SIGKILL to OOM
> victims and/or wake up the OOM reaper by releasing oom_lock for minutes
> because other threads consume a lot of CPU time for pointless direct
> reclaim.
>
> ----------
> [ 2802.635229] Killed process 7267 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> [ 2802.644296] oom_reaper: reaped process 7267 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 2802.650237] Out of memory: Kill process 7268 (a.out) score 999 or sacrifice child
> [ 2803.653052] Killed process 7268 (a.out) total-vm:4176kB, anon-rss:84kB, file-rss:0kB, shmem-rss:0kB
> [ 2804.426183] oom_reaper: reaped process 7268 (a.out), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
> [ 2804.432524] Out of memory: Kill process 7269 (a.out) score 999 or sacrifice child
> [ 2805.349380] a.out: page allocation stalls for 10047ms, order:0, mode:0x24280ca(GFP_HIGHUSER_MOVABLE|__GFP_ZERO)
> [ 2805.349383] CPU: 2 PID: 7243 Comm: a.out Not tainted 4.9.0-rc8 #62
> (...snipped...)
> [ 3540.977499] a.out 7269 22716.893359 5272 120
> [ 3540.977499] 0.000000 1447.601063 0.000000
> [ 3540.977499] 0 0
> [ 3540.977500] /autogroup-155
> ----------
>
> This patch adds extra sleeps which is effectively equivalent to
>
> if (mutex_lock_killable(&oom_lock) == 0)
> mutex_unlock(&oom_lock);
>
> before retrying allocation at __alloc_pages_may_oom() so that the
> OOM killer is not preempted by other threads waiting for the OOM
> killer/reaper to reclaim memory. Since the OOM reaper grabs oom_lock
> due to commit e2fe14564d3316d1 ("oom_reaper: close race with exiting
> task"), waking up other threads before the OOM reaper is woken up by
> directly waiting for oom_lock might not help so much.
So, why don't you simply s@mutex_trylock@mutex_lock_killable@ then?
The trylock is simply an optimistic heuristic to retry while the memory
is being freed. Making this part sync might help for the case you are
seeing.
> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
> ---
> mm/page_alloc.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 51cbe1e..e5c1102 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -3060,6 +3060,7 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
> .order = order,
> };
> struct page *page;
> + static bool wait_more;
>
> *did_some_progress = 0;
>
> @@ -3070,6 +3071,9 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
> if (!mutex_trylock(&oom_lock)) {
> *did_some_progress = 1;
> schedule_timeout_uninterruptible(1);
> + while (wait_more)
> + if (schedule_timeout_killable(1) < 0)
> + break;
> return NULL;
> }
>
> @@ -3109,6 +3113,7 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
> if (gfp_mask & __GFP_THISNODE)
> goto out;
> }
> + wait_more = true;
> /* Exhausted what can be done so it's blamo time */
> if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
> *did_some_progress = 1;
> @@ -3125,6 +3130,7 @@ void warn_alloc(gfp_t gfp_mask, const char *fmt, ...)
> ALLOC_NO_WATERMARKS, ac);
> }
> }
> + wait_more = false;
> out:
> mutex_unlock(&oom_lock);
> return page;
This is a joke, isn't it? Seriously, this made my eyes bleed.
violent NAK!
--
Michal Hocko
SUSE Labs
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-12-07 8:16 UTC|newest]
Thread overview: 74+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-06 10:33 Tetsuo Handa
2016-12-07 8:15 ` Michal Hocko [this message]
2016-12-07 15:29 ` Tetsuo Handa
2016-12-08 8:20 ` Vlastimil Babka
2016-12-08 11:00 ` Tetsuo Handa
2016-12-08 13:32 ` Michal Hocko
2016-12-08 16:18 ` Sergey Senozhatsky
2016-12-08 13:27 ` Michal Hocko
2016-12-09 14:23 ` Tetsuo Handa
2016-12-09 14:46 ` Michal Hocko
2016-12-10 11:24 ` Tetsuo Handa
2016-12-12 9:07 ` Michal Hocko
2016-12-12 11:49 ` Petr Mladek
2016-12-12 13:00 ` Michal Hocko
2016-12-12 14:05 ` Tetsuo Handa
2016-12-13 1:06 ` Sergey Senozhatsky
2016-12-12 12:12 ` Tetsuo Handa
2016-12-12 12:55 ` Michal Hocko
2016-12-12 13:19 ` Michal Hocko
2016-12-13 12:06 ` Tetsuo Handa
2016-12-13 17:06 ` Michal Hocko
2016-12-14 11:37 ` Tetsuo Handa
2016-12-14 12:42 ` Michal Hocko
2016-12-14 16:36 ` Tetsuo Handa
2016-12-14 18:18 ` Michal Hocko
2016-12-15 10:21 ` Tetsuo Handa
2016-12-19 11:25 ` Tetsuo Handa
2016-12-19 12:27 ` Sergey Senozhatsky
2016-12-20 15:39 ` Sergey Senozhatsky
2016-12-22 10:27 ` Tetsuo Handa
2016-12-22 10:53 ` Petr Mladek
2016-12-22 13:40 ` Sergey Senozhatsky
2016-12-22 13:33 ` Tetsuo Handa
2016-12-22 19:24 ` Michal Hocko
2016-12-24 6:25 ` Tetsuo Handa
2016-12-26 11:49 ` Michal Hocko
2016-12-27 10:39 ` Tetsuo Handa
2016-12-27 10:57 ` Michal Hocko
2016-12-22 13:42 ` Sergey Senozhatsky
2016-12-22 14:01 ` Tetsuo Handa
2016-12-22 14:09 ` Sergey Senozhatsky
2016-12-22 14:30 ` Sergey Senozhatsky
2016-12-26 10:54 ` Tetsuo Handa
2016-12-26 11:34 ` Sergey Senozhatsky
2017-01-12 13:10 ` Petr Mladek
2017-01-13 2:52 ` Sergey Senozhatsky
2017-01-13 3:53 ` Sergey Senozhatsky
2017-01-13 11:15 ` Petr Mladek
2017-01-13 11:14 ` Petr Mladek
2017-01-12 14:18 ` Petr Mladek
2017-01-13 2:28 ` Sergey Senozhatsky
2017-01-13 11:03 ` Petr Mladek
2017-01-13 11:50 ` Sergey Senozhatsky
2017-01-13 12:15 ` Petr Mladek
2016-12-26 11:41 ` Sergey Senozhatsky
2017-01-13 14:03 ` Petr Mladek
2016-12-15 1:11 ` Sergey Senozhatsky
2016-12-15 6:35 ` Michal Hocko
2016-12-15 10:16 ` Petr Mladek
2016-12-14 9:37 ` Petr Mladek
2016-12-14 10:20 ` Sergey Senozhatsky
2016-12-14 11:01 ` Petr Mladek
2016-12-14 12:23 ` Sergey Senozhatsky
2016-12-14 12:47 ` Petr Mladek
2016-12-14 10:26 ` Michal Hocko
2016-12-15 7:34 ` Sergey Senozhatsky
2016-12-14 11:37 ` Tetsuo Handa
2016-12-14 12:36 ` Petr Mladek
2016-12-14 12:44 ` Michal Hocko
2016-12-14 13:36 ` Tetsuo Handa
2016-12-14 13:52 ` Michal Hocko
2016-12-14 12:50 ` Sergey Senozhatsky
2016-12-12 14:59 ` Tetsuo Handa
2016-12-12 15:55 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161207081555.GB17136@dhcp22.suse.cz \
--to=mhocko@suse.com \
--cc=linux-mm@kvack.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox