From: Michal Hocko <mhocko@kernel.org>
To: Yafang Shao <laoar.shao@gmail.com>
Cc: penguin-kernel@i-love.sakura.ne.jp, rientjes@google.com,
akpm@linux-foundation.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm, oom: check memcg margin for parallel oom
Date: Tue, 14 Jul 2020 14:37:26 +0200 [thread overview]
Message-ID: <20200714123726.GI24642@dhcp22.suse.cz> (raw)
In-Reply-To: <1594728512-18969-1-git-send-email-laoar.shao@gmail.com>
On Tue 14-07-20 08:08:32, Yafang Shao wrote:
> The commit 7775face2079 ("memcg: killed threads should not invoke memcg OOM
> killer") resolves the problem that different threads in a multi-threaded
> task doing parallel memcg oom, but it doesn't solve the problem that
> different tasks doing parallel memcg oom.
>
> It may happens that many different tasks in the same memcg are waiting
> oom_lock at the same time, if one of them has already made progress and
> freed enough available memory, the others don't need to trigger the oom
> killer again. By checking memcg margin after hold oom_lock can help
> achieve it.
While the changelog makes sense it I believe it can be improved. I would
use something like the following. Feel free to use its parts.
"
Memcg oom killer invocation is synchronized by the global oom_lock and
tasks are sleeping on the lock while somebody is selecting the victim or
potentially race with the oom_reaper is releasing the victim's memory.
This can result in a pointless oom killer invocation because a waiter
might be racing with the oom_reaper
P1 oom_reaper P2
oom_reap_task mutex_lock(oom_lock)
out_of_memory # no victim because we have one already
__oom_reap_task_mm mute_unlock(oom_lock)
mutex_lock(oom_lock)
set MMF_OOM_SKIP
select_bad_process
# finds a new victim
The page allocator prevents from this race by trying to allocate after
the lock can be acquired (in __alloc_pages_may_oom) which acts as a last
minute check. Moreover page allocator simply doesn't block on the
oom_lock and simply retries the whole reclaim process.
Memcg oom killer should do the last minute check as well. Call
mem_cgroup_margin to do that. Trylock on the oom_lock could be done as
well but this doesn't seem to be necessary at this stage.
"
> Suggested-by: Michal Hocko <mhocko@kernel.org>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
> Cc: David Rientjes <rientjes@google.com>
> ---
> mm/memcontrol.c | 19 +++++++++++++++++--
> 1 file changed, 17 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 1962232..df141e1 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -1560,16 +1560,31 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
> .gfp_mask = gfp_mask,
> .order = order,
> };
> - bool ret;
> + bool ret = true;
>
> if (mutex_lock_killable(&oom_lock))
> return true;
> +
> /*
> * A few threads which were not waiting at mutex_lock_killable() can
> * fail to bail out. Therefore, check again after holding oom_lock.
> */
> - ret = should_force_charge() || out_of_memory(&oc);
> + if (should_force_charge())
> + goto out;
> +
> + /*
> + * Different tasks may be doing parallel oom, so after hold the
> + * oom_lock the task should check the memcg margin again to check
> + * whether other task has already made progress.
> + */
> + if (mem_cgroup_margin(memcg) >= (1 << order))
> + goto out;
Is there any reason why you simply haven't done this? (+ your comment
which is helpful).
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 248e6cad0095..2c176825efe3 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1561,15 +1561,21 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
.order = order,
.chosen_points = LONG_MIN,
};
- bool ret;
+ bool ret = true;
- if (mutex_lock_killable(&oom_lock))
+ if (!mutex_trylock(&oom_lock))
return true;
+
+ if (mem_cgroup_margin(memcg) >= (1 << order))
+ goto unlock;
+
/*
* A few threads which were not waiting at mutex_lock_killable() can
* fail to bail out. Therefore, check again after holding oom_lock.
*/
ret = should_force_charge() || out_of_memory(&oc);
+
+unlock:
mutex_unlock(&oom_lock);
return ret;
}
> +
> + ret = out_of_memory(&oc);
> +
> +out:
> mutex_unlock(&oom_lock);
> +
> return ret;
> }
>
> --
> 1.8.3.1
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2020-07-14 12:37 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-14 12:08 Yafang Shao
2020-07-14 12:37 ` Michal Hocko [this message]
2020-07-14 12:38 ` Michal Hocko
2020-07-14 13:26 ` Yafang Shao
2020-07-14 12:47 ` Tetsuo Handa
2020-07-14 12:57 ` Michal Hocko
2020-07-14 13:25 ` Yafang Shao
2020-07-14 13:34 ` Michal Hocko
2020-07-14 13:40 ` Yafang Shao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200714123726.GI24642@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=laoar.shao@gmail.com \
--cc=linux-mm@kvack.org \
--cc=penguin-kernel@i-love.sakura.ne.jp \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox