From: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
To: Michal Hocko <mhocko@kernel.org>, Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, David Rientjes <rientjes@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH 2/2] memcg: do not report racy no-eligible OOM tasks
Date: Tue, 23 Oct 2018 21:33:43 +0900 [thread overview]
Message-ID: <a55e70bd-dc5f-9a11-72e6-7cd7b3b48ab7@I-love.SAKURA.ne.jp> (raw)
In-Reply-To: <20181023121055.GS18839@dhcp22.suse.cz>
On 2018/10/23 21:10, Michal Hocko wrote:
> On Tue 23-10-18 13:42:46, Michal Hocko wrote:
>> On Tue 23-10-18 10:01:08, Tetsuo Handa wrote:
>>> Michal Hocko wrote:
>>>> On Mon 22-10-18 20:45:17, Tetsuo Handa wrote:
>>>>>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>>>>>> index e79cb59552d9..a9dfed29967b 100644
>>>>>> --- a/mm/memcontrol.c
>>>>>> +++ b/mm/memcontrol.c
>>>>>> @@ -1380,10 +1380,22 @@ static bool mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask,
>>>>>> .gfp_mask = gfp_mask,
>>>>>> .order = order,
>>>>>> };
>>>>>> - bool ret;
>>>>>> + bool ret = true;
>>>>>>
>>>>>> mutex_lock(&oom_lock);
>>>>>> +
>>>>>> + /*
>>>>>> + * multi-threaded tasks might race with oom_reaper and gain
>>>>>> + * MMF_OOM_SKIP before reaching out_of_memory which can lead
>>>>>> + * to out_of_memory failure if the task is the last one in
>>>>>> + * memcg which would be a false possitive failure reported
>>>>>> + */
>>>>>> + if (tsk_is_oom_victim(current))
>>>>>> + goto unlock;
>>>>>> +
>>>>>
>>>>> This is not wrong but is strange. We can use mutex_lock_killable(&oom_lock)
>>>>> so that any killed threads no longer wait for oom_lock.
>>>>
>>>> tsk_is_oom_victim is stronger because it doesn't depend on
>>>> fatal_signal_pending which might be cleared throughout the exit process.
>>>>
>>>
>>> I still want to propose this. No need to be memcg OOM specific.
>>
>> Well, I maintain what I've said [1] about simplicity and specific fix
>> for a specific issue. Especially in the tricky code like this where all
>> the consequences are far more subtle than they seem to be.
>>
>> This is obviously a matter of taste but I don't see much point discussing
>> this back and forth for ever. Unless there is a general agreement that
>> the above is less appropriate then I am willing to consider a different
>> change but I simply do not have energy to nit pick for ever.
>>
>> [1] http://lkml.kernel.org/r/20181022134315.GF18839@dhcp22.suse.cz
>
> In other words. Having a memcg specific fix means, well, a memcg
> maintenance burden. Like any other memcg specific oom decisions we
> already have. So are you OK with that Johannes or you would like to see
> a more generic fix which might turn out to be more complex?
>
I don't know what "that Johannes" refers to.
If you don't want to affect SysRq-OOM and pagefault-OOM cases,
are you OK with having a global-OOM specific fix?
mm/page_alloc.c | 11 +++++++++++
1 file changed, 11 insertions(+)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e2ef1c1..f59f029 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3518,6 +3518,17 @@ void warn_alloc(gfp_t gfp_mask, nodemask_t *nodemask, const char *fmt, ...)
if (gfp_mask & __GFP_THISNODE)
goto out;
+ /*
+ * It is possible that multi-threaded OOM victims get
+ * task_will_free_mem(current) == false when the OOM reaper quickly
+ * set MMF_OOM_SKIP. But since we know that tsk_is_oom_victim() == true
+ * tasks won't loop forever (unless it is a __GFP_NOFAIL allocation
+ * request), we don't need to select next OOM victim.
+ */
+ if (tsk_is_oom_victim(current) && !(gfp_mask & __GFP_NOFAIL)) {
+ *did_some_progress = 1;
+ goto out;
+ }
/* Exhausted what can be done so it's blame time */
if (out_of_memory(&oc) || WARN_ON_ONCE(gfp_mask & __GFP_NOFAIL)) {
*did_some_progress = 1;
--
1.8.3.1
next prev parent reply other threads:[~2018-10-23 12:34 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-22 7:13 [RFC PATCH 0/2] oom, memcg: do not report racy no-eligible OOM Michal Hocko
2018-10-22 7:13 ` [RFC PATCH 1/2] mm, oom: marks all killed tasks as oom victims Michal Hocko
2018-10-22 7:58 ` Tetsuo Handa
2018-10-22 8:48 ` Michal Hocko
2018-10-22 9:42 ` Tetsuo Handa
2018-10-22 10:43 ` Michal Hocko
2018-10-22 10:56 ` Tetsuo Handa
2018-10-22 11:12 ` Michal Hocko
2018-10-22 11:16 ` [RFC PATCH v2 " Michal Hocko
2018-10-22 7:13 ` [RFC PATCH 2/2] memcg: do not report racy no-eligible OOM tasks Michal Hocko
2018-10-22 11:45 ` Tetsuo Handa
2018-10-22 12:03 ` Michal Hocko
2018-10-22 13:20 ` Tetsuo Handa
2018-10-22 13:43 ` Michal Hocko
2018-10-22 15:12 ` Tetsuo Handa
2018-10-23 1:01 ` Tetsuo Handa
2018-10-23 11:42 ` Michal Hocko
2018-10-23 12:10 ` Michal Hocko
2018-10-23 12:33 ` Tetsuo Handa [this message]
2018-10-23 12:48 ` Michal Hocko
2018-10-26 14:25 ` Johannes Weiner
2018-10-26 19:25 ` Michal Hocko
2018-10-26 19:33 ` Michal Hocko
2018-10-27 1:10 ` Tetsuo Handa
2018-11-06 9:44 ` Tetsuo Handa
2018-11-06 12:42 ` Michal Hocko
2018-11-07 9:45 ` Tetsuo Handa
2018-11-07 10:08 ` Michal Hocko
2018-12-07 12:43 ` Tetsuo Handa
2018-12-12 10:23 ` Tetsuo Handa
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a55e70bd-dc5f-9a11-72e6-7cd7b3b48ab7@I-love.SAKURA.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rientjes@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox