From: Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>
To: Michal Hocko <mhocko@kernel.org>
Cc: ytk.lee@samsung.com, "linux-mm@kvack.org" <linux-mm@kvack.org>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
Oleg Nesterov <oleg@redhat.com>,
David Rientjes <rientjes@google.com>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] mm, oom_adj: avoid meaningless loop to find processes sharing mm
Date: Tue, 9 Oct 2018 19:00:44 +0900 [thread overview]
Message-ID: <df4b029c-16b4-755f-2672-d7ec116f78ba@i-love.sakura.ne.jp> (raw)
In-Reply-To: <20181009075015.GC8528@dhcp22.suse.cz>
On 2018/10/09 16:50, Michal Hocko wrote:
> On Tue 09-10-18 08:35:41, Michal Hocko wrote:
>> [I have only now noticed that the patch has been reposted]
>>
>> On Mon 08-10-18 18:27:39, Tetsuo Handa wrote:
>>> On 2018/10/08 17:38, Yong-Taek Lee wrote:
> [...]
>>>> Thank you for your suggestion. But i think it would be better to seperate to 2 issues. How about think these
>>>> issues separately because there are no dependency between race issue and my patch. As i already explained,
>>>> for_each_process path is meaningless if there is only one thread group with many threads(mm_users > 1 but
>>>> no other thread group sharing same mm). Do you have any other idea to avoid meaningless loop ?
>>>
>>> Yes. I suggest reverting commit 44a70adec910d692 ("mm, oom_adj: make sure processes
>>> sharing mm have same view of oom_score_adj") and commit 97fd49c2355ffded ("mm, oom:
>>> kill all tasks sharing the mm").
>>
>> This would require a lot of other work for something as border line as
>> weird threading model like this. I will think about something more
>> appropriate - e.g. we can take mmap_sem for read while doing this check
>> and that should prevent from races with [v]fork.
>
> Not really. We do not even take the mmap_sem when CLONE_VM. So this is
> not the way. Doing a proper synchronization seems much harder. So let's
> consider what is the worst case scenario. We would basically hit a race
> window between copy_signal and copy_mm and the only relevant case would
> be OOM_SCORE_ADJ_MIN which wouldn't propagate to the new "thread".
The "between copy_signal() and copy_mm()" race window is merely whether we need
to run for_each_process() loop. The race window is much larger than that; it is
between "copy_signal() copies oom_score_adj/oom_score_adj_min" and "the created
thread becomes accessible from for_each_process() loop".
> OOM
> killer could then pick up the "thread" and kill it along with the whole
> process group sharing the mm.
Just reverting commit 44a70adec910d692 and commit 97fd49c2355ffded is
sufficient.
> Well, that is unfortunate indeed and it
> breaks the OOM_SCORE_ADJ_MIN contract. There are basically two ways here
> 1) do not care and encourage users to use a saner way to set
> OOM_SCORE_ADJ_MIN because doing that externally is racy anyway e.g.
> setting it before [v]fork & exec. Btw. do we know about an actual user
> who would care?
I'm not talking about [v]fork & exec. Why are you talking about [v]fork & exec ?
> 2) add OOM_SCORE_ADJ_MIN and do not kill tasks sharing mm and do not
> reap the mm in the rare case of the race.
That is no problem. The mistake we made in 4.6 was that we updated oom_score_adj
to -1000 (and allowed unprivileged users to OOM-lockup the system). Now that we set
MMF_OOM_SKIP, there is no need to worry about "oom_score_adj != -1000" thread group
and "oom_score_adj == -1000" thread group sharing the same mm. Since updating
oom_score_adj to -1000 is a privileged operation, it is administrator's wish if
such case happened; the kernel should respect the administrator's wish.
>
> I would prefer the firs but if this race really has to be addressed then
> the 2 sounds more reasonable than the wholesale revert.
>
next prev parent reply other threads:[~2018-10-09 10:01 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <CGME20181008011931epcms1p82dd01b7e5c067ea99946418bc97de46a@epcms1p8>
2018-10-08 1:19 ` Yong-Taek Lee
2018-10-08 2:52 ` Tetsuo Handa
[not found] ` <CGME20181008011931epcms1p82dd01b7e5c067ea99946418bc97de46a@epcms1p5>
2018-10-08 6:14 ` Yong-Taek Lee
2018-10-08 6:22 ` Tetsuo Handa
[not found] ` <CGME20181008011931epcms1p82dd01b7e5c067ea99946418bc97de46a@epcms1p2>
2018-10-08 8:38 ` Yong-Taek Lee
2018-10-08 9:27 ` Tetsuo Handa
2018-10-09 6:35 ` Michal Hocko
2018-10-09 7:50 ` Michal Hocko
2018-10-09 10:00 ` Tetsuo Handa [this message]
2018-10-09 11:10 ` Michal Hocko
2018-10-09 12:52 ` Tetsuo Handa
2018-10-09 12:58 ` Michal Hocko
2018-10-09 13:14 ` Tetsuo Handa
2018-10-09 13:26 ` Michal Hocko
2018-10-09 13:51 ` Tetsuo Handa
2018-10-09 14:09 ` Michal Hocko
2018-10-09 8:02 ` Michal Hocko
[not found] <CGME20181005063208epcms1p22959cd2f771ad017996e2b18266791ea@epcms1p2>
2018-10-05 6:32 ` Yong-Taek Lee
2018-10-09 6:23 ` Michal Hocko
2018-10-09 8:03 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=df4b029c-16b4-755f-2672-d7ec116f78ba@i-love.sakura.ne.jp \
--to=penguin-kernel@i-love.sakura.ne.jp \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=oleg@redhat.com \
--cc=rientjes@google.com \
--cc=torvalds@linux-foundation.org \
--cc=vdavydov.dev@gmail.com \
--cc=ytk.lee@samsung.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox