linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kohli, Gaurav" <gkohli@codeaurora.org>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
	David Rientjes <rientjes@google.com>
Cc: akpm@linux-foundation.org, mhocko@suse.com,
	kirill.shutemov@linux.intel.com, aarcange@redhat.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-arm-msm@vger.kernel.org
Subject: Re: [PATCH] mm: oom: Fix race condition between oom_badness and do_exit of task
Date: Fri, 9 Mar 2018 12:41:44 +0530	[thread overview]
Message-ID: <14ba6c44-d444-bd0a-0bac-0c6851b19344@codeaurora.org> (raw)
In-Reply-To: <d73682f9-f214-64c4-ce09-fd1ff3ffe252@I-love.SAKURA.ne.jp>

[-- Attachment #1: Type: text/plain, Size: 3735 bytes --]

On 3/8/2018 7:35 PM, Tetsuo Handa wrote:

> On 2018/03/08 13:51, Kohli, Gaurav wrote:
>> On 3/8/2018 2:26 AM, David Rientjes wrote:
>>
>>> On Wed, 7 Mar 2018, Gaurav Kohli wrote:
>>>
>>>> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
>>>> index 6fd9773..5f4cc4b 100644
>>>> --- a/mm/oom_kill.c
>>>> +++ b/mm/oom_kill.c
>>>> @@ -114,9 +114,11 @@ struct task_struct *find_lock_task_mm(struct task_struct *p)
>>>>  A  A A A A A  for_each_thread(p, t) {
>>>>  A A A A A A A A A  task_lock(t);
>>>> +A A A A A A A  get_task_struct(t);
>>>>  A A A A A A A A A  if (likely(t->mm))
>>>>  A A A A A A A A A A A A A  goto found;
>>>>  A A A A A A A A A  task_unlock(t);
>>>> +A A A A A A A  put_task_struct(t);
>>>>  A A A A A  }
>>>>  A A A A A  t = NULL;
>>>>  A  found:
>>> We hold rcu_read_lock() here, so perhaps only do get_task_struct() before
>>> doing rcu_read_unlock() and we have a non-NULL t?
>> Here rcu_read_lock will not help, as our task may change due to below algo:
>>
>> for_each_thread(p, t) {
>>  A A A A A A A A  task_lock(t);
>> +A A A A A A A  get_task_struct(t);
>>  A A A A A A A A  if (likely(t->mm))
>>  A A A A A A A A A A A A  goto found;
>>  A A A A A A A A  task_unlock(t);
>> +A A A A A A A  put_task_struct(t)
>>
>>
>> So only we can increase usage counter here only at the current task.
> static int proc_single_show(struct seq_file *m, void *v)
> {
> 	struct inode *inode = m->private;
> 	struct pid_namespace *ns;
> 	struct pid *pid;
> 	struct task_struct *task;
> 	int ret;
>
> 	ns = inode->i_sb->s_fs_info;
> 	pid = proc_pid(inode);
> 	task = get_pid_task(pid, PIDTYPE_PID); /* get_task_struct() is called upon success. */
> 	if (!task)
> 		return -ESRCH;
>
> 	ret = PROC_I(inode)->op.proc_show(m, ns, pid, task);
>
> 	put_task_struct(task);
> 	return ret;
> }
>
> static int proc_oom_score(struct seq_file *m, struct pid_namespace *ns,
> 			  struct pid *pid, struct task_struct *task)
> {
> 	unsigned long totalpages = totalram_pages + total_swap_pages;
> 	unsigned long points = 0;
>
> 	points = oom_badness(task, NULL, NULL, totalpages) *
> 			     1000 / totalpages; /* task->usage > 0 due to proc_single_show() */
> 	seq_printf(m, "%lu\n", points);
>
> 	return 0;
> }
>
> struct task_struct *find_lock_task_mm(struct task_struct *p) /* p->usage > 0 */
> {
> 	struct task_struct *t;
>
> 	rcu_read_lock();
>
> 	for_each_thread(p, t) {
> 		task_lock(t);
> 		if (likely(t->mm))
> 			goto found;
> 		task_unlock(t);
> 	}
> 	t = NULL;
> found:
> 	rcu_read_unlock();
>
> 	return t; /* t->usage > 0 even if t != p because t->mm != NULL */
> }
>
> t->alloc_lock is still held when leaving find_lock_task_mm(), which means
> that t->mm != NULL. But nothing prevents t from setting t->mm = NULL at
> exit_mm() from do_exit() and calling exit_creds() from __put_task_struct(t)
> after task_unlock(t) is called. Seems difficult to trigger race window. Maybe
> something has preempted because oom_badness() becomes outside of RCU grace
> period upon leaving find_lock_task_mm() when called from proc_oom_score().

Hi Tetsuo,

Yes it is not easy to reproduce seen twice till now and i agree with your analysis. But David has already fixing this in different way, So that also looks better to me:

https://patchwork.kernel.org/patch/10265641/

But if need to keep that code, So we have to bump up the task reference that's only i can think of now.

> --
> To unsubscribe from this list: send the line "unsubscribe linux-arm-msm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
-- 
Qualcomm India Private Limited, on behalf of Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project.


[-- Attachment #2: Type: text/html, Size: 39898 bytes --]

  reply	other threads:[~2018-03-09  7:11 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-07 12:57 Gaurav Kohli
2018-03-07 20:56 ` David Rientjes
2018-03-08  4:51   ` Kohli, Gaurav
2018-03-08 14:05     ` Tetsuo Handa
2018-03-09  7:11       ` Kohli, Gaurav [this message]
2018-03-09 10:48         ` Tetsuo Handa
2018-03-09 12:04           ` Kohli, Gaurav
2018-03-09 12:18             ` Tetsuo Handa
2018-03-13 13:37           ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=14ba6c44-d444-bd0a-0bac-0c6851b19344@codeaurora.org \
    --to=gkohli@codeaurora.org \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=penguin-kernel@I-love.SAKURA.ne.jp \
    --cc=rientjes@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox