From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from psmtp.com (na3sys010amx196.postini.com [74.125.245.196]) by kanga.kvack.org (Postfix) with SMTP id 428EB6B0002 for ; Wed, 24 Apr 2013 11:22:39 -0400 (EDT) Date: Wed, 24 Apr 2013 17:22:36 +0200 From: Michal Hocko Subject: Re: [PATCH] oom: add pending SIGKILL check for chosen victim Message-ID: <20130424152236.GB7600@dhcp22.suse.cz> References: <1366643184-3627-1-git-send-email-dserrg@gmail.com> <20130422195138.GB31098@dhcp22.suse.cz> <20130423192614.c8621a7fe1b5b3e0a2ebf74a@gmail.com> <20130423155638.GJ8001@dhcp22.suse.cz> <20130424145514.GA24997@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20130424145514.GA24997@redhat.com> Sender: owner-linux-mm@kvack.org List-ID: To: Oleg Nesterov Cc: dserrg , linux-mm@kvack.org, Andrew Morton , David Rientjes , KAMEZAWA Hiroyuki , Sha Zhengju On Wed 24-04-13 16:55:14, Oleg Nesterov wrote: > On 04/23, Michal Hocko wrote: > > > > [CCing Oleg] > > > > On Tue 23-04-13 19:26:14, dserrg wrote: > > > On Mon, 22 Apr 2013 21:51:38 +0200 > > > > > > Yes, we are holding tasklist_lock when iterating, but the thread can be deleted > > > from thread_group list _before_ that. In this case, while_each_thread loop exit > > > condition will never be true. > > Yes, while_each_thread() should be only used if know that ->thread_group > list is valid. > > For example, if you do find_task_by_vpid() under rcu_read_lock() > while_each_thread() is fine _unless_ you drop rcu lock in between. > > This is the common mistake, people often forget about this. > > But I can't understand how this patch can fix the problem, I think it > can't. > > From the changelog: > > When SIGKILL is sent to a task, it's also sent to all tasks in the same > threadgroup. This information can be used to prevent triggering further > oom killers for this threadgroup and avoid the infinite loop. > ^^^^^^^^^^^^^^^^^^^^^^^ > > How?? I guess it assumes that fatal_signal_pending() is still true even when the process is unhashed already. Which sounds like a workaround to me. > > Oleg, is there anything that would prevent from this race? Maybe we need > > to call thread_group_empty before? > > You need to check, say, pid_alive(). Or PF_EXITING. > > For example oom_kill_process() looks wrong. It does check PF_EXITING > before while_each_thread(), but this is racy because it should check > it under tasklist_lock. Thanks for the clarification. I have tried to deduce the locking rules but failed... > So I think that oom_kill_process() needs something like below, but > this code really needs the cleanups. > > Oleg. > > > --- x/mm/oom_kill.c > +++ x/mm/oom_kill.c > @@ -436,6 +436,14 @@ void oom_kill_process(struct task_struct > * parent. This attempts to lose the minimal amount of work done while > * still freeing memory. > */ > + rcu_read_lock(); > + if (!pid_alive(p)) { > + rcu_read_unlock(); > + set_tsk_thread_flag(p, TIF_MEMDIE); > + put_task_struct(p); > + return; > + } > + > read_lock(&tasklist_lock); > do { > list_for_each_entry(child, &t->children, sibling) { > @@ -458,7 +466,6 @@ void oom_kill_process(struct task_struct > } while_each_thread(p, t); > read_unlock(&tasklist_lock); > > - rcu_read_lock(); > p = find_lock_task_mm(victim); > if (!p) { > rcu_read_unlock(); Makes sense to me. -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org