From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pa0-f51.google.com (mail-pa0-f51.google.com [209.85.220.51]) by kanga.kvack.org (Postfix) with ESMTP id B8EBB82F69 for ; Mon, 22 Feb 2016 19:47:25 -0500 (EST) Received: by mail-pa0-f51.google.com with SMTP id fy10so99720757pac.1 for ; Mon, 22 Feb 2016 16:47:25 -0800 (PST) Received: from mail-pa0-x236.google.com (mail-pa0-x236.google.com. [2607:f8b0:400e:c03::236]) by mx.google.com with ESMTPS id 65si43151181pfi.145.2016.02.22.16.47.25 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Feb 2016 16:47:25 -0800 (PST) Received: by mail-pa0-x236.google.com with SMTP id ho8so102529626pac.2 for ; Mon, 22 Feb 2016 16:47:25 -0800 (PST) Date: Mon, 22 Feb 2016 16:47:23 -0800 (PST) From: David Rientjes Subject: Re: [PATCH] mm,oom: remove shortcuts for SIGKILL and PF_EXITING cases In-Reply-To: <1456038869-7874-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> Message-ID: References: <1456038869-7874-1-git-send-email-penguin-kernel@I-love.SAKURA.ne.jp> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: owner-linux-mm@kvack.org List-ID: To: Tetsuo Handa Cc: mhocko@kernel.org, akpm@linux-foundation.org, hannes@cmpxchg.org, mgorman@suse.de, oleg@redhat.com, torvalds@linux-foundation.org, hughd@google.com, andrea@kernel.org, riel@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org On Sun, 21 Feb 2016, Tetsuo Handa wrote: > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index ae8b81c..390ec2c 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1253,16 +1253,6 @@ static void mem_cgroup_out_of_memory(struct mem_cgroup *memcg, gfp_t gfp_mask, > > mutex_lock(&oom_lock); > > - /* > - * If current has a pending SIGKILL or is exiting, then automatically > - * select it. The goal is to allow it to allocate so that it may > - * quickly exit and free its memory. > - */ > - if (fatal_signal_pending(current) || task_will_free_mem(current)) { > - mark_oom_victim(current); > - goto unlock; > - } > - > check_panic_on_oom(&oc, CONSTRAINT_MEMCG, memcg); > totalpages = mem_cgroup_get_limit(memcg) ? : 1; > for_each_mem_cgroup_tree(iter, memcg) { > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index d7bb9c1..5e8563a 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -684,19 +684,6 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, > DEFAULT_RATELIMIT_BURST); > bool can_oom_reap = true; > > - /* > - * If the task is already exiting, don't alarm the sysadmin or kill > - * its children or threads, just set TIF_MEMDIE so it can die quickly > - */ > - task_lock(p); > - if (p->mm && task_will_free_mem(p)) { > - mark_oom_victim(p); > - task_unlock(p); > - put_task_struct(p); > - return; > - } > - task_unlock(p); > - > if (__ratelimit(&oom_rs)) > dump_header(oc, p, memcg); > > @@ -759,20 +746,15 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, > task_unlock(victim); > > /* > - * Kill all user processes sharing victim->mm in other thread groups, if > - * any. They don't get access to memory reserves, though, to avoid > - * depletion of all memory. This prevents mm->mmap_sem livelock when an > - * oom killed thread cannot exit because it requires the semaphore and > - * its contended by another thread trying to allocate memory itself. > - * That thread will now get access to memory reserves since it has a > - * pending fatal signal. > + * Kill all user processes sharing victim->mm. This reduces possibility > + * of hitting mm->mmap_sem livelock when an oom killed thread cannot > + * exit because it requires the semaphore and its contended by another > + * thread trying to allocate memory itself. > */ > rcu_read_lock(); > for_each_process(p) { > if (!process_shares_mm(p, mm)) > continue; > - if (same_thread_group(p, victim)) > - continue; > if (unlikely(p->flags & PF_KTHREAD) || is_global_init(p) || > p->signal->oom_score_adj == OOM_SCORE_ADJ_MIN) { > /* > @@ -784,6 +766,12 @@ void oom_kill_process(struct oom_control *oc, struct task_struct *p, > continue; > } > do_send_sig_info(SIGKILL, SEND_SIG_FORCED, p, true); > + for_each_thread(p, t) { > + task_lock(t); > + if (t->mm) > + mark_oom_victim(t); > + task_unlock(t); > + } > } > rcu_read_unlock(); > > @@ -860,20 +848,6 @@ bool out_of_memory(struct oom_control *oc) > return true; > > /* > - * If current has a pending SIGKILL or is exiting, then automatically > - * select it. The goal is to allow it to allocate so that it may > - * quickly exit and free its memory. > - * > - * But don't select if current has already released its mm and cleared > - * TIF_MEMDIE flag at exit_mm(), otherwise an OOM livelock may occur. > - */ > - if (current->mm && > - (fatal_signal_pending(current) || task_will_free_mem(current))) { > - mark_oom_victim(current); > - return true; > - } > - > - /* > * Check if there were limitations on the allocation (only relevant for > * NUMA) that may require different handling. > */ No, NACK. You cannot prohibit an exiting process from gaining access to memory reserves and randomly killing another process without additional chances of a livelock. The goal is for an exiting or killed process to be able to exit so it can free its memory, not kill additional processes. Please start trimming your cc list, I seriously doubt all these people are interested in this thread. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org