From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f175.google.com (mail-pf0-f175.google.com [209.85.192.175]) by kanga.kvack.org (Postfix) with ESMTP id 43C926B0005 for ; Tue, 1 Mar 2016 10:52:17 -0500 (EST) Received: by mail-pf0-f175.google.com with SMTP id 124so53218988pfg.0 for ; Tue, 01 Mar 2016 07:52:17 -0800 (PST) Received: from mail-pf0-f178.google.com (mail-pf0-f178.google.com. [209.85.192.178]) by mx.google.com with ESMTPS id c64si51187962pfd.70.2016.03.01.07.52.16 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 01 Mar 2016 07:52:16 -0800 (PST) Received: by mail-pf0-f178.google.com with SMTP id 124so53218839pfg.0 for ; Tue, 01 Mar 2016 07:52:16 -0800 (PST) Date: Tue, 1 Mar 2016 16:52:12 +0100 From: Michal Hocko Subject: Re: [PATCH] exit: clear TIF_MEMDIE after exit_task_work Message-ID: <20160301155212.GJ9461@dhcp22.suse.cz> References: <1456765329-14890-1-git-send-email-vdavydov@virtuozzo.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1456765329-14890-1-git-send-email-vdavydov@virtuozzo.com> Sender: owner-linux-mm@kvack.org List-ID: To: Vladimir Davydov Cc: Andrew Morton , Tetsuo Handa , David Rientjes , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Michael S. Tsirkin" [CCing vhost-net maintainer] On Mon 29-02-16 20:02:09, Vladimir Davydov wrote: > An mm_struct may be pinned by a file. An example is vhost-net device > created by a qemu/kvm (see vhost_net_ioctl -> vhost_net_set_owner -> > vhost_dev_set_owner). The more I think about that the more I am wondering whether this is actually OK and correct. Why does the driver have to pin the address space? Nothing really prevents from parallel tearing down of the address space anyway so the code cannot expect all the vmas to stay. Would it be enough to pin the mm_struct only? I am not sure I understand the code properly but what prevents from the situation when a VHOST_SET_OWNER caller dies without calling VHOST_RESET_OWNER and so the mm would be pinned indefinitely? [Keeping the reset of the email for reference] > If such process gets OOM-killed, the reference to > its mm_struct will only be released from exit_task_work -> ____fput -> > __fput -> vhost_net_release -> vhost_dev_cleanup, which is called after > exit_mmap, where TIF_MEMDIE is cleared. As a result, we can start > selecting the next victim before giving the last one a chance to free > its memory. In practice, this leads to killing several VMs along with > the fattest one. > > Signed-off-by: Vladimir Davydov > --- > kernel/exit.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/exit.c b/kernel/exit.c > index fd90195667e1..cc50e12165f7 100644 > --- a/kernel/exit.c > +++ b/kernel/exit.c > @@ -434,8 +434,6 @@ static void exit_mm(struct task_struct *tsk) > task_unlock(tsk); > mm_update_next_owner(mm); > mmput(mm); > - if (test_thread_flag(TIF_MEMDIE)) > - exit_oom_victim(tsk); > } > > static struct task_struct *find_alive_thread(struct task_struct *p) > @@ -746,6 +744,8 @@ void do_exit(long code) > disassociate_ctty(1); > exit_task_namespaces(tsk); > exit_task_work(tsk); > + if (test_thread_flag(TIF_MEMDIE)) > + exit_oom_victim(tsk); > exit_thread(); > > /* > -- > 2.1.4 -- Michal Hocko SUSE Labs -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org