From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68188C433F5 for ; Wed, 15 Dec 2021 16:18:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 691E06B007D; Wed, 15 Dec 2021 10:53:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6410E6B007E; Wed, 15 Dec 2021 10:53:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 530956B0080; Wed, 15 Dec 2021 10:53:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0200.hostedemail.com [216.40.44.200]) by kanga.kvack.org (Postfix) with ESMTP id 44EDD6B007D for ; Wed, 15 Dec 2021 10:53:44 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 09E8477804 for ; Wed, 15 Dec 2021 15:53:34 +0000 (UTC) X-FDA: 78920473548.28.BA9EDB3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf20.hostedemail.com (Postfix) with ESMTP id EC4821C0012 for ; Wed, 15 Dec 2021 15:53:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1639583612; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Abx6XVTZxeI1Y9sqCGZVk3LIO7eRn0VFyYMe2HXMnBc=; b=ATbc0FhCw5O2AAwaiKHTWe4CtE3VgSUDjgEhflq8S5Jj9gfQaZ8gT+y5RJsyjhqX8cWhdU JhF3byd9+i3EpI+ca9GERcN0uxjBUU1HgRuE5nLBE55V0VMVv7hA6NhnlxgBT7nchTl1hy OhShFtSjZ2d0IXn38Uc38UazO4bhatM= Received: from mail-yb1-f200.google.com (mail-yb1-f200.google.com [209.85.219.200]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-671-8WORhvmmNQm4d2_pz-PaPw-1; Wed, 15 Dec 2021 10:53:31 -0500 X-MC-Unique: 8WORhvmmNQm4d2_pz-PaPw-1 Received: by mail-yb1-f200.google.com with SMTP id w5-20020a25ac05000000b005c55592df4dso44052787ybi.12 for ; Wed, 15 Dec 2021 07:53:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Abx6XVTZxeI1Y9sqCGZVk3LIO7eRn0VFyYMe2HXMnBc=; b=DTM9o/tgKoYOxRTLIltkNsWlr5YNe91M7GRvritw7nEbjm0H4uDBexNAE9MeUGqLZl MnbLFq+Cu9qZjvneLjGRbS7af16tedb/WUepndBqIJn23RLF0Du9hRSekXA3XmsKdpcv C2qq11btYf91DgyYSfXFazw4AkEfK/DEqfZLOLwVVGmBRW/84nlAbVqsAtpSu93WE78O gTBXFTumWgzM336XAjCkVT29sNlyPC1XotC7sQeF5t5E2xPPHKEaBBhexJ3dL8/tPUhG a4NySsRT50T4POF4q4N4ML34C/7r0I9F6WfiDwQ0qi/WiHzIceOdq/Vic5uqKC2PF9RK J59Q== X-Gm-Message-State: AOAM5308b3DSfq34zW/ai3dRlWqunQD3iF95LAHkVqe5f4Sr/ekc/H4m OFWwiou9OGDbuGDawab7pzBdW19aoOd4TQtiGRsJKtaus5I5FgvISYYzopxcEAC1LgJIYi0qw6M J86+IEpk2z1WWNwCuSC4E4prpuy8= X-Received: by 2002:a25:d16:: with SMTP id 22mr7262211ybn.51.1639583611034; Wed, 15 Dec 2021 07:53:31 -0800 (PST) X-Google-Smtp-Source: ABdhPJxgo9cFMLm8mx2Ky3hk+GNpyqtsSqduxH8gzk/VkQSFZWfkZLjMF1CAQ43GEotn4on8nVyw7ekmdDrUu0xOLiI= X-Received: by 2002:a25:d16:: with SMTP id 22mr7262175ybn.51.1639583610794; Wed, 15 Dec 2021 07:53:30 -0800 (PST) MIME-Version: 1.0 References: <20211208181714.880312-1-jsavitz@redhat.com> In-Reply-To: From: Joel Savitz Date: Wed, 15 Dec 2021 10:53:15 -0500 Message-ID: Subject: Re: [PATCH v2] mm/oom_kill: wake futex waiters before annihilating victim shared mutex To: Qian Cai Cc: linux-kernel , Waiman Long , linux-mm@kvack.org, Nico Pache , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Darren Hart , Davidlohr Bueso , =?UTF-8?Q?Andr=C3=A9_Almeida?= , Andrew Morton , Michal Hocko X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EC4821C0012 X-Stat-Signature: hnw3g7khyb361wsunhze6u9j8xjkrzsd Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ATbc0FhC; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf20.hostedemail.com: domain of jsavitz@redhat.com has no SPF policy when checking 170.10.129.124) smtp.mailfrom=jsavitz@redhat.com X-HE-Tag: 1639583610-593820 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > I am afraid we can't call futex_exit_release() under rcu_read_lock() > because it might sleep. Ah that's too bad. Is there an equivalent atomic call suitable for this purpose? > > BUG: sleeping function called from invalid context at kernel/locking/mutex.c:577 > in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1509, name: lsbug > preempt_count: 1, expected: 0 > 3 locks held by lsbug/1509: > #0: ffff00004de99c98 (&mm->mmap_lock){++++}-{3:3}, at: do_page_fault > #1: ffff800010fd8308 (oom_lock){+.+.}-{3:3}, at: __alloc_pages_slowpath.constprop.0 > __alloc_pages_may_oom at /usr/src/linux-next/mm/page_alloc.c:4278 > (inlined by) __alloc_pages_slowpath at /usr/src/linux-next/mm/page_alloc.c:5058 > #2: ffff000867b3b0c0 (&p->alloc_lock){+.+.}-{2:2}, at: find_lock_task_mm > find_lock_task_mm at /usr/src/linux-next/mm/oom_kill.c:145 > CPU: 5 PID: 1509 Comm: lsbug Not tainted 5.16.0-rc5-next-20211214+ #172 > Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020 > Call trace: > dump_backtrace > show_stack > dump_stack_lvl > dump_stack > __might_resched > __might_sleep > __mutex_lock > mutex_lock_nested > futex_cleanup_begin > futex_cleanup_begin at /usr/src/linux-next/kernel/futex/core.c:1071 > futex_exit_release > __oom_kill_process > oom_kill_process > out_of_memory > __alloc_pages_slowpath.constprop.0 > __alloc_pages > alloc_pages_vma > alloc_zeroed_user_highpage_movable > do_anonymous_page > __handle_mm_fault > handle_mm_fault > do_page_fault > do_translation_fault > do_mem_abort > el0_da > el0t_64_sync_handler > el0t_64_sync > ============================= > [ BUG: Invalid wait context ] > 5.16.0-rc5-next-20211214+ #172 Tainted: G W > ----------------------------- > lsbug/1509 is trying to lock: > ffff000867b3ba98 (&tsk->futex_exit_mutex){+.+.}-{3:3}, at: futex_cleanup_begin > other info that might help us debug this: > context-{4:4} > 3 locks held by lsbug/1509: > #0: ffff00004de99c98 (&mm->mmap_lock){++++}-{3:3}, at: do_page_fault > #1: ffff800010fd8308 (oom_lock){+.+.}-{3:3}, at: __alloc_pages_slowpath.constprop.0 > #2: ffff000867b3b0c0 (&p->alloc_lock){+.+.}-{2:2}, at: find_lock_task_mm > stack backtrace: > CPU: 5 PID: 1509 Comm: lsbug Tainted: G W 5.16.0-rc5-next-20211214+ #172 > Hardware name: MiTAC RAPTOR EV-883832-X3-0001/RAPTOR, BIOS 1.6 06/28/2020 > Call trace: > dump_backtrace > show_stack > dump_stack_lvl > dump_stack > __lock_acquire > lock_acquire > __mutex_lock > mutex_lock_nested > futex_cleanup_begin > futex_exit_release > __oom_kill_process > oom_kill_process > out_of_memory > __alloc_pages_slowpath.constprop.0 > __alloc_pages > alloc_pages_vma > alloc_zeroed_user_highpage_movable > do_anonymous_page > __handle_mm_fault > handle_mm_fault > do_page_fault > do_translation_fault > do_mem_abort > el0_da > el0t_64_sync_handler > el0t_64_sync > > > --- > > mm/oom_kill.c | 12 ++++++++++++ > > 1 file changed, 12 insertions(+) > > > > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > > index 1ddabefcfb5a..884a5f15fd06 100644 > > --- a/mm/oom_kill.c > > +++ b/mm/oom_kill.c > > @@ -44,6 +44,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include "internal.h" > > @@ -885,6 +886,11 @@ static void __oom_kill_process(struct task_struct *victim, const char *message) > > count_vm_event(OOM_KILL); > > memcg_memory_event_mm(mm, MEMCG_OOM_KILL); > > > > + /* > > + * We call futex_exit_release() on the victim task to ensure any waiters on any > > + * process-shared futexes held by the victim task are woken up. > > + */ > > + futex_exit_release(victim); > > /* > > * We should send SIGKILL before granting access to memory reserves > > * in order to prevent the OOM victim from depleting the memory > > @@ -930,6 +936,12 @@ static void __oom_kill_process(struct task_struct *victim, const char *message) > > */ > > if (unlikely(p->flags & PF_KTHREAD)) > > continue; > > + /* > > + * We call futex_exit_release() on any task p sharing the > > + * victim->mm to ensure any waiters on any > > + * process-shared futexes held by task p are woken up. > > + */ > > + futex_exit_release(p); > > do_send_sig_info(SIGKILL, SEND_SIG_PRIV, p, PIDTYPE_TGID); > > } > > rcu_read_unlock(); > > -- > > 2.27.0 > > > Best, Joel Savitz