From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.6 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EE20CA9EBC for ; Sat, 26 Oct 2019 15:34:12 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 077F52084C for ; Sat, 26 Oct 2019 15:34:11 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="N3uER++U" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 077F52084C Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 682B36B0003; Sat, 26 Oct 2019 11:34:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60C636B0005; Sat, 26 Oct 2019 11:34:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4AC856B0006; Sat, 26 Oct 2019 11:34:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0214.hostedemail.com [216.40.44.214]) by kanga.kvack.org (Postfix) with ESMTP id 209696B0003 for ; Sat, 26 Oct 2019 11:34:11 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id B42006C3C for ; Sat, 26 Oct 2019 15:34:10 +0000 (UTC) X-FDA: 76086331860.08.robin70_75107d205b1a X-HE-Tag: robin70_75107d205b1a X-Filterd-Recvd-Size: 5588 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf44.hostedemail.com (Postfix) with ESMTP for ; Sat, 26 Oct 2019 15:34:10 +0000 (UTC) Received: from localhost (c-73-47-72-35.hsd1.nh.comcast.net [73.47.72.35]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C7EF720663; Sat, 26 Oct 2019 15:34:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1572104049; bh=CD0FgLiNJJZMYaTOlqcJbAFXREYTUTLAITk0cybJ/qw=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=N3uER++UD8qgn1607PWaVghumv0faImnnR0il9iHP9AbJiQm0BoAfbD+k4qbf7SDL A5bI3DLm1rgHmMkesgbNB9x2Z4s854YREBTw/8Qq804u0OrMPCOKa+ZOOm1Cx4k69Y 3BWdgB1zm534lgPIpnwq03+E8ewT20WqkTw+Lmc8= Date: Sat, 26 Oct 2019 11:34:07 -0400 From: Sasha Levin To: zhong jiang Cc: Matthew Wilcox , gregkh@linuxfoundation.org, stable@vger.kernel.org, dh.herrmann@gmail.com, linux-mm@kvack.org Subject: Re: [PATCH 4.19] memfd: Fix locking when tagging pins Message-ID: <20191026153407.GJ31224@sasha-vm> References: <20191025165837.22979-1-willy@infradead.org> <5DB3A985.4000903@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii; format=flowed Content-Disposition: inline In-Reply-To: <5DB3A985.4000903@huawei.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Oct 26, 2019 at 10:03:49AM +0800, zhong jiang wrote: >On 2019/10/26 0:58, Matthew Wilcox wrote: >> From: "Matthew Wilcox (Oracle)" >> >> The RCU lock is insufficient to protect the radix tree iteration as >> a deletion from the tree can occur before we take the spinlock to >> tag the entry. In 4.19, this has manifested as a bug with the following >> trace: >> >> kernel BUG at lib/radix-tree.c:1429! >> invalid opcode: 0000 [#1] SMP KASAN PTI >> CPU: 7 PID: 6935 Comm: syz-executor.2 Not tainted 4.19.36 #25 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014 >> RIP: 0010:radix_tree_tag_set+0x200/0x2f0 lib/radix-tree.c:1429 >> Code: 00 00 5b 5d 41 5c 41 5d 41 5e 41 5f c3 48 89 44 24 10 e8 a3 29 7e fe 48 8b 44 24 10 48 0f ab 03 e9 d2 fe ff ff e8 90 29 7e fe <0f> 0b 48 c7 c7 e0 5a 87 84 e8 f0 e7 08 ff 4c 89 ef e8 4a ff ac fe >> RSP: 0018:ffff88837b13fb60 EFLAGS: 00010016 >> RAX: 0000000000040000 RBX: ffff8883c5515d58 RCX: ffffffff82cb2ef0 >> RDX: 0000000000000b72 RSI: ffffc90004cf2000 RDI: ffff8883c5515d98 >> RBP: ffff88837b13fb98 R08: ffffed106f627f7e R09: ffffed106f627f7e >> R10: 0000000000000001 R11: ffffed106f627f7d R12: 0000000000000004 >> R13: ffffea000d7fea80 R14: 1ffff1106f627f6f R15: 0000000000000002 >> FS: 00007fa1b8df2700(0000) GS:ffff8883e2fc0000(0000) knlGS:0000000000000000 >> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> CR2: 00007fa1b8df1db8 CR3: 000000037d4d2001 CR4: 0000000000160ee0 >> Call Trace: >> memfd_tag_pins mm/memfd.c:51 [inline] >> memfd_wait_for_pins+0x2c5/0x12d0 mm/memfd.c:81 >> memfd_add_seals mm/memfd.c:215 [inline] >> memfd_fcntl+0x33d/0x4a0 mm/memfd.c:247 >> do_fcntl+0x589/0xeb0 fs/fcntl.c:421 >> __do_sys_fcntl fs/fcntl.c:463 [inline] >> __se_sys_fcntl fs/fcntl.c:448 [inline] >> __x64_sys_fcntl+0x12d/0x180 fs/fcntl.c:448 >> do_syscall_64+0xc8/0x580 arch/x86/entry/common.c:293 >> >> The problem does not occur in mainline due to the XArray rewrite which >> changed the locking to exclude modification of the tree during iteration. >> At the time, nobody realised this was a bugfix. Backport the locking >> changes to stable. >> >> Cc: stable@vger.kernel.org >> Reported-by: zhong jiang >> Signed-off-by: Matthew Wilcox (Oracle) >> --- >> mm/memfd.c | 18 ++++++++++-------- >> 1 file changed, 10 insertions(+), 8 deletions(-) >> >> diff --git a/mm/memfd.c b/mm/memfd.c >> index 2bb5e257080e..5859705dafe1 100644 >> --- a/mm/memfd.c >> +++ b/mm/memfd.c >> @@ -34,11 +34,12 @@ static void memfd_tag_pins(struct address_space *mapping) >> void __rcu **slot; >> pgoff_t start; >> struct page *page; >> + unsigned int tagged = 0; >> >> lru_add_drain(); >> start = 0; >> - rcu_read_lock(); >> >> + xa_lock_irq(&mapping->i_pages); >> radix_tree_for_each_slot(slot, &mapping->i_pages, &iter, start) { >> page = radix_tree_deref_slot(slot); >> if (!page || radix_tree_exception(page)) { >> @@ -47,18 +48,19 @@ static void memfd_tag_pins(struct address_space *mapping) >> continue; >> } >> } else if (page_count(page) - page_mapcount(page) > 1) { >> - xa_lock_irq(&mapping->i_pages); >> radix_tree_tag_set(&mapping->i_pages, iter.index, >> MEMFD_TAG_PINNED); >> - xa_unlock_irq(&mapping->i_pages); >> } >> >> - if (need_resched()) { >> - slot = radix_tree_iter_resume(slot, &iter); >> - cond_resched_rcu(); >> - } >> + if (++tagged % 1024) >> + continue; >> + >> + slot = radix_tree_iter_resume(slot, &iter); >> + xa_unlock_irq(&mapping->i_pages); >> + cond_resched(); >> + xa_lock_irq(&mapping->i_pages); >> } >> - rcu_read_unlock(); >> + xa_unlock_irq(&mapping->i_pages); >> } >> >> /* >The patch looks good to me. thanks for your review and efforts. > >Sasha, The patch was correct, It should go into stable instead of my patch. I've queued up this series for all respective branches (fixing up 4.19), thanks! -- Thanks, Sasha