From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACE80C47258 for ; Thu, 25 Jan 2024 13:20:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 174F96B0071; Thu, 25 Jan 2024 08:20:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 125CE6B0095; Thu, 25 Jan 2024 08:20:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2FC36B0096; Thu, 25 Jan 2024 08:20:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E17D46B0071 for ; Thu, 25 Jan 2024 08:20:48 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 897718082E for ; Thu, 25 Jan 2024 13:20:48 +0000 (UTC) X-FDA: 81717893376.06.54B86B5 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf15.hostedemail.com (Postfix) with ESMTP id C2BF7A0016 for ; Thu, 25 Jan 2024 13:20:46 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=XvPwZ6Az; spf=pass (imf15.hostedemail.com: domain of "SRS0=XAUt=JD=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=XAUt=JD=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org"; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706188846; a=rsa-sha256; cv=none; b=yNd5jmW70Ct8s9bWjqJr8tjmSu22ytXPKr3nNfVvTIlm58JdT4KyRwLVGf4Ci8pW8wz0Fa qLYQaYVvwIHgLiTU9S+VPd8JW5LWWWoyrkUqg+8SuFJhbGo8ft6g/Na7ClbWCfmC7Qrlq5 xpFM5yTokPRhTMroKs0A4VmoB2PagVo= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=XvPwZ6Az; spf=pass (imf15.hostedemail.com: domain of "SRS0=XAUt=JD=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=XAUt=JD=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org"; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706188846; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wTXhlMivyrlqakHv4RHJI/CN0uZo+69hXJcQksLdlZw=; b=jG766sTPxarZCxOx375VN7nhaAGoJMsRlU5gDHGZVDZi9zhFlP7StBuTJUXPSTlXWBoLpm WAKPbtZYLWBUR+J2vhU/XP81wQOjDd/h4IuPWZfYjWi3Xi115XcpTYPAzxMsDtiKq3A0Td ZThnFrtd6JBQR2WpkG8ciYqBvgoCghQ= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id A5771621C4; Thu, 25 Jan 2024 13:20:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 549DCC433F1; Thu, 25 Jan 2024 13:20:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706188845; bh=REBZVG9ehyWFJMZdJCjxgTWUOv9y1bMZcqreSj6u5EI=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=XvPwZ6AzhOVK6A3/vsuVyOFrAjO4c/mAA4zUw9lmJ6R/R6KL+mNzUgYzYot46Levk FzWTm8sMRVWeLfbMe4vMOBygGZdpmHeJoLmuSRU3JfLE0oGmmJY4FCresqAsmfLqmv hq0B6ly4ckUdxQ3A5+NEV806+lmOzSqI2GjVp26vbRgUrhIHWKL685tdCvsGUbk1I4 T2V4X9nZ35yxMAyMQbmfuzaO/6Ro3Nq55TbN1lb1HzB+NXF6sn3AvEv9n3lqFDBG1Y NziJ6+upFovUW2qO8Jyqz0vTUynOnQbbbNVt5E34YzJYAAfliyQlDV2zQrzgCbsSaA CcLD9dMoaUdlw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id E6F0FCE189D; Thu, 25 Jan 2024 05:20:44 -0800 (PST) Date: Thu, 25 Jan 2024 05:20:44 -0800 From: "Paul E. McKenney" To: Marco Elver Cc: Alexander Potapenko , quic_charante@quicinc.com, akpm@linux-foundation.org, aneesh.kumar@linux.ibm.com, dan.j.williams@intel.com, david@redhat.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mgorman@techsingularity.net, osalvador@suse.de, vbabka@suse.cz, Dmitry Vyukov , kasan-dev@googlegroups.com, Ilya Leoshkevich , Nicholas Miehlbradt , rcu@vger.kernel.org Subject: Re: [PATCH] mm/sparsemem: fix race in accessing memory_section->usage Message-ID: <9d94958c-7ab3-4f0d-a718-1f72c1467925@paulmck-laptop> Reply-To: paulmck@kernel.org References: <1697202267-23600-1-git-send-email-quic_charante@quicinc.com> <20240115184430.2710652-1-glider@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: C2BF7A0016 X-Stat-Signature: 85bop8uukbxnkx4p6cog3gr4kgxmmnmc X-Rspam-User: X-HE-Tag: 1706188846-743572 X-HE-Meta: U2FsdGVkX18H51CrLi7lE4iXXqW6P5x1+3FHCKiEfiKNkq1TTVUFeV678qMLiGKG5mp+OU2z7mneKPtT3T/TN8qLNevH6heGaR9lm376qyR26WX2wLPajt4uKqgvtsQ3HXtlejQyM02DgyiGcdpIAHssCAuW2rZ11SUSRuQRgqfdkx1pXv60KYtem5aRzLrPnsK5rcdPHsPuKeVzlIkwePJz9RfLVJ2I7prJv7G856jT1Mma9/OarSPdJUHCIVYj5bVUmEGZLYG+uaR/7DxkIG+PaE4YLqbEaSSnIAhM+zKkE5td79t8hbJk0HKrdvFBWVu6i+DXXR0CTGNrRvoyf8igCcwRct7HGgp1zvCZ/jlAbjIjR3KgYcwC9J7DWP4ZohfFIIv9ZRPXyd1Sm2qFeK2+edNxdlQGUWaUdlCxdODWC74+AEsO13HpB1hftQplMnm9GVT+L0x+jYHuKG9SMvTrRx0yMO05gTHWhh3QhLAC4G17ddQtLKLtwBID3fURYuDwXTJJbzNB6wB/sXF5qH7oYETUmib+D2qMu0Njic3FNEiLz2xpZyb8ADwyern5jixX2rLu5bo+Vze0fjmhfhWlloI64l9wpz317ygfW+FLZ7ujt0wX5lBwuuV/rMX5gk8mTsSlRsxxpRoS4Te+v6lOUDEUZjp2VQQX4Xjsb9I1PY5lsid78x/oJHrA2mEWthJJSIO/xFmWPsGRQD0GeOy2KGcVwtJV4eyV2q/hQLzOHVjPytdgN6yjQGErEnQ0SGhyA6pwUkXNI4H1PU5tpfal8RY69vIMbnjH8REKQCQab0bL++ySnb+HGq2dQb5QPq8rClSv6kAMQy6Q9wjFWTxUyjYWHT1Rt/ojlnUmKnE7r1uk06Qj6w+raXejVO3i35g3K3SI47s2fldEiVHsNiZ0cWS4sQOpSR9vFHdcvcyDE3+HAoTJ0pFaONFNav2DKswwG3QnuHyaQjImSl2 OecQbTx4 VWnqdRbT5fa0wHD/gcr6FS2cWI7ML6hWajTesQNqLo0OK4PXa2yRyekGtJ0bCfT0t4V1juoDrwQwvyu/iJeqqZ8XJY6BDVDR6p6Pfps2ujBZafjMdtZt6UoYgC77wluCOv3QNbstMBsk/tHWwHuPY3GDQdbTudjhWisKV8wTTJP1BVOtNQB6e+BxyeUb2z9dsw0y7TsMI32FWppPwUGdGujrNNT6SSZTRV3sZ2CSL7lvLQg1JiBDT+tNG+1obteWIzA4kRlggaAcncAAk4uBO/IclNwYjxoUzJq2W+HcYaNjLC71nWPwvsW62aEnfTXBP/ZX8awlSLWru2zmpUC2u32kzMtENpvz8v+uBThc8Dvfk4FhkksM7uwxHUhP/vbvlLl3gfnfhsHIUAbDrEIauJhLe9MbTcLmyFLEawzxdMhrxHbEPi3b+56sioC2vMlx6u4voGic8ZPYXOs26DW1+k0buFMXqrrg36TY1qd/4yWMNM5mm81bU3+cS+bNpa3L6jC8gy5+iBemjcwadWmbmbWo0gf71OD3V7wuhAqRAed19vEdBIy2rdqpB3T8gGaUlWjyFNQKzpJ1SZ+4hPlMrs5v3JA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 18, 2024 at 10:43:06AM +0100, Marco Elver wrote: > On Thu, Jan 18, 2024 at 10:01AM +0100, Alexander Potapenko wrote: > > > > > > Hrm, rcu_read_unlock_sched_notrace() can still call > > > __preempt_schedule_notrace(), which is again instrumented by KMSAN. > > > > > > This patch gets me a working kernel: > > > > [...] > > > Disabling interrupts is a little heavy handed - it also assumes the > > > current RCU implementation. There is > > > preempt_enable_no_resched_notrace(), but that might be worse because it > > > breaks scheduling guarantees. > > > > > > That being said, whatever we do here should be wrapped in some > > > rcu_read_lock/unlock_() helper. > > > > We could as well redefine rcu_read_lock/unlock in mm/kmsan/shadow.c > > (or the x86-specific KMSAN header, depending on whether people are > > seeing the problem on s390 and Power) with some header magic. > > But that's probably more fragile than adding a helper. > > > > > > > > Is there an existing helper we can use? If not, we need a variant that > > > can be used from extremely constrained contexts that can't even call > > > into the scheduler. And if we want pfn_valid() to switch to it, it also > > > should be fast. > > The below patch also gets me a working kernel. For pfn_valid(), using > rcu_read_lock_sched() should be reasonable, given its critical section > is very small and also enables it to be called from more constrained > contexts again (like KMSAN). > > Within KMSAN we also have to suppress reschedules. This is again not > ideal, but since it's limited to KMSAN should be tolerable. > > WDYT? I like this one better from a purely selfish RCU perspective. ;-) Thanx, Paul > ------ >8 ------ > > diff --git a/arch/x86/include/asm/kmsan.h b/arch/x86/include/asm/kmsan.h > index 8fa6ac0e2d76..bbb1ba102129 100644 > --- a/arch/x86/include/asm/kmsan.h > +++ b/arch/x86/include/asm/kmsan.h > @@ -64,6 +64,7 @@ static inline bool kmsan_virt_addr_valid(void *addr) > { > unsigned long x = (unsigned long)addr; > unsigned long y = x - __START_KERNEL_map; > + bool ret; > > /* use the carry flag to determine if x was < __START_KERNEL_map */ > if (unlikely(x > y)) { > @@ -79,7 +80,21 @@ static inline bool kmsan_virt_addr_valid(void *addr) > return false; > } > > - return pfn_valid(x >> PAGE_SHIFT); > + /* > + * pfn_valid() relies on RCU, and may call into the scheduler on exiting > + * the critical section. However, this would result in recursion with > + * KMSAN. Therefore, disable preemption here, and re-enable preemption > + * below while suppressing rescheduls to avoid recursion. > + * > + * Note, this sacrifices occasionally breaking scheduling guarantees. > + * Although, a kernel compiled with KMSAN has already given up on any > + * performance guarantees due to being heavily instrumented. > + */ > + preempt_disable(); > + ret = pfn_valid(x >> PAGE_SHIFT); > + preempt_enable_no_resched(); > + > + return ret; > } > > #endif /* !MODULE */ > diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h > index 4ed33b127821..a497f189d988 100644 > --- a/include/linux/mmzone.h > +++ b/include/linux/mmzone.h > @@ -2013,9 +2013,9 @@ static inline int pfn_valid(unsigned long pfn) > if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS) > return 0; > ms = __pfn_to_section(pfn); > - rcu_read_lock(); > + rcu_read_lock_sched(); > if (!valid_section(ms)) { > - rcu_read_unlock(); > + rcu_read_unlock_sched(); > return 0; > } > /* > @@ -2023,7 +2023,7 @@ static inline int pfn_valid(unsigned long pfn) > * the entire section-sized span. > */ > ret = early_section(ms) || pfn_section_valid(ms, pfn); > - rcu_read_unlock(); > + rcu_read_unlock_sched(); > > return ret; > }