From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67FEFC3ABC3 for ; Mon, 12 May 2025 12:00:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2FEC6B00F5; Mon, 12 May 2025 08:00:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB6EF6B00F6; Mon, 12 May 2025 08:00:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C7F526B00F7; Mon, 12 May 2025 08:00:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A6AEF6B00F5 for ; Mon, 12 May 2025 08:00:12 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 3F3E0BE964 for ; Mon, 12 May 2025 12:00:13 +0000 (UTC) X-FDA: 83434112706.18.D4A82C5 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf28.hostedemail.com (Postfix) with ESMTP id 45817C001F for ; Mon, 12 May 2025 12:00:11 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747051211; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=S5UaF3+FoZTnT060SdYShB3m84yokX+Uw/fn5I55XiU=; b=LV9ltpO6VWVMBlZu6HbOsRELp/P2tEUv3CZqXhfm8DGdP08CGJyoY8Mqrdmxeggf9AWzrT yWUYwLIHUSjqUWYiVUcMzEVQfyoVOm4xqg0N+gAQxQgkSx/OogSEzTGCsDRw585ctXcS+y vagUOYyN+hFplAJtD/r8NQath2nciXI= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none; spf=pass (imf28.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747051211; a=rsa-sha256; cv=none; b=oojPVZtiQJecNGHvKmKhQTl3dUN7Geiht8f/GDfw3cd8ugZM2T4pSLfPHtaodMFlcC0xD5 PE0BgfpIudx2gXmofWeO8gVeS8ASF3RQg0qMCxG0UVeyYWr9Nwua22KV4MC06f6bsUzmxJ k4/VxRc8mMY8Dq5caSlOIl2aCWh6V8s= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 391E1150C; Mon, 12 May 2025 04:59:59 -0700 (PDT) Received: from [10.57.90.222] (unknown [10.57.90.222]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 881C03F673; Mon, 12 May 2025 05:00:07 -0700 (PDT) Message-ID: Date: Mon, 12 May 2025 13:00:05 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] arm64/mm: Disable barrier batching in interrupt contexts Content-Language: en-GB To: David Hildenbrand , Catalin Marinas , Will Deacon , Pasha Tatashin , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , "Matthew Wilcox (Oracle)" , Mark Rutland , Anshuman Khandual , Alexandre Ghiti , Kevin Brodsky Cc: linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, syzbot+5c0d9392e042f41d45c5@syzkaller.appspotmail.com References: <20250512102242.4156463-1-ryan.roberts@arm.com> <001dfd4f-27f2-407f-bd1c-21928a754342@redhat.com> From: Ryan Roberts In-Reply-To: <001dfd4f-27f2-407f-bd1c-21928a754342@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: 8m6q4t3ubbodt9e1j4f9uorur8g3gk1p X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 45817C001F X-HE-Tag: 1747051211-545270 X-HE-Meta: U2FsdGVkX18pJc1UGzkzLQ13lnWHQbV97BqdTGDY4A+LrYsTWomAv2NNZaKDPnSkR/kCqIatDyyEoeeo8O0irFuyQHiOWXxZYawGtnZFhQXPqAk4jh3qjETj1C1peMcEynvwky5vuqXj+/lGY9Rk7S2J4TjFkxE/b7jf0vgNNdifCSItJh1PwY/PNylX8jO+9e0ITXs5K0J32+7NEi+FziEM/iKav+4y3/fgTWVeWOx8JoIIksOEBbLgLGTxysahiuiI7mNZf4W+BrWkkTYrv+TYbou/QuOSVkoqXZV7/HvS5723idQ0e4MW1ncgBkpqmb8qSIoPZU6dLKGQWhQ7vmIT8n/fuXcaUb5yn1GVfT6ZfdXOGjL7ztWq5OGPhjhqTRMBIG17Vy5pveKwfHhLYGWAzraJLForzuecoEZxesvyfl/0qu7/mPiuWKEPsRhve6pz9RENKHzOqEaTkQ7Lr4caj/TPZY+gLs0CvwLndDN0vsK5bJT4TJ37hOIYM6AtNwSvUv8iQgw//Ee+PHZvi2XFpE4uCLfEEgtWcGcBXWAbSfK58iIYXQ4kz98eXooez9QrTypqLbekKXiqiM9y+1xLmAaa4XSW7rg8Z3rvwRRD+n0pKRD2w+3UbZN2bSy0Xxj8sml84USXgzoWSq1bSZlS9hYm/fSrNuBDI2f46lXVPhccQp1VTweIiVekjD5gXrUcjmUGIl5hUbnYJ/iX+5kxYzPDRfnwI/riknCTWNB8pYI1IR0Fflaw/ValCVmGwFlg74BuNquc4KmhsJGFspxJuY+N0jEMhSnuLhCeP4mOgNXd9iQ3QBpLoq/f8Ox18bSlN8dJAa8p18DJxw3k0ZKf45uHxlkDUnoLDqV51qpDtBx9C7+pmyaP2DNqU4pIcNHKwmuKap9pyPTJFsfWr4I1ylBu8sOBD7mbQknovizW8lFhu6DyFl+KFbdiaPnE X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 12/05/2025 12:07, David Hildenbrand wrote: > On 12.05.25 12:22, Ryan Roberts wrote: >> Commit 5fdd05efa1cd ("arm64/mm: Batch barriers when updating kernel >> mappings") enabled arm64 kernels to track "lazy mmu mode" using TIF >> flags in order to defer barriers until exiting the mode. At the same >> time, it added warnings to check that pte manipulations were never >> performed in interrupt context, because the tracking implementation >> could not deal with nesting. >> >> But it turns out that some debug features (e.g. KFENCE, DEBUG_PAGEALLOC) >> do manipulate ptes in softirq context, which triggered the warnings. >> >> So let's take the simplest and safest route and disable the batching >> optimization in interrupt contexts. This makes these users no worse off >> than prior to the optimization. Additionally the known offenders are >> debug features that only manipulate a single PTE, so there is no >> performance gain anyway. >> >> There may be some obscure case of encrypted/decrypted DMA with the >> dma_free_coherent called from an interrupt context, but again, this is >> no worse off than prior to the commit. >> >> Some options for supporting nesting were considered, but there is a >> difficult to solve problem if any code manipulates ptes within interrupt >> context but *outside of* a lazy mmu region. If this case exists, the >> code would expect the updates to be immediate, but because the task >> context may have already been in lazy mmu mode, the updates would be >> deferred, which could cause incorrect behaviour. This problem is avoided >> by always ensuring updates within interrupt context are immediate. >> >> Fixes: 5fdd05efa1cd ("arm64/mm: Batch barriers when updating kernel mappings") >> Reported-by: syzbot+5c0d9392e042f41d45c5@syzkaller.appspotmail.com >> Closes: https://lore.kernel.org/linux-arm- >> kernel/681f2a09.050a0220.f2294.0006.GAE@google.com/ >> Signed-off-by: Ryan Roberts >> --- >> >> Hi Will, >> >> I've tested before and after with KFENCE enabled and it solves the issue. I've >> also run all the mm-selftests which all continue to pass. >> >> Catalin suggested a Fixes patch targetting the SHA as it is in for-next/mm was >> the preferred approach, but shout if you want something different. I'm hoping >> that with this fix we can still make it for this cycle, subject to not finding >> any more issues. >> >> Thanks, >> Ryan >> >> >>   arch/arm64/include/asm/pgtable.h | 16 ++++++++++++++-- >>   1 file changed, 14 insertions(+), 2 deletions(-) >> >> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h >> index ab4a1b19e596..e65083ec35cb 100644 >> --- a/arch/arm64/include/asm/pgtable.h >> +++ b/arch/arm64/include/asm/pgtable.h >> @@ -64,7 +64,11 @@ static inline void queue_pte_barriers(void) >>   { >>       unsigned long flags; >> >> -    VM_WARN_ON(in_interrupt()); >> +    if (in_interrupt()) { >> +        emit_pte_barriers(); >> +        return; >> +    } >> + >>       flags = read_thread_flags(); >> >>       if (flags & BIT(TIF_LAZY_MMU)) { >> @@ -79,7 +83,9 @@ static inline void queue_pte_barriers(void) >>   #define  __HAVE_ARCH_ENTER_LAZY_MMU_MODE >>   static inline void arch_enter_lazy_mmu_mode(void) >>   { >> -    VM_WARN_ON(in_interrupt()); >> +    if (in_interrupt()) >> +        return; >> + >>       VM_WARN_ON(test_thread_flag(TIF_LAZY_MMU)); >> >>       set_thread_flag(TIF_LAZY_MMU); >> @@ -87,12 +93,18 @@ static inline void arch_enter_lazy_mmu_mode(void) >> >>   static inline void arch_flush_lazy_mmu_mode(void) >>   { >> +    if (in_interrupt()) >> +        return; >> + >>       if (test_and_clear_thread_flag(TIF_LAZY_MMU_PENDING)) >>           emit_pte_barriers(); >>   } >> >>   static inline void arch_leave_lazy_mmu_mode(void) >>   { >> +    if (in_interrupt()) >> +        return; >> + >>       arch_flush_lazy_mmu_mode(); >>       clear_thread_flag(TIF_LAZY_MMU); >>   } > > I guess in all cases we could optimize out the in_interrupt() check on !debug > configs. I think that assumes we can easily and accurately identify all configs that cause this? We've identified 2 but I'm not confident that it's a full list. Also, KFENCE isn't really a debug config (despite me calling it that in the commit log) - it's supposed to be something that can be enabled in production builds. > > Hm, maybe there is an elegant way to catch all of these "problematic" users? I'm all ears if you have any suggestions? :) It actaully looks like x86/XEN tries to solves this problem in a similar way: enum xen_lazy_mode xen_get_lazy_mode(void) { if (in_interrupt()) return XEN_LAZY_NONE; return this_cpu_read(xen_lazy_mode); } Although I'm not convinced it's fully robust. It also has: static inline void enter_lazy(enum xen_lazy_mode mode) { BUG_ON(this_cpu_read(xen_lazy_mode) != XEN_LAZY_NONE); this_cpu_write(xen_lazy_mode, mode); } which is called as part of its arch_enter_lazy_mmu_mode() implementation. If a task was already in lazy mmu mode when an interrupt comes in and causes the nested arch_enter_lazy_mmu_mode() that we saw in this bug report, surely that BUG_ON() should trigger? Thanks, Ryan