From: Alexander Gordeev <agordeev@linux.ibm.com>
To: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Ritesh Harjani <ritesh.list@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Andreas Larsson <andreas@gaisler.com>,
Andrew Morton <akpm@linux-foundation.org>,
Boris Ostrovsky <boris.ostrovsky@oracle.com>,
Borislav Petkov <bp@alien8.de>,
Catalin Marinas <catalin.marinas@arm.com>,
Christophe Leroy <christophe.leroy@csgroup.eu>,
Dave Hansen <dave.hansen@linux.intel.com>,
David Hildenbrand <david@redhat.com>,
"David S. Miller" <davem@davemloft.net>,
David Woodhouse <dwmw2@infradead.org>,
"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
Jann Horn <jannh@google.com>, Juergen Gross <jgross@suse.com>,
"Liam R. Howlett" <Liam.Howlett@oracle.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
Nicholas Piggin <npiggin@gmail.com>,
Peter Zijlstra <peterz@infradead.org>,
Ryan Roberts <ryan.roberts@arm.com>,
Suren Baghdasaryan <surenb@google.com>,
Thomas Gleixner <tglx@linutronix.de>,
Vlastimil Babka <vbabka@suse.cz>, Will Deacon <will@kernel.org>,
Yeoreum Yun <yeoreum.yun@arm.com>,
linux-arm-kernel@lists.infradead.org,
linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org,
xen-devel@lists.xenproject.org, x86@kernel.org
Subject: Re: [PATCH v4 07/12] mm: enable lazy_mmu sections to nest
Date: Thu, 6 Nov 2025 16:33:26 +0100 [thread overview]
Message-ID: <d5435e75-036b-44a5-a989-722e13f94b3e-agordeev@linux.ibm.com> (raw)
In-Reply-To: <48a4ecb5-3412-4d3f-9e43-535f8bee505f@arm.com>
On Thu, Nov 06, 2025 at 10:51:43AM +0000, Kevin Brodsky wrote:
> On 05/11/2025 16:12, Alexander Gordeev wrote:
> > On Wed, Nov 05, 2025 at 02:19:03PM +0530, Ritesh Harjani wrote:
> >>> + * in_lazy_mmu_mode() can be used to check whether the lazy MMU mode is
> >>> + * currently enabled.
> >>> */
> >>> #ifdef CONFIG_ARCH_HAS_LAZY_MMU_MODE
> >>> static inline void lazy_mmu_mode_enable(void)
> >>> {
> >>> - arch_enter_lazy_mmu_mode();
> >>> + struct lazy_mmu_state *state = ¤t->lazy_mmu_state;
> >>> +
> >>> + VM_WARN_ON_ONCE(state->nesting_level == U8_MAX);
> >>> + /* enable() must not be called while paused */
> >>> + VM_WARN_ON(state->nesting_level > 0 && !state->active);
> >>> +
> >>> + if (state->nesting_level++ == 0) {
> >>> + state->active = true;
> >>> + arch_enter_lazy_mmu_mode();
> >>> + }
> >>> }
> >> Some architectures disables preemption in their
> >> arch_enter_lazy_mmu_mode(). So shouldn't the state->active = true should
> >> happen after arch_enter_lazy_mmu_mode() has disabled preemption()? i.e.
> > Do you have some scenario in mind that could cause an issue?
> > IOW, what could go wrong if the process is scheduled to another
> > CPU before preempt_disable() is called?
>
> I'm not sure I understand the issue either.
>
> >> static inline void lazy_mmu_mode_enable(void)
> >> {
> >> - arch_enter_lazy_mmu_mode();
> >> + struct lazy_mmu_state *state = ¤t->lazy_mmu_state;
> >> +
> >> + VM_WARN_ON_ONCE(state->nesting_level == U8_MAX);
> >> + /* enable() must not be called while paused */
> >> + VM_WARN_ON(state->nesting_level > 0 && !state->active);
> >> +
> >> + if (state->nesting_level++ == 0) {
> >> + arch_enter_lazy_mmu_mode();
> >> + state->active = true;
> >> + }
> >> }
> >>
> >> ... I think it make more sense to enable the state after the arch_**
> >> call right.
> > But then in_lazy_mmu_mode() would return false if called from
> > arch_enter_lazy_mmu_mode(). Not big problem, but still..
>
> The ordering of nesting_level/active was the way you expected in v3, but
> the conclusion of the discussion with David H [1] is that it doesn't
> really matter so I simplified the ordering in v4 - the arch hooks
> shouldn't call in_lazy_mmu_mode() or inspect lazy_mmu_state.
> arch_enter()/arch_leave() shouldn't need it anyway since they're called
> once per outer section (not in nested sections). arch_flush() could
> potentially do something different when nested, but that seems unlikely.
>
> - Kevin
>
> [1]
> https://lore.kernel.org/all/af4414b6-617c-4dc8-bddc-3ea00d1f6f3b@redhat.com/
I might be misunderstand this conversation, but it looked to me as a discussion
about lazy_mmu_state::nesting_level value, not lazy_mmu_state::active.
I do use in_lazy_mmu_mode() (lazy_mmu_state::active) check from the arch-
callbacks. Here is the example (and likely the only case so far) where it hits:
static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
void *_data)
{
lazy_mmu_mode_pause();
...
if (likely(pte_none(ptep_get(ptep)))) {
/* Here set_pte() checks whether we are in lazy_mmu mode */
set_pte_at(&init_mm, addr, ptep, pte); <--- calls set_pte()
data->pages[index] = NULL;
}
...
lazy_mmu_mode_resume();
...
}
So without in_lazy_mmu_mode() check above the arch-specific set_pte()
implementation enters a wrong branch, which ends up in:
[ 394.503134] Call Trace:
[ 394.503137] [<00007fffe01333f4>] dump_stack_lvl+0xbc/0xf0
[ 394.503143] [<00007fffe010298c>] vpanic+0x1cc/0x418
[ 394.503149] [<00007fffe0102c7a>] panic+0xa2/0xa8
[ 394.503154] [<00007fffe01e7a8a>] check_panic_on_warn+0x8a/0xb0
[ 394.503160] [<00007fffe082d122>] end_report+0x72/0x110
[ 394.503166] [<00007fffe082d3e6>] kasan_report+0xc6/0x100
[ 394.503171] [<00007fffe01b9556>] ipte_batch_ptep_get+0x146/0x150
[ 394.503176] [<00007fffe0830096>] kasan_populate_vmalloc_pte+0xe6/0x1e0
[ 394.503183] [<00007fffe0718050>] apply_to_pte_range+0x1a0/0x570
[ 394.503189] [<00007fffe07260fa>] __apply_to_page_range+0x3ca/0x8f0
[ 394.503195] [<00007fffe0726648>] apply_to_page_range+0x28/0x40
[ 394.503201] [<00007fffe082fe34>] __kasan_populate_vmalloc+0x324/0x340
[ 394.503207] [<00007fffe076954e>] alloc_vmap_area+0x31e/0xbf0
[ 394.503213] [<00007fffe0770106>] __get_vm_area_node+0x1a6/0x2d0
[ 394.503218] [<00007fffe07716fa>] __vmalloc_node_range_noprof+0xba/0x260
[ 394.503224] [<00007fffe0771970>] __vmalloc_node_noprof+0xd0/0x110
[ 394.503229] [<00007fffe0771a22>] vmalloc_noprof+0x32/0x40
[ 394.503234] [<00007fff604eaa42>] full_fit_alloc_test+0xb2/0x3e0 [test_vmalloc]
[ 394.503241] [<00007fff604eb478>] test_func+0x488/0x760 [test_vmalloc]
[ 394.503247] [<00007fffe025ad68>] kthread+0x368/0x630
[ 394.503253] [<00007fffe01391e0>] __ret_from_fork+0xd0/0x490
[ 394.503259] [<00007fffe24e468a>] ret_from_fork+0xa/0x30
I could have cached lazy_mmu_state::active as arch-specific data
and check it, but then what is the point to have it generalized?
Thanks!
next prev parent reply other threads:[~2025-11-06 15:34 UTC|newest]
Thread overview: 80+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-29 10:08 [PATCH v4 00/12] Nesting support for lazy MMU mode Kevin Brodsky
2025-10-29 10:08 ` [PATCH v4 01/12] powerpc/64s: Do not re-activate batched TLB flush Kevin Brodsky
2025-11-01 12:05 ` David Hildenbrand
2025-11-05 2:46 ` Ritesh Harjani
2025-11-06 10:29 ` Kevin Brodsky
2025-11-08 0:35 ` Ritesh Harjani
2025-11-10 13:18 ` Kevin Brodsky
2025-11-07 12:25 ` Ryan Roberts
2025-11-07 12:28 ` Ryan Roberts
2025-10-29 10:08 ` [PATCH v4 02/12] x86/xen: simplify flush_lazy_mmu() Kevin Brodsky
2025-11-01 12:14 ` David Hildenbrand
2025-11-03 18:06 ` Kevin Brodsky
2025-11-07 12:31 ` Ryan Roberts
2025-11-10 10:36 ` Kevin Brodsky
2025-11-11 10:08 ` Ryan Roberts
2025-11-07 15:45 ` Jürgen Groß
2025-10-29 10:09 ` [PATCH v4 03/12] powerpc/mm: implement arch_flush_lazy_mmu_mode() Kevin Brodsky
2025-11-01 12:14 ` David Hildenbrand
2025-11-05 3:15 ` Ritesh Harjani
2025-11-05 9:49 ` Ritesh Harjani
2025-11-06 10:31 ` Kevin Brodsky
2025-10-29 10:09 ` [PATCH v4 04/12] sparc/mm: " Kevin Brodsky
2025-11-01 12:14 ` David Hildenbrand
2025-10-29 10:09 ` [PATCH v4 05/12] mm: introduce CONFIG_ARCH_HAS_LAZY_MMU_MODE Kevin Brodsky
2025-11-01 12:16 ` David Hildenbrand
2025-11-05 4:40 ` Ritesh Harjani
2025-11-06 10:33 ` Kevin Brodsky
2025-11-07 13:56 ` Ryan Roberts
2025-11-10 10:37 ` Kevin Brodsky
2025-10-29 10:09 ` [PATCH v4 06/12] mm: introduce generic lazy_mmu helpers Kevin Brodsky
2025-11-01 12:18 ` David Hildenbrand
2025-11-07 14:26 ` Ryan Roberts
2025-11-07 14:34 ` David Hildenbrand (Red Hat)
2025-11-07 15:22 ` Ryan Roberts
2025-11-10 8:11 ` Alexander Gordeev
2025-11-10 9:19 ` Ryan Roberts
2025-11-11 8:01 ` Alexander Gordeev
2025-11-11 12:16 ` Ryan Roberts
2025-11-10 10:45 ` Kevin Brodsky
2025-11-24 12:47 ` Kevin Brodsky
2025-11-24 14:36 ` Ryan Roberts
2025-10-29 10:09 ` [PATCH v4 07/12] mm: enable lazy_mmu sections to nest Kevin Brodsky
2025-10-29 16:41 ` Alexander Gordeev
2025-10-30 10:28 ` Kevin Brodsky
2025-10-30 16:34 ` Alexander Gordeev
2025-11-01 12:22 ` David Hildenbrand
2025-11-03 18:08 ` Kevin Brodsky
2025-11-05 8:49 ` Ritesh Harjani
2025-11-05 16:12 ` Alexander Gordeev
2025-11-06 10:51 ` Kevin Brodsky
2025-11-06 15:33 ` Alexander Gordeev [this message]
2025-11-07 10:16 ` Kevin Brodsky
2025-11-06 16:32 ` Ritesh Harjani
2025-11-06 17:01 ` Ritesh Harjani
2025-11-07 11:13 ` Kevin Brodsky
2025-11-07 14:59 ` Ryan Roberts
2025-11-10 10:47 ` Kevin Brodsky
2025-11-11 10:24 ` Ryan Roberts
2025-11-11 15:56 ` Kevin Brodsky
2025-11-11 17:03 ` Ryan Roberts
2025-11-12 10:42 ` Kevin Brodsky
2025-11-12 13:57 ` David Hildenbrand (Red Hat)
2025-10-29 10:09 ` [PATCH v4 08/12] arm64: mm: replace TIF_LAZY_MMU with in_lazy_mmu_mode() Kevin Brodsky
2025-11-03 16:03 ` David Hildenbrand
2025-11-03 18:25 ` Kevin Brodsky
2025-11-07 15:28 ` Ryan Roberts
2025-10-29 10:09 ` [PATCH v4 09/12] powerpc/mm: replace batch->active " Kevin Brodsky
2025-11-03 16:05 ` David Hildenbrand
2025-11-04 11:33 ` Kevin Brodsky
2025-11-05 9:40 ` Ritesh Harjani
2025-10-29 10:09 ` [PATCH v4 10/12] sparc/mm: " Kevin Brodsky
2025-11-03 16:11 ` David Hildenbrand (Red Hat)
2025-10-29 10:09 ` [PATCH v4 11/12] x86/xen: use lazy_mmu_state when context-switching Kevin Brodsky
2025-11-03 16:15 ` David Hildenbrand (Red Hat)
2025-11-03 18:29 ` Kevin Brodsky
2025-11-03 19:23 ` David Hildenbrand (Red Hat)
2025-11-04 11:28 ` Kevin Brodsky
2025-10-29 10:09 ` [PATCH v4 12/12] mm: bail out of lazy_mmu_mode_* in interrupt context Kevin Brodsky
2025-11-07 15:42 ` Ryan Roberts
2025-11-10 10:48 ` Kevin Brodsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d5435e75-036b-44a5-a989-722e13f94b3e-agordeev@linux.ibm.com \
--to=agordeev@linux.ibm.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=andreas@gaisler.com \
--cc=boris.ostrovsky@oracle.com \
--cc=bp@alien8.de \
--cc=catalin.marinas@arm.com \
--cc=christophe.leroy@csgroup.eu \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=david@redhat.com \
--cc=dwmw2@infradead.org \
--cc=hpa@zytor.com \
--cc=jannh@google.com \
--cc=jgross@suse.com \
--cc=kevin.brodsky@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=maddy@linux.ibm.com \
--cc=mhocko@suse.com \
--cc=mingo@redhat.com \
--cc=mpe@ellerman.id.au \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=ritesh.list@gmail.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=sparclinux@vger.kernel.org \
--cc=surenb@google.com \
--cc=tglx@linutronix.de \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
--cc=yeoreum.yun@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox