linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Kevin Brodsky <kevin.brodsky@arm.com>
To: David Hildenbrand <david@redhat.com>,
	Alexander Gordeev <agordeev@linux.ibm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andreas Larsson <andreas@gaisler.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Borislav Petkov <bp@alien8.de>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Christophe Leroy <christophe.leroy@csgroup.eu>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"David S. Miller" <davem@davemloft.net>,
	"H. Peter Anvin" <hpa@zytor.com>, Ingo Molnar <mingo@redhat.com>,
	Jann Horn <jannh@google.com>, Juergen Gross <jgross@suse.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
	Madhavan Srinivasan <maddy@linux.ibm.com>,
	Michael Ellerman <mpe@ellerman.id.au>,
	Michal Hocko <mhocko@suse.com>, Mike Rapoport <rppt@kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	Vlastimil Babka <vbabka@suse.cz>, Will Deacon <will@kernel.org>,
	Yeoreum Yun <yeoreum.yun@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	linuxppc-dev@lists.ozlabs.org, sparclinux@vger.kernel.org,
	xen-devel@lists.xenproject.org
Subject: Re: [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections
Date: Tue, 9 Sep 2025 16:02:04 +0200	[thread overview]
Message-ID: <5681b377-baa7-4cd4-8e23-7314d58a7b5b@arm.com> (raw)
In-Reply-To: <47ee1df7-1602-4200-af94-475f84ca8d80@arm.com>

On 09/09/2025 15:49, Kevin Brodsky wrote:
> On 09/09/2025 13:54, David Hildenbrand wrote:
>> On 09.09.25 13:45, Alexander Gordeev wrote:
>>> On Tue, Sep 09, 2025 at 12:09:48PM +0200, David Hildenbrand wrote:
>>>> On 09.09.25 11:40, Alexander Gordeev wrote:
>>>>> On Tue, Sep 09, 2025 at 11:07:36AM +0200, David Hildenbrand wrote:
>>>>>> On 08.09.25 09:39, Kevin Brodsky wrote:
>>>>>>> arch_{enter,leave}_lazy_mmu_mode() currently have a stateless API
>>>>>>> (taking and returning no value). This is proving problematic in
>>>>>>> situations where leave() needs to restore some context back to its
>>>>>>> original state (before enter() was called). In particular, this
>>>>>>> makes it difficult to support the nesting of lazy_mmu sections -
>>>>>>> leave() does not know whether the matching enter() call occurred
>>>>>>> while lazy_mmu was already enabled, and whether to disable it or
>>>>>>> not.
>>>>>>>
>>>>>>> This patch gives all architectures the chance to store local state
>>>>>>> while inside a lazy_mmu section by making enter() return some value,
>>>>>>> storing it in a local variable, and having leave() take that value.
>>>>>>> That value is typed lazy_mmu_state_t - each architecture defining
>>>>>>> __HAVE_ARCH_ENTER_LAZY_MMU_MODE is free to define it as it sees fit.
>>>>>>> For now we define it as int everywhere, which is sufficient to
>>>>>>> support nesting.
>>>>> ...
>>>>>>> {
>>>>>>> + lazy_mmu_state_t lazy_mmu_state;
>>>>>>> ...
>>>>>>> - arch_enter_lazy_mmu_mode();
>>>>>>> + lazy_mmu_state = arch_enter_lazy_mmu_mode();
>>>>>>> ...
>>>>>>> - arch_leave_lazy_mmu_mode();
>>>>>>> + arch_leave_lazy_mmu_mode(lazy_mmu_state);
>>>>>>> ...
>>>>>>> }
>>>>>>>
>>>>>>> * In a few cases (e.g. xen_flush_lazy_mmu()), a function knows that
>>>>>>>      lazy_mmu is already enabled, and it temporarily disables it by
>>>>>>>      calling leave() and then enter() again. Here we want to ensure
>>>>>>>      that any operation between the leave() and enter() calls is
>>>>>>>      completed immediately; for that reason we pass
>>>>>>> LAZY_MMU_DEFAULT to
>>>>>>>      leave() to fully disable lazy_mmu. enter() will then
>>>>>>> re-enable it
>>>>>>>      - this achieves the expected behaviour, whether nesting
>>>>>>> occurred
>>>>>>>      before that function was called or not.
>>>>>>>
>>>>>>> Note: it is difficult to provide a default definition of
>>>>>>> lazy_mmu_state_t for architectures implementing lazy_mmu, because
>>>>>>> that definition would need to be available in
>>>>>>> arch/x86/include/asm/paravirt_types.h and adding a new generic
>>>>>>>     #include there is very tricky due to the existing header soup.
>>>>>> Yeah, I was wondering about exactly that.
>>>>>>
>>>>>> In particular because LAZY_MMU_DEFAULT etc resides somewehere
>>>>>> compeltely
>>>>>> different.
>>>>>>
>>>>>> Which raises the question: is using a new type really of any
>>>>>> benefit here?
>>>>>>
>>>>>> Can't we just use an "enum lazy_mmu_state" and call it a day?
>>>>> I could envision something completely different for this type on s390,
>>>>> e.g. a pointer to a per-cpu structure. So I would really ask to stick
>>>>> with the current approach.
> This is indeed the motivation - let every arch do whatever it sees fit.
> lazy_mmu_state_t is basically an opaque type as far as generic code is
> concerned, which also means that this API change is the first and last
> one we need (famous last words, I know). 
>
> I mentioned in the cover letter that the pkeys-based page table
> protection series [1] would have an immediate use for lazy_mmu_state_t.
> In that proposal, any helper writing to pgtables needs to modify the
> pkey register and then restore it. To reduce the overhead, lazy_mmu is
> used to set the pkey register only once in enter(), and then restore it
> in leave() [2]. This currently relies on storing the original pkey
> register value in thread_struct, which is suboptimal and most
> importantly doesn't work if lazy_mmu sections nest. With this series, we
> could instead store the pkey register value in lazy_mmu_state_t
> (enlarging it to 64 bits or more).

Forgot the references, sorry...

[1]
https://lore.kernel.org/linux-hardening/20250815085512.2182322-1-kevin.brodsky@arm.com/
[2]
https://lore.kernel.org/linux-hardening/20250815085512.2182322-19-kevin.brodsky@arm.com/

> I also considered going further and making lazy_mmu_state_t a pointer as
> Alexander suggested - more complex to manage, but also a lot more flexible.
>
>>>> Would that integrate well with LAZY_MMU_DEFAULT etc?
>>> Hmm... I though the idea is to use LAZY_MMU_* by architectures that
>>> want to use it - at least that is how I read the description above.
>>>
>>> It is only kasan_populate|depopulate_vmalloc_pte() in generic code
>>> that do not follow this pattern, and it looks as a problem to me.
> This discussion also made me realise that this is problematic, as the
> LAZY_MMU_{DEFAULT,NESTED} macros were meant only for architectures'
> convenience, not for generic code (where lazy_mmu_state_t should ideally
> be an opaque type as mentioned above). It almost feels like the kasan
> case deserves a different API, because this is not how enter() and
> leave() are meant to be used. This would mean quite a bit of churn
> though, so maybe just introduce another arch-defined value to pass to
> leave() for such a situation - for instance,
> arch_leave_lazy_mmu_mode(LAZY_MMU_FLUSH)?
>
>> Yes, that's why I am asking.
>>
>> What kind of information (pointer to a per-cpu structure) would you
>> want to return, and would handling it similar to how
>> pagefault_disable()/pagefault_enable() e.g., using a variable in
>> "current" to track the nesting level avoid having s390x to do that?
> The pagefault_disabled approach works fine for simple use-cases, but it
> doesn't scale well. The space allocated in task_struct/thread_struct to
> track that state is wasted (unused) most of the time. Worse, it does not
> truly enable states to be nested: it allows the outermost section to
> store some state, but nested sections cannot allocate extra space. This
> is really what the stack is for.
>
> - Kevin


  reply	other threads:[~2025-09-09 14:02 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-08  7:39 [PATCH v2 0/7] Nesting support for lazy MMU mode Kevin Brodsky
2025-09-08  7:39 ` [PATCH v2 1/7] mm: remove arch_flush_lazy_mmu_mode() Kevin Brodsky
2025-09-08  9:29   ` Yeoreum Yun
2025-09-09  9:00   ` David Hildenbrand
2025-09-08  7:39 ` [PATCH v2 2/7] mm: introduce local state for lazy_mmu sections Kevin Brodsky
2025-09-08  9:30   ` Yeoreum Yun
2025-09-09  5:40   ` Andrew Morton
2025-09-09  9:05     ` Kevin Brodsky
2025-09-09  9:07   ` David Hildenbrand
2025-09-09  9:40     ` Alexander Gordeev
2025-09-09 10:09       ` David Hildenbrand
2025-09-09 11:45         ` Alexander Gordeev
2025-09-09 11:54           ` David Hildenbrand
2025-09-09 13:49             ` Kevin Brodsky
2025-09-09 14:02               ` Kevin Brodsky [this message]
2025-09-09 14:28               ` David Hildenbrand
2025-09-10 15:16                 ` Kevin Brodsky
2025-09-10 15:37                   ` David Hildenbrand
2025-09-11 16:19                     ` Kevin Brodsky
2025-09-11 18:14                       ` David Hildenbrand
2025-09-12  7:26                         ` Kevin Brodsky
2025-09-12  8:04                           ` David Hildenbrand
2025-09-12  8:48                             ` Kevin Brodsky
2025-09-12  8:55                               ` David Hildenbrand
2025-09-12 12:37                                 ` Alexander Gordeev
2025-09-12 12:40                                   ` David Hildenbrand
2025-09-12 12:56                                     ` Alexander Gordeev
2025-09-12 13:02                                       ` David Hildenbrand
2025-09-12 14:05                                         ` Alexander Gordeev
2025-09-12 14:25                                           ` David Hildenbrand
2025-09-12 15:02                                             ` Kevin Brodsky
2025-09-09 14:38               ` Alexander Gordeev
2025-09-10 16:11                 ` Kevin Brodsky
2025-09-11 12:06                   ` Alexander Gordeev
2025-09-11 16:20                     ` Kevin Brodsky
2025-09-09 10:57     ` Juergen Gross
2025-09-09 14:15       ` Kevin Brodsky
2025-09-09 10:08   ` Jürgen Groß
2025-09-08  7:39 ` [PATCH v2 3/7] arm64: mm: fully support nested " Kevin Brodsky
2025-09-08  9:30   ` Yeoreum Yun
2025-09-08  7:39 ` [PATCH v2 4/7] x86/xen: support nested lazy_mmu sections (again) Kevin Brodsky
2025-09-09  9:13   ` David Hildenbrand
2025-09-09  9:37     ` Jürgen Groß
2025-09-09  9:56       ` David Hildenbrand
2025-09-09 11:28         ` Kevin Brodsky
2025-09-09  9:42   ` Jürgen Groß
2025-09-08  7:39 ` [PATCH v2 5/7] powerpc/mm: support nested lazy_mmu sections Kevin Brodsky
2025-09-08  7:39 ` [PATCH v2 6/7] sparc/mm: " Kevin Brodsky
2025-09-08  7:39 ` [PATCH v2 7/7] mm: update lazy_mmu documentation Kevin Brodsky
2025-09-08  9:30   ` Yeoreum Yun
2025-09-08 16:56 ` [PATCH v2 0/7] Nesting support for lazy MMU mode Lorenzo Stoakes
2025-09-09  9:10   ` Kevin Brodsky
2025-09-09  2:16 ` Andrew Morton
2025-09-09  9:21   ` David Hildenbrand
2025-09-09 13:59     ` Kevin Brodsky
2025-09-12 15:25     ` Kevin Brodsky
2025-09-15  6:28       ` Alexander Gordeev
2025-09-15 11:19         ` Kevin Brodsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5681b377-baa7-4cd4-8e23-7314d58a7b5b@arm.com \
    --to=kevin.brodsky@arm.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=agordeev@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=andreas@gaisler.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=christophe.leroy@csgroup.eu \
    --cc=dave.hansen@linux.intel.com \
    --cc=davem@davemloft.net \
    --cc=david@redhat.com \
    --cc=hpa@zytor.com \
    --cc=jannh@google.com \
    --cc=jgross@suse.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=maddy@linux.ibm.com \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=mpe@ellerman.id.au \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=sparclinux@vger.kernel.org \
    --cc=surenb@google.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yeoreum.yun@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox