From: Alexander Gordeev <agordeev@linux.ibm.com>
To: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Albert Ou <aou@eecs.berkeley.edu>,
Andreas Larsson <andreas@gaisler.com>,
Andrew Morton <akpm@linux-foundation.org>,
Catalin Marinas <catalin.marinas@arm.com>,
Dave Hansen <dave.hansen@linux.intel.com>,
"David S. Miller" <davem@davemloft.net>,
Geert Uytterhoeven <geert@linux-m68k.org>,
Linus Walleij <linus.walleij@linaro.org>,
Madhavan Srinivasan <maddy@linux.ibm.com>,
Mark Rutland <mark.rutland@arm.com>,
Matthew Wilcox <willy@infradead.org>,
Michael Ellerman <mpe@ellerman.id.au>,
"Mike Rapoport (IBM)" <rppt@kernel.org>,
Palmer Dabbelt <palmer@dabbelt.com>,
Paul Walmsley <paul.walmsley@sifive.com>,
Peter Zijlstra <peterz@infradead.org>,
Qi Zheng <zhengqi.arch@bytedance.com>,
Ryan Roberts <ryan.roberts@arm.com>,
Will Deacon <will@kernel.org>,
Yang Shi <yang@os.amperecomputing.com>,
linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org,
linux-csky@vger.kernel.org, linux-m68k@lists.linux-m68k.org,
linux-openrisc@vger.kernel.org, linux-riscv@lists.infradead.org,
linux-s390@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
sparclinux@vger.kernel.org, x86@kernel.org
Subject: Re: [PATCH v2 03/12] mm: Call ctor/dtor for kernel PTEs
Date: Fri, 25 Apr 2025 18:35:22 +0200 [thread overview]
Message-ID: <aAu5ylJPs+Oa9iQ3@li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com> (raw)
In-Reply-To: <20250408095222.860601-4-kevin.brodsky@arm.com>
On Tue, Apr 08, 2025 at 10:52:13AM +0100, Kevin Brodsky wrote:
> Since [1], constructors/destructors are expected to be called for
> all page table pages, at all levels and for both user and kernel
> pgtables. There is however one glaring exception: kernel PTEs are
> managed via separate helpers (pte_alloc_kernel/pte_free_kernel),
> which do not call the [cd]tor, at least not in the generic
> implementation.
>
> The most obvious reason for this anomaly is that init_mm is
> special-cased not to use split page table locks. As a result calling
> ptlock_init() for PTEs associated with init_mm would be wasteful,
> potentially resulting in dynamic memory allocation. However, pgtable
> [cd]tors perform other actions - currently related to
> accounting/statistics, and potentially more functionally significant
> in the future.
>
> Now that pagetable_pte_ctor() is passed the associated mm, we can
> make it skip the call to ptlock_init() for init_mm; this allows us
> to call the ctor from pte_alloc_one_kernel() too. This is matched by
> a call to the pgtable destructor in pte_free_kernel(); no
> special-casing is needed on that path, as ptlock_free() is already
> called unconditionally. (ptlock_free() is a no-op unless a ptlock
> was allocated for the given PTP.)
>
> This patch ensures that all architectures that rely on
> <asm-generic/pgalloc.h> call the [cd]tor for kernel PTEs.
> pte_free_kernel() cannot be overridden so changing the generic
> implementation is sufficient. pte_alloc_one_kernel() can be
> overridden using __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL, and a few
> architectures implement it by calling the page allocator directly.
> We amend those so that they call the generic
> __pte_alloc_one_kernel() instead, if possible, ensuring that the
> ctor is called.
>
> A few architectures do not use <asm-generic/pgalloc.h>; those will
> be taken care of separately.
>
> [1] https://lore.kernel.org/linux-mm/20250103184415.2744423-1-kevin.brodsky@arm.com/
>
> Signed-off-by: Kevin Brodsky <kevin.brodsky@arm.com>
> ---
> arch/csky/include/asm/pgalloc.h | 2 +-
> arch/microblaze/mm/pgtable.c | 2 +-
> arch/openrisc/mm/ioremap.c | 2 +-
> include/asm-generic/pgalloc.h | 7 ++++++-
> include/linux/mm.h | 2 +-
> 5 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/arch/csky/include/asm/pgalloc.h b/arch/csky/include/asm/pgalloc.h
> index 11055c574968..9ed2b15ffd94 100644
> --- a/arch/csky/include/asm/pgalloc.h
> +++ b/arch/csky/include/asm/pgalloc.h
> @@ -29,7 +29,7 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
> pte_t *pte;
> unsigned long i;
>
> - pte = (pte_t *) __get_free_page(GFP_KERNEL);
> + pte = __pte_alloc_one_kernel(mm);
> if (!pte)
> return NULL;
>
> diff --git a/arch/microblaze/mm/pgtable.c b/arch/microblaze/mm/pgtable.c
> index 9f73265aad4e..e96dd1b7aba4 100644
> --- a/arch/microblaze/mm/pgtable.c
> +++ b/arch/microblaze/mm/pgtable.c
> @@ -245,7 +245,7 @@ unsigned long iopa(unsigned long addr)
> __ref pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
> {
> if (mem_init_done)
> - return (pte_t *)__get_free_page(GFP_KERNEL | __GFP_ZERO);
> + return __pte_alloc_one_kernel(mm);
> else
> return memblock_alloc_try_nid(PAGE_SIZE, PAGE_SIZE,
> MEMBLOCK_LOW_LIMIT,
> diff --git a/arch/openrisc/mm/ioremap.c b/arch/openrisc/mm/ioremap.c
> index 8e63e86251ca..3b352f97fecb 100644
> --- a/arch/openrisc/mm/ioremap.c
> +++ b/arch/openrisc/mm/ioremap.c
> @@ -36,7 +36,7 @@ pte_t __ref *pte_alloc_one_kernel(struct mm_struct *mm)
> pte_t *pte;
>
> if (likely(mem_init_done)) {
> - pte = (pte_t *)get_zeroed_page(GFP_KERNEL);
> + pte = __pte_alloc_one_kernel(mm);
> } else {
> pte = memblock_alloc_or_panic(PAGE_SIZE, PAGE_SIZE);
> }
> diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
> index e164ca66f0f6..3c8ec3bfea44 100644
> --- a/include/asm-generic/pgalloc.h
> +++ b/include/asm-generic/pgalloc.h
> @@ -23,6 +23,11 @@ static inline pte_t *__pte_alloc_one_kernel_noprof(struct mm_struct *mm)
>
> if (!ptdesc)
> return NULL;
> + if (!pagetable_pte_ctor(mm, ptdesc)) {
> + pagetable_free(ptdesc);
> + return NULL;
> + }
> +
> return ptdesc_address(ptdesc);
> }
> #define __pte_alloc_one_kernel(...) alloc_hooks(__pte_alloc_one_kernel_noprof(__VA_ARGS__))
> @@ -48,7 +53,7 @@ static inline pte_t *pte_alloc_one_kernel_noprof(struct mm_struct *mm)
> */
> static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
> {
> - pagetable_free(virt_to_ptdesc(pte));
> + pagetable_dtor_free(virt_to_ptdesc(pte));
> }
>
> /**
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index f9b793cce2c1..3f48e449574a 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3103,7 +3103,7 @@ static inline void pagetable_dtor_free(struct ptdesc *ptdesc)
> static inline bool pagetable_pte_ctor(struct mm_struct *mm,
> struct ptdesc *ptdesc)
> {
> - if (!ptlock_init(ptdesc))
> + if (mm != &init_mm && !ptlock_init(ptdesc))
> return false;
> __pagetable_ctor(ptdesc);
> return true;
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com> # s390
next prev parent reply other threads:[~2025-04-25 16:36 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-08 9:52 [PATCH v2 00/12] Always call constructor for kernel page tables Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 01/12] mm: Pass mm down to pagetable_{pte,pmd}_ctor Kevin Brodsky
2025-04-25 16:34 ` Alexander Gordeev
2025-04-08 9:52 ` [PATCH v2 02/12] x86: pgtable: Always use pte_free_kernel() Kevin Brodsky
2025-04-08 15:22 ` Dave Hansen
2025-04-08 16:37 ` Matthew Wilcox
2025-04-08 16:54 ` Dave Hansen
2025-04-08 17:40 ` Matthew Wilcox
2025-04-08 17:42 ` Dave Hansen
2025-04-09 14:50 ` Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 03/12] mm: Call ctor/dtor for kernel PTEs Kevin Brodsky
2025-04-25 16:35 ` Alexander Gordeev [this message]
2025-04-08 9:52 ` [PATCH v2 04/12] m68k: " Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 05/12] powerpc: " Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 06/12] sparc64: " Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 07/12] mm: Skip ptlock_init() for kernel PMDs Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 08/12] arm64: mm: Use enum to identify pgtable level instead of *_SHIFT Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 09/12] arm64: mm: Always call PTE/PMD ctor in __create_pgd_mapping() Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 10/12] riscv: mm: Clarify ctor mm argument in alloc_{pte,pmd}_late Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 11/12] arm64: mm: Call PUD/P4D ctor in __create_pgd_mapping() Kevin Brodsky
2025-04-08 9:52 ` [PATCH v2 12/12] riscv: mm: Call PUD/P4D ctor in special kernel pgtable alloc Kevin Brodsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aAu5ylJPs+Oa9iQ3@li-008a6a4c-3549-11b2-a85c-c5cc2836eea2.ibm.com \
--to=agordeev@linux.ibm.com \
--cc=akpm@linux-foundation.org \
--cc=andreas@gaisler.com \
--cc=aou@eecs.berkeley.edu \
--cc=catalin.marinas@arm.com \
--cc=dave.hansen@linux.intel.com \
--cc=davem@davemloft.net \
--cc=geert@linux-m68k.org \
--cc=kevin.brodsky@arm.com \
--cc=linus.walleij@linaro.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-csky@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-m68k@lists.linux-m68k.org \
--cc=linux-mm@kvack.org \
--cc=linux-openrisc@vger.kernel.org \
--cc=linux-riscv@lists.infradead.org \
--cc=linux-s390@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=maddy@linux.ibm.com \
--cc=mark.rutland@arm.com \
--cc=mpe@ellerman.id.au \
--cc=palmer@dabbelt.com \
--cc=paul.walmsley@sifive.com \
--cc=peterz@infradead.org \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=sparclinux@vger.kernel.org \
--cc=will@kernel.org \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
--cc=yang@os.amperecomputing.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox