* [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table()
@ 2024-12-14 9:02 Qi Zheng
2024-12-14 9:02 ` [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU" Qi Zheng
` (12 more replies)
0 siblings, 13 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng
Hi all,
As proposed [1] by Peter Zijlstra below, this patch series aims to move
pagetable_*_dtor() into __tlb_remove_table(). This will cleanup pagetable_*_dtor()
a bit and more gracefully fix the UAF issue [2] reported by syzbot.
```
Notably:
- s390 pud isn't calling the existing pagetable_pud_[cd]tor()
- none of the p4d things have pagetable_p4d_[cd]tor() (x86,arm64,s390,riscv)
and they have inconsistent accounting
- while much of the _ctor calls are in generic code, many of the _dtor
calls are in arch code for hysterial raisins, this could easily be
fixed
- if we fix ptlock_free() to handle NULL, then all the _dtor()
functions can use it, and we can observe they're all identical
and can be folded
after all that cleanup, you can move the _dtor from *_free_tlb() into
tlb_remove_table() -- which for the above case, would then have it
called from __tlb_remove_table_free().
```
And hi Andrew, I developed the code based on the latest linux-next, so I reverted
the "mm: pgtable: make ptlock be freed by RCU" first. Once the review of this
patch series is completed, the "mm: pgtable: make ptlock be freed by RCU" can be
dropped directly from mm tree, and this revert patch will not be needed.
This series is based on next-20241213. And I tested this patch series on x86 and
only cross-compiled it on arm[64], powerpc, riscv, s390 and sparc.
Comments and suggestions are welcome!
Thanks,
Qi
[1]. https://lore.kernel.org/all/20241211133433.GC12500@noisy.programming.kicks-ass.net/
[2]. https://lore.kernel.org/all/67548279.050a0220.a30f1.015b.GAE@google.com/
Qi Zheng (12):
Revert "mm: pgtable: make ptlock be freed by RCU"
mm: pgtable: introduce generic p4d_alloc_one() and p4d_free()
arm64: pgtable: use mmu gather to free p4d level page table
s390: pgtable: add statistics for PUD and P4D level page table
mm: pgtable: introduce pagetable_dtor()
arm: pgtable: move pagetable_dtor() to __tlb_remove_table()
arm64: pgtable: move pagetable_dtor() to __tlb_remove_table()
riscv: pgtable: move pagetable_dtor() to __tlb_remove_table()
x86: pgtable: move pagetable_dtor() to __tlb_remove_table()
s390: pgtable: also move pagetable_dtor() of PxD to
__tlb_remove_table()
mm: pgtable: introduce generic __tlb_remove_table()
mm: pgtable: move __tlb_remove_table_one() in x86 to generic file
Documentation/mm/split_page_table_lock.rst | 4 +-
arch/arm/include/asm/tlb.h | 10 ----
arch/arm64/include/asm/pgalloc.h | 17 +++---
arch/arm64/include/asm/tlb.h | 21 +++++---
arch/csky/include/asm/pgalloc.h | 2 +-
arch/hexagon/include/asm/pgalloc.h | 2 +-
arch/loongarch/include/asm/pgalloc.h | 2 +-
arch/m68k/include/asm/mcf_pgalloc.h | 4 +-
arch/m68k/include/asm/sun3_pgalloc.h | 2 +-
arch/m68k/mm/motorola.c | 2 +-
arch/mips/include/asm/pgalloc.h | 2 +-
arch/nios2/include/asm/pgalloc.h | 2 +-
arch/openrisc/include/asm/pgalloc.h | 2 +-
arch/powerpc/include/asm/tlb.h | 1 +
arch/powerpc/mm/book3s64/mmu_context.c | 2 +-
arch/powerpc/mm/book3s64/pgtable.c | 2 +-
arch/powerpc/mm/pgtable-frag.c | 4 +-
arch/riscv/include/asm/pgalloc.h | 51 ++++++++----------
arch/riscv/include/asm/tlb.h | 18 -------
arch/riscv/mm/init.c | 4 +-
arch/s390/include/asm/pgalloc.h | 31 +++++++----
arch/s390/include/asm/tlb.h | 37 ++++++-------
arch/s390/mm/pgalloc.c | 21 ++------
arch/sh/include/asm/pgalloc.h | 2 +-
arch/sparc/include/asm/tlb_32.h | 1 +
arch/sparc/include/asm/tlb_64.h | 1 +
arch/sparc/mm/init_64.c | 2 +-
arch/sparc/mm/srmmu.c | 2 +-
arch/um/include/asm/pgalloc.h | 6 +--
arch/x86/include/asm/pgalloc.h | 16 +++---
arch/x86/include/asm/tlb.h | 33 ------------
arch/x86/kernel/paravirt.c | 1 +
arch/x86/mm/pgtable.c | 13 ++---
include/asm-generic/pgalloc.h | 61 ++++++++++++++++++++--
include/asm-generic/tlb.h | 15 +++++-
include/linux/mm.h | 44 +++++-----------
include/linux/mm_types.h | 9 +---
mm/memory.c | 23 +++-----
mm/mmu_gather.c | 20 ++++++-
39 files changed, 233 insertions(+), 259 deletions(-)
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU"
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 18:29 ` Yu Zhao
2024-12-14 9:02 ` [PATCH 02/12] mm: pgtable: introduce generic p4d_alloc_one() and p4d_free() Qi Zheng
` (11 subsequent siblings)
12 siblings, 1 reply; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng
This reverts commit 2f3443770437e49abc39af26962d293851cbab6d.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
include/linux/mm.h | 2 +-
include/linux/mm_types.h | 9 +--------
mm/memory.c | 22 ++++++----------------
3 files changed, 8 insertions(+), 25 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e7902980439cc..5e73e53c34e9e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2988,7 +2988,7 @@ void ptlock_free(struct ptdesc *ptdesc);
static inline spinlock_t *ptlock_ptr(struct ptdesc *ptdesc)
{
- return &(ptdesc->ptl->ptl);
+ return ptdesc->ptl;
}
#else /* ALLOC_SPLIT_PTLOCKS */
static inline void ptlock_cache_init(void)
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index df8f5152644ec..5d8779997266e 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -434,13 +434,6 @@ FOLIO_MATCH(flags, _flags_2a);
FOLIO_MATCH(compound_head, _head_2a);
#undef FOLIO_MATCH
-#if ALLOC_SPLIT_PTLOCKS
-struct pt_lock {
- spinlock_t ptl;
- struct rcu_head rcu;
-};
-#endif
-
/**
* struct ptdesc - Memory descriptor for page tables.
* @__page_flags: Same as page flags. Powerpc only.
@@ -485,7 +478,7 @@ struct ptdesc {
union {
unsigned long _pt_pad_2;
#if ALLOC_SPLIT_PTLOCKS
- struct pt_lock *ptl;
+ spinlock_t *ptl;
#else
spinlock_t ptl;
#endif
diff --git a/mm/memory.c b/mm/memory.c
index d9af83dd86bbf..83765632e20b0 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -7041,34 +7041,24 @@ static struct kmem_cache *page_ptl_cachep;
void __init ptlock_cache_init(void)
{
- page_ptl_cachep = kmem_cache_create("page->ptl", sizeof(struct pt_lock), 0,
+ page_ptl_cachep = kmem_cache_create("page->ptl", sizeof(spinlock_t), 0,
SLAB_PANIC, NULL);
}
bool ptlock_alloc(struct ptdesc *ptdesc)
{
- struct pt_lock *pt_lock;
+ spinlock_t *ptl;
- pt_lock = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
- if (!pt_lock)
+ ptl = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
+ if (!ptl)
return false;
- ptdesc->ptl = pt_lock;
+ ptdesc->ptl = ptl;
return true;
}
-static void ptlock_free_rcu(struct rcu_head *head)
-{
- struct pt_lock *pt_lock;
-
- pt_lock = container_of(head, struct pt_lock, rcu);
- kmem_cache_free(page_ptl_cachep, pt_lock);
-}
-
void ptlock_free(struct ptdesc *ptdesc)
{
- struct pt_lock *pt_lock = ptdesc->ptl;
-
- call_rcu(&pt_lock->rcu, ptlock_free_rcu);
+ kmem_cache_free(page_ptl_cachep, ptdesc->ptl);
}
#endif
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 02/12] mm: pgtable: introduce generic p4d_alloc_one() and p4d_free()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
2024-12-14 9:02 ` [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU" Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-18 14:26 ` Klara Modin
2024-12-14 9:02 ` [PATCH 03/12] arm64: pgtable: use mmu gather to free p4d level page table Qi Zheng
` (10 subsequent siblings)
12 siblings, 1 reply; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng
Several architectures (arm64, riscv, x86) define p4d_alloc_one() as a
wrapper for get_zeroed_page() and p4d_free() as a wrapper for free_page().
For these architectures, provide a generic implementation in
asm-generic/pgalloc.h and convert them to use it. And like other levels
of page tables, add statistics for P4D level page table.
For s390, it also defines p4d_alloc_one() and p4d_free(), but it uses its
own logic, so skip it.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
arch/arm64/include/asm/pgalloc.h | 15 ++++-----
arch/riscv/include/asm/pgalloc.h | 25 ++++++---------
arch/x86/include/asm/pgalloc.h | 16 ++++------
arch/x86/mm/pgtable.c | 3 ++
include/asm-generic/pgalloc.h | 55 ++++++++++++++++++++++++++++++++
include/linux/mm.h | 16 ++++++++++
6 files changed, 98 insertions(+), 32 deletions(-)
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index e75422864d1bd..679c530549327 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -15,6 +15,8 @@
#define __HAVE_ARCH_PGD_FREE
#define __HAVE_ARCH_PUD_FREE
+#define __HAVE_ARCH_P4D_ALLOC_ONE
+#define __HAVE_ARCH_P4D_FREE
#include <asm-generic/pgalloc.h>
#define PGD_SIZE (PTRS_PER_PGD * sizeof(pgd_t))
@@ -87,19 +89,16 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
{
- gfp_t gfp = GFP_PGTABLE_USER;
+ if (!pgtable_l5_enabled())
+ return NULL;
- if (mm == &init_mm)
- gfp = GFP_PGTABLE_KERNEL;
- return (p4d_t *)get_zeroed_page(gfp);
+ return __p4d_alloc_one(mm, addr);
}
static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
{
- if (!pgtable_l5_enabled())
- return;
- BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
- free_page((unsigned long)p4d);
+ if (pgtable_l5_enabled())
+ __p4d_free(mm, p4d);
}
#define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d)
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index f52264304f772..bb6e1c5f1fb19 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -14,6 +14,8 @@
#ifdef CONFIG_MMU
#define __HAVE_ARCH_PUD_ALLOC_ONE
#define __HAVE_ARCH_PUD_FREE
+#define __HAVE_ARCH_P4D_ALLOC_ONE
+#define __HAVE_ARCH_P4D_FREE
#include <asm-generic/pgalloc.h>
static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt)
@@ -118,21 +120,10 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
#define p4d_alloc_one p4d_alloc_one
static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
{
- if (pgtable_l5_enabled) {
- gfp_t gfp = GFP_PGTABLE_USER;
-
- if (mm == &init_mm)
- gfp = GFP_PGTABLE_KERNEL;
- return (p4d_t *)get_zeroed_page(gfp);
- }
+ if (!pgtable_l5_enabled)
+ return NULL;
- return NULL;
-}
-
-static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
-{
- BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
- free_page((unsigned long)p4d);
+ return __p4d_alloc_one(mm, addr);
}
#define p4d_free p4d_free
@@ -145,8 +136,12 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long addr)
{
- if (pgtable_l5_enabled)
+ if (pgtable_l5_enabled) {
+ struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
+
+ pagetable_p4d_dtor(ptdesc);
riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
+ }
}
#endif /* __PAGETABLE_PMD_FOLDED */
diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
index dcd836b59bebd..d9bc6cae77c9e 100644
--- a/arch/x86/include/asm/pgalloc.h
+++ b/arch/x86/include/asm/pgalloc.h
@@ -8,6 +8,8 @@
#define __HAVE_ARCH_PTE_ALLOC_ONE
#define __HAVE_ARCH_PGD_FREE
+#define __HAVE_ARCH_P4D_ALLOC_ONE
+#define __HAVE_ARCH_P4D_FREE
#include <asm-generic/pgalloc.h>
static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; }
@@ -149,20 +151,16 @@ static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4
static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
{
- gfp_t gfp = GFP_KERNEL_ACCOUNT;
+ if (!pgtable_l5_enabled())
+ return NULL;
- if (mm == &init_mm)
- gfp &= ~__GFP_ACCOUNT;
- return (p4d_t *)get_zeroed_page(gfp);
+ return __p4d_alloc_one(mm, addr);
}
static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
{
- if (!pgtable_l5_enabled())
- return;
-
- BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
- free_page((unsigned long)p4d);
+ if (pgtable_l5_enabled())
+ return __p4d_free(mm, p4d);
}
extern void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d);
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 69a357b15974a..3d6e84da45b24 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -94,6 +94,9 @@ void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
#if CONFIG_PGTABLE_LEVELS > 4
void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
{
+ struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
+
+ pagetable_p4d_dtor(ptdesc);
paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
paravirt_tlb_remove_table(tlb, virt_to_page(p4d));
}
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index 7c48f5fbf8aa7..dbf61819b3581 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -215,6 +215,61 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud)
#endif /* CONFIG_PGTABLE_LEVELS > 3 */
+#if CONFIG_PGTABLE_LEVELS > 4
+
+static inline p4d_t *__p4d_alloc_one_noprof(struct mm_struct *mm, unsigned long addr)
+{
+ gfp_t gfp = GFP_PGTABLE_USER;
+ struct ptdesc *ptdesc;
+
+ if (mm == &init_mm)
+ gfp = GFP_PGTABLE_KERNEL;
+ gfp &= ~__GFP_HIGHMEM;
+
+ ptdesc = pagetable_alloc_noprof(gfp, 0);
+ if (!ptdesc)
+ return NULL;
+
+ pagetable_p4d_ctor(ptdesc);
+ return ptdesc_address(ptdesc);
+}
+#define __p4d_alloc_one(...) alloc_hooks(__p4d_alloc_one_noprof(__VA_ARGS__))
+
+#ifndef __HAVE_ARCH_P4D_ALLOC_ONE
+/**
+ * p4d_alloc_one - allocate memory for a P4D-level page table
+ * @mm: the mm_struct of the current context
+ *
+ * Allocate memory for a page table using %GFP_PGTABLE_USER for user context
+ * and %GFP_PGTABLE_KERNEL for kernel context.
+ *
+ * Return: pointer to the allocated memory or %NULL on error
+ */
+static inline p4d_t *p4d_alloc_one_noprof(struct mm_struct *mm, unsigned long addr)
+{
+ return __p4d_alloc_one_noprof(mm, addr);
+}
+#define p4d_alloc_one(...) alloc_hooks(p4d_alloc_one_noprof(__VA_ARGS__))
+#endif
+
+static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
+{
+ struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
+
+ BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
+ pagetable_p4d_dtor(ptdesc);
+ pagetable_free(ptdesc);
+}
+
+#ifndef __HAVE_ARCH_P4D_FREE
+static inline void p4d_free(struct mm_struct *mm, pud_t *p4d)
+{
+ __p4d_free(mm, p4d);
+}
+#endif
+
+#endif
+
#ifndef __HAVE_ARCH_PGD_FREE
static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
{
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 5e73e53c34e9e..807a12ed8ec96 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3237,6 +3237,22 @@ static inline void pagetable_pud_dtor(struct ptdesc *ptdesc)
lruvec_stat_sub_folio(folio, NR_PAGETABLE);
}
+static inline void pagetable_p4d_ctor(struct ptdesc *ptdesc)
+{
+ struct folio *folio = ptdesc_folio(ptdesc);
+
+ __folio_set_pgtable(folio);
+ lruvec_stat_add_folio(folio, NR_PAGETABLE);
+}
+
+static inline void pagetable_p4d_dtor(struct ptdesc *ptdesc)
+{
+ struct folio *folio = ptdesc_folio(ptdesc);
+
+ __folio_clear_pgtable(folio);
+ lruvec_stat_sub_folio(folio, NR_PAGETABLE);
+}
+
extern void __init pagecache_init(void);
extern void free_initmem(void);
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 03/12] arm64: pgtable: use mmu gather to free p4d level page table
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
2024-12-14 9:02 ` [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU" Qi Zheng
2024-12-14 9:02 ` [PATCH 02/12] mm: pgtable: introduce generic p4d_alloc_one() and p4d_free() Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 04/12] s390: pgtable: add statistics for PUD and P4D " Qi Zheng
` (9 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng, linux-arm-kernel
Like other levels of page tables, also use mmu gather mechanism to free
p4d level page table.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: linux-arm-kernel@lists.infradead.org
---
arch/arm64/include/asm/pgalloc.h | 2 --
arch/arm64/include/asm/tlb.h | 14 ++++++++++++++
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 679c530549327..58cc260b6bee8 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -100,8 +100,6 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
if (pgtable_l5_enabled())
__p4d_free(mm, p4d);
}
-
-#define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d)
#else
static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t p4dp, pgdval_t prot)
{
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index a947c6e784ed2..445282cde9afb 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -111,4 +111,18 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
}
#endif
+#if CONFIG_PGTABLE_LEVELS > 4
+static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4dp,
+ unsigned long addr)
+{
+ struct ptdesc *ptdesc = virt_to_ptdesc(p4dp);
+
+ if (!pgtable_l5_enabled())
+ return;
+
+ pagetable_p4d_dtor(ptdesc);
+ tlb_remove_ptdesc(tlb, ptdesc);
+}
+#endif
+
#endif
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 04/12] s390: pgtable: add statistics for PUD and P4D level page table
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (2 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 03/12] arm64: pgtable: use mmu gather to free p4d level page table Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 05/12] mm: pgtable: introduce pagetable_dtor() Qi Zheng
` (8 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng, linux-s390
Like PMD and PTE level page table, also add statistics for PUD and P4D
page table.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: linux-s390@vger.kernel.org
---
arch/s390/include/asm/pgalloc.h | 29 +++++++++++++++++++-------
arch/s390/include/asm/tlb.h | 37 +++++++++++++++++----------------
2 files changed, 40 insertions(+), 26 deletions(-)
diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index 7b84ef6dc4b6d..a0c1ca5d8423c 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -53,29 +53,42 @@ static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long address)
{
unsigned long *table = crst_table_alloc(mm);
- if (table)
- crst_table_init(table, _REGION2_ENTRY_EMPTY);
+ if (!table)
+ return NULL;
+ crst_table_init(table, _REGION2_ENTRY_EMPTY);
+ pagetable_p4d_ctor(virt_to_ptdesc(table));
+
return (p4d_t *) table;
}
static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
{
- if (!mm_p4d_folded(mm))
- crst_table_free(mm, (unsigned long *) p4d);
+ if (mm_p4d_folded(mm))
+ return;
+
+ pagetable_p4d_dtor(virt_to_ptdesc(p4d));
+ crst_table_free(mm, (unsigned long *) p4d);
}
static inline pud_t *pud_alloc_one(struct mm_struct *mm, unsigned long address)
{
unsigned long *table = crst_table_alloc(mm);
- if (table)
- crst_table_init(table, _REGION3_ENTRY_EMPTY);
+
+ if (!table)
+ return NULL;
+ crst_table_init(table, _REGION3_ENTRY_EMPTY);
+ pagetable_pud_ctor(virt_to_ptdesc(table));
+
return (pud_t *) table;
}
static inline void pud_free(struct mm_struct *mm, pud_t *pud)
{
- if (!mm_pud_folded(mm))
- crst_table_free(mm, (unsigned long *) pud);
+ if (mm_pud_folded(mm))
+ return;
+
+ pagetable_pud_dtor(virt_to_ptdesc(pud));
+ crst_table_free(mm, (unsigned long *) pud);
}
static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long vmaddr)
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index e95b2c8081eb8..b946964afce8e 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -110,24 +110,6 @@ static inline void pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
tlb_remove_ptdesc(tlb, pmd);
}
-/*
- * p4d_free_tlb frees a pud table and clears the CRSTE for the
- * region second table entry from the tlb.
- * If the mm uses a four level page table the single p4d is freed
- * as the pgd. p4d_free_tlb checks the asce_limit against 8PB
- * to avoid the double free of the p4d in this case.
- */
-static inline void p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
- unsigned long address)
-{
- if (mm_p4d_folded(tlb->mm))
- return;
- __tlb_adjust_range(tlb, address, PAGE_SIZE);
- tlb->mm->context.flush_mm = 1;
- tlb->freed_tables = 1;
- tlb_remove_ptdesc(tlb, p4d);
-}
-
/*
* pud_free_tlb frees a pud table and clears the CRSTE for the
* region third table entry from the tlb.
@@ -140,11 +122,30 @@ static inline void pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
{
if (mm_pud_folded(tlb->mm))
return;
+ pagetable_pud_dtor(virt_to_ptdesc(pud));
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
tlb->cleared_p4ds = 1;
tlb_remove_ptdesc(tlb, pud);
}
+/*
+ * p4d_free_tlb frees a p4d table and clears the CRSTE for the
+ * region second table entry from the tlb.
+ * If the mm uses a four level page table the single p4d is freed
+ * as the pgd. p4d_free_tlb checks the asce_limit against 8PB
+ * to avoid the double free of the p4d in this case.
+ */
+static inline void p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
+ unsigned long address)
+{
+ if (mm_p4d_folded(tlb->mm))
+ return;
+ pagetable_p4d_dtor(virt_to_ptdesc(p4d));
+ __tlb_adjust_range(tlb, address, PAGE_SIZE);
+ tlb->mm->context.flush_mm = 1;
+ tlb->freed_tables = 1;
+ tlb_remove_ptdesc(tlb, p4d);
+}
#endif /* _S390_TLB_H */
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 05/12] mm: pgtable: introduce pagetable_dtor()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (3 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 04/12] s390: pgtable: add statistics for PUD and P4D " Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 06/12] arm: pgtable: move pagetable_dtor() to __tlb_remove_table() Qi Zheng
` (7 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng
The pagetable_p*_dtor() are exactly the same except for the handling of
ptlock. If we make ptlock_free() handle the case where ptdesc->ptl is
NULL and remove VM_BUG_ON_PAGE() from pmd_ptlock_free(), we can unify
pagetable_p*_dtor() into one function. Let's introduce pagetable_dtor()
to do this.
Later, pagetable_dtor() will be moved to tlb_remove_ptdesc(), so that
ptlock and page table pages can be freed together (regardless of whether
RCU is used). This prevents the use-after-free problem where the ptlock
is freed immediately but the page table pages is freed later via RCU.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
Documentation/mm/split_page_table_lock.rst | 4 +-
arch/arm/include/asm/tlb.h | 4 +-
arch/arm64/include/asm/tlb.h | 8 ++--
arch/csky/include/asm/pgalloc.h | 2 +-
arch/hexagon/include/asm/pgalloc.h | 2 +-
arch/loongarch/include/asm/pgalloc.h | 2 +-
arch/m68k/include/asm/mcf_pgalloc.h | 4 +-
arch/m68k/include/asm/sun3_pgalloc.h | 2 +-
arch/m68k/mm/motorola.c | 2 +-
arch/mips/include/asm/pgalloc.h | 2 +-
arch/nios2/include/asm/pgalloc.h | 2 +-
arch/openrisc/include/asm/pgalloc.h | 2 +-
arch/powerpc/mm/book3s64/mmu_context.c | 2 +-
arch/powerpc/mm/book3s64/pgtable.c | 2 +-
arch/powerpc/mm/pgtable-frag.c | 4 +-
arch/riscv/include/asm/pgalloc.h | 8 ++--
arch/riscv/mm/init.c | 4 +-
arch/s390/include/asm/pgalloc.h | 6 +--
arch/s390/include/asm/tlb.h | 6 +--
arch/s390/mm/pgalloc.c | 2 +-
arch/sh/include/asm/pgalloc.h | 2 +-
arch/sparc/mm/init_64.c | 2 +-
arch/sparc/mm/srmmu.c | 2 +-
arch/um/include/asm/pgalloc.h | 6 +--
arch/x86/mm/pgtable.c | 12 ++---
include/asm-generic/pgalloc.h | 8 ++--
include/linux/mm.h | 52 ++++------------------
mm/memory.c | 3 +-
28 files changed, 62 insertions(+), 95 deletions(-)
diff --git a/Documentation/mm/split_page_table_lock.rst b/Documentation/mm/split_page_table_lock.rst
index 581446d4a4eba..8e1ceb0a6619a 100644
--- a/Documentation/mm/split_page_table_lock.rst
+++ b/Documentation/mm/split_page_table_lock.rst
@@ -62,7 +62,7 @@ Support of split page table lock by an architecture
===================================================
There's no need in special enabling of PTE split page table lock: everything
-required is done by pagetable_pte_ctor() and pagetable_pte_dtor(), which
+required is done by pagetable_pte_ctor() and pagetable_dtor(), which
must be called on PTE table allocation / freeing.
Make sure the architecture doesn't use slab allocator for page table
@@ -73,7 +73,7 @@ PMD split lock only makes sense if you have more than two page table
levels.
PMD split lock enabling requires pagetable_pmd_ctor() call on PMD table
-allocation and pagetable_pmd_dtor() on freeing.
+allocation and pagetable_dtor() on freeing.
Allocation usually happens in pmd_alloc_one(), freeing in pmd_free() and
pmd_free_tlb(), but make sure you cover all PMD table allocation / freeing
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index f40d06ad5d2a3..ef79bf1e8563f 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -41,7 +41,7 @@ __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
{
struct ptdesc *ptdesc = page_ptdesc(pte);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
#ifndef CONFIG_ARM_LPAE
/*
@@ -61,7 +61,7 @@ __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr)
#ifdef CONFIG_ARM_LPAE
struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
#endif
}
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index 445282cde9afb..408d0f36a8a8f 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -82,7 +82,7 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
{
struct ptdesc *ptdesc = page_ptdesc(pte);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
@@ -92,7 +92,7 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
{
struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
#endif
@@ -106,7 +106,7 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
if (!pgtable_l4_enabled())
return;
- pagetable_pud_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
#endif
@@ -120,7 +120,7 @@ static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4dp,
if (!pgtable_l5_enabled())
return;
- pagetable_p4d_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
#endif
diff --git a/arch/csky/include/asm/pgalloc.h b/arch/csky/include/asm/pgalloc.h
index 9c84c9012e534..f1ce5b7b28f22 100644
--- a/arch/csky/include/asm/pgalloc.h
+++ b/arch/csky/include/asm/pgalloc.h
@@ -63,7 +63,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
#define __pte_free_tlb(tlb, pte, address) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc(tlb, page_ptdesc(pte)); \
} while (0)
diff --git a/arch/hexagon/include/asm/pgalloc.h b/arch/hexagon/include/asm/pgalloc.h
index 55988625e6fbc..40e42a0e71673 100644
--- a/arch/hexagon/include/asm/pgalloc.h
+++ b/arch/hexagon/include/asm/pgalloc.h
@@ -89,7 +89,7 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
#define __pte_free_tlb(tlb, pte, addr) \
do { \
- pagetable_pte_dtor((page_ptdesc(pte))); \
+ pagetable_dtor((page_ptdesc(pte))); \
tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
} while (0)
diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h
index a7b9c9e73593d..7211dff8c969e 100644
--- a/arch/loongarch/include/asm/pgalloc.h
+++ b/arch/loongarch/include/asm/pgalloc.h
@@ -57,7 +57,7 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
#define __pte_free_tlb(tlb, pte, address) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
} while (0)
diff --git a/arch/m68k/include/asm/mcf_pgalloc.h b/arch/m68k/include/asm/mcf_pgalloc.h
index 302c5bf67179e..22d6c1fcabfb4 100644
--- a/arch/m68k/include/asm/mcf_pgalloc.h
+++ b/arch/m68k/include/asm/mcf_pgalloc.h
@@ -37,7 +37,7 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pgtable,
{
struct ptdesc *ptdesc = virt_to_ptdesc(pgtable);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
@@ -61,7 +61,7 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pgtable)
{
struct ptdesc *ptdesc = virt_to_ptdesc(pgtable);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
diff --git a/arch/m68k/include/asm/sun3_pgalloc.h b/arch/m68k/include/asm/sun3_pgalloc.h
index 4a137eecb6fe4..2b626cb3ad0ae 100644
--- a/arch/m68k/include/asm/sun3_pgalloc.h
+++ b/arch/m68k/include/asm/sun3_pgalloc.h
@@ -19,7 +19,7 @@ extern const char bad_pmd_string[];
#define __pte_free_tlb(tlb, pte, addr) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
} while (0)
diff --git a/arch/m68k/mm/motorola.c b/arch/m68k/mm/motorola.c
index c1761d309fc61..81715cece70c6 100644
--- a/arch/m68k/mm/motorola.c
+++ b/arch/m68k/mm/motorola.c
@@ -201,7 +201,7 @@ int free_pointer_table(void *table, int type)
list_del(dp);
mmu_page_dtor((void *)page);
if (type == TABLE_PTE)
- pagetable_pte_dtor(virt_to_ptdesc((void *)page));
+ pagetable_dtor(virt_to_ptdesc((void *)page));
free_page (page);
return 1;
} else if (ptable_list[type].next != dp) {
diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h
index f4440edcd8fe2..36d9805033c4b 100644
--- a/arch/mips/include/asm/pgalloc.h
+++ b/arch/mips/include/asm/pgalloc.h
@@ -56,7 +56,7 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
#define __pte_free_tlb(tlb, pte, address) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
} while (0)
diff --git a/arch/nios2/include/asm/pgalloc.h b/arch/nios2/include/asm/pgalloc.h
index ce6bb8e74271f..12a536b7bfbd4 100644
--- a/arch/nios2/include/asm/pgalloc.h
+++ b/arch/nios2/include/asm/pgalloc.h
@@ -30,7 +30,7 @@ extern pgd_t *pgd_alloc(struct mm_struct *mm);
#define __pte_free_tlb(tlb, pte, addr) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
} while (0)
diff --git a/arch/openrisc/include/asm/pgalloc.h b/arch/openrisc/include/asm/pgalloc.h
index c6a73772a5466..596e2355824e3 100644
--- a/arch/openrisc/include/asm/pgalloc.h
+++ b/arch/openrisc/include/asm/pgalloc.h
@@ -68,7 +68,7 @@ extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm);
#define __pte_free_tlb(tlb, pte, addr) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
} while (0)
diff --git a/arch/powerpc/mm/book3s64/mmu_context.c b/arch/powerpc/mm/book3s64/mmu_context.c
index 1715b07c630c9..4e1e45420bd49 100644
--- a/arch/powerpc/mm/book3s64/mmu_context.c
+++ b/arch/powerpc/mm/book3s64/mmu_context.c
@@ -253,7 +253,7 @@ static void pmd_frag_destroy(void *pmd_frag)
count = ((unsigned long)pmd_frag & ~PAGE_MASK) >> PMD_FRAG_SIZE_SHIFT;
/* We allow PTE_FRAG_NR fragments from a PTE page */
if (atomic_sub_and_test(PMD_FRAG_NR - count, &ptdesc->pt_frag_refcount)) {
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
}
diff --git a/arch/powerpc/mm/book3s64/pgtable.c b/arch/powerpc/mm/book3s64/pgtable.c
index 3745425280808..3f28e4acd920b 100644
--- a/arch/powerpc/mm/book3s64/pgtable.c
+++ b/arch/powerpc/mm/book3s64/pgtable.c
@@ -477,7 +477,7 @@ void pmd_fragment_free(unsigned long *pmd)
BUG_ON(atomic_read(&ptdesc->pt_frag_refcount) <= 0);
if (atomic_dec_and_test(&ptdesc->pt_frag_refcount)) {
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
}
diff --git a/arch/powerpc/mm/pgtable-frag.c b/arch/powerpc/mm/pgtable-frag.c
index e89f64a0f24ae..713268ccb1a0e 100644
--- a/arch/powerpc/mm/pgtable-frag.c
+++ b/arch/powerpc/mm/pgtable-frag.c
@@ -25,7 +25,7 @@ void pte_frag_destroy(void *pte_frag)
count = ((unsigned long)pte_frag & ~PAGE_MASK) >> PTE_FRAG_SIZE_SHIFT;
/* We allow PTE_FRAG_NR fragments from a PTE page */
if (atomic_sub_and_test(PTE_FRAG_NR - count, &ptdesc->pt_frag_refcount)) {
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
}
@@ -111,7 +111,7 @@ static void pte_free_now(struct rcu_head *head)
struct ptdesc *ptdesc;
ptdesc = container_of(head, struct ptdesc, pt_rcu_head);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index bb6e1c5f1fb19..09c2eff571a49 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -112,7 +112,7 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
if (pgtable_l4_enabled) {
struct ptdesc *ptdesc = virt_to_ptdesc(pud);
- pagetable_pud_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
riscv_tlb_remove_ptdesc(tlb, ptdesc);
}
}
@@ -139,7 +139,7 @@ static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
if (pgtable_l5_enabled) {
struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
- pagetable_p4d_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
}
}
@@ -172,7 +172,7 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
{
struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
riscv_tlb_remove_ptdesc(tlb, ptdesc);
}
@@ -183,7 +183,7 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
{
struct ptdesc *ptdesc = page_ptdesc(pte);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
riscv_tlb_remove_ptdesc(tlb, ptdesc);
}
#endif /* CONFIG_MMU */
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index fc53ce748c804..8d703fb51b1dc 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -1558,7 +1558,7 @@ static void __meminit free_pte_table(pte_t *pte_start, pmd_t *pmd)
return;
}
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
if (PageReserved(page))
free_reserved_page(page);
else
@@ -1580,7 +1580,7 @@ static void __meminit free_pmd_table(pmd_t *pmd_start, pud_t *pud, bool is_vmemm
}
if (!is_vmemmap)
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
if (PageReserved(page))
free_reserved_page(page);
else
diff --git a/arch/s390/include/asm/pgalloc.h b/arch/s390/include/asm/pgalloc.h
index a0c1ca5d8423c..5fced6d3c36b0 100644
--- a/arch/s390/include/asm/pgalloc.h
+++ b/arch/s390/include/asm/pgalloc.h
@@ -66,7 +66,7 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
if (mm_p4d_folded(mm))
return;
- pagetable_p4d_dtor(virt_to_ptdesc(p4d));
+ pagetable_dtor(virt_to_ptdesc(p4d));
crst_table_free(mm, (unsigned long *) p4d);
}
@@ -87,7 +87,7 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud)
if (mm_pud_folded(mm))
return;
- pagetable_pud_dtor(virt_to_ptdesc(pud));
+ pagetable_dtor(virt_to_ptdesc(pud));
crst_table_free(mm, (unsigned long *) pud);
}
@@ -109,7 +109,7 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
{
if (mm_pmd_folded(mm))
return;
- pagetable_pmd_dtor(virt_to_ptdesc(pmd));
+ pagetable_dtor(virt_to_ptdesc(pmd));
crst_table_free(mm, (unsigned long *) pmd);
}
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index b946964afce8e..74b6fba4c2ee3 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -102,7 +102,7 @@ static inline void pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
{
if (mm_pmd_folded(tlb->mm))
return;
- pagetable_pmd_dtor(virt_to_ptdesc(pmd));
+ pagetable_dtor(virt_to_ptdesc(pmd));
__tlb_adjust_range(tlb, address, PAGE_SIZE);
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
@@ -122,7 +122,7 @@ static inline void pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
{
if (mm_pud_folded(tlb->mm))
return;
- pagetable_pud_dtor(virt_to_ptdesc(pud));
+ pagetable_dtor(virt_to_ptdesc(pud));
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
tlb->cleared_p4ds = 1;
@@ -141,7 +141,7 @@ static inline void p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
{
if (mm_p4d_folded(tlb->mm))
return;
- pagetable_p4d_dtor(virt_to_ptdesc(p4d));
+ pagetable_dtor(virt_to_ptdesc(p4d));
__tlb_adjust_range(tlb, address, PAGE_SIZE);
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
index 58696a0c4e4ac..569de24d33761 100644
--- a/arch/s390/mm/pgalloc.c
+++ b/arch/s390/mm/pgalloc.c
@@ -182,7 +182,7 @@ unsigned long *page_table_alloc(struct mm_struct *mm)
static void pagetable_pte_dtor_free(struct ptdesc *ptdesc)
{
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
index 5d8577ab15911..96d938fdf2244 100644
--- a/arch/sh/include/asm/pgalloc.h
+++ b/arch/sh/include/asm/pgalloc.h
@@ -34,7 +34,7 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
#define __pte_free_tlb(tlb, pte, addr) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
} while (0)
diff --git a/arch/sparc/mm/init_64.c b/arch/sparc/mm/init_64.c
index 21f8cbbd0581c..05882bca5b732 100644
--- a/arch/sparc/mm/init_64.c
+++ b/arch/sparc/mm/init_64.c
@@ -2915,7 +2915,7 @@ static void __pte_free(pgtable_t pte)
{
struct ptdesc *ptdesc = virt_to_ptdesc(pte);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
diff --git a/arch/sparc/mm/srmmu.c b/arch/sparc/mm/srmmu.c
index 9df51a62333d6..e3a72c884b867 100644
--- a/arch/sparc/mm/srmmu.c
+++ b/arch/sparc/mm/srmmu.c
@@ -372,7 +372,7 @@ void pte_free(struct mm_struct *mm, pgtable_t ptep)
page = pfn_to_page(__nocache_pa((unsigned long)ptep) >> PAGE_SHIFT);
spin_lock(&mm->page_table_lock);
if (page_ref_dec_return(page) == 1)
- pagetable_pte_dtor(page_ptdesc(page));
+ pagetable_dtor(page_ptdesc(page));
spin_unlock(&mm->page_table_lock);
srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
diff --git a/arch/um/include/asm/pgalloc.h b/arch/um/include/asm/pgalloc.h
index 04fb4e6969a46..f0af23c3aeb2b 100644
--- a/arch/um/include/asm/pgalloc.h
+++ b/arch/um/include/asm/pgalloc.h
@@ -27,7 +27,7 @@ extern pgd_t *pgd_alloc(struct mm_struct *);
#define __pte_free_tlb(tlb, pte, address) \
do { \
- pagetable_pte_dtor(page_ptdesc(pte)); \
+ pagetable_dtor(page_ptdesc(pte)); \
tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
} while (0)
@@ -35,7 +35,7 @@ do { \
#define __pmd_free_tlb(tlb, pmd, address) \
do { \
- pagetable_pmd_dtor(virt_to_ptdesc(pmd)); \
+ pagetable_dtor(virt_to_ptdesc(pmd)); \
tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pmd)); \
} while (0)
@@ -43,7 +43,7 @@ do { \
#define __pud_free_tlb(tlb, pud, address) \
do { \
- pagetable_pud_dtor(virt_to_ptdesc(pud)); \
+ pagetable_dtor(virt_to_ptdesc(pud)); \
tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pud)); \
} while (0)
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index 3d6e84da45b24..a6cd9660e29ec 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -60,7 +60,7 @@ early_param("userpte", setup_userpte);
void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
{
- pagetable_pte_dtor(page_ptdesc(pte));
+ pagetable_dtor(page_ptdesc(pte));
paravirt_release_pte(page_to_pfn(pte));
paravirt_tlb_remove_table(tlb, pte);
}
@@ -77,7 +77,7 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
#ifdef CONFIG_X86_PAE
tlb->need_flush_all = 1;
#endif
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
paravirt_tlb_remove_table(tlb, ptdesc_page(ptdesc));
}
@@ -86,7 +86,7 @@ void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
{
struct ptdesc *ptdesc = virt_to_ptdesc(pud);
- pagetable_pud_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
paravirt_release_pud(__pa(pud) >> PAGE_SHIFT);
paravirt_tlb_remove_table(tlb, virt_to_page(pud));
}
@@ -96,7 +96,7 @@ void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
{
struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
- pagetable_p4d_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
paravirt_tlb_remove_table(tlb, virt_to_page(p4d));
}
@@ -233,7 +233,7 @@ static void free_pmds(struct mm_struct *mm, pmd_t *pmds[], int count)
if (pmds[i]) {
ptdesc = virt_to_ptdesc(pmds[i]);
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
mm_dec_nr_pmds(mm);
}
@@ -867,7 +867,7 @@ int pud_free_pmd_page(pud_t *pud, unsigned long addr)
free_page((unsigned long)pmd_sv);
- pagetable_pmd_dtor(virt_to_ptdesc(pmd));
+ pagetable_dtor(virt_to_ptdesc(pmd));
free_page((unsigned long)pmd);
return 1;
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index dbf61819b3581..3673e9c29504e 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -109,7 +109,7 @@ static inline void pte_free(struct mm_struct *mm, struct page *pte_page)
{
struct ptdesc *ptdesc = page_ptdesc(pte_page);
- pagetable_pte_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
@@ -153,7 +153,7 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
- pagetable_pmd_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
#endif
@@ -202,7 +202,7 @@ static inline void __pud_free(struct mm_struct *mm, pud_t *pud)
struct ptdesc *ptdesc = virt_to_ptdesc(pud);
BUG_ON((unsigned long)pud & (PAGE_SIZE-1));
- pagetable_pud_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
@@ -257,7 +257,7 @@ static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
- pagetable_p4d_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 807a12ed8ec96..497035a78849b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3055,6 +3055,15 @@ static inline bool ptlock_init(struct ptdesc *ptdesc) { return true; }
static inline void ptlock_free(struct ptdesc *ptdesc) {}
#endif /* defined(CONFIG_SPLIT_PTE_PTLOCKS) */
+static inline void pagetable_dtor(struct ptdesc *ptdesc)
+{
+ struct folio *folio = ptdesc_folio(ptdesc);
+
+ ptlock_free(ptdesc);
+ __folio_clear_pgtable(folio);
+ lruvec_stat_sub_folio(folio, NR_PAGETABLE);
+}
+
static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc)
{
struct folio *folio = ptdesc_folio(ptdesc);
@@ -3066,15 +3075,6 @@ static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc)
return true;
}
-static inline void pagetable_pte_dtor(struct ptdesc *ptdesc)
-{
- struct folio *folio = ptdesc_folio(ptdesc);
-
- ptlock_free(ptdesc);
- __folio_clear_pgtable(folio);
- lruvec_stat_sub_folio(folio, NR_PAGETABLE);
-}
-
pte_t *___pte_offset_map(pmd_t *pmd, unsigned long addr, pmd_t *pmdvalp);
static inline pte_t *__pte_offset_map(pmd_t *pmd, unsigned long addr,
pmd_t *pmdvalp)
@@ -3151,14 +3151,6 @@ static inline bool pmd_ptlock_init(struct ptdesc *ptdesc)
return ptlock_init(ptdesc);
}
-static inline void pmd_ptlock_free(struct ptdesc *ptdesc)
-{
-#ifdef CONFIG_TRANSPARENT_HUGEPAGE
- VM_BUG_ON_PAGE(ptdesc->pmd_huge_pte, ptdesc_page(ptdesc));
-#endif
- ptlock_free(ptdesc);
-}
-
#define pmd_huge_pte(mm, pmd) (pmd_ptdesc(pmd)->pmd_huge_pte)
#else
@@ -3169,7 +3161,6 @@ static inline spinlock_t *pmd_lockptr(struct mm_struct *mm, pmd_t *pmd)
}
static inline bool pmd_ptlock_init(struct ptdesc *ptdesc) { return true; }
-static inline void pmd_ptlock_free(struct ptdesc *ptdesc) {}
#define pmd_huge_pte(mm, pmd) ((mm)->pmd_huge_pte)
@@ -3193,15 +3184,6 @@ static inline bool pagetable_pmd_ctor(struct ptdesc *ptdesc)
return true;
}
-static inline void pagetable_pmd_dtor(struct ptdesc *ptdesc)
-{
- struct folio *folio = ptdesc_folio(ptdesc);
-
- pmd_ptlock_free(ptdesc);
- __folio_clear_pgtable(folio);
- lruvec_stat_sub_folio(folio, NR_PAGETABLE);
-}
-
/*
* No scalability reason to split PUD locks yet, but follow the same pattern
* as the PMD locks to make it easier if we decide to. The VM should not be
@@ -3229,14 +3211,6 @@ static inline void pagetable_pud_ctor(struct ptdesc *ptdesc)
lruvec_stat_add_folio(folio, NR_PAGETABLE);
}
-static inline void pagetable_pud_dtor(struct ptdesc *ptdesc)
-{
- struct folio *folio = ptdesc_folio(ptdesc);
-
- __folio_clear_pgtable(folio);
- lruvec_stat_sub_folio(folio, NR_PAGETABLE);
-}
-
static inline void pagetable_p4d_ctor(struct ptdesc *ptdesc)
{
struct folio *folio = ptdesc_folio(ptdesc);
@@ -3245,14 +3219,6 @@ static inline void pagetable_p4d_ctor(struct ptdesc *ptdesc)
lruvec_stat_add_folio(folio, NR_PAGETABLE);
}
-static inline void pagetable_p4d_dtor(struct ptdesc *ptdesc)
-{
- struct folio *folio = ptdesc_folio(ptdesc);
-
- __folio_clear_pgtable(folio);
- lruvec_stat_sub_folio(folio, NR_PAGETABLE);
-}
-
extern void __init pagecache_init(void);
extern void free_initmem(void);
diff --git a/mm/memory.c b/mm/memory.c
index 83765632e20b0..c00fa95464b19 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -7058,7 +7058,8 @@ bool ptlock_alloc(struct ptdesc *ptdesc)
void ptlock_free(struct ptdesc *ptdesc)
{
- kmem_cache_free(page_ptl_cachep, ptdesc->ptl);
+ if (ptdesc->ptl)
+ kmem_cache_free(page_ptl_cachep, ptdesc->ptl);
}
#endif
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 06/12] arm: pgtable: move pagetable_dtor() to __tlb_remove_table()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (4 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 05/12] mm: pgtable: introduce pagetable_dtor() Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 07/12] arm64: " Qi Zheng
` (6 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng, linux-arm-kernel
Move pagetable_dtor() to __tlb_remove_table(), so that ptlock and page
table pages can be freed together (regardless of whether RCU is used).
This prevents the use-after-free problem where the ptlock is freed
immediately but the page table pages is freed later via RCU.
Page tables shouldn't have swap cache, so use pagetable_free() instead of
free_page_and_swap_cache() to free page table pages.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: linux-arm-kernel@lists.infradead.org
---
arch/arm/include/asm/tlb.h | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index ef79bf1e8563f..264ab635e807a 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -26,12 +26,14 @@
#else /* !CONFIG_MMU */
-#include <linux/swap.h>
#include <asm/tlbflush.h>
static inline void __tlb_remove_table(void *_table)
{
- free_page_and_swap_cache((struct page *)_table);
+ struct ptdesc *ptdesc = (struct ptdesc *)_table;
+
+ pagetable_dtor(ptdesc);
+ pagetable_free(ptdesc);
}
#include <asm-generic/tlb.h>
@@ -41,8 +43,6 @@ __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
{
struct ptdesc *ptdesc = page_ptdesc(pte);
- pagetable_dtor(ptdesc);
-
#ifndef CONFIG_ARM_LPAE
/*
* With the classic ARM MMU, a pte page has two corresponding pmd
@@ -61,7 +61,6 @@ __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr)
#ifdef CONFIG_ARM_LPAE
struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
- pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
#endif
}
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 07/12] arm64: pgtable: move pagetable_dtor() to __tlb_remove_table()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (5 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 06/12] arm: pgtable: move pagetable_dtor() to __tlb_remove_table() Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 08/12] riscv: " Qi Zheng
` (5 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng, linux-arm-kernel
Move pagetable_dtor() to __tlb_remove_table(), so that ptlock and page
table pages can be freed together (regardless of whether RCU is used).
This prevents the use-after-free problem where the ptlock is freed
immediately but the page table pages is freed later via RCU.
Page tables shouldn't have swap cache, so use pagetable_free() instead of
free_page_and_swap_cache() to free page table pages.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: linux-arm-kernel@lists.infradead.org
---
arch/arm64/include/asm/tlb.h | 10 ++++------
1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index 408d0f36a8a8f..93591a80b5bfb 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -9,11 +9,13 @@
#define __ASM_TLB_H
#include <linux/pagemap.h>
-#include <linux/swap.h>
static inline void __tlb_remove_table(void *_table)
{
- free_page_and_swap_cache((struct page *)_table);
+ struct ptdesc *ptdesc = (struct ptdesc *)_table;
+
+ pagetable_dtor(ptdesc);
+ pagetable_free(ptdesc);
}
#define tlb_flush tlb_flush
@@ -82,7 +84,6 @@ static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
{
struct ptdesc *ptdesc = page_ptdesc(pte);
- pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
@@ -92,7 +93,6 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
{
struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
- pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
#endif
@@ -106,7 +106,6 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
if (!pgtable_l4_enabled())
return;
- pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
#endif
@@ -120,7 +119,6 @@ static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4dp,
if (!pgtable_l5_enabled())
return;
- pagetable_dtor(ptdesc);
tlb_remove_ptdesc(tlb, ptdesc);
}
#endif
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 08/12] riscv: pgtable: move pagetable_dtor() to __tlb_remove_table()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (6 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 07/12] arm64: " Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 09/12] x86: " Qi Zheng
` (4 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng, linux-riscv
Move pagetable_dtor() to __tlb_remove_table(), so that ptlock and page
table pages can be freed together (regardless of whether RCU is used).
This prevents the use-after-free problem where the ptlock is freed
immediately but the page table pages is freed later via RCU.
Page tables shouldn't have swap cache, so use pagetable_free() instead of
free_page_and_swap_cache() to free page table pages.
By the way, move the comment above __tlb_remove_table() to
riscv_tlb_remove_ptdesc(), it will be more appropriate.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: linux-riscv@lists.infradead.org
---
arch/riscv/include/asm/pgalloc.h | 38 ++++++++++++++------------------
arch/riscv/include/asm/tlb.h | 14 ++++--------
2 files changed, 21 insertions(+), 31 deletions(-)
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index 09c2eff571a49..fc50d14010246 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -18,12 +18,22 @@
#define __HAVE_ARCH_P4D_FREE
#include <asm-generic/pgalloc.h>
+/*
+ * While riscv platforms with riscv_ipi_for_rfence as true require an IPI to
+ * perform TLB shootdown, some platforms with riscv_ipi_for_rfence as false use
+ * SBI to perform TLB shootdown. To keep software pagetable walkers safe in this
+ * case we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the
+ * comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h
+ * for more details.
+ */
static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt)
{
- if (riscv_use_sbi_for_rfence())
+ if (riscv_use_sbi_for_rfence()) {
tlb_remove_ptdesc(tlb, pt);
- else
+ } else {
+ pagetable_dtor(pt);
tlb_remove_page_ptdesc(tlb, pt);
+ }
}
static inline void pmd_populate_kernel(struct mm_struct *mm,
@@ -109,12 +119,8 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud)
static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
unsigned long addr)
{
- if (pgtable_l4_enabled) {
- struct ptdesc *ptdesc = virt_to_ptdesc(pud);
-
- pagetable_dtor(ptdesc);
- riscv_tlb_remove_ptdesc(tlb, ptdesc);
- }
+ if (pgtable_l4_enabled)
+ riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(pud));
}
#define p4d_alloc_one p4d_alloc_one
@@ -136,12 +142,8 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long addr)
{
- if (pgtable_l5_enabled) {
- struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
-
- pagetable_dtor(ptdesc);
+ if (pgtable_l5_enabled)
riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
- }
}
#endif /* __PAGETABLE_PMD_FOLDED */
@@ -170,10 +172,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
unsigned long addr)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
-
- pagetable_dtor(ptdesc);
- riscv_tlb_remove_ptdesc(tlb, ptdesc);
+ riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(pmd));
}
#endif /* __PAGETABLE_PMD_FOLDED */
@@ -181,10 +180,7 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
unsigned long addr)
{
- struct ptdesc *ptdesc = page_ptdesc(pte);
-
- pagetable_dtor(ptdesc);
- riscv_tlb_remove_ptdesc(tlb, ptdesc);
+ riscv_tlb_remove_ptdesc(tlb, page_ptdesc(pte));
}
#endif /* CONFIG_MMU */
diff --git a/arch/riscv/include/asm/tlb.h b/arch/riscv/include/asm/tlb.h
index 1f6c38420d8e0..ded8724b3c4f7 100644
--- a/arch/riscv/include/asm/tlb.h
+++ b/arch/riscv/include/asm/tlb.h
@@ -11,19 +11,13 @@ struct mmu_gather;
static void tlb_flush(struct mmu_gather *tlb);
#ifdef CONFIG_MMU
-#include <linux/swap.h>
-/*
- * While riscv platforms with riscv_ipi_for_rfence as true require an IPI to
- * perform TLB shootdown, some platforms with riscv_ipi_for_rfence as false use
- * SBI to perform TLB shootdown. To keep software pagetable walkers safe in this
- * case we switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the
- * comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h
- * for more details.
- */
static inline void __tlb_remove_table(void *table)
{
- free_page_and_swap_cache(table);
+ struct ptdesc *ptdesc = (struct ptdesc *)table;
+
+ pagetable_dtor(ptdesc);
+ pagetable_free(ptdesc);
}
#endif /* CONFIG_MMU */
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 09/12] x86: pgtable: move pagetable_dtor() to __tlb_remove_table()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (7 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 08/12] riscv: " Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 10/12] s390: pgtable: also move pagetable_dtor() of PxD " Qi Zheng
` (3 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng, x86
Move pagetable_dtor() to __tlb_remove_table(), so that ptlock and page
table pages can be freed together (regardless of whether RCU is used).
This prevents the use-after-free problem where the ptlock is freed
immediately but the page table pages is freed later via RCU.
Page tables shouldn't have swap cache, so use pagetable_free() instead of
free_page_and_swap_cache() to free page table pages.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: x86@kernel.org
---
arch/x86/include/asm/tlb.h | 17 ++++++++++-------
arch/x86/kernel/paravirt.c | 1 +
arch/x86/mm/pgtable.c | 12 ++----------
3 files changed, 13 insertions(+), 17 deletions(-)
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 73f0786181cc9..f64730be5ad67 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -31,24 +31,27 @@ static inline void tlb_flush(struct mmu_gather *tlb)
*/
static inline void __tlb_remove_table(void *table)
{
- free_page_and_swap_cache(table);
+ struct ptdesc *ptdesc = (struct ptdesc *)table;
+
+ pagetable_dtor(ptdesc);
+ pagetable_free(ptdesc);
}
#ifdef CONFIG_PT_RECLAIM
static inline void __tlb_remove_table_one_rcu(struct rcu_head *head)
{
- struct page *page;
+ struct ptdesc *ptdesc;
- page = container_of(head, struct page, rcu_head);
- put_page(page);
+ ptdesc = container_of(head, struct ptdesc, pt_rcu_head);
+ __tlb_remove_table(ptdesc);
}
static inline void __tlb_remove_table_one(void *table)
{
- struct page *page;
+ struct ptdesc *ptdesc;
- page = table;
- call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu);
+ ptdesc = table;
+ call_rcu(&ptdesc->pt_rcu_head, __tlb_remove_table_one_rcu);
}
#define __tlb_remove_table_one __tlb_remove_table_one
#endif /* CONFIG_PT_RECLAIM */
diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c
index 7bdcf152778c0..46d5d325483b0 100644
--- a/arch/x86/kernel/paravirt.c
+++ b/arch/x86/kernel/paravirt.c
@@ -62,6 +62,7 @@ void __init native_pv_lock_init(void)
#ifndef CONFIG_PT_RECLAIM
static void native_tlb_remove_table(struct mmu_gather *tlb, void *table)
{
+ pagetable_dtor(table);
tlb_remove_page(tlb, table);
}
#else
diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
index a6cd9660e29ec..a0b0e501ba663 100644
--- a/arch/x86/mm/pgtable.c
+++ b/arch/x86/mm/pgtable.c
@@ -23,6 +23,7 @@ EXPORT_SYMBOL(physical_mask);
static inline
void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table)
{
+ pagetable_dtor(table);
tlb_remove_page(tlb, table);
}
#else
@@ -60,7 +61,6 @@ early_param("userpte", setup_userpte);
void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
{
- pagetable_dtor(page_ptdesc(pte));
paravirt_release_pte(page_to_pfn(pte));
paravirt_tlb_remove_table(tlb, pte);
}
@@ -68,7 +68,6 @@ void ___pte_free_tlb(struct mmu_gather *tlb, struct page *pte)
#if CONFIG_PGTABLE_LEVELS > 2
void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
paravirt_release_pmd(__pa(pmd) >> PAGE_SHIFT);
/*
* NOTE! For PAE, any changes to the top page-directory-pointer-table
@@ -77,16 +76,12 @@ void ___pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd)
#ifdef CONFIG_X86_PAE
tlb->need_flush_all = 1;
#endif
- pagetable_dtor(ptdesc);
- paravirt_tlb_remove_table(tlb, ptdesc_page(ptdesc));
+ paravirt_tlb_remove_table(tlb, virt_to_page(pmd));
}
#if CONFIG_PGTABLE_LEVELS > 3
void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(pud);
-
- pagetable_dtor(ptdesc);
paravirt_release_pud(__pa(pud) >> PAGE_SHIFT);
paravirt_tlb_remove_table(tlb, virt_to_page(pud));
}
@@ -94,9 +89,6 @@ void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
#if CONFIG_PGTABLE_LEVELS > 4
void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
-
- pagetable_dtor(ptdesc);
paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
paravirt_tlb_remove_table(tlb, virt_to_page(p4d));
}
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 10/12] s390: pgtable: also move pagetable_dtor() of PxD to __tlb_remove_table()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (8 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 09/12] x86: " Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table() Qi Zheng
` (2 subsequent siblings)
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng, linux-s390
In s390, the pagetable_dtor() of PTE has long been moved to
__tlb_remove_table(). So similarly, also move the pagetable_dtor() of
PMD|PUD|P4D to __tlb_remove_table(). This prevents the use-after-free
problem where the ptlock is freed immediately but the page table pages
is freed later via RCU.
By the way, rename pagetable_pte_dtor_free() to pagetable_dtor_free().
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Cc: linux-s390@vger.kernel.org
---
arch/s390/include/asm/tlb.h | 3 ---
arch/s390/mm/pgalloc.c | 14 ++++----------
2 files changed, 4 insertions(+), 13 deletions(-)
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index 74b6fba4c2ee3..79df7c0932c56 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -102,7 +102,6 @@ static inline void pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
{
if (mm_pmd_folded(tlb->mm))
return;
- pagetable_dtor(virt_to_ptdesc(pmd));
__tlb_adjust_range(tlb, address, PAGE_SIZE);
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
@@ -122,7 +121,6 @@ static inline void pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
{
if (mm_pud_folded(tlb->mm))
return;
- pagetable_dtor(virt_to_ptdesc(pud));
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
tlb->cleared_p4ds = 1;
@@ -141,7 +139,6 @@ static inline void p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
{
if (mm_p4d_folded(tlb->mm))
return;
- pagetable_dtor(virt_to_ptdesc(p4d));
__tlb_adjust_range(tlb, address, PAGE_SIZE);
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
index 569de24d33761..c73b89811a264 100644
--- a/arch/s390/mm/pgalloc.c
+++ b/arch/s390/mm/pgalloc.c
@@ -180,7 +180,7 @@ unsigned long *page_table_alloc(struct mm_struct *mm)
return table;
}
-static void pagetable_pte_dtor_free(struct ptdesc *ptdesc)
+static void pagetable_dtor_free(struct ptdesc *ptdesc)
{
pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
@@ -190,20 +190,14 @@ void page_table_free(struct mm_struct *mm, unsigned long *table)
{
struct ptdesc *ptdesc = virt_to_ptdesc(table);
- pagetable_pte_dtor_free(ptdesc);
+ pagetable_dtor_free(ptdesc);
}
void __tlb_remove_table(void *table)
{
struct ptdesc *ptdesc = virt_to_ptdesc(table);
- struct page *page = ptdesc_page(ptdesc);
- if (compound_order(page) == CRST_ALLOC_ORDER) {
- /* pmd, pud, or p4d */
- pagetable_free(ptdesc);
- return;
- }
- pagetable_pte_dtor_free(ptdesc);
+ pagetable_dtor_free(ptdesc);
}
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
@@ -211,7 +205,7 @@ static void pte_free_now(struct rcu_head *head)
{
struct ptdesc *ptdesc = container_of(head, struct ptdesc, pt_rcu_head);
- pagetable_pte_dtor_free(ptdesc);
+ pagetable_dtor_free(ptdesc);
}
void pte_free_defer(struct mm_struct *mm, pgtable_t pgtable)
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (9 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 10/12] s390: pgtable: also move pagetable_dtor() of PxD " Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-16 12:00 ` Peter Zijlstra
2024-12-14 9:02 ` [PATCH 12/12] mm: pgtable: move __tlb_remove_table_one() in x86 to generic file Qi Zheng
2024-12-14 18:55 ` [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Peter Zijlstra
12 siblings, 1 reply; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng
Several architectures (arm, arm64, riscv, s390 and x86) define exactly the
same __tlb_remove_table(), just introduce generic __tlb_remove_table() to
eliminate these duplications.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
arch/arm/include/asm/tlb.h | 9 ---------
arch/arm64/include/asm/tlb.h | 7 -------
arch/powerpc/include/asm/tlb.h | 1 +
arch/riscv/include/asm/tlb.h | 12 ------------
arch/s390/include/asm/tlb.h | 1 -
arch/s390/mm/pgalloc.c | 7 -------
arch/sparc/include/asm/tlb_32.h | 1 +
arch/sparc/include/asm/tlb_64.h | 1 +
arch/x86/include/asm/tlb.h | 17 -----------------
include/asm-generic/tlb.h | 15 +++++++++++++--
10 files changed, 16 insertions(+), 55 deletions(-)
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index 264ab635e807a..ea4fbe7b17f6f 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -27,15 +27,6 @@
#else /* !CONFIG_MMU */
#include <asm/tlbflush.h>
-
-static inline void __tlb_remove_table(void *_table)
-{
- struct ptdesc *ptdesc = (struct ptdesc *)_table;
-
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
-}
-
#include <asm-generic/tlb.h>
static inline void
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index 93591a80b5bfb..8d762607285cc 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -10,13 +10,6 @@
#include <linux/pagemap.h>
-static inline void __tlb_remove_table(void *_table)
-{
- struct ptdesc *ptdesc = (struct ptdesc *)_table;
-
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
-}
#define tlb_flush tlb_flush
static void tlb_flush(struct mmu_gather *tlb);
diff --git a/arch/powerpc/include/asm/tlb.h b/arch/powerpc/include/asm/tlb.h
index 1ca7d4c4b90db..2058e8d3e0138 100644
--- a/arch/powerpc/include/asm/tlb.h
+++ b/arch/powerpc/include/asm/tlb.h
@@ -37,6 +37,7 @@ extern void tlb_flush(struct mmu_gather *tlb);
*/
#define tlb_needs_table_invalidate() radix_enabled()
+#define __HAVE_ARCH_TLB_REMOVE_TABLE
/* Get the generic bits... */
#include <asm-generic/tlb.h>
diff --git a/arch/riscv/include/asm/tlb.h b/arch/riscv/include/asm/tlb.h
index ded8724b3c4f7..50b63b5c15bd8 100644
--- a/arch/riscv/include/asm/tlb.h
+++ b/arch/riscv/include/asm/tlb.h
@@ -10,18 +10,6 @@ struct mmu_gather;
static void tlb_flush(struct mmu_gather *tlb);
-#ifdef CONFIG_MMU
-
-static inline void __tlb_remove_table(void *table)
-{
- struct ptdesc *ptdesc = (struct ptdesc *)table;
-
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
-}
-
-#endif /* CONFIG_MMU */
-
#define tlb_flush tlb_flush
#include <asm-generic/tlb.h>
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index 79df7c0932c56..7052780740349 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -22,7 +22,6 @@
* Pages used for the page tables is a different story. FIXME: more
*/
-void __tlb_remove_table(void *_table);
static inline void tlb_flush(struct mmu_gather *tlb);
static inline bool __tlb_remove_page_size(struct mmu_gather *tlb,
struct page *page, bool delay_rmap, int page_size);
diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
index c73b89811a264..3e002dea6278f 100644
--- a/arch/s390/mm/pgalloc.c
+++ b/arch/s390/mm/pgalloc.c
@@ -193,13 +193,6 @@ void page_table_free(struct mm_struct *mm, unsigned long *table)
pagetable_dtor_free(ptdesc);
}
-void __tlb_remove_table(void *table)
-{
- struct ptdesc *ptdesc = virt_to_ptdesc(table);
-
- pagetable_dtor_free(ptdesc);
-}
-
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
static void pte_free_now(struct rcu_head *head)
{
diff --git a/arch/sparc/include/asm/tlb_32.h b/arch/sparc/include/asm/tlb_32.h
index 5cd28a8793e39..910254867dfbd 100644
--- a/arch/sparc/include/asm/tlb_32.h
+++ b/arch/sparc/include/asm/tlb_32.h
@@ -2,6 +2,7 @@
#ifndef _SPARC_TLB_H
#define _SPARC_TLB_H
+#define __HAVE_ARCH_TLB_REMOVE_TABLE
#include <asm-generic/tlb.h>
#endif /* _SPARC_TLB_H */
diff --git a/arch/sparc/include/asm/tlb_64.h b/arch/sparc/include/asm/tlb_64.h
index 3037187482db7..1a6e694418e39 100644
--- a/arch/sparc/include/asm/tlb_64.h
+++ b/arch/sparc/include/asm/tlb_64.h
@@ -33,6 +33,7 @@ void flush_tlb_pending(void);
#define tlb_needs_table_invalidate() (false)
#endif
+#define __HAVE_ARCH_TLB_REMOVE_TABLE
#include <asm-generic/tlb.h>
#endif /* _SPARC64_TLB_H */
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index f64730be5ad67..3858dbf75880e 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -20,23 +20,6 @@ static inline void tlb_flush(struct mmu_gather *tlb)
flush_tlb_mm_range(tlb->mm, start, end, stride_shift, tlb->freed_tables);
}
-/*
- * While x86 architecture in general requires an IPI to perform TLB
- * shootdown, enablement code for several hypervisors overrides
- * .flush_tlb_others hook in pv_mmu_ops and implements it by issuing
- * a hypercall. To keep software pagetable walkers safe in this case we
- * switch to RCU based table free (MMU_GATHER_RCU_TABLE_FREE). See the comment
- * below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h
- * for more details.
- */
-static inline void __tlb_remove_table(void *table)
-{
- struct ptdesc *ptdesc = (struct ptdesc *)table;
-
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
-}
-
#ifdef CONFIG_PT_RECLAIM
static inline void __tlb_remove_table_one_rcu(struct rcu_head *head)
{
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 709830274b756..939a813023d7e 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -153,8 +153,9 @@
*
* Useful if your architecture has non-page page directories.
*
- * When used, an architecture is expected to provide __tlb_remove_table()
- * which does the actual freeing of these pages.
+ * When used, an architecture is expected to provide __tlb_remove_table() or
+ * use the generic __tlb_remove_table(), which does the actual freeing of these
+ * pages.
*
* MMU_GATHER_RCU_TABLE_FREE
*
@@ -207,6 +208,16 @@ struct mmu_table_batch {
#define MAX_TABLE_BATCH \
((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
+#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
+static inline void __tlb_remove_table(void *_table)
+{
+ struct ptdesc *ptdesc = (struct ptdesc *)_table;
+
+ pagetable_dtor(ptdesc);
+ pagetable_free(ptdesc);
+}
+#endif
+
extern void tlb_remove_table(struct mmu_gather *tlb, void *table);
#else /* !CONFIG_MMU_GATHER_HAVE_TABLE_FREE */
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 12/12] mm: pgtable: move __tlb_remove_table_one() in x86 to generic file
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (10 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table() Qi Zheng
@ 2024-12-14 9:02 ` Qi Zheng
2024-12-14 18:55 ` [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Peter Zijlstra
12 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-14 9:02 UTC (permalink / raw)
To: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel, Qi Zheng
The __tlb_remove_table_one() in x86 does not contain architecture-specific
content, so move it to the generic file.
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
arch/x86/include/asm/tlb.h | 19 -------------------
mm/mmu_gather.c | 20 ++++++++++++++++++--
2 files changed, 18 insertions(+), 21 deletions(-)
diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h
index 3858dbf75880e..77f52bc1578a7 100644
--- a/arch/x86/include/asm/tlb.h
+++ b/arch/x86/include/asm/tlb.h
@@ -20,25 +20,6 @@ static inline void tlb_flush(struct mmu_gather *tlb)
flush_tlb_mm_range(tlb->mm, start, end, stride_shift, tlb->freed_tables);
}
-#ifdef CONFIG_PT_RECLAIM
-static inline void __tlb_remove_table_one_rcu(struct rcu_head *head)
-{
- struct ptdesc *ptdesc;
-
- ptdesc = container_of(head, struct ptdesc, pt_rcu_head);
- __tlb_remove_table(ptdesc);
-}
-
-static inline void __tlb_remove_table_one(void *table)
-{
- struct ptdesc *ptdesc;
-
- ptdesc = table;
- call_rcu(&ptdesc->pt_rcu_head, __tlb_remove_table_one_rcu);
-}
-#define __tlb_remove_table_one __tlb_remove_table_one
-#endif /* CONFIG_PT_RECLAIM */
-
static inline void invlpg(unsigned long addr)
{
asm volatile("invlpg (%0)" ::"r" (addr) : "memory");
diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c
index 1e21022bcf339..7aa6f18c500b2 100644
--- a/mm/mmu_gather.c
+++ b/mm/mmu_gather.c
@@ -311,13 +311,29 @@ static inline void tlb_table_invalidate(struct mmu_gather *tlb)
}
}
-#ifndef __tlb_remove_table_one
+#ifdef CONFIG_PT_RECLAIM
+static inline void __tlb_remove_table_one_rcu(struct rcu_head *head)
+{
+ struct ptdesc *ptdesc;
+
+ ptdesc = container_of(head, struct ptdesc, pt_rcu_head);
+ __tlb_remove_table(ptdesc);
+}
+
+static inline void __tlb_remove_table_one(void *table)
+{
+ struct ptdesc *ptdesc;
+
+ ptdesc = table;
+ call_rcu(&ptdesc->pt_rcu_head, __tlb_remove_table_one_rcu);
+}
+#else
static inline void __tlb_remove_table_one(void *table)
{
tlb_remove_table_sync_one();
__tlb_remove_table(table);
}
-#endif
+#endif /* CONFIG_PT_RECLAIM */
static void tlb_remove_table_one(void *table)
{
--
2.20.1
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU"
2024-12-14 9:02 ` [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU" Qi Zheng
@ 2024-12-14 18:29 ` Yu Zhao
2024-12-15 6:29 ` Qi Zheng
0 siblings, 1 reply; 32+ messages in thread
From: Yu Zhao @ 2024-12-14 18:29 UTC (permalink / raw)
To: Qi Zheng
Cc: peterz, tglx, david, jannh, hughd, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On Sat, Dec 14, 2024 at 2:03 AM Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>
> This reverts commit 2f3443770437e49abc39af26962d293851cbab6d.
>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
Bloating struct pt_lock is unnecessary. Glad to see it's reverted.
Acked-by: Yu Zhao <yuzhao@google.com>
> ---
> include/linux/mm.h | 2 +-
> include/linux/mm_types.h | 9 +--------
> mm/memory.c | 22 ++++++----------------
> 3 files changed, 8 insertions(+), 25 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index e7902980439cc..5e73e53c34e9e 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -2988,7 +2988,7 @@ void ptlock_free(struct ptdesc *ptdesc);
>
> static inline spinlock_t *ptlock_ptr(struct ptdesc *ptdesc)
> {
> - return &(ptdesc->ptl->ptl);
> + return ptdesc->ptl;
> }
> #else /* ALLOC_SPLIT_PTLOCKS */
> static inline void ptlock_cache_init(void)
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index df8f5152644ec..5d8779997266e 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -434,13 +434,6 @@ FOLIO_MATCH(flags, _flags_2a);
> FOLIO_MATCH(compound_head, _head_2a);
> #undef FOLIO_MATCH
>
> -#if ALLOC_SPLIT_PTLOCKS
> -struct pt_lock {
> - spinlock_t ptl;
> - struct rcu_head rcu;
> -};
> -#endif
> -
> /**
> * struct ptdesc - Memory descriptor for page tables.
> * @__page_flags: Same as page flags. Powerpc only.
> @@ -485,7 +478,7 @@ struct ptdesc {
> union {
> unsigned long _pt_pad_2;
> #if ALLOC_SPLIT_PTLOCKS
> - struct pt_lock *ptl;
> + spinlock_t *ptl;
> #else
> spinlock_t ptl;
> #endif
> diff --git a/mm/memory.c b/mm/memory.c
> index d9af83dd86bbf..83765632e20b0 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -7041,34 +7041,24 @@ static struct kmem_cache *page_ptl_cachep;
>
> void __init ptlock_cache_init(void)
> {
> - page_ptl_cachep = kmem_cache_create("page->ptl", sizeof(struct pt_lock), 0,
> + page_ptl_cachep = kmem_cache_create("page->ptl", sizeof(spinlock_t), 0,
> SLAB_PANIC, NULL);
> }
>
> bool ptlock_alloc(struct ptdesc *ptdesc)
> {
> - struct pt_lock *pt_lock;
> + spinlock_t *ptl;
>
> - pt_lock = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
> - if (!pt_lock)
> + ptl = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
> + if (!ptl)
> return false;
> - ptdesc->ptl = pt_lock;
> + ptdesc->ptl = ptl;
> return true;
> }
>
> -static void ptlock_free_rcu(struct rcu_head *head)
> -{
> - struct pt_lock *pt_lock;
> -
> - pt_lock = container_of(head, struct pt_lock, rcu);
> - kmem_cache_free(page_ptl_cachep, pt_lock);
> -}
> -
> void ptlock_free(struct ptdesc *ptdesc)
> {
> - struct pt_lock *pt_lock = ptdesc->ptl;
> -
> - call_rcu(&pt_lock->rcu, ptlock_free_rcu);
> + kmem_cache_free(page_ptl_cachep, ptdesc->ptl);
> }
> #endif
>
> --
> 2.20.1
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table()
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
` (11 preceding siblings ...)
2024-12-14 9:02 ` [PATCH 12/12] mm: pgtable: move __tlb_remove_table_one() in x86 to generic file Qi Zheng
@ 2024-12-14 18:55 ` Peter Zijlstra
2024-12-15 6:23 ` Qi Zheng
12 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2024-12-14 18:55 UTC (permalink / raw)
To: Qi Zheng
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On Sat, Dec 14, 2024 at 05:02:46PM +0800, Qi Zheng wrote:
> As proposed [1] by Peter Zijlstra below, this patch series aims to move
> pagetable_*_dtor() into __tlb_remove_table(). This will cleanup pagetable_*_dtor()
> a bit and more gracefully fix the UAF issue [2] reported by syzbot.
I'll go over the patches in more detail on Monday, but at least the
first few patches should probably carry something like:
Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Since you extracted it from my half-baked patch, the rest could probably
do with:
Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Anyway, thanks for doing this, and so quicky! As said, I'll read more on
Monday.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table()
2024-12-14 18:55 ` [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Peter Zijlstra
@ 2024-12-15 6:23 ` Qi Zheng
0 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-15 6:23 UTC (permalink / raw)
To: Peter Zijlstra
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On 2024/12/15 02:55, Peter Zijlstra wrote:
> On Sat, Dec 14, 2024 at 05:02:46PM +0800, Qi Zheng wrote:
>
>> As proposed [1] by Peter Zijlstra below, this patch series aims to move
>> pagetable_*_dtor() into __tlb_remove_table(). This will cleanup pagetable_*_dtor()
>> a bit and more gracefully fix the UAF issue [2] reported by syzbot.
>
> I'll go over the patches in more detail on Monday, but at least the
> first few patches should probably carry something like:
>
> Originally-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Right! Will add this for [PATCH 02/12], [PATCH 03/12] and [PATCH 05/12],
>
> Since you extracted it from my half-baked patch, the rest could probably
> do with:
>
> Suggested-by: Peter Zijlstra (Intel) <peterz@infradead.org>
and add this for [PATCH 04/12], [PATCH 06|07|08|09|10/12].
>
> Anyway, thanks for doing this, and so quicky! As said, I'll read more on
> Monday.
Thanks!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU"
2024-12-14 18:29 ` Yu Zhao
@ 2024-12-15 6:29 ` Qi Zheng
2024-12-16 6:10 ` Andrew Morton
0 siblings, 1 reply; 32+ messages in thread
From: Qi Zheng @ 2024-12-15 6:29 UTC (permalink / raw)
To: Yu Zhao
Cc: peterz, tglx, david, jannh, hughd, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On 2024/12/15 02:29, Yu Zhao wrote:
> On Sat, Dec 14, 2024 at 2:03 AM Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>>
>> This reverts commit 2f3443770437e49abc39af26962d293851cbab6d.
>>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>
> Bloating struct pt_lock is unnecessary. Glad to see it's reverted.
Indeed!
>
> Acked-by: Yu Zhao <yuzhao@google.com>
Thanks! Once the review of this patch series is completed, we can simply
drop "mm: pgtable: make ptlock be freed by RCU" from mm tree.
>
>
>> ---
>> include/linux/mm.h | 2 +-
>> include/linux/mm_types.h | 9 +--------
>> mm/memory.c | 22 ++++++----------------
>> 3 files changed, 8 insertions(+), 25 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index e7902980439cc..5e73e53c34e9e 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -2988,7 +2988,7 @@ void ptlock_free(struct ptdesc *ptdesc);
>>
>> static inline spinlock_t *ptlock_ptr(struct ptdesc *ptdesc)
>> {
>> - return &(ptdesc->ptl->ptl);
>> + return ptdesc->ptl;
>> }
>> #else /* ALLOC_SPLIT_PTLOCKS */
>> static inline void ptlock_cache_init(void)
>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> index df8f5152644ec..5d8779997266e 100644
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -434,13 +434,6 @@ FOLIO_MATCH(flags, _flags_2a);
>> FOLIO_MATCH(compound_head, _head_2a);
>> #undef FOLIO_MATCH
>>
>> -#if ALLOC_SPLIT_PTLOCKS
>> -struct pt_lock {
>> - spinlock_t ptl;
>> - struct rcu_head rcu;
>> -};
>> -#endif
>> -
>> /**
>> * struct ptdesc - Memory descriptor for page tables.
>> * @__page_flags: Same as page flags. Powerpc only.
>> @@ -485,7 +478,7 @@ struct ptdesc {
>> union {
>> unsigned long _pt_pad_2;
>> #if ALLOC_SPLIT_PTLOCKS
>> - struct pt_lock *ptl;
>> + spinlock_t *ptl;
>> #else
>> spinlock_t ptl;
>> #endif
>> diff --git a/mm/memory.c b/mm/memory.c
>> index d9af83dd86bbf..83765632e20b0 100644
>> --- a/mm/memory.c
>> +++ b/mm/memory.c
>> @@ -7041,34 +7041,24 @@ static struct kmem_cache *page_ptl_cachep;
>>
>> void __init ptlock_cache_init(void)
>> {
>> - page_ptl_cachep = kmem_cache_create("page->ptl", sizeof(struct pt_lock), 0,
>> + page_ptl_cachep = kmem_cache_create("page->ptl", sizeof(spinlock_t), 0,
>> SLAB_PANIC, NULL);
>> }
>>
>> bool ptlock_alloc(struct ptdesc *ptdesc)
>> {
>> - struct pt_lock *pt_lock;
>> + spinlock_t *ptl;
>>
>> - pt_lock = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
>> - if (!pt_lock)
>> + ptl = kmem_cache_alloc(page_ptl_cachep, GFP_KERNEL);
>> + if (!ptl)
>> return false;
>> - ptdesc->ptl = pt_lock;
>> + ptdesc->ptl = ptl;
>> return true;
>> }
>>
>> -static void ptlock_free_rcu(struct rcu_head *head)
>> -{
>> - struct pt_lock *pt_lock;
>> -
>> - pt_lock = container_of(head, struct pt_lock, rcu);
>> - kmem_cache_free(page_ptl_cachep, pt_lock);
>> -}
>> -
>> void ptlock_free(struct ptdesc *ptdesc)
>> {
>> - struct pt_lock *pt_lock = ptdesc->ptl;
>> -
>> - call_rcu(&pt_lock->rcu, ptlock_free_rcu);
>> + kmem_cache_free(page_ptl_cachep, ptdesc->ptl);
>> }
>> #endif
>>
>> --
>> 2.20.1
>>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU"
2024-12-15 6:29 ` Qi Zheng
@ 2024-12-16 6:10 ` Andrew Morton
2024-12-16 6:15 ` Qi Zheng
0 siblings, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2024-12-16 6:10 UTC (permalink / raw)
To: Qi Zheng
Cc: Yu Zhao, peterz, tglx, david, jannh, hughd, willy, muchun.song,
vbabka, lorenzo.stoakes, rientjes, linux-mm, linux-kernel
On Sun, 15 Dec 2024 14:29:38 +0800 Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> >
> > Acked-by: Yu Zhao <yuzhao@google.com>
>
> Thanks! Once the review of this patch series is completed, we can simply
> drop "mm: pgtable: make ptlock be freed by RCU" from mm tree.
Can we drop it now and does the remainder of the series "synchronously
scan and reclaim empty user PTE pages v4" remain valid and useful?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU"
2024-12-16 6:10 ` Andrew Morton
@ 2024-12-16 6:15 ` Qi Zheng
2024-12-16 6:35 ` Andrew Morton
0 siblings, 1 reply; 32+ messages in thread
From: Qi Zheng @ 2024-12-16 6:15 UTC (permalink / raw)
To: Andrew Morton
Cc: Yu Zhao, peterz, tglx, david, jannh, hughd, willy, muchun.song,
vbabka, lorenzo.stoakes, rientjes, linux-mm, linux-kernel
Hi Andrew,
On 2024/12/16 14:10, Andrew Morton wrote:
> On Sun, 15 Dec 2024 14:29:38 +0800 Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>
>>>
>>> Acked-by: Yu Zhao <yuzhao@google.com>
>>
>> Thanks! Once the review of this patch series is completed, we can simply
>> drop "mm: pgtable: make ptlock be freed by RCU" from mm tree.
>
> Can we drop it now and does the remainder of the series "synchronously
> scan and reclaim empty user PTE pages v4" remain valid and useful?
The "mm: pgtable: make ptlock be freed by RCU" fixes the UAF issue [1]
reported by syzbot. If it is dropped now and this patch series is not
merged, the UAF issue will reappear.
[1].
https://lore.kernel.org/lkml/67548279.050a0220.a30f1.015b.GAE@google.com/
Thanks,
Qi
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU"
2024-12-16 6:15 ` Qi Zheng
@ 2024-12-16 6:35 ` Andrew Morton
2024-12-16 6:39 ` Qi Zheng
0 siblings, 1 reply; 32+ messages in thread
From: Andrew Morton @ 2024-12-16 6:35 UTC (permalink / raw)
To: Qi Zheng
Cc: Yu Zhao, peterz, tglx, david, jannh, hughd, willy, muchun.song,
vbabka, lorenzo.stoakes, rientjes, linux-mm, linux-kernel
On Mon, 16 Dec 2024 14:15:35 +0800 Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> Hi Andrew,
>
> On 2024/12/16 14:10, Andrew Morton wrote:
> > On Sun, 15 Dec 2024 14:29:38 +0800 Qi Zheng <zhengqi.arch@bytedance.com> wrote:
> >
> >>>
> >>> Acked-by: Yu Zhao <yuzhao@google.com>
> >>
> >> Thanks! Once the review of this patch series is completed, we can simply
> >> drop "mm: pgtable: make ptlock be freed by RCU" from mm tree.
> >
> > Can we drop it now and does the remainder of the series "synchronously
> > scan and reclaim empty user PTE pages v4" remain valid and useful?
>
> The "mm: pgtable: make ptlock be freed by RCU" fixes the UAF issue [1]
> reported by syzbot. If it is dropped now and this patch series is not
> merged, the UAF issue will reappear.
>
> [1].
> https://lore.kernel.org/lkml/67548279.050a0220.a30f1.015b.GAE@google.com/
OK, so as I understand it,
- the series "synchronously scan and reclaim empty user PTE pages v4"
exposes a use-after-free bug, and fixes that bug with the patch "mm:
pgtable: make ptlock be freed by RCU".
- The series "move pagetable_*_dtor() to __tlb_remove_table()" fixes
that bug in a more desirable way.
- So when the series "move pagetable_*_dtor() to
__tlb_remove_table()" is merged into mm-unstable, I drop the patch
"mm: pgtable: make ptlock be freed by RCU".
Correct?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU"
2024-12-16 6:35 ` Andrew Morton
@ 2024-12-16 6:39 ` Qi Zheng
0 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-16 6:39 UTC (permalink / raw)
To: Andrew Morton
Cc: Yu Zhao, peterz, tglx, david, jannh, hughd, willy, muchun.song,
vbabka, lorenzo.stoakes, rientjes, linux-mm, linux-kernel
On 2024/12/16 14:35, Andrew Morton wrote:
> On Mon, 16 Dec 2024 14:15:35 +0800 Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>
>> Hi Andrew,
>>
>> On 2024/12/16 14:10, Andrew Morton wrote:
>>> On Sun, 15 Dec 2024 14:29:38 +0800 Qi Zheng <zhengqi.arch@bytedance.com> wrote:
>>>
>>>>>
>>>>> Acked-by: Yu Zhao <yuzhao@google.com>
>>>>
>>>> Thanks! Once the review of this patch series is completed, we can simply
>>>> drop "mm: pgtable: make ptlock be freed by RCU" from mm tree.
>>>
>>> Can we drop it now and does the remainder of the series "synchronously
>>> scan and reclaim empty user PTE pages v4" remain valid and useful?
>>
>> The "mm: pgtable: make ptlock be freed by RCU" fixes the UAF issue [1]
>> reported by syzbot. If it is dropped now and this patch series is not
>> merged, the UAF issue will reappear.
>>
>> [1].
>> https://lore.kernel.org/lkml/67548279.050a0220.a30f1.015b.GAE@google.com/
>
> OK, so as I understand it,
>
> - the series "synchronously scan and reclaim empty user PTE pages v4"
> exposes a use-after-free bug, and fixes that bug with the patch "mm:
> pgtable: make ptlock be freed by RCU".
>
> - The series "move pagetable_*_dtor() to __tlb_remove_table()" fixes
> that bug in a more desirable way.
>
> - So when the series "move pagetable_*_dtor() to
> __tlb_remove_table()" is merged into mm-unstable, I drop the patch
> "mm: pgtable: make ptlock be freed by RCU".
>
> Correct?
Right!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-14 9:02 ` [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table() Qi Zheng
@ 2024-12-16 12:00 ` Peter Zijlstra
2024-12-16 12:03 ` Peter Zijlstra
2024-12-16 12:52 ` Qi Zheng
0 siblings, 2 replies; 32+ messages in thread
From: Peter Zijlstra @ 2024-12-16 12:00 UTC (permalink / raw)
To: Qi Zheng
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On Sat, Dec 14, 2024 at 05:02:57PM +0800, Qi Zheng wrote:
> diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
> index c73b89811a264..3e002dea6278f 100644
> --- a/arch/s390/mm/pgalloc.c
> +++ b/arch/s390/mm/pgalloc.c
> @@ -193,13 +193,6 @@ void page_table_free(struct mm_struct *mm, unsigned long *table)
> pagetable_dtor_free(ptdesc);
> }
>
> -void __tlb_remove_table(void *table)
> -{
> - struct ptdesc *ptdesc = virt_to_ptdesc(table);
> -
> - pagetable_dtor_free(ptdesc);
> -}
> -
> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> static void pte_free_now(struct rcu_head *head)
> {
> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> index 709830274b756..939a813023d7e 100644
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> #define MAX_TABLE_BATCH \
> ((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
>
> +#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
> +static inline void __tlb_remove_table(void *_table)
> +{
> + struct ptdesc *ptdesc = (struct ptdesc *)_table;
> +
> + pagetable_dtor(ptdesc);
> + pagetable_free(ptdesc);
> +}
> +#endif
Spot the fail...
That said, all this ptdesc stuff is another giant trainwreck. Let me
clean that up for you.
---
diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
index ea4fbe7b17f6..ac3881ec342f 100644
--- a/arch/arm/include/asm/tlb.h
+++ b/arch/arm/include/asm/tlb.h
@@ -32,8 +32,6 @@
static inline void
__pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
{
- struct ptdesc *ptdesc = page_ptdesc(pte);
-
#ifndef CONFIG_ARM_LPAE
/*
* With the classic ARM MMU, a pte page has two corresponding pmd
@@ -43,16 +41,14 @@ __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
__tlb_adjust_range(tlb, addr - PAGE_SIZE, 2 * PAGE_SIZE);
#endif
- tlb_remove_ptdesc(tlb, ptdesc);
+ tlb_remove_table(tlb, pte);
}
static inline void
__pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr)
{
#ifdef CONFIG_ARM_LPAE
- struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
-
- tlb_remove_ptdesc(tlb, ptdesc);
+ tlb_remove_table(tlb, virt_to_page(pmdp));
#endif
}
diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
index 8d762607285c..4a60569fed69 100644
--- a/arch/arm64/include/asm/tlb.h
+++ b/arch/arm64/include/asm/tlb.h
@@ -75,18 +75,14 @@ static inline void tlb_flush(struct mmu_gather *tlb)
static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
unsigned long addr)
{
- struct ptdesc *ptdesc = page_ptdesc(pte);
-
- tlb_remove_ptdesc(tlb, ptdesc);
+ tlb_remove_table(tlb, pte);
}
#if CONFIG_PGTABLE_LEVELS > 2
static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
unsigned long addr)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
-
- tlb_remove_ptdesc(tlb, ptdesc);
+ tlb_remove_table(tlb, virt_to_page(pmdp));
}
#endif
@@ -94,12 +90,10 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
unsigned long addr)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(pudp);
-
if (!pgtable_l4_enabled())
return;
- tlb_remove_ptdesc(tlb, ptdesc);
+ tlb_remove_table(tlb, virt_to_page(pudp));
}
#endif
@@ -107,12 +101,10 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4dp,
unsigned long addr)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(p4dp);
-
if (!pgtable_l5_enabled())
return;
- tlb_remove_ptdesc(tlb, ptdesc);
+ tlb_remove_table(tlb, virt_to_page(p4dp));
}
#endif
diff --git a/arch/csky/include/asm/pgalloc.h b/arch/csky/include/asm/pgalloc.h
index f1ce5b7b28f2..2c0897624699 100644
--- a/arch/csky/include/asm/pgalloc.h
+++ b/arch/csky/include/asm/pgalloc.h
@@ -64,7 +64,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
#define __pte_free_tlb(tlb, pte, address) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc(tlb, page_ptdesc(pte)); \
+ tlb_remove_page(tlb, pte); \
} while (0)
extern void pagetable_init(void);
diff --git a/arch/hexagon/include/asm/pgalloc.h b/arch/hexagon/include/asm/pgalloc.h
index 40e42a0e7167..8b1550498f1b 100644
--- a/arch/hexagon/include/asm/pgalloc.h
+++ b/arch/hexagon/include/asm/pgalloc.h
@@ -90,7 +90,7 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
#define __pte_free_tlb(tlb, pte, addr) \
do { \
pagetable_dtor((page_ptdesc(pte))); \
- tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
#endif
diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h
index 7211dff8c969..5a4f22aeb618 100644
--- a/arch/loongarch/include/asm/pgalloc.h
+++ b/arch/loongarch/include/asm/pgalloc.h
@@ -58,7 +58,7 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
#define __pte_free_tlb(tlb, pte, address) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
#ifndef __PAGETABLE_PMD_FOLDED
diff --git a/arch/m68k/include/asm/sun3_pgalloc.h b/arch/m68k/include/asm/sun3_pgalloc.h
index 2b626cb3ad0a..63d9f95f5e3d 100644
--- a/arch/m68k/include/asm/sun3_pgalloc.h
+++ b/arch/m68k/include/asm/sun3_pgalloc.h
@@ -20,7 +20,7 @@ extern const char bad_pmd_string[];
#define __pte_free_tlb(tlb, pte, addr) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, pte_t *pte)
diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h
index 36d9805033c4..bbee21345154 100644
--- a/arch/mips/include/asm/pgalloc.h
+++ b/arch/mips/include/asm/pgalloc.h
@@ -57,7 +57,7 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
#define __pte_free_tlb(tlb, pte, address) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
#ifndef __PAGETABLE_PMD_FOLDED
diff --git a/arch/nios2/include/asm/pgalloc.h b/arch/nios2/include/asm/pgalloc.h
index 12a536b7bfbd..641cec8fb2a2 100644
--- a/arch/nios2/include/asm/pgalloc.h
+++ b/arch/nios2/include/asm/pgalloc.h
@@ -31,7 +31,7 @@ extern pgd_t *pgd_alloc(struct mm_struct *mm);
#define __pte_free_tlb(tlb, pte, addr) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
#endif /* _ASM_NIOS2_PGALLOC_H */
diff --git a/arch/openrisc/include/asm/pgalloc.h b/arch/openrisc/include/asm/pgalloc.h
index 596e2355824e..e9b9bc53ece0 100644
--- a/arch/openrisc/include/asm/pgalloc.h
+++ b/arch/openrisc/include/asm/pgalloc.h
@@ -69,7 +69,7 @@ extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm);
#define __pte_free_tlb(tlb, pte, addr) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
#endif
diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
index fc50d1401024..baedbd2546b9 100644
--- a/arch/riscv/include/asm/pgalloc.h
+++ b/arch/riscv/include/asm/pgalloc.h
@@ -26,13 +26,13 @@
* comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h
* for more details.
*/
-static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt)
+static inline void riscv_tlb_remove_table(struct mmu_gather *tlb, void *pt)
{
if (riscv_use_sbi_for_rfence()) {
- tlb_remove_ptdesc(tlb, pt);
+ tlb_remove_table(tlb, pt);
} else {
pagetable_dtor(pt);
- tlb_remove_page_ptdesc(tlb, pt);
+ tlb_remove_page(tlb, pt);
}
}
@@ -120,7 +120,7 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
unsigned long addr)
{
if (pgtable_l4_enabled)
- riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(pud));
+ riscv_tlb_remove_table(tlb, virt_to_page(pud));
}
#define p4d_alloc_one p4d_alloc_one
@@ -143,7 +143,7 @@ static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
unsigned long addr)
{
if (pgtable_l5_enabled)
- riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
+ riscv_tlb_remove_table(tlb, virt_to_page(p4d));
}
#endif /* __PAGETABLE_PMD_FOLDED */
@@ -172,7 +172,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
unsigned long addr)
{
- riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(pmd));
+ riscv_tlb_remove_table(tlb, virt_to_page(pmd));
}
#endif /* __PAGETABLE_PMD_FOLDED */
@@ -180,7 +180,7 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
unsigned long addr)
{
- riscv_tlb_remove_ptdesc(tlb, page_ptdesc(pte));
+ riscv_tlb_remove_table(tlb, pte);
}
#endif /* CONFIG_MMU */
diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
index 705278074034..fba11949dd2e 100644
--- a/arch/s390/include/asm/tlb.h
+++ b/arch/s390/include/asm/tlb.h
@@ -86,7 +86,7 @@ static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
tlb->cleared_pmds = 1;
if (mm_alloc_pgste(tlb->mm))
gmap_unlink(tlb->mm, (unsigned long *)pte, address);
- tlb_remove_ptdesc(tlb, pte);
+ tlb_remove_table(tlb, pte);
}
/*
@@ -105,7 +105,7 @@ static inline void pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
tlb->cleared_puds = 1;
- tlb_remove_ptdesc(tlb, pmd);
+ tlb_remove_table(tlb, pmd);
}
/*
@@ -123,7 +123,7 @@ static inline void pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
tlb->cleared_p4ds = 1;
- tlb_remove_ptdesc(tlb, pud);
+ tlb_remove_table(tlb, pud);
}
/*
@@ -141,7 +141,7 @@ static inline void p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
__tlb_adjust_range(tlb, address, PAGE_SIZE);
tlb->mm->context.flush_mm = 1;
tlb->freed_tables = 1;
- tlb_remove_ptdesc(tlb, p4d);
+ tlb_remove_table(tlb, p4d);
}
#endif /* _S390_TLB_H */
diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
index 96d938fdf224..43812b2363ef 100644
--- a/arch/sh/include/asm/pgalloc.h
+++ b/arch/sh/include/asm/pgalloc.h
@@ -35,7 +35,7 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
#define __pte_free_tlb(tlb, pte, addr) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
#endif /* __ASM_SH_PGALLOC_H */
diff --git a/arch/um/include/asm/pgalloc.h b/arch/um/include/asm/pgalloc.h
index f0af23c3aeb2..aa6063dc5b1e 100644
--- a/arch/um/include/asm/pgalloc.h
+++ b/arch/um/include/asm/pgalloc.h
@@ -28,7 +28,7 @@ extern pgd_t *pgd_alloc(struct mm_struct *);
#define __pte_free_tlb(tlb, pte, address) \
do { \
pagetable_dtor(page_ptdesc(pte)); \
- tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
+ tlb_remove_page((tlb), (pte)); \
} while (0)
#if CONFIG_PGTABLE_LEVELS > 2
@@ -36,15 +36,15 @@ do { \
#define __pmd_free_tlb(tlb, pmd, address) \
do { \
pagetable_dtor(virt_to_ptdesc(pmd)); \
- tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pmd)); \
+ tlb_remove_page((tlb), virt_page(pmd)); \
} while (0)
#if CONFIG_PGTABLE_LEVELS > 3
#define __pud_free_tlb(tlb, pud, address) \
do { \
- pagetable_dtor(virt_to_ptdesc(pud)); \
- tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pud)); \
+ pagetable_dtor(virt_to_ptdesc(pud)); \
+ tlb_remove_page((tlb), virt_to_page(pud)); \
} while (0)
#endif
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 939a813023d7..7991950e98f6 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -209,9 +209,9 @@ struct mmu_table_batch {
((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
-static inline void __tlb_remove_table(void *_table)
+static inline void __tlb_remove_table(void *table)
{
- struct ptdesc *ptdesc = (struct ptdesc *)_table;
+ struct ptdesc *ptdesc = page_to_ptdesc(table);
pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
@@ -499,17 +499,6 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
return tlb_remove_page_size(tlb, page, PAGE_SIZE);
}
-static inline void tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt)
-{
- tlb_remove_table(tlb, pt);
-}
-
-/* Like tlb_remove_ptdesc, but for page-like page directories. */
-static inline void tlb_remove_page_ptdesc(struct mmu_gather *tlb, struct ptdesc *pt)
-{
- tlb_remove_page(tlb, ptdesc_page(pt));
-}
-
static inline void tlb_change_page_size(struct mmu_gather *tlb,
unsigned int page_size)
{
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-16 12:00 ` Peter Zijlstra
@ 2024-12-16 12:03 ` Peter Zijlstra
2024-12-16 12:05 ` Peter Zijlstra
2024-12-16 12:52 ` Qi Zheng
1 sibling, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2024-12-16 12:03 UTC (permalink / raw)
To: Qi Zheng
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On Mon, Dec 16, 2024 at 01:00:43PM +0100, Peter Zijlstra wrote:
> On Sat, Dec 14, 2024 at 05:02:57PM +0800, Qi Zheng wrote:
>
> > diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
> > index c73b89811a264..3e002dea6278f 100644
> > --- a/arch/s390/mm/pgalloc.c
> > +++ b/arch/s390/mm/pgalloc.c
> > @@ -193,13 +193,6 @@ void page_table_free(struct mm_struct *mm, unsigned long *table)
> > pagetable_dtor_free(ptdesc);
> > }
> >
> > -void __tlb_remove_table(void *table)
> > -{
> > - struct ptdesc *ptdesc = virt_to_ptdesc(table);
> > -
> > - pagetable_dtor_free(ptdesc);
> > -}
> > -
> > #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > static void pte_free_now(struct rcu_head *head)
> > {
>
> > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> > index 709830274b756..939a813023d7e 100644
> > --- a/include/asm-generic/tlb.h
> > +++ b/include/asm-generic/tlb.h
>
> > #define MAX_TABLE_BATCH \
> > ((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
> >
> > +#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
> > +static inline void __tlb_remove_table(void *_table)
> > +{
> > + struct ptdesc *ptdesc = (struct ptdesc *)_table;
> > +
> > + pagetable_dtor(ptdesc);
> > + pagetable_free(ptdesc);
> > +}
> > +#endif
>
>
> Spot the fail...
>
> That said, all this ptdesc stuff is another giant trainwreck. Let me
> clean that up for you.
>
> ---
> -static inline void __tlb_remove_table(void *_table)
> +static inline void __tlb_remove_table(void *table)
> {
> - struct ptdesc *ptdesc = (struct ptdesc *)_table;
> + struct ptdesc *ptdesc = page_to_ptdesc(table);
>
> pagetable_dtor(ptdesc);
> pagetable_free(ptdesc);
And now observe that __tlb_remove_table() is identical to
asm-generic/pgalloc.h pte_free(), pmd_free(), __pud_free() and
__p4d_free().
And I'm sure we don't need 5 copies of this :-), wdyt?
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-16 12:03 ` Peter Zijlstra
@ 2024-12-16 12:05 ` Peter Zijlstra
2024-12-16 13:53 ` Qi Zheng
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2024-12-16 12:05 UTC (permalink / raw)
To: Qi Zheng
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On Mon, Dec 16, 2024 at 01:03:13PM +0100, Peter Zijlstra wrote:
> On Mon, Dec 16, 2024 at 01:00:43PM +0100, Peter Zijlstra wrote:
> > On Sat, Dec 14, 2024 at 05:02:57PM +0800, Qi Zheng wrote:
> >
> > > diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
> > > index c73b89811a264..3e002dea6278f 100644
> > > --- a/arch/s390/mm/pgalloc.c
> > > +++ b/arch/s390/mm/pgalloc.c
> > > @@ -193,13 +193,6 @@ void page_table_free(struct mm_struct *mm, unsigned long *table)
> > > pagetable_dtor_free(ptdesc);
> > > }
> > >
> > > -void __tlb_remove_table(void *table)
> > > -{
> > > - struct ptdesc *ptdesc = virt_to_ptdesc(table);
> > > -
> > > - pagetable_dtor_free(ptdesc);
> > > -}
> > > -
> > > #ifdef CONFIG_TRANSPARENT_HUGEPAGE
> > > static void pte_free_now(struct rcu_head *head)
> > > {
> >
> > > diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> > > index 709830274b756..939a813023d7e 100644
> > > --- a/include/asm-generic/tlb.h
> > > +++ b/include/asm-generic/tlb.h
> >
> > > #define MAX_TABLE_BATCH \
> > > ((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
> > >
> > > +#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
> > > +static inline void __tlb_remove_table(void *_table)
> > > +{
> > > + struct ptdesc *ptdesc = (struct ptdesc *)_table;
> > > +
> > > + pagetable_dtor(ptdesc);
> > > + pagetable_free(ptdesc);
> > > +}
> > > +#endif
> >
> >
> > Spot the fail...
> >
> > That said, all this ptdesc stuff is another giant trainwreck. Let me
> > clean that up for you.
> >
> > ---
>
> > -static inline void __tlb_remove_table(void *_table)
> > +static inline void __tlb_remove_table(void *table)
> > {
> > - struct ptdesc *ptdesc = (struct ptdesc *)_table;
> > + struct ptdesc *ptdesc = page_to_ptdesc(table);
> >
> > pagetable_dtor(ptdesc);
> > pagetable_free(ptdesc);
>
> And now observe that __tlb_remove_table() is identical to
> asm-generic/pgalloc.h pte_free(), pmd_free(), __pud_free() and
> __p4d_free().
>
> And I'm sure we don't need 5 copies of this :-), wdyt?
Argh, nearly, it has the whole page vs virt nonsense still going on.
Bah.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-16 12:00 ` Peter Zijlstra
2024-12-16 12:03 ` Peter Zijlstra
@ 2024-12-16 12:52 ` Qi Zheng
2024-12-16 18:12 ` Peter Zijlstra
1 sibling, 1 reply; 32+ messages in thread
From: Qi Zheng @ 2024-12-16 12:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On 2024/12/16 20:00, Peter Zijlstra wrote:
> On Sat, Dec 14, 2024 at 05:02:57PM +0800, Qi Zheng wrote:
[...]
>>
>> +#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
>> +static inline void __tlb_remove_table(void *_table)
>> +{
>> + struct ptdesc *ptdesc = (struct ptdesc *)_table;
>> +
>> + pagetable_dtor(ptdesc);
>> + pagetable_free(ptdesc);
>> +}
>> +#endif
>
>
> Spot the fail...
>
> That said, all this ptdesc stuff is another giant trainwreck. Let me
> clean that up for you.
It looks like you want to revert what was done in this patch series:
https://lore.kernel.org/all/20230807230513.102486-1-vishal.moola@gmail.com/
But why? It seems that splitting ptdesc from struct page is a good
thing?
>
> ---
> diff --git a/arch/arm/include/asm/tlb.h b/arch/arm/include/asm/tlb.h
> index ea4fbe7b17f6..ac3881ec342f 100644
> --- a/arch/arm/include/asm/tlb.h
> +++ b/arch/arm/include/asm/tlb.h
> @@ -32,8 +32,6 @@
> static inline void
> __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> {
> - struct ptdesc *ptdesc = page_ptdesc(pte);
> -
> #ifndef CONFIG_ARM_LPAE
> /*
> * With the classic ARM MMU, a pte page has two corresponding pmd
> @@ -43,16 +41,14 @@ __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte, unsigned long addr)
> __tlb_adjust_range(tlb, addr - PAGE_SIZE, 2 * PAGE_SIZE);
> #endif
>
> - tlb_remove_ptdesc(tlb, ptdesc);
> + tlb_remove_table(tlb, pte);
> }
>
> static inline void
> __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp, unsigned long addr)
> {
> #ifdef CONFIG_ARM_LPAE
> - struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
> -
> - tlb_remove_ptdesc(tlb, ptdesc);
> + tlb_remove_table(tlb, virt_to_page(pmdp));
> #endif
> }
>
> diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
> index 8d762607285c..4a60569fed69 100644
> --- a/arch/arm64/include/asm/tlb.h
> +++ b/arch/arm64/include/asm/tlb.h
> @@ -75,18 +75,14 @@ static inline void tlb_flush(struct mmu_gather *tlb)
> static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
> unsigned long addr)
> {
> - struct ptdesc *ptdesc = page_ptdesc(pte);
> -
> - tlb_remove_ptdesc(tlb, ptdesc);
> + tlb_remove_table(tlb, pte);
> }
>
> #if CONFIG_PGTABLE_LEVELS > 2
> static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
> unsigned long addr)
> {
> - struct ptdesc *ptdesc = virt_to_ptdesc(pmdp);
> -
> - tlb_remove_ptdesc(tlb, ptdesc);
> + tlb_remove_table(tlb, virt_to_page(pmdp));
> }
> #endif
>
> @@ -94,12 +90,10 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
> static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
> unsigned long addr)
> {
> - struct ptdesc *ptdesc = virt_to_ptdesc(pudp);
> -
> if (!pgtable_l4_enabled())
> return;
>
> - tlb_remove_ptdesc(tlb, ptdesc);
> + tlb_remove_table(tlb, virt_to_page(pudp));
> }
> #endif
>
> @@ -107,12 +101,10 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pudp,
> static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4dp,
> unsigned long addr)
> {
> - struct ptdesc *ptdesc = virt_to_ptdesc(p4dp);
> -
> if (!pgtable_l5_enabled())
> return;
>
> - tlb_remove_ptdesc(tlb, ptdesc);
> + tlb_remove_table(tlb, virt_to_page(p4dp));
> }
> #endif
>
> diff --git a/arch/csky/include/asm/pgalloc.h b/arch/csky/include/asm/pgalloc.h
> index f1ce5b7b28f2..2c0897624699 100644
> --- a/arch/csky/include/asm/pgalloc.h
> +++ b/arch/csky/include/asm/pgalloc.h
> @@ -64,7 +64,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
> #define __pte_free_tlb(tlb, pte, address) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc(tlb, page_ptdesc(pte)); \
> + tlb_remove_page(tlb, pte); \
> } while (0)
>
> extern void pagetable_init(void);
> diff --git a/arch/hexagon/include/asm/pgalloc.h b/arch/hexagon/include/asm/pgalloc.h
> index 40e42a0e7167..8b1550498f1b 100644
> --- a/arch/hexagon/include/asm/pgalloc.h
> +++ b/arch/hexagon/include/asm/pgalloc.h
> @@ -90,7 +90,7 @@ static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd,
> #define __pte_free_tlb(tlb, pte, addr) \
> do { \
> pagetable_dtor((page_ptdesc(pte))); \
> - tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> #endif
> diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h
> index 7211dff8c969..5a4f22aeb618 100644
> --- a/arch/loongarch/include/asm/pgalloc.h
> +++ b/arch/loongarch/include/asm/pgalloc.h
> @@ -58,7 +58,7 @@ static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
> #define __pte_free_tlb(tlb, pte, address) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> #ifndef __PAGETABLE_PMD_FOLDED
> diff --git a/arch/m68k/include/asm/sun3_pgalloc.h b/arch/m68k/include/asm/sun3_pgalloc.h
> index 2b626cb3ad0a..63d9f95f5e3d 100644
> --- a/arch/m68k/include/asm/sun3_pgalloc.h
> +++ b/arch/m68k/include/asm/sun3_pgalloc.h
> @@ -20,7 +20,7 @@ extern const char bad_pmd_string[];
> #define __pte_free_tlb(tlb, pte, addr) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> static inline void pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmd, pte_t *pte)
> diff --git a/arch/mips/include/asm/pgalloc.h b/arch/mips/include/asm/pgalloc.h
> index 36d9805033c4..bbee21345154 100644
> --- a/arch/mips/include/asm/pgalloc.h
> +++ b/arch/mips/include/asm/pgalloc.h
> @@ -57,7 +57,7 @@ static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> #define __pte_free_tlb(tlb, pte, address) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc((tlb), page_ptdesc(pte)); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> #ifndef __PAGETABLE_PMD_FOLDED
> diff --git a/arch/nios2/include/asm/pgalloc.h b/arch/nios2/include/asm/pgalloc.h
> index 12a536b7bfbd..641cec8fb2a2 100644
> --- a/arch/nios2/include/asm/pgalloc.h
> +++ b/arch/nios2/include/asm/pgalloc.h
> @@ -31,7 +31,7 @@ extern pgd_t *pgd_alloc(struct mm_struct *mm);
> #define __pte_free_tlb(tlb, pte, addr) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> #endif /* _ASM_NIOS2_PGALLOC_H */
> diff --git a/arch/openrisc/include/asm/pgalloc.h b/arch/openrisc/include/asm/pgalloc.h
> index 596e2355824e..e9b9bc53ece0 100644
> --- a/arch/openrisc/include/asm/pgalloc.h
> +++ b/arch/openrisc/include/asm/pgalloc.h
> @@ -69,7 +69,7 @@ extern pte_t *pte_alloc_one_kernel(struct mm_struct *mm);
> #define __pte_free_tlb(tlb, pte, addr) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> #endif
> diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
> index fc50d1401024..baedbd2546b9 100644
> --- a/arch/riscv/include/asm/pgalloc.h
> +++ b/arch/riscv/include/asm/pgalloc.h
> @@ -26,13 +26,13 @@
> * comment below 'ifdef CONFIG_MMU_GATHER_RCU_TABLE_FREE' in include/asm-generic/tlb.h
> * for more details.
> */
> -static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt)
> +static inline void riscv_tlb_remove_table(struct mmu_gather *tlb, void *pt)
> {
> if (riscv_use_sbi_for_rfence()) {
> - tlb_remove_ptdesc(tlb, pt);
> + tlb_remove_table(tlb, pt);
> } else {
> pagetable_dtor(pt);
> - tlb_remove_page_ptdesc(tlb, pt);
> + tlb_remove_page(tlb, pt);
> }
> }
>
> @@ -120,7 +120,7 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
> unsigned long addr)
> {
> if (pgtable_l4_enabled)
> - riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(pud));
> + riscv_tlb_remove_table(tlb, virt_to_page(pud));
> }
>
> #define p4d_alloc_one p4d_alloc_one
> @@ -143,7 +143,7 @@ static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
> unsigned long addr)
> {
> if (pgtable_l5_enabled)
> - riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
> + riscv_tlb_remove_table(tlb, virt_to_page(p4d));
> }
> #endif /* __PAGETABLE_PMD_FOLDED */
>
> @@ -172,7 +172,7 @@ static inline pgd_t *pgd_alloc(struct mm_struct *mm)
> static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
> unsigned long addr)
> {
> - riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(pmd));
> + riscv_tlb_remove_table(tlb, virt_to_page(pmd));
> }
>
> #endif /* __PAGETABLE_PMD_FOLDED */
> @@ -180,7 +180,7 @@ static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
> static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
> unsigned long addr)
> {
> - riscv_tlb_remove_ptdesc(tlb, page_ptdesc(pte));
> + riscv_tlb_remove_table(tlb, pte);
> }
> #endif /* CONFIG_MMU */
>
> diff --git a/arch/s390/include/asm/tlb.h b/arch/s390/include/asm/tlb.h
> index 705278074034..fba11949dd2e 100644
> --- a/arch/s390/include/asm/tlb.h
> +++ b/arch/s390/include/asm/tlb.h
> @@ -86,7 +86,7 @@ static inline void pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
> tlb->cleared_pmds = 1;
> if (mm_alloc_pgste(tlb->mm))
> gmap_unlink(tlb->mm, (unsigned long *)pte, address);
> - tlb_remove_ptdesc(tlb, pte);
> + tlb_remove_table(tlb, pte);
> }
>
> /*
> @@ -105,7 +105,7 @@ static inline void pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmd,
> tlb->mm->context.flush_mm = 1;
> tlb->freed_tables = 1;
> tlb->cleared_puds = 1;
> - tlb_remove_ptdesc(tlb, pmd);
> + tlb_remove_table(tlb, pmd);
> }
>
> /*
> @@ -123,7 +123,7 @@ static inline void pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
> tlb->mm->context.flush_mm = 1;
> tlb->freed_tables = 1;
> tlb->cleared_p4ds = 1;
> - tlb_remove_ptdesc(tlb, pud);
> + tlb_remove_table(tlb, pud);
> }
>
> /*
> @@ -141,7 +141,7 @@ static inline void p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
> __tlb_adjust_range(tlb, address, PAGE_SIZE);
> tlb->mm->context.flush_mm = 1;
> tlb->freed_tables = 1;
> - tlb_remove_ptdesc(tlb, p4d);
> + tlb_remove_table(tlb, p4d);
> }
>
> #endif /* _S390_TLB_H */
> diff --git a/arch/sh/include/asm/pgalloc.h b/arch/sh/include/asm/pgalloc.h
> index 96d938fdf224..43812b2363ef 100644
> --- a/arch/sh/include/asm/pgalloc.h
> +++ b/arch/sh/include/asm/pgalloc.h
> @@ -35,7 +35,7 @@ static inline void pmd_populate(struct mm_struct *mm, pmd_t *pmd,
> #define __pte_free_tlb(tlb, pte, addr) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> #endif /* __ASM_SH_PGALLOC_H */
> diff --git a/arch/um/include/asm/pgalloc.h b/arch/um/include/asm/pgalloc.h
> index f0af23c3aeb2..aa6063dc5b1e 100644
> --- a/arch/um/include/asm/pgalloc.h
> +++ b/arch/um/include/asm/pgalloc.h
> @@ -28,7 +28,7 @@ extern pgd_t *pgd_alloc(struct mm_struct *);
> #define __pte_free_tlb(tlb, pte, address) \
> do { \
> pagetable_dtor(page_ptdesc(pte)); \
> - tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte))); \
> + tlb_remove_page((tlb), (pte)); \
> } while (0)
>
> #if CONFIG_PGTABLE_LEVELS > 2
> @@ -36,15 +36,15 @@ do { \
> #define __pmd_free_tlb(tlb, pmd, address) \
> do { \
> pagetable_dtor(virt_to_ptdesc(pmd)); \
> - tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pmd)); \
> + tlb_remove_page((tlb), virt_page(pmd)); \
> } while (0)
>
> #if CONFIG_PGTABLE_LEVELS > 3
>
> #define __pud_free_tlb(tlb, pud, address) \
> do { \
> - pagetable_dtor(virt_to_ptdesc(pud)); \
> - tlb_remove_page_ptdesc((tlb), virt_to_ptdesc(pud)); \
> + pagetable_dtor(virt_to_ptdesc(pud)); \
> + tlb_remove_page((tlb), virt_to_page(pud)); \
> } while (0)
>
> #endif
> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
> index 939a813023d7..7991950e98f6 100644
> --- a/include/asm-generic/tlb.h
> +++ b/include/asm-generic/tlb.h
> @@ -209,9 +209,9 @@ struct mmu_table_batch {
> ((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
>
> #ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
> -static inline void __tlb_remove_table(void *_table)
> +static inline void __tlb_remove_table(void *table)
> {
> - struct ptdesc *ptdesc = (struct ptdesc *)_table;
> + struct ptdesc *ptdesc = page_to_ptdesc(table);
>
> pagetable_dtor(ptdesc);
> pagetable_free(ptdesc);
> @@ -499,17 +499,6 @@ static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
> return tlb_remove_page_size(tlb, page, PAGE_SIZE);
> }
>
> -static inline void tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt)
> -{
> - tlb_remove_table(tlb, pt);
> -}
> -
> -/* Like tlb_remove_ptdesc, but for page-like page directories. */
> -static inline void tlb_remove_page_ptdesc(struct mmu_gather *tlb, struct ptdesc *pt)
> -{
> - tlb_remove_page(tlb, ptdesc_page(pt));
> -}
> -
> static inline void tlb_change_page_size(struct mmu_gather *tlb,
> unsigned int page_size)
> {
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-16 12:05 ` Peter Zijlstra
@ 2024-12-16 13:53 ` Qi Zheng
0 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-16 13:53 UTC (permalink / raw)
To: Peter Zijlstra
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On 2024/12/16 20:05, Peter Zijlstra wrote:
> On Mon, Dec 16, 2024 at 01:03:13PM +0100, Peter Zijlstra wrote:
>> On Mon, Dec 16, 2024 at 01:00:43PM +0100, Peter Zijlstra wrote:
>>> On Sat, Dec 14, 2024 at 05:02:57PM +0800, Qi Zheng wrote:
>>>
>>>> diff --git a/arch/s390/mm/pgalloc.c b/arch/s390/mm/pgalloc.c
>>>> index c73b89811a264..3e002dea6278f 100644
>>>> --- a/arch/s390/mm/pgalloc.c
>>>> +++ b/arch/s390/mm/pgalloc.c
>>>> @@ -193,13 +193,6 @@ void page_table_free(struct mm_struct *mm, unsigned long *table)
>>>> pagetable_dtor_free(ptdesc);
>>>> }
>>>>
>>>> -void __tlb_remove_table(void *table)
>>>> -{
>>>> - struct ptdesc *ptdesc = virt_to_ptdesc(table);
>>>> -
>>>> - pagetable_dtor_free(ptdesc);
>>>> -}
>>>> -
>>>> #ifdef CONFIG_TRANSPARENT_HUGEPAGE
>>>> static void pte_free_now(struct rcu_head *head)
>>>> {
>>>
>>>> diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
>>>> index 709830274b756..939a813023d7e 100644
>>>> --- a/include/asm-generic/tlb.h
>>>> +++ b/include/asm-generic/tlb.h
>>>
>>>> #define MAX_TABLE_BATCH \
>>>> ((PAGE_SIZE - sizeof(struct mmu_table_batch)) / sizeof(void *))
>>>>
>>>> +#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
>>>> +static inline void __tlb_remove_table(void *_table)
>>>> +{
>>>> + struct ptdesc *ptdesc = (struct ptdesc *)_table;
>>>> +
>>>> + pagetable_dtor(ptdesc);
>>>> + pagetable_free(ptdesc);
>>>> +}
>>>> +#endif
>>>
>>>
>>> Spot the fail...
>>>
>>> That said, all this ptdesc stuff is another giant trainwreck. Let me
>>> clean that up for you.
>>>
>>> ---
>>
>>> -static inline void __tlb_remove_table(void *_table)
>>> +static inline void __tlb_remove_table(void *table)
>>> {
>>> - struct ptdesc *ptdesc = (struct ptdesc *)_table;
>>> + struct ptdesc *ptdesc = page_to_ptdesc(table);
>>>
>>> pagetable_dtor(ptdesc);
>>> pagetable_free(ptdesc);
>>
>> And now observe that __tlb_remove_table() is identical to
>> asm-generic/pgalloc.h pte_free(), pmd_free(), __pud_free() and
>> __p4d_free().
>>
>> And I'm sure we don't need 5 copies of this :-), wdyt?
>
> Argh, nearly, it has the whole page vs virt nonsense still going on.
> Bah.
Yes, maybe it can be unified into struct page parameter, which seems
to be more convenient for conversion:
static inline void pagtable_dtor_free(struct mm_struct *mm, void *table)
{
struct ptdesc *ptdesc = page_to_ptdesc(table);
pagetable_dtor(ptdesc);
pagetable_free(ptdesc);
}
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-16 12:52 ` Qi Zheng
@ 2024-12-16 18:12 ` Peter Zijlstra
2024-12-17 3:42 ` Qi Zheng
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2024-12-16 18:12 UTC (permalink / raw)
To: Qi Zheng
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On Mon, Dec 16, 2024 at 08:52:06PM +0800, Qi Zheng wrote:
>
>
> On 2024/12/16 20:00, Peter Zijlstra wrote:
> > On Sat, Dec 14, 2024 at 05:02:57PM +0800, Qi Zheng wrote:
>
> [...]
>
> > > +#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
> > > +static inline void __tlb_remove_table(void *_table)
> > > +{
> > > + struct ptdesc *ptdesc = (struct ptdesc *)_table;
> > > +
> > > + pagetable_dtor(ptdesc);
> > > + pagetable_free(ptdesc);
> > > +}
> > > +#endif
> >
> >
> > Spot the fail...
> >
> > That said, all this ptdesc stuff is another giant trainwreck. Let me
> > clean that up for you.
>
> It looks like you want to revert what was done in this patch series:
>
> https://lore.kernel.org/all/20230807230513.102486-1-vishal.moola@gmail.com/
>
> But why? It seems that splitting ptdesc from struct page is a good
> thing?
Because we're explicitly allocating pages for the page-tables, and also,
code like:
tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte)));
static inline void tlb_remove_page_ptdesc(struct mmu_gather *tlb, struct ptdesc *pt)
{
tlb_remove_page(tlb, ptdesc_page(pt));
}
Just makes me upset.
Just bloody write tlb_remove_page() and call it a day.
All that nonsense is just obfuscation at this point.
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-16 18:12 ` Peter Zijlstra
@ 2024-12-17 3:42 ` Qi Zheng
2024-12-17 9:02 ` Peter Zijlstra
0 siblings, 1 reply; 32+ messages in thread
From: Qi Zheng @ 2024-12-17 3:42 UTC (permalink / raw)
To: Peter Zijlstra, Vishal Moola
Cc: tglx, david, jannh, hughd, yuzhao, willy, muchun.song, vbabka,
lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On 2024/12/17 02:12, Peter Zijlstra wrote:
> On Mon, Dec 16, 2024 at 08:52:06PM +0800, Qi Zheng wrote:
>>
>>
>> On 2024/12/16 20:00, Peter Zijlstra wrote:
>>> On Sat, Dec 14, 2024 at 05:02:57PM +0800, Qi Zheng wrote:
>>
>> [...]
>>
>>>> +#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
>>>> +static inline void __tlb_remove_table(void *_table)
>>>> +{
>>>> + struct ptdesc *ptdesc = (struct ptdesc *)_table;
>>>> +
>>>> + pagetable_dtor(ptdesc);
>>>> + pagetable_free(ptdesc);
>>>> +}
>>>> +#endif
>>>
>>>
>>> Spot the fail...
>>>
>>> That said, all this ptdesc stuff is another giant trainwreck. Let me
>>> clean that up for you.
>>
>> It looks like you want to revert what was done in this patch series:
>>
>> https://lore.kernel.org/all/20230807230513.102486-1-vishal.moola@gmail.com/
>>
>> But why? It seems that splitting ptdesc from struct page is a good
>> thing?
>
> Because we're explicitly allocating pages for the page-tables, and also,
> code like:
>
> tlb_remove_page_ptdesc((tlb), (page_ptdesc(pte)));
>
> static inline void tlb_remove_page_ptdesc(struct mmu_gather *tlb, struct ptdesc *pt)
> {
> tlb_remove_page(tlb, ptdesc_page(pt));
> }
>
> Just makes me upset.
Aha, this strikes me as odd too.
Also +CC Vishal Moola, the author of ptdesc, who may be able to provide
more background information. If Vishal has no objection, I will try to
remove tlb_remove_ptdesc() and tlb_remove_page_ptdesc() as Peter suggested.
>
> Just bloody write tlb_remove_page() and call it a day.
>
> All that nonsense is just obfuscation at this point.
In addition, will remove the duplicates of __tlb_remove_table,
asm-generic/pgalloc.h pte_free(), pmd_free(), __pud_free() and
__p4d_free() as follows:
diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
index 3673e9c29504e..370f5b579ff88 100644
--- a/include/asm-generic/pgalloc.h
+++ b/include/asm-generic/pgalloc.h
@@ -107,10 +107,7 @@ static inline pgtable_t pte_alloc_one_noprof(struct
mm_struct *mm)
*/
static inline void pte_free(struct mm_struct *mm, struct page *pte_page)
{
- struct ptdesc *ptdesc = page_ptdesc(pte_page);
-
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
+ pagetable_dtor_free(pte_page);
}
@@ -150,11 +147,7 @@ static inline pmd_t *pmd_alloc_one_noprof(struct
mm_struct *mm, unsigned long ad
#ifndef __HAVE_ARCH_PMD_FREE
static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(pmd);
-
- BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
+ pagetable_dtor_free(virt_to_page(pmd));
}
#endif
@@ -199,11 +192,7 @@ static inline pud_t *pud_alloc_one_noprof(struct
mm_struct *mm, unsigned long ad
static inline void __pud_free(struct mm_struct *mm, pud_t *pud)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(pud);
-
- BUG_ON((unsigned long)pud & (PAGE_SIZE-1));
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
+ pagetable_dtor_free(virt_to_page(pud));
}
#ifndef __HAVE_ARCH_PUD_FREE
@@ -254,11 +243,7 @@ static inline p4d_t *p4d_alloc_one_noprof(struct
mm_struct *mm, unsigned long ad
static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
{
- struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
-
- BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
+ pagetable_dtor_free(virt_to_page(p4d));
}
#ifndef __HAVE_ARCH_P4D_FREE
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h
index 939a813023d7e..b34d014c23ef0 100644
--- a/include/asm-generic/tlb.h
+++ b/include/asm-generic/tlb.h
@@ -211,10 +211,7 @@ struct mmu_table_batch {
#ifndef __HAVE_ARCH_TLB_REMOVE_TABLE
static inline void __tlb_remove_table(void *_table)
{
- struct ptdesc *ptdesc = (struct ptdesc *)_table;
-
- pagetable_dtor(ptdesc);
- pagetable_free(ptdesc);
+ pagetable_dtor_free(_table);
}
#endif
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 497035a78849b..11829860ec05e 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3064,6 +3064,14 @@ static inline void pagetable_dtor(struct ptdesc
*ptdesc)
lruvec_stat_sub_folio(folio, NR_PAGETABLE);
}
+static inline void pagetable_dtor_free(void *table)
+{
+ struct ptdesc *ptdesc = page_ptdesc((struct page *)table);
+
+ pagetable_dtor(ptdesc);
+ pagetable_dtor(ptdesc);
+}
+
static inline bool pagetable_pte_ctor(struct ptdesc *ptdesc)
{
struct folio *folio = ptdesc_folio(ptdesc);
Thanks!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-17 3:42 ` Qi Zheng
@ 2024-12-17 9:02 ` Peter Zijlstra
2024-12-17 9:10 ` Qi Zheng
0 siblings, 1 reply; 32+ messages in thread
From: Peter Zijlstra @ 2024-12-17 9:02 UTC (permalink / raw)
To: Qi Zheng
Cc: Vishal Moola, tglx, david, jannh, hughd, yuzhao, willy,
muchun.song, vbabka, lorenzo.stoakes, akpm, rientjes, linux-mm,
linux-kernel
On Tue, Dec 17, 2024 at 11:42:02AM +0800, Qi Zheng wrote:
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 497035a78849b..11829860ec05e 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3064,6 +3064,14 @@ static inline void pagetable_dtor(struct ptdesc
> *ptdesc)
> lruvec_stat_sub_folio(folio, NR_PAGETABLE);
> }
>
> +static inline void pagetable_dtor_free(void *table)
> +{
> + struct ptdesc *ptdesc = page_ptdesc((struct page *)table);
> +
> + pagetable_dtor(ptdesc);
> + pagetable_dtor(ptdesc);
> +}
Right, that works, except you have whitespace issues and I think you'll
find it'll work better if you don't call _dtor twice but instead replace
that last one with _free() :-)
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table()
2024-12-17 9:02 ` Peter Zijlstra
@ 2024-12-17 9:10 ` Qi Zheng
0 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-17 9:10 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Vishal Moola, tglx, david, jannh, hughd, yuzhao, willy,
muchun.song, vbabka, lorenzo.stoakes, akpm, rientjes, linux-mm,
linux-kernel
On 2024/12/17 17:02, Peter Zijlstra wrote:
> On Tue, Dec 17, 2024 at 11:42:02AM +0800, Qi Zheng wrote:
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 497035a78849b..11829860ec05e 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -3064,6 +3064,14 @@ static inline void pagetable_dtor(struct ptdesc
>> *ptdesc)
>> lruvec_stat_sub_folio(folio, NR_PAGETABLE);
>> }
>>
>> +static inline void pagetable_dtor_free(void *table)
>> +{
>> + struct ptdesc *ptdesc = page_ptdesc((struct page *)table);
>> +
>> + pagetable_dtor(ptdesc);
>> + pagetable_dtor(ptdesc);
>> +}
>
> Right, that works, except you have whitespace issues and I think you'll
> find it'll work better if you don't call _dtor twice but instead replace
> that last one with _free() :-)
Ah, stupid thing I did, please ignore it. ;)
Will add this to v2.
Thanks!
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 02/12] mm: pgtable: introduce generic p4d_alloc_one() and p4d_free()
2024-12-14 9:02 ` [PATCH 02/12] mm: pgtable: introduce generic p4d_alloc_one() and p4d_free() Qi Zheng
@ 2024-12-18 14:26 ` Klara Modin
2024-12-18 14:42 ` Qi Zheng
0 siblings, 1 reply; 32+ messages in thread
From: Klara Modin @ 2024-12-18 14:26 UTC (permalink / raw)
To: Qi Zheng, peterz, tglx, david, jannh, hughd, yuzhao, willy,
muchun.song, vbabka, lorenzo.stoakes, akpm, rientjes
Cc: linux-mm, linux-kernel
Hi,
On 2024-12-14 10:02, Qi Zheng wrote:
> Several architectures (arm64, riscv, x86) define p4d_alloc_one() as a
> wrapper for get_zeroed_page() and p4d_free() as a wrapper for free_page().
>
> For these architectures, provide a generic implementation in
> asm-generic/pgalloc.h and convert them to use it. And like other levels
> of page tables, add statistics for P4D level page table.
>
> For s390, it also defines p4d_alloc_one() and p4d_free(), but it uses its
> own logic, so skip it.
>
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
> ---
> arch/arm64/include/asm/pgalloc.h | 15 ++++-----
> arch/riscv/include/asm/pgalloc.h | 25 ++++++---------
> arch/x86/include/asm/pgalloc.h | 16 ++++------
> arch/x86/mm/pgtable.c | 3 ++
> include/asm-generic/pgalloc.h | 55 ++++++++++++++++++++++++++++++++
> include/linux/mm.h | 16 ++++++++++
> 6 files changed, 98 insertions(+), 32 deletions(-)
>
> diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
> index e75422864d1bd..679c530549327 100644
> --- a/arch/arm64/include/asm/pgalloc.h
> +++ b/arch/arm64/include/asm/pgalloc.h
> @@ -15,6 +15,8 @@
>
> #define __HAVE_ARCH_PGD_FREE
> #define __HAVE_ARCH_PUD_FREE
> +#define __HAVE_ARCH_P4D_ALLOC_ONE
> +#define __HAVE_ARCH_P4D_FREE
> #include <asm-generic/pgalloc.h>
>
> #define PGD_SIZE (PTRS_PER_PGD * sizeof(pgd_t))
> @@ -87,19 +89,16 @@ static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgdp, p4d_t *p4dp)
>
> static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
> {
> - gfp_t gfp = GFP_PGTABLE_USER;
> + if (!pgtable_l5_enabled())
> + return NULL;
>
> - if (mm == &init_mm)
> - gfp = GFP_PGTABLE_KERNEL;
> - return (p4d_t *)get_zeroed_page(gfp);
> + return __p4d_alloc_one(mm, addr);
> }
>
> static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
> {
> - if (!pgtable_l5_enabled())
> - return;
> - BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
> - free_page((unsigned long)p4d);
> + if (pgtable_l5_enabled())
> + __p4d_free(mm, p4d);
> }
>
> #define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d)
> diff --git a/arch/riscv/include/asm/pgalloc.h b/arch/riscv/include/asm/pgalloc.h
> index f52264304f772..bb6e1c5f1fb19 100644
> --- a/arch/riscv/include/asm/pgalloc.h
> +++ b/arch/riscv/include/asm/pgalloc.h
> @@ -14,6 +14,8 @@
> #ifdef CONFIG_MMU
> #define __HAVE_ARCH_PUD_ALLOC_ONE
> #define __HAVE_ARCH_PUD_FREE
> +#define __HAVE_ARCH_P4D_ALLOC_ONE
> +#define __HAVE_ARCH_P4D_FREE
> #include <asm-generic/pgalloc.h>
>
> static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb, void *pt)
> @@ -118,21 +120,10 @@ static inline void __pud_free_tlb(struct mmu_gather *tlb, pud_t *pud,
> #define p4d_alloc_one p4d_alloc_one
> static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
> {
> - if (pgtable_l5_enabled) {
> - gfp_t gfp = GFP_PGTABLE_USER;
> -
> - if (mm == &init_mm)
> - gfp = GFP_PGTABLE_KERNEL;
> - return (p4d_t *)get_zeroed_page(gfp);
> - }
> + if (!pgtable_l5_enabled)
> + return NULL;
>
> - return NULL;
> -}
> -
> -static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
> -{
> - BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
> - free_page((unsigned long)p4d);
> + return __p4d_alloc_one(mm, addr);
> }
>
> #define p4d_free p4d_free
> @@ -145,8 +136,12 @@ static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
> static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
> unsigned long addr)
> {
> - if (pgtable_l5_enabled)
> + if (pgtable_l5_enabled) {
> + struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
> +
> + pagetable_p4d_dtor(ptdesc);
> riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
> + }
> }
> #endif /* __PAGETABLE_PMD_FOLDED */
>
> diff --git a/arch/x86/include/asm/pgalloc.h b/arch/x86/include/asm/pgalloc.h
> index dcd836b59bebd..d9bc6cae77c9e 100644
> --- a/arch/x86/include/asm/pgalloc.h
> +++ b/arch/x86/include/asm/pgalloc.h
> @@ -8,6 +8,8 @@
>
> #define __HAVE_ARCH_PTE_ALLOC_ONE
> #define __HAVE_ARCH_PGD_FREE
> +#define __HAVE_ARCH_P4D_ALLOC_ONE
> +#define __HAVE_ARCH_P4D_FREE
> #include <asm-generic/pgalloc.h>
>
> static inline int __paravirt_pgd_alloc(struct mm_struct *mm) { return 0; }
> @@ -149,20 +151,16 @@ static inline void pgd_populate_safe(struct mm_struct *mm, pgd_t *pgd, p4d_t *p4
>
> static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned long addr)
> {
> - gfp_t gfp = GFP_KERNEL_ACCOUNT;
> + if (!pgtable_l5_enabled())
> + return NULL;
>
> - if (mm == &init_mm)
> - gfp &= ~__GFP_ACCOUNT;
> - return (p4d_t *)get_zeroed_page(gfp);
> + return __p4d_alloc_one(mm, addr);
> }
>
> static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
> {
> - if (!pgtable_l5_enabled())
> - return;
> -
> - BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
> - free_page((unsigned long)p4d);
> + if (pgtable_l5_enabled())
> + return __p4d_free(mm, p4d);
> }
>
> extern void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d);
> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
> index 69a357b15974a..3d6e84da45b24 100644
> --- a/arch/x86/mm/pgtable.c
> +++ b/arch/x86/mm/pgtable.c
> @@ -94,6 +94,9 @@ void ___pud_free_tlb(struct mmu_gather *tlb, pud_t *pud)
> #if CONFIG_PGTABLE_LEVELS > 4
> void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
> {
> + struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
> +
> + pagetable_p4d_dtor(ptdesc);
> paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
> paravirt_tlb_remove_table(tlb, virt_to_page(p4d));
> }
> diff --git a/include/asm-generic/pgalloc.h b/include/asm-generic/pgalloc.h
> index 7c48f5fbf8aa7..dbf61819b3581 100644
> --- a/include/asm-generic/pgalloc.h
> +++ b/include/asm-generic/pgalloc.h
> @@ -215,6 +215,61 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud)
>
> #endif /* CONFIG_PGTABLE_LEVELS > 3 */
>
> +#if CONFIG_PGTABLE_LEVELS > 4
> +
> +static inline p4d_t *__p4d_alloc_one_noprof(struct mm_struct *mm, unsigned long addr)
> +{
> + gfp_t gfp = GFP_PGTABLE_USER;
> + struct ptdesc *ptdesc;
> +
> + if (mm == &init_mm)
> + gfp = GFP_PGTABLE_KERNEL;
> + gfp &= ~__GFP_HIGHMEM;
> +
> + ptdesc = pagetable_alloc_noprof(gfp, 0);
> + if (!ptdesc)
> + return NULL;
> +
> + pagetable_p4d_ctor(ptdesc);
> + return ptdesc_address(ptdesc);
> +}
> +#define __p4d_alloc_one(...) alloc_hooks(__p4d_alloc_one_noprof(__VA_ARGS__))
> +
> +#ifndef __HAVE_ARCH_P4D_ALLOC_ONE
> +/**
> + * p4d_alloc_one - allocate memory for a P4D-level page table
> + * @mm: the mm_struct of the current context
> + *
> + * Allocate memory for a page table using %GFP_PGTABLE_USER for user context
> + * and %GFP_PGTABLE_KERNEL for kernel context.
> + *
> + * Return: pointer to the allocated memory or %NULL on error
> + */
> +static inline p4d_t *p4d_alloc_one_noprof(struct mm_struct *mm, unsigned long addr)
> +{
> + return __p4d_alloc_one_noprof(mm, addr);
> +}
> +#define p4d_alloc_one(...) alloc_hooks(p4d_alloc_one_noprof(__VA_ARGS__))
> +#endif
> +
> +static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
> +{
> + struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
> +
> + BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
> + pagetable_p4d_dtor(ptdesc);
> + pagetable_free(ptdesc);
> +}
> +
> +#ifndef __HAVE_ARCH_P4D_FREE
> +static inline void p4d_free(struct mm_struct *mm, pud_t *p4d)
Should this perhaps be p4d_t *p4d rather than pud_t *p4d?
Otherwise I get this build error:
In file included from
/home/klara/git/linux/arch/riscv/include/asm/kfence.h:8,
from /home/klara/git/linux/mm/kfence/core.c:34:
/home/klara/git/linux/include/asm-generic/pgalloc.h: In function ‘p4d_free’:
/home/klara/git/linux/include/asm-generic/pgalloc.h:252:24: error:
passing argument 2 of ‘__p4d_free’ from incompatible pointer type
[-Wincompatible-pointer-types]
252 | __p4d_free(mm, p4d);
| ^~~
| |
| pud_t *
/home/klara/git/linux/include/asm-generic/pgalloc.h:244:60: note:
expected ‘p4d_t *’ but argument is of type ‘pud_t *’
244 | static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
| ~~~~~~~^~~
In file included from
/home/klara/git/linux/arch/riscv/include/asm/kfence.h:8,
from /home/klara/git/linux/mm/kfence/report.c:22:
/home/klara/git/linux/include/asm-generic/pgalloc.h: In function ‘p4d_free’:
/home/klara/git/linux/include/asm-generic/pgalloc.h:252:24: error:
passing argument 2 of ‘__p4d_free’ from incompatible pointer type
[-Wincompatible-pointer-types]
252 | __p4d_free(mm, p4d);
| ^~~
| |
| pud_t *
/home/klara/git/linux/include/asm-generic/pgalloc.h:244:60: note:
expected ‘p4d_t *’ but argument is of type ‘pud_t *’
244 | static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
| ~~~~~~~^~~
Regards,
Klara Modin
> +{
> + __p4d_free(mm, p4d);
> +}
> +#endif
> +
> +#endif
> +
> #ifndef __HAVE_ARCH_PGD_FREE
> static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> {
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5e73e53c34e9e..807a12ed8ec96 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3237,6 +3237,22 @@ static inline void pagetable_pud_dtor(struct ptdesc *ptdesc)
> lruvec_stat_sub_folio(folio, NR_PAGETABLE);
> }
>
> +static inline void pagetable_p4d_ctor(struct ptdesc *ptdesc)
> +{
> + struct folio *folio = ptdesc_folio(ptdesc);
> +
> + __folio_set_pgtable(folio);
> + lruvec_stat_add_folio(folio, NR_PAGETABLE);
> +}
> +
> +static inline void pagetable_p4d_dtor(struct ptdesc *ptdesc)
> +{
> + struct folio *folio = ptdesc_folio(ptdesc);
> +
> + __folio_clear_pgtable(folio);
> + lruvec_stat_sub_folio(folio, NR_PAGETABLE);
> +}
> +
> extern void __init pagecache_init(void);
> extern void free_initmem(void);
>
^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 02/12] mm: pgtable: introduce generic p4d_alloc_one() and p4d_free()
2024-12-18 14:26 ` Klara Modin
@ 2024-12-18 14:42 ` Qi Zheng
0 siblings, 0 replies; 32+ messages in thread
From: Qi Zheng @ 2024-12-18 14:42 UTC (permalink / raw)
To: Klara Modin
Cc: peterz, tglx, david, jannh, hughd, yuzhao, willy, muchun.song,
vbabka, lorenzo.stoakes, akpm, rientjes, linux-mm, linux-kernel
On 2024/12/18 22:26, Klara Modin wrote:
> Hi,
>
> On 2024-12-14 10:02, Qi Zheng wrote:
>> Several architectures (arm64, riscv, x86) define p4d_alloc_one() as a
>> wrapper for get_zeroed_page() and p4d_free() as a wrapper for
>> free_page().
>>
>> For these architectures, provide a generic implementation in
>> asm-generic/pgalloc.h and convert them to use it. And like other levels
>> of page tables, add statistics for P4D level page table.
>>
>> For s390, it also defines p4d_alloc_one() and p4d_free(), but it uses its
>> own logic, so skip it.
>>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> ---
>> arch/arm64/include/asm/pgalloc.h | 15 ++++-----
>> arch/riscv/include/asm/pgalloc.h | 25 ++++++---------
>> arch/x86/include/asm/pgalloc.h | 16 ++++------
>> arch/x86/mm/pgtable.c | 3 ++
>> include/asm-generic/pgalloc.h | 55 ++++++++++++++++++++++++++++++++
>> include/linux/mm.h | 16 ++++++++++
>> 6 files changed, 98 insertions(+), 32 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/pgalloc.h
>> b/arch/arm64/include/asm/pgalloc.h
>> index e75422864d1bd..679c530549327 100644
>> --- a/arch/arm64/include/asm/pgalloc.h
>> +++ b/arch/arm64/include/asm/pgalloc.h
>> @@ -15,6 +15,8 @@
>> #define __HAVE_ARCH_PGD_FREE
>> #define __HAVE_ARCH_PUD_FREE
>> +#define __HAVE_ARCH_P4D_ALLOC_ONE
>> +#define __HAVE_ARCH_P4D_FREE
>> #include <asm-generic/pgalloc.h>
>> #define PGD_SIZE (PTRS_PER_PGD * sizeof(pgd_t))
>> @@ -87,19 +89,16 @@ static inline void pgd_populate(struct mm_struct
>> *mm, pgd_t *pgdp, p4d_t *p4dp)
>> static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned
>> long addr)
>> {
>> - gfp_t gfp = GFP_PGTABLE_USER;
>> + if (!pgtable_l5_enabled())
>> + return NULL;
>> - if (mm == &init_mm)
>> - gfp = GFP_PGTABLE_KERNEL;
>> - return (p4d_t *)get_zeroed_page(gfp);
>> + return __p4d_alloc_one(mm, addr);
>> }
>> static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
>> {
>> - if (!pgtable_l5_enabled())
>> - return;
>> - BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
>> - free_page((unsigned long)p4d);
>> + if (pgtable_l5_enabled())
>> + __p4d_free(mm, p4d);
>> }
>> #define __p4d_free_tlb(tlb, p4d, addr) p4d_free((tlb)->mm, p4d)
>> diff --git a/arch/riscv/include/asm/pgalloc.h
>> b/arch/riscv/include/asm/pgalloc.h
>> index f52264304f772..bb6e1c5f1fb19 100644
>> --- a/arch/riscv/include/asm/pgalloc.h
>> +++ b/arch/riscv/include/asm/pgalloc.h
>> @@ -14,6 +14,8 @@
>> #ifdef CONFIG_MMU
>> #define __HAVE_ARCH_PUD_ALLOC_ONE
>> #define __HAVE_ARCH_PUD_FREE
>> +#define __HAVE_ARCH_P4D_ALLOC_ONE
>> +#define __HAVE_ARCH_P4D_FREE
>> #include <asm-generic/pgalloc.h>
>> static inline void riscv_tlb_remove_ptdesc(struct mmu_gather *tlb,
>> void *pt)
>> @@ -118,21 +120,10 @@ static inline void __pud_free_tlb(struct
>> mmu_gather *tlb, pud_t *pud,
>> #define p4d_alloc_one p4d_alloc_one
>> static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned
>> long addr)
>> {
>> - if (pgtable_l5_enabled) {
>> - gfp_t gfp = GFP_PGTABLE_USER;
>> -
>> - if (mm == &init_mm)
>> - gfp = GFP_PGTABLE_KERNEL;
>> - return (p4d_t *)get_zeroed_page(gfp);
>> - }
>> + if (!pgtable_l5_enabled)
>> + return NULL;
>> - return NULL;
>> -}
>> -
>> -static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
>> -{
>> - BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
>> - free_page((unsigned long)p4d);
>> + return __p4d_alloc_one(mm, addr);
>> }
>> #define p4d_free p4d_free
>> @@ -145,8 +136,12 @@ static inline void p4d_free(struct mm_struct *mm,
>> p4d_t *p4d)
>> static inline void __p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d,
>> unsigned long addr)
>> {
>> - if (pgtable_l5_enabled)
>> + if (pgtable_l5_enabled) {
>> + struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
>> +
>> + pagetable_p4d_dtor(ptdesc);
>> riscv_tlb_remove_ptdesc(tlb, virt_to_ptdesc(p4d));
>> + }
>> }
>> #endif /* __PAGETABLE_PMD_FOLDED */
>> diff --git a/arch/x86/include/asm/pgalloc.h
>> b/arch/x86/include/asm/pgalloc.h
>> index dcd836b59bebd..d9bc6cae77c9e 100644
>> --- a/arch/x86/include/asm/pgalloc.h
>> +++ b/arch/x86/include/asm/pgalloc.h
>> @@ -8,6 +8,8 @@
>> #define __HAVE_ARCH_PTE_ALLOC_ONE
>> #define __HAVE_ARCH_PGD_FREE
>> +#define __HAVE_ARCH_P4D_ALLOC_ONE
>> +#define __HAVE_ARCH_P4D_FREE
>> #include <asm-generic/pgalloc.h>
>> static inline int __paravirt_pgd_alloc(struct mm_struct *mm) {
>> return 0; }
>> @@ -149,20 +151,16 @@ static inline void pgd_populate_safe(struct
>> mm_struct *mm, pgd_t *pgd, p4d_t *p4
>> static inline p4d_t *p4d_alloc_one(struct mm_struct *mm, unsigned
>> long addr)
>> {
>> - gfp_t gfp = GFP_KERNEL_ACCOUNT;
>> + if (!pgtable_l5_enabled())
>> + return NULL;
>> - if (mm == &init_mm)
>> - gfp &= ~__GFP_ACCOUNT;
>> - return (p4d_t *)get_zeroed_page(gfp);
>> + return __p4d_alloc_one(mm, addr);
>> }
>> static inline void p4d_free(struct mm_struct *mm, p4d_t *p4d)
>> {
>> - if (!pgtable_l5_enabled())
>> - return;
>> -
>> - BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
>> - free_page((unsigned long)p4d);
>> + if (pgtable_l5_enabled())
>> + return __p4d_free(mm, p4d);
>> }
>> extern void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d);
>> diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c
>> index 69a357b15974a..3d6e84da45b24 100644
>> --- a/arch/x86/mm/pgtable.c
>> +++ b/arch/x86/mm/pgtable.c
>> @@ -94,6 +94,9 @@ void ___pud_free_tlb(struct mmu_gather *tlb, pud_t
>> *pud)
>> #if CONFIG_PGTABLE_LEVELS > 4
>> void ___p4d_free_tlb(struct mmu_gather *tlb, p4d_t *p4d)
>> {
>> + struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
>> +
>> + pagetable_p4d_dtor(ptdesc);
>> paravirt_release_p4d(__pa(p4d) >> PAGE_SHIFT);
>> paravirt_tlb_remove_table(tlb, virt_to_page(p4d));
>> }
>> diff --git a/include/asm-generic/pgalloc.h
>> b/include/asm-generic/pgalloc.h
>> index 7c48f5fbf8aa7..dbf61819b3581 100644
>> --- a/include/asm-generic/pgalloc.h
>> +++ b/include/asm-generic/pgalloc.h
>> @@ -215,6 +215,61 @@ static inline void pud_free(struct mm_struct *mm,
>> pud_t *pud)
>> #endif /* CONFIG_PGTABLE_LEVELS > 3 */
>> +#if CONFIG_PGTABLE_LEVELS > 4
>> +
>> +static inline p4d_t *__p4d_alloc_one_noprof(struct mm_struct *mm,
>> unsigned long addr)
>> +{
>> + gfp_t gfp = GFP_PGTABLE_USER;
>> + struct ptdesc *ptdesc;
>> +
>> + if (mm == &init_mm)
>> + gfp = GFP_PGTABLE_KERNEL;
>> + gfp &= ~__GFP_HIGHMEM;
>> +
>> + ptdesc = pagetable_alloc_noprof(gfp, 0);
>> + if (!ptdesc)
>> + return NULL;
>> +
>> + pagetable_p4d_ctor(ptdesc);
>> + return ptdesc_address(ptdesc);
>> +}
>> +#define __p4d_alloc_one(...)
>> alloc_hooks(__p4d_alloc_one_noprof(__VA_ARGS__))
>> +
>> +#ifndef __HAVE_ARCH_P4D_ALLOC_ONE
>> +/**
>> + * p4d_alloc_one - allocate memory for a P4D-level page table
>> + * @mm: the mm_struct of the current context
>> + *
>> + * Allocate memory for a page table using %GFP_PGTABLE_USER for user
>> context
>> + * and %GFP_PGTABLE_KERNEL for kernel context.
>> + *
>> + * Return: pointer to the allocated memory or %NULL on error
>> + */
>> +static inline p4d_t *p4d_alloc_one_noprof(struct mm_struct *mm,
>> unsigned long addr)
>> +{
>> + return __p4d_alloc_one_noprof(mm, addr);
>> +}
>> +#define p4d_alloc_one(...)
>> alloc_hooks(p4d_alloc_one_noprof(__VA_ARGS__))
>> +#endif
>> +
>> +static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
>> +{
>> + struct ptdesc *ptdesc = virt_to_ptdesc(p4d);
>> +
>> + BUG_ON((unsigned long)p4d & (PAGE_SIZE-1));
>> + pagetable_p4d_dtor(ptdesc);
>> + pagetable_free(ptdesc);
>> +}
>> +
>
>
>> +#ifndef __HAVE_ARCH_P4D_FREE
>> +static inline void p4d_free(struct mm_struct *mm, pud_t *p4d)
>
> Should this perhaps be p4d_t *p4d rather than pud_t *p4d?
Yes, my bad. Will fix it.
Thanks!
>
> Otherwise I get this build error:
>
> In file included from
> /home/klara/git/linux/arch/riscv/include/asm/kfence.h:8,
> from /home/klara/git/linux/mm/kfence/core.c:34:
> /home/klara/git/linux/include/asm-generic/pgalloc.h: In function
> ‘p4d_free’:
> /home/klara/git/linux/include/asm-generic/pgalloc.h:252:24: error:
> passing argument 2 of ‘__p4d_free’ from incompatible pointer type
> [-Wincompatible-pointer-types]
> 252 | __p4d_free(mm, p4d);
> | ^~~
> | |
> | pud_t *
> /home/klara/git/linux/include/asm-generic/pgalloc.h:244:60: note:
> expected ‘p4d_t *’ but argument is of type ‘pud_t *’
> 244 | static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
> | ~~~~~~~^~~
> In file included from
> /home/klara/git/linux/arch/riscv/include/asm/kfence.h:8,
> from /home/klara/git/linux/mm/kfence/report.c:22:
> /home/klara/git/linux/include/asm-generic/pgalloc.h: In function
> ‘p4d_free’:
> /home/klara/git/linux/include/asm-generic/pgalloc.h:252:24: error:
> passing argument 2 of ‘__p4d_free’ from incompatible pointer type
> [-Wincompatible-pointer-types]
> 252 | __p4d_free(mm, p4d);
> | ^~~
> | |
> | pud_t *
> /home/klara/git/linux/include/asm-generic/pgalloc.h:244:60: note:
> expected ‘p4d_t *’ but argument is of type ‘pud_t *’
> 244 | static inline void __p4d_free(struct mm_struct *mm, p4d_t *p4d)
> | ~~~~~~~^~~
>
> Regards,
> Klara Modin
>
>> +{
>> + __p4d_free(mm, p4d);
>> +}
>> +#endif
>> +
>> +#endif
>> +
>> #ifndef __HAVE_ARCH_PGD_FREE
>> static inline void pgd_free(struct mm_struct *mm, pgd_t *pgd)
>> {
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 5e73e53c34e9e..807a12ed8ec96 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -3237,6 +3237,22 @@ static inline void pagetable_pud_dtor(struct
>> ptdesc *ptdesc)
>> lruvec_stat_sub_folio(folio, NR_PAGETABLE);
>> }
>> +static inline void pagetable_p4d_ctor(struct ptdesc *ptdesc)
>> +{
>> + struct folio *folio = ptdesc_folio(ptdesc);
>> +
>> + __folio_set_pgtable(folio);
>> + lruvec_stat_add_folio(folio, NR_PAGETABLE);
>> +}
>> +
>> +static inline void pagetable_p4d_dtor(struct ptdesc *ptdesc)
>> +{
>> + struct folio *folio = ptdesc_folio(ptdesc);
>> +
>> + __folio_clear_pgtable(folio);
>> + lruvec_stat_sub_folio(folio, NR_PAGETABLE);
>> +}
>> +
>> extern void __init pagecache_init(void);
>> extern void free_initmem(void);
>
^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2024-12-18 14:43 UTC | newest]
Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-12-14 9:02 [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Qi Zheng
2024-12-14 9:02 ` [PATCH 01/12] Revert "mm: pgtable: make ptlock be freed by RCU" Qi Zheng
2024-12-14 18:29 ` Yu Zhao
2024-12-15 6:29 ` Qi Zheng
2024-12-16 6:10 ` Andrew Morton
2024-12-16 6:15 ` Qi Zheng
2024-12-16 6:35 ` Andrew Morton
2024-12-16 6:39 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 02/12] mm: pgtable: introduce generic p4d_alloc_one() and p4d_free() Qi Zheng
2024-12-18 14:26 ` Klara Modin
2024-12-18 14:42 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 03/12] arm64: pgtable: use mmu gather to free p4d level page table Qi Zheng
2024-12-14 9:02 ` [PATCH 04/12] s390: pgtable: add statistics for PUD and P4D " Qi Zheng
2024-12-14 9:02 ` [PATCH 05/12] mm: pgtable: introduce pagetable_dtor() Qi Zheng
2024-12-14 9:02 ` [PATCH 06/12] arm: pgtable: move pagetable_dtor() to __tlb_remove_table() Qi Zheng
2024-12-14 9:02 ` [PATCH 07/12] arm64: " Qi Zheng
2024-12-14 9:02 ` [PATCH 08/12] riscv: " Qi Zheng
2024-12-14 9:02 ` [PATCH 09/12] x86: " Qi Zheng
2024-12-14 9:02 ` [PATCH 10/12] s390: pgtable: also move pagetable_dtor() of PxD " Qi Zheng
2024-12-14 9:02 ` [PATCH 11/12] mm: pgtable: introduce generic __tlb_remove_table() Qi Zheng
2024-12-16 12:00 ` Peter Zijlstra
2024-12-16 12:03 ` Peter Zijlstra
2024-12-16 12:05 ` Peter Zijlstra
2024-12-16 13:53 ` Qi Zheng
2024-12-16 12:52 ` Qi Zheng
2024-12-16 18:12 ` Peter Zijlstra
2024-12-17 3:42 ` Qi Zheng
2024-12-17 9:02 ` Peter Zijlstra
2024-12-17 9:10 ` Qi Zheng
2024-12-14 9:02 ` [PATCH 12/12] mm: pgtable: move __tlb_remove_table_one() in x86 to generic file Qi Zheng
2024-12-14 18:55 ` [PATCH 00/12] move pagetable_*_dtor() to __tlb_remove_table() Peter Zijlstra
2024-12-15 6:23 ` Qi Zheng
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox