linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] LoongArch: Fix vmalloc test issue
@ 2024-10-10  3:50 Bibo Mao
  2024-10-10  3:50 ` [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space Bibo Mao
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: Bibo Mao @ 2024-10-10  3:50 UTC (permalink / raw)
  To: Huacai Chen, Andrey Ryabinin, Andrew Morton
  Cc: David Hildenbrand, Barry Song, loongarch, linux-kernel,
	kasan-dev, linux-mm

On LoongArch 3C5000 Dual-Way machine, there are 32 CPUs and 128G RAM,
there are some errors with run vmalloc test with command like this
  insmod test_vmalloc.ko   nr_threads=32  run_test_mask=0x3af

Here is part of error message,
 WARNING: CPU: 13 PID: 1457 at mm/vmalloc.c:503 vmap_small_pages_range_noflush+0x388/0x510
 CPU: 13 UID: 0 PID: 1457 Comm: vmalloc_test/15 Not tainted 6.12.0-rc2+ #93

 Trying to vfree() nonexistent vm area (000000004dec9ced)
 WARNING: CPU: 3 PID: 1444 at mm/vmalloc.c:3345 vfree+0x1e8/0x4c8
 CPU: 3 UID: 0 PID: 1444 Comm: vmalloc_test/2

 Trying to vfree() bad address (00000000fc7c9da5)
 WARNING: CPU: 10 PID: 1552 at mm/vmalloc.c:3210 remove_vm_area+0x88/0x98
 CPU: 10 UID: 0 PID: 1552 Comm: kworker/u144:3

The mainly problem is that function set_pte() and pte_free() is atomic,
there is contension between them. Since these functions need modify
two consecutive pte entries for kernel space area, to assure that both
pte entries with PAGE_GLOBAL bit set.

With this patchset, vmalloc test case passes to run with command
  insmod test_vmalloc.ko   nr_threads=32  run_test_mask=0x3af

Bibo Mao (4):
  LoongArch: Set pte entry with PAGE_GLOBAL for kernel space
  mm/sparse-vmemmap: set pte_init when vmemmap is created
  LoongArch: Add barrier between set_pte and memory access
  LoongArch: Use atomic operation with set_pte and pte_clear function

 arch/loongarch/include/asm/cacheflush.h | 14 +++++++-
 arch/loongarch/include/asm/pgalloc.h    | 13 +++++++
 arch/loongarch/include/asm/pgtable.h    | 45 +++++++++----------------
 arch/loongarch/mm/init.c                |  4 ++-
 arch/loongarch/mm/kasan_init.c          |  4 ++-
 arch/loongarch/mm/pgtable.c             | 22 ++++++++++++
 mm/sparse-vmemmap.c                     |  5 +++
 7 files changed, 75 insertions(+), 32 deletions(-)


base-commit: 87d6aab2389e5ce0197d8257d5f8ee965a67c4cd
-- 
2.39.3



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space
  2024-10-10  3:50 [PATCH 0/4] LoongArch: Fix vmalloc test issue Bibo Mao
@ 2024-10-10  3:50 ` Bibo Mao
  2024-10-12  2:15   ` Huacai Chen
  2024-10-10  3:50 ` [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created Bibo Mao
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 14+ messages in thread
From: Bibo Mao @ 2024-10-10  3:50 UTC (permalink / raw)
  To: Huacai Chen, Andrey Ryabinin, Andrew Morton
  Cc: David Hildenbrand, Barry Song, loongarch, linux-kernel,
	kasan-dev, linux-mm

Unlike general architectures, there are two pages for one TLB entry
on LoongArch system. For kernel space, it requires both two pte
entries with PAGE_GLOBAL set, else HW treats it as non-global tlb,
there will be potential problems if tlb entry for kernel space is
not global. Such as fail to flush kernel tlb with function
local_flush_tlb_kernel_range() which only flush tlb with global bit.

Here function kernel_pte_init() is added, it can be used to init
pte table when it is created, so the default inital pte is
PAGE_GLOBAL rather than zero at beginning.

Kernel space areas includes fixmap, percpu, vmalloc and kasan areas
set default pte entry with PAGE_GLOBAL set.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/loongarch/include/asm/pgalloc.h | 13 +++++++++++++
 arch/loongarch/include/asm/pgtable.h |  1 +
 arch/loongarch/mm/init.c             |  4 +++-
 arch/loongarch/mm/kasan_init.c       |  4 +++-
 arch/loongarch/mm/pgtable.c          | 22 ++++++++++++++++++++++
 5 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h
index 4e2d6b7ca2ee..b2698c03dc2c 100644
--- a/arch/loongarch/include/asm/pgalloc.h
+++ b/arch/loongarch/include/asm/pgalloc.h
@@ -10,8 +10,21 @@
 
 #define __HAVE_ARCH_PMD_ALLOC_ONE
 #define __HAVE_ARCH_PUD_ALLOC_ONE
+#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL
 #include <asm-generic/pgalloc.h>
 
+static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
+{
+	pte_t *pte;
+
+	pte = (pte_t *) __get_free_page(GFP_KERNEL);
+	if (!pte)
+		return NULL;
+
+	kernel_pte_init(pte);
+	return pte;
+}
+
 static inline void pmd_populate_kernel(struct mm_struct *mm,
 				       pmd_t *pmd, pte_t *pte)
 {
diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index 9965f52ef65b..22e3a8f96213 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -269,6 +269,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pm
 extern void pgd_init(void *addr);
 extern void pud_init(void *addr);
 extern void pmd_init(void *addr);
+extern void kernel_pte_init(void *addr);
 
 /*
  * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
index 8a87a482c8f4..9f26e933a8a3 100644
--- a/arch/loongarch/mm/init.c
+++ b/arch/loongarch/mm/init.c
@@ -198,9 +198,11 @@ pte_t * __init populate_kernel_pte(unsigned long addr)
 	if (!pmd_present(pmdp_get(pmd))) {
 		pte_t *pte;
 
-		pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
+		pte = memblock_alloc_raw(PAGE_SIZE, PAGE_SIZE);
 		if (!pte)
 			panic("%s: Failed to allocate memory\n", __func__);
+
+		kernel_pte_init(pte);
 		pmd_populate_kernel(&init_mm, pmd, pte);
 	}
 
diff --git a/arch/loongarch/mm/kasan_init.c b/arch/loongarch/mm/kasan_init.c
index 427d6b1aec09..34988573b0d5 100644
--- a/arch/loongarch/mm/kasan_init.c
+++ b/arch/loongarch/mm/kasan_init.c
@@ -152,6 +152,8 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
 		phys_addr_t page_phys = early ?
 					__pa_symbol(kasan_early_shadow_page)
 					      : kasan_alloc_zeroed_page(node);
+		if (!early)
+			kernel_pte_init(__va(page_phys));
 		next = addr + PAGE_SIZE;
 		set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL));
 	} while (ptep++, addr = next, addr != end && __pte_none(early, ptep_get(ptep)));
@@ -287,7 +289,7 @@ void __init kasan_init(void)
 		set_pte(&kasan_early_shadow_pte[i],
 			pfn_pte(__phys_to_pfn(__pa_symbol(kasan_early_shadow_page)), PAGE_KERNEL_RO));
 
-	memset(kasan_early_shadow_page, 0, PAGE_SIZE);
+	kernel_pte_init(kasan_early_shadow_page);
 	csr_write64(__pa_symbol(swapper_pg_dir), LOONGARCH_CSR_PGDH);
 	local_flush_tlb_all();
 
diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c
index eb6a29b491a7..228ffc1db0a3 100644
--- a/arch/loongarch/mm/pgtable.c
+++ b/arch/loongarch/mm/pgtable.c
@@ -38,6 +38,28 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
 }
 EXPORT_SYMBOL_GPL(pgd_alloc);
 
+void kernel_pte_init(void *addr)
+{
+	unsigned long *p, *end;
+	unsigned long entry;
+
+	entry = (unsigned long)_PAGE_GLOBAL;
+	p = (unsigned long *)addr;
+	end = p + PTRS_PER_PTE;
+
+	do {
+		p[0] = entry;
+		p[1] = entry;
+		p[2] = entry;
+		p[3] = entry;
+		p[4] = entry;
+		p += 8;
+		p[-3] = entry;
+		p[-2] = entry;
+		p[-1] = entry;
+	} while (p != end);
+}
+
 void pgd_init(void *addr)
 {
 	unsigned long *p, *end;
-- 
2.39.3



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created
  2024-10-10  3:50 [PATCH 0/4] LoongArch: Fix vmalloc test issue Bibo Mao
  2024-10-10  3:50 ` [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space Bibo Mao
@ 2024-10-10  3:50 ` Bibo Mao
  2024-10-11  4:42   ` kernel test robot
                     ` (2 more replies)
  2024-10-10  3:50 ` [PATCH 3/4] LoongArch: Add barrier between set_pte and memory access Bibo Mao
  2024-10-10  3:50 ` [PATCH 4/4] LoongArch: Use atomic operation with set_pte and pte_clear function Bibo Mao
  3 siblings, 3 replies; 14+ messages in thread
From: Bibo Mao @ 2024-10-10  3:50 UTC (permalink / raw)
  To: Huacai Chen, Andrey Ryabinin, Andrew Morton
  Cc: David Hildenbrand, Barry Song, loongarch, linux-kernel,
	kasan-dev, linux-mm

Like pmd_init(), a weak function kernel_pte_init() is added and it
is only effective on LoongArch system. When pte table is created for
vmemmap kernel space, function kernel_pte_init() is called here.

It has no any effective on other architectures.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 mm/sparse-vmemmap.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index edcc7a6b0f6f..c0388b2e959d 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -184,6 +184,10 @@ static void * __meminit vmemmap_alloc_block_zero(unsigned long size, int node)
 	return p;
 }
 
+void __weak __meminit kernel_pte_init(void *addr)
+{
+}
+
 pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node)
 {
 	pmd_t *pmd = pmd_offset(pud, addr);
@@ -191,6 +195,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node)
 		void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
 		if (!p)
 			return NULL;
+		kernel_pte_init(p);
 		pmd_populate_kernel(&init_mm, pmd, p);
 	}
 	return pmd;
-- 
2.39.3



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 3/4] LoongArch: Add barrier between set_pte and memory access
  2024-10-10  3:50 [PATCH 0/4] LoongArch: Fix vmalloc test issue Bibo Mao
  2024-10-10  3:50 ` [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space Bibo Mao
  2024-10-10  3:50 ` [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created Bibo Mao
@ 2024-10-10  3:50 ` Bibo Mao
  2024-10-12  2:16   ` Huacai Chen
  2024-10-10  3:50 ` [PATCH 4/4] LoongArch: Use atomic operation with set_pte and pte_clear function Bibo Mao
  3 siblings, 1 reply; 14+ messages in thread
From: Bibo Mao @ 2024-10-10  3:50 UTC (permalink / raw)
  To: Huacai Chen, Andrey Ryabinin, Andrew Morton
  Cc: David Hildenbrand, Barry Song, loongarch, linux-kernel,
	kasan-dev, linux-mm

It is possible to return a spurious fault if memory is accessed
right after the pte is set. For user address space, pte is set
in kernel space and memory is accessed in user space, there is
long time for synchronization, no barrier needed. However for
kernel address space, it is possible that memory is accessed
right after the pte is set.

Here flush_cache_vmap/flush_cache_vmap_early is used for
synchronization.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
index f8754d08a31a..53be231319ef 100644
--- a/arch/loongarch/include/asm/cacheflush.h
+++ b/arch/loongarch/include/asm/cacheflush.h
@@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
 #define flush_cache_dup_mm(mm)				do { } while (0)
 #define flush_cache_range(vma, start, end)		do { } while (0)
 #define flush_cache_page(vma, vmaddr, pfn)		do { } while (0)
-#define flush_cache_vmap(start, end)			do { } while (0)
 #define flush_cache_vunmap(start, end)			do { } while (0)
 #define flush_icache_user_page(vma, page, addr, len)	do { } while (0)
 #define flush_dcache_mmap_lock(mapping)			do { } while (0)
 #define flush_dcache_mmap_unlock(mapping)		do { } while (0)
 
+/*
+ * It is possible for a kernel virtual mapping access to return a spurious
+ * fault if it's accessed right after the pte is set. The page fault handler
+ * does not expect this type of fault. flush_cache_vmap is not exactly the
+ * right place to put this, but it seems to work well enough.
+ */
+static inline void flush_cache_vmap(unsigned long start, unsigned long end)
+{
+	smp_mb();
+}
+#define flush_cache_vmap flush_cache_vmap
+#define flush_cache_vmap_early	flush_cache_vmap
+
 #define cache_op(op, addr)						\
 	__asm__ __volatile__(						\
 	"	cacop	%0, %1					\n"	\
-- 
2.39.3



^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 4/4] LoongArch: Use atomic operation with set_pte and pte_clear function
  2024-10-10  3:50 [PATCH 0/4] LoongArch: Fix vmalloc test issue Bibo Mao
                   ` (2 preceding siblings ...)
  2024-10-10  3:50 ` [PATCH 3/4] LoongArch: Add barrier between set_pte and memory access Bibo Mao
@ 2024-10-10  3:50 ` Bibo Mao
  2024-10-12  2:16   ` Huacai Chen
  3 siblings, 1 reply; 14+ messages in thread
From: Bibo Mao @ 2024-10-10  3:50 UTC (permalink / raw)
  To: Huacai Chen, Andrey Ryabinin, Andrew Morton
  Cc: David Hildenbrand, Barry Song, loongarch, linux-kernel,
	kasan-dev, linux-mm

For kernel space area on LoongArch system, both two consecutive page
table entries should be enabled with PAGE_GLOBAL bit. So with function
set_pte() and pte_clear(), pte buddy entry is checked and set besides
its own pte entry. However it is not atomic operation to set both two
pte entries, there is problem with test_vmalloc test case.

With previous patch, all page table entries are set with PAGE_GLOBAL
bit at beginning. Only its own pte entry need update with function
set_pte() and pte_clear(), nothing to do with buddy pte entry.

Signed-off-by: Bibo Mao <maobibo@loongson.cn>
---
 arch/loongarch/include/asm/pgtable.h | 44 ++++++++++------------------
 1 file changed, 15 insertions(+), 29 deletions(-)

diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
index 22e3a8f96213..4be3f0dbecda 100644
--- a/arch/loongarch/include/asm/pgtable.h
+++ b/arch/loongarch/include/asm/pgtable.h
@@ -325,40 +325,26 @@ extern void paging_init(void);
 static inline void set_pte(pte_t *ptep, pte_t pteval)
 {
 	WRITE_ONCE(*ptep, pteval);
+}
 
-	if (pte_val(pteval) & _PAGE_GLOBAL) {
-		pte_t *buddy = ptep_buddy(ptep);
-		/*
-		 * Make sure the buddy is global too (if it's !none,
-		 * it better already be global)
-		 */
-		if (pte_none(ptep_get(buddy))) {
-#ifdef CONFIG_SMP
-			/*
-			 * For SMP, multiple CPUs can race, so we need
-			 * to do this atomically.
-			 */
-			__asm__ __volatile__(
-			__AMOR "$zero, %[global], %[buddy] \n"
-			: [buddy] "+ZB" (buddy->pte)
-			: [global] "r" (_PAGE_GLOBAL)
-			: "memory");
-
-			DBAR(0b11000); /* o_wrw = 0b11000 */
-#else /* !CONFIG_SMP */
-			WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL));
-#endif /* CONFIG_SMP */
-		}
-	}
+static inline unsigned long __ptep_get_and_clear(pte_t *ptep)
+{
+	return atomic64_fetch_and(_PAGE_GLOBAL, (atomic64_t *)&pte_val(*ptep));
 }
 
 static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
 {
-	/* Preserve global status for the pair */
-	if (pte_val(ptep_get(ptep_buddy(ptep))) & _PAGE_GLOBAL)
-		set_pte(ptep, __pte(_PAGE_GLOBAL));
-	else
-		set_pte(ptep, __pte(0));
+	__ptep_get_and_clear(ptep);
+}
+
+#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
+static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
+					unsigned long addr, pte_t *ptep)
+{
+	unsigned long val;
+
+	val = __ptep_get_and_clear(ptep);
+	return __pte(val);
 }
 
 #define PGD_T_LOG2	(__builtin_ffs(sizeof(pgd_t)) - 1)
-- 
2.39.3



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created
  2024-10-10  3:50 ` [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created Bibo Mao
@ 2024-10-11  4:42   ` kernel test robot
  2024-10-11  4:42   ` kernel test robot
  2024-10-12  2:17   ` Huacai Chen
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2024-10-11  4:42 UTC (permalink / raw)
  To: Bibo Mao, Huacai Chen, Andrey Ryabinin, Andrew Morton
  Cc: oe-kbuild-all, Linux Memory Management List, David Hildenbrand,
	Barry Song, loongarch, linux-kernel, kasan-dev

Hi Bibo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 87d6aab2389e5ce0197d8257d5f8ee965a67c4cd]

url:    https://github.com/intel-lab-lkp/linux/commits/Bibo-Mao/LoongArch-Set-pte-entry-with-PAGE_GLOBAL-for-kernel-space/20241010-115120
base:   87d6aab2389e5ce0197d8257d5f8ee965a67c4cd
patch link:    https://lore.kernel.org/r/20241010035048.3422527-3-maobibo%40loongson.cn
patch subject: [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created
config: x86_64-defconfig (https://download.01.org/0day-ci/archive/20241011/202410111213.dfJ08626-lkp@intel.com/config)
compiler: gcc-11 (Debian 11.3.0-12) 11.3.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241011/202410111213.dfJ08626-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410111213.dfJ08626-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> mm/sparse-vmemmap.c:187:23: warning: no previous prototype for 'kernel_pte_init' [-Wmissing-prototypes]
     187 | void __weak __meminit kernel_pte_init(void *addr)
         |                       ^~~~~~~~~~~~~~~


vim +/kernel_pte_init +187 mm/sparse-vmemmap.c

   186	
 > 187	void __weak __meminit kernel_pte_init(void *addr)
   188	{
   189	}
   190	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created
  2024-10-10  3:50 ` [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created Bibo Mao
  2024-10-11  4:42   ` kernel test robot
@ 2024-10-11  4:42   ` kernel test robot
  2024-10-12  2:17   ` Huacai Chen
  2 siblings, 0 replies; 14+ messages in thread
From: kernel test robot @ 2024-10-11  4:42 UTC (permalink / raw)
  To: Bibo Mao, Huacai Chen, Andrey Ryabinin, Andrew Morton
  Cc: llvm, oe-kbuild-all, Linux Memory Management List,
	David Hildenbrand, Barry Song, loongarch, linux-kernel,
	kasan-dev

Hi Bibo,

kernel test robot noticed the following build warnings:

[auto build test WARNING on 87d6aab2389e5ce0197d8257d5f8ee965a67c4cd]

url:    https://github.com/intel-lab-lkp/linux/commits/Bibo-Mao/LoongArch-Set-pte-entry-with-PAGE_GLOBAL-for-kernel-space/20241010-115120
base:   87d6aab2389e5ce0197d8257d5f8ee965a67c4cd
patch link:    https://lore.kernel.org/r/20241010035048.3422527-3-maobibo%40loongson.cn
patch subject: [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created
config: s390-allnoconfig (https://download.01.org/0day-ci/archive/20241011/202410111254.kon5pPzX-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 70e0a7e7e6a8541bcc46908c592eed561850e416)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241011/202410111254.kon5pPzX-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410111254.kon5pPzX-lkp@intel.com/

All warnings (new ones prefixed by >>):

   In file included from mm/sparse-vmemmap.c:21:
   In file included from include/linux/mm.h:2213:
   include/linux/vmstat.h:518:36: warning: arithmetic between different enumeration types ('enum node_stat_item' and 'enum lru_list') [-Wenum-enum-conversion]
     518 |         return node_stat_name(NR_LRU_BASE + lru) + 3; // skip "nr_"
         |                               ~~~~~~~~~~~ ^ ~~~
   In file included from mm/sparse-vmemmap.c:23:
   In file included from include/linux/memblock.h:13:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from include/linux/io.h:14:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:548:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     548 |         val = __raw_readb(PCI_IOBASE + addr);
         |                           ~~~~~~~~~~ ^
   include/asm-generic/io.h:561:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     561 |         val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:37:59: note: expanded from macro '__le16_to_cpu'
      37 | #define __le16_to_cpu(x) __swab16((__force __u16)(__le16)(x))
         |                                                           ^
   include/uapi/linux/swab.h:102:54: note: expanded from macro '__swab16'
     102 | #define __swab16(x) (__u16)__builtin_bswap16((__u16)(x))
         |                                                      ^
   In file included from mm/sparse-vmemmap.c:23:
   In file included from include/linux/memblock.h:13:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from include/linux/io.h:14:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:574:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     574 |         val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
         |                                                         ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/big_endian.h:35:59: note: expanded from macro '__le32_to_cpu'
      35 | #define __le32_to_cpu(x) __swab32((__force __u32)(__le32)(x))
         |                                                           ^
   include/uapi/linux/swab.h:115:54: note: expanded from macro '__swab32'
     115 | #define __swab32(x) (__u32)__builtin_bswap32((__u32)(x))
         |                                                      ^
   In file included from mm/sparse-vmemmap.c:23:
   In file included from include/linux/memblock.h:13:
   In file included from arch/s390/include/asm/dma.h:5:
   In file included from include/linux/io.h:14:
   In file included from arch/s390/include/asm/io.h:93:
   include/asm-generic/io.h:585:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     585 |         __raw_writeb(value, PCI_IOBASE + addr);
         |                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:595:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     595 |         __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:605:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     605 |         __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
         |                                                       ~~~~~~~~~~ ^
   include/asm-generic/io.h:693:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     693 |         readsb(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:701:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     701 |         readsw(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:709:20: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     709 |         readsl(PCI_IOBASE + addr, buffer, count);
         |                ~~~~~~~~~~ ^
   include/asm-generic/io.h:718:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     718 |         writesb(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:727:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     727 |         writesw(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
   include/asm-generic/io.h:736:21: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
     736 |         writesl(PCI_IOBASE + addr, buffer, count);
         |                 ~~~~~~~~~~ ^
>> mm/sparse-vmemmap.c:187:23: warning: no previous prototype for function 'kernel_pte_init' [-Wmissing-prototypes]
     187 | void __weak __meminit kernel_pte_init(void *addr)
         |                       ^
   mm/sparse-vmemmap.c:187:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
     187 | void __weak __meminit kernel_pte_init(void *addr)
         | ^
         | static 
   14 warnings generated.


vim +/kernel_pte_init +187 mm/sparse-vmemmap.c

   186	
 > 187	void __weak __meminit kernel_pte_init(void *addr)
   188	{
   189	}
   190	

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space
  2024-10-10  3:50 ` [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space Bibo Mao
@ 2024-10-12  2:15   ` Huacai Chen
  2024-10-12  2:40     ` maobibo
  0 siblings, 1 reply; 14+ messages in thread
From: Huacai Chen @ 2024-10-12  2:15 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Andrey Ryabinin, Andrew Morton, David Hildenbrand, Barry Song,
	loongarch, linux-kernel, kasan-dev, linux-mm

Hi, Bibo,

On Thu, Oct 10, 2024 at 11:50 AM Bibo Mao <maobibo@loongson.cn> wrote:
>
> Unlike general architectures, there are two pages for one TLB entry
> on LoongArch system. For kernel space, it requires both two pte
> entries with PAGE_GLOBAL set, else HW treats it as non-global tlb,
> there will be potential problems if tlb entry for kernel space is
> not global. Such as fail to flush kernel tlb with function
> local_flush_tlb_kernel_range() which only flush tlb with global bit.
>
> Here function kernel_pte_init() is added, it can be used to init
> pte table when it is created, so the default inital pte is
> PAGE_GLOBAL rather than zero at beginning.
I think kernel_pte_init() is also needed in zero_pmd_populate() in
mm/kasan/init.c. And moreover, the second patch should be squashed in
this one because they should be as a whole. Though the second one
touches the common code, I can merge it with mm maintainer's acked-by.


Huacai

>
> Kernel space areas includes fixmap, percpu, vmalloc and kasan areas
> set default pte entry with PAGE_GLOBAL set.
>
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  arch/loongarch/include/asm/pgalloc.h | 13 +++++++++++++
>  arch/loongarch/include/asm/pgtable.h |  1 +
>  arch/loongarch/mm/init.c             |  4 +++-
>  arch/loongarch/mm/kasan_init.c       |  4 +++-
>  arch/loongarch/mm/pgtable.c          | 22 ++++++++++++++++++++++
>  5 files changed, 42 insertions(+), 2 deletions(-)
>
> diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h
> index 4e2d6b7ca2ee..b2698c03dc2c 100644
> --- a/arch/loongarch/include/asm/pgalloc.h
> +++ b/arch/loongarch/include/asm/pgalloc.h
> @@ -10,8 +10,21 @@
>
>  #define __HAVE_ARCH_PMD_ALLOC_ONE
>  #define __HAVE_ARCH_PUD_ALLOC_ONE
> +#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL
>  #include <asm-generic/pgalloc.h>
>
> +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
> +{
> +       pte_t *pte;
> +
> +       pte = (pte_t *) __get_free_page(GFP_KERNEL);
> +       if (!pte)
> +               return NULL;
> +
> +       kernel_pte_init(pte);
> +       return pte;
> +}
> +
>  static inline void pmd_populate_kernel(struct mm_struct *mm,
>                                        pmd_t *pmd, pte_t *pte)
>  {
> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
> index 9965f52ef65b..22e3a8f96213 100644
> --- a/arch/loongarch/include/asm/pgtable.h
> +++ b/arch/loongarch/include/asm/pgtable.h
> @@ -269,6 +269,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pm
>  extern void pgd_init(void *addr);
>  extern void pud_init(void *addr);
>  extern void pmd_init(void *addr);
> +extern void kernel_pte_init(void *addr);
>
>  /*
>   * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
> diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
> index 8a87a482c8f4..9f26e933a8a3 100644
> --- a/arch/loongarch/mm/init.c
> +++ b/arch/loongarch/mm/init.c
> @@ -198,9 +198,11 @@ pte_t * __init populate_kernel_pte(unsigned long addr)
>         if (!pmd_present(pmdp_get(pmd))) {
>                 pte_t *pte;
>
> -               pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
> +               pte = memblock_alloc_raw(PAGE_SIZE, PAGE_SIZE);
>                 if (!pte)
>                         panic("%s: Failed to allocate memory\n", __func__);
> +
> +               kernel_pte_init(pte);
>                 pmd_populate_kernel(&init_mm, pmd, pte);
>         }
>
> diff --git a/arch/loongarch/mm/kasan_init.c b/arch/loongarch/mm/kasan_init.c
> index 427d6b1aec09..34988573b0d5 100644
> --- a/arch/loongarch/mm/kasan_init.c
> +++ b/arch/loongarch/mm/kasan_init.c
> @@ -152,6 +152,8 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
>                 phys_addr_t page_phys = early ?
>                                         __pa_symbol(kasan_early_shadow_page)
>                                               : kasan_alloc_zeroed_page(node);
> +               if (!early)
> +                       kernel_pte_init(__va(page_phys));
>                 next = addr + PAGE_SIZE;
>                 set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL));
>         } while (ptep++, addr = next, addr != end && __pte_none(early, ptep_get(ptep)));
> @@ -287,7 +289,7 @@ void __init kasan_init(void)
>                 set_pte(&kasan_early_shadow_pte[i],
>                         pfn_pte(__phys_to_pfn(__pa_symbol(kasan_early_shadow_page)), PAGE_KERNEL_RO));
>
> -       memset(kasan_early_shadow_page, 0, PAGE_SIZE);
> +       kernel_pte_init(kasan_early_shadow_page);
>         csr_write64(__pa_symbol(swapper_pg_dir), LOONGARCH_CSR_PGDH);
>         local_flush_tlb_all();
>
> diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c
> index eb6a29b491a7..228ffc1db0a3 100644
> --- a/arch/loongarch/mm/pgtable.c
> +++ b/arch/loongarch/mm/pgtable.c
> @@ -38,6 +38,28 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
>  }
>  EXPORT_SYMBOL_GPL(pgd_alloc);
>
> +void kernel_pte_init(void *addr)
> +{
> +       unsigned long *p, *end;
> +       unsigned long entry;
> +
> +       entry = (unsigned long)_PAGE_GLOBAL;
> +       p = (unsigned long *)addr;
> +       end = p + PTRS_PER_PTE;
> +
> +       do {
> +               p[0] = entry;
> +               p[1] = entry;
> +               p[2] = entry;
> +               p[3] = entry;
> +               p[4] = entry;
> +               p += 8;
> +               p[-3] = entry;
> +               p[-2] = entry;
> +               p[-1] = entry;
> +       } while (p != end);
> +}
> +
>  void pgd_init(void *addr)
>  {
>         unsigned long *p, *end;
> --
> 2.39.3
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] LoongArch: Add barrier between set_pte and memory access
  2024-10-10  3:50 ` [PATCH 3/4] LoongArch: Add barrier between set_pte and memory access Bibo Mao
@ 2024-10-12  2:16   ` Huacai Chen
  2024-10-12  2:48     ` maobibo
  0 siblings, 1 reply; 14+ messages in thread
From: Huacai Chen @ 2024-10-12  2:16 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Andrey Ryabinin, Andrew Morton, David Hildenbrand, Barry Song,
	loongarch, linux-kernel, kasan-dev, linux-mm

Hi, Bibo,

On Thu, Oct 10, 2024 at 11:50 AM Bibo Mao <maobibo@loongson.cn> wrote:
>
> It is possible to return a spurious fault if memory is accessed
> right after the pte is set. For user address space, pte is set
> in kernel space and memory is accessed in user space, there is
> long time for synchronization, no barrier needed. However for
> kernel address space, it is possible that memory is accessed
> right after the pte is set.
>
> Here flush_cache_vmap/flush_cache_vmap_early is used for
> synchronization.
>
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
>  1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
> index f8754d08a31a..53be231319ef 100644
> --- a/arch/loongarch/include/asm/cacheflush.h
> +++ b/arch/loongarch/include/asm/cacheflush.h
> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
>  #define flush_cache_dup_mm(mm)                         do { } while (0)
>  #define flush_cache_range(vma, start, end)             do { } while (0)
>  #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
> -#define flush_cache_vmap(start, end)                   do { } while (0)
>  #define flush_cache_vunmap(start, end)                 do { } while (0)
>  #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
>  #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
>  #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
>
> +/*
> + * It is possible for a kernel virtual mapping access to return a spurious
> + * fault if it's accessed right after the pte is set. The page fault handler
> + * does not expect this type of fault. flush_cache_vmap is not exactly the
> + * right place to put this, but it seems to work well enough.
> + */
> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
> +{
> +       smp_mb();
> +}
I don't know whether this is the best API to do this, and I think
flush_cache_vunmap() also should be a smp_mb().


Huacai

> +#define flush_cache_vmap flush_cache_vmap
> +#define flush_cache_vmap_early flush_cache_vmap
> +
>  #define cache_op(op, addr)                                             \
>         __asm__ __volatile__(                                           \
>         "       cacop   %0, %1                                  \n"     \
> --
> 2.39.3
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] LoongArch: Use atomic operation with set_pte and pte_clear function
  2024-10-10  3:50 ` [PATCH 4/4] LoongArch: Use atomic operation with set_pte and pte_clear function Bibo Mao
@ 2024-10-12  2:16   ` Huacai Chen
  2024-10-12  2:51     ` maobibo
  0 siblings, 1 reply; 14+ messages in thread
From: Huacai Chen @ 2024-10-12  2:16 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Andrey Ryabinin, Andrew Morton, David Hildenbrand, Barry Song,
	loongarch, linux-kernel, kasan-dev, linux-mm

Hi, Bibo,

On Thu, Oct 10, 2024 at 11:50 AM Bibo Mao <maobibo@loongson.cn> wrote:
>
> For kernel space area on LoongArch system, both two consecutive page
> table entries should be enabled with PAGE_GLOBAL bit. So with function
> set_pte() and pte_clear(), pte buddy entry is checked and set besides
> its own pte entry. However it is not atomic operation to set both two
> pte entries, there is problem with test_vmalloc test case.
>
> With previous patch, all page table entries are set with PAGE_GLOBAL
> bit at beginning. Only its own pte entry need update with function
> set_pte() and pte_clear(), nothing to do with buddy pte entry.
>
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  arch/loongarch/include/asm/pgtable.h | 44 ++++++++++------------------
>  1 file changed, 15 insertions(+), 29 deletions(-)
>
> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
> index 22e3a8f96213..4be3f0dbecda 100644
> --- a/arch/loongarch/include/asm/pgtable.h
> +++ b/arch/loongarch/include/asm/pgtable.h
> @@ -325,40 +325,26 @@ extern void paging_init(void);
>  static inline void set_pte(pte_t *ptep, pte_t pteval)
>  {
>         WRITE_ONCE(*ptep, pteval);
> +}
>
> -       if (pte_val(pteval) & _PAGE_GLOBAL) {
> -               pte_t *buddy = ptep_buddy(ptep);
> -               /*
> -                * Make sure the buddy is global too (if it's !none,
> -                * it better already be global)
> -                */
> -               if (pte_none(ptep_get(buddy))) {
> -#ifdef CONFIG_SMP
> -                       /*
> -                        * For SMP, multiple CPUs can race, so we need
> -                        * to do this atomically.
> -                        */
> -                       __asm__ __volatile__(
> -                       __AMOR "$zero, %[global], %[buddy] \n"
> -                       : [buddy] "+ZB" (buddy->pte)
> -                       : [global] "r" (_PAGE_GLOBAL)
> -                       : "memory");
> -
> -                       DBAR(0b11000); /* o_wrw = 0b11000 */
> -#else /* !CONFIG_SMP */
> -                       WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL));
> -#endif /* CONFIG_SMP */
> -               }
> -       }
> +static inline unsigned long __ptep_get_and_clear(pte_t *ptep)
> +{
> +       return atomic64_fetch_and(_PAGE_GLOBAL, (atomic64_t *)&pte_val(*ptep));
>  }
>
>  static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>  {
> -       /* Preserve global status for the pair */
> -       if (pte_val(ptep_get(ptep_buddy(ptep))) & _PAGE_GLOBAL)
> -               set_pte(ptep, __pte(_PAGE_GLOBAL));
> -       else
> -               set_pte(ptep, __pte(0));
> +       __ptep_get_and_clear(ptep);
With the first patch, a kernel pte always take _PAGE_GLOBAL, so we
don't need an expensive atomic operation, just
"set_pte(pte_val(ptep_get(ptep)) & _PAGE_GLOBAL)" is OK here. And then
we don't need a custom ptep_get_and_clear().


Huacai

> +}
> +
> +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
> +static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
> +                                       unsigned long addr, pte_t *ptep)
> +{
> +       unsigned long val;
> +
> +       val = __ptep_get_and_clear(ptep);
> +       return __pte(val);
>  }
>
>  #define PGD_T_LOG2     (__builtin_ffs(sizeof(pgd_t)) - 1)
> --
> 2.39.3
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created
  2024-10-10  3:50 ` [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created Bibo Mao
  2024-10-11  4:42   ` kernel test robot
  2024-10-11  4:42   ` kernel test robot
@ 2024-10-12  2:17   ` Huacai Chen
  2 siblings, 0 replies; 14+ messages in thread
From: Huacai Chen @ 2024-10-12  2:17 UTC (permalink / raw)
  To: Bibo Mao
  Cc: Andrey Ryabinin, Andrew Morton, David Hildenbrand, Barry Song,
	loongarch, linux-kernel, kasan-dev, linux-mm

Hi, Bibo,

Just declare kernel_pte_init() in header file. Since this series is a
bugfix, we should merge it as soon as possible. The refactoring patch
can be sent after this series.

Huacai

On Thu, Oct 10, 2024 at 11:50 AM Bibo Mao <maobibo@loongson.cn> wrote:
>
> Like pmd_init(), a weak function kernel_pte_init() is added and it
> is only effective on LoongArch system. When pte table is created for
> vmemmap kernel space, function kernel_pte_init() is called here.
>
> It has no any effective on other architectures.
>
> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
> ---
>  mm/sparse-vmemmap.c | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
> index edcc7a6b0f6f..c0388b2e959d 100644
> --- a/mm/sparse-vmemmap.c
> +++ b/mm/sparse-vmemmap.c
> @@ -184,6 +184,10 @@ static void * __meminit vmemmap_alloc_block_zero(unsigned long size, int node)
>         return p;
>  }
>
> +void __weak __meminit kernel_pte_init(void *addr)
> +{
> +}
> +
>  pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node)
>  {
>         pmd_t *pmd = pmd_offset(pud, addr);
> @@ -191,6 +195,7 @@ pmd_t * __meminit vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node)
>                 void *p = vmemmap_alloc_block_zero(PAGE_SIZE, node);
>                 if (!p)
>                         return NULL;
> +               kernel_pte_init(p);
>                 pmd_populate_kernel(&init_mm, pmd, p);
>         }
>         return pmd;
> --
> 2.39.3
>


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space
  2024-10-12  2:15   ` Huacai Chen
@ 2024-10-12  2:40     ` maobibo
  0 siblings, 0 replies; 14+ messages in thread
From: maobibo @ 2024-10-12  2:40 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Andrey Ryabinin, Andrew Morton, David Hildenbrand, Barry Song,
	loongarch, linux-kernel, kasan-dev, linux-mm

Huacai,

On 2024/10/12 上午10:15, Huacai Chen wrote:
> Hi, Bibo,
> 
> On Thu, Oct 10, 2024 at 11:50 AM Bibo Mao <maobibo@loongson.cn> wrote:
>>
>> Unlike general architectures, there are two pages for one TLB entry
>> on LoongArch system. For kernel space, it requires both two pte
>> entries with PAGE_GLOBAL set, else HW treats it as non-global tlb,
>> there will be potential problems if tlb entry for kernel space is
>> not global. Such as fail to flush kernel tlb with function
>> local_flush_tlb_kernel_range() which only flush tlb with global bit.
>>
>> Here function kernel_pte_init() is added, it can be used to init
>> pte table when it is created, so the default inital pte is
>> PAGE_GLOBAL rather than zero at beginning.
> I think kernel_pte_init() is also needed in zero_pmd_populate() in
> mm/kasan/init.c. And moreover, the second patch should be squashed in
yes, it is needed in zero_pmd_populate() in mm/kasan/init.c, will add it
in next version.

> this one because they should be as a whole. Though the second one
> touches the common code, I can merge it with mm maintainer's acked-by.
Sure, will merge it with the second one into one patch.

Regards
Bibo Mao
> 
> 
> Huacai
> 
>>
>> Kernel space areas includes fixmap, percpu, vmalloc and kasan areas
>> set default pte entry with PAGE_GLOBAL set.
>>
>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>> ---
>>   arch/loongarch/include/asm/pgalloc.h | 13 +++++++++++++
>>   arch/loongarch/include/asm/pgtable.h |  1 +
>>   arch/loongarch/mm/init.c             |  4 +++-
>>   arch/loongarch/mm/kasan_init.c       |  4 +++-
>>   arch/loongarch/mm/pgtable.c          | 22 ++++++++++++++++++++++
>>   5 files changed, 42 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/loongarch/include/asm/pgalloc.h b/arch/loongarch/include/asm/pgalloc.h
>> index 4e2d6b7ca2ee..b2698c03dc2c 100644
>> --- a/arch/loongarch/include/asm/pgalloc.h
>> +++ b/arch/loongarch/include/asm/pgalloc.h
>> @@ -10,8 +10,21 @@
>>
>>   #define __HAVE_ARCH_PMD_ALLOC_ONE
>>   #define __HAVE_ARCH_PUD_ALLOC_ONE
>> +#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL
>>   #include <asm-generic/pgalloc.h>
>>
>> +static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
>> +{
>> +       pte_t *pte;
>> +
>> +       pte = (pte_t *) __get_free_page(GFP_KERNEL);
>> +       if (!pte)
>> +               return NULL;
>> +
>> +       kernel_pte_init(pte);
>> +       return pte;
>> +}
>> +
>>   static inline void pmd_populate_kernel(struct mm_struct *mm,
>>                                         pmd_t *pmd, pte_t *pte)
>>   {
>> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
>> index 9965f52ef65b..22e3a8f96213 100644
>> --- a/arch/loongarch/include/asm/pgtable.h
>> +++ b/arch/loongarch/include/asm/pgtable.h
>> @@ -269,6 +269,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pm
>>   extern void pgd_init(void *addr);
>>   extern void pud_init(void *addr);
>>   extern void pmd_init(void *addr);
>> +extern void kernel_pte_init(void *addr);
>>
>>   /*
>>    * Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
>> diff --git a/arch/loongarch/mm/init.c b/arch/loongarch/mm/init.c
>> index 8a87a482c8f4..9f26e933a8a3 100644
>> --- a/arch/loongarch/mm/init.c
>> +++ b/arch/loongarch/mm/init.c
>> @@ -198,9 +198,11 @@ pte_t * __init populate_kernel_pte(unsigned long addr)
>>          if (!pmd_present(pmdp_get(pmd))) {
>>                  pte_t *pte;
>>
>> -               pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
>> +               pte = memblock_alloc_raw(PAGE_SIZE, PAGE_SIZE);
>>                  if (!pte)
>>                          panic("%s: Failed to allocate memory\n", __func__);
>> +
>> +               kernel_pte_init(pte);
>>                  pmd_populate_kernel(&init_mm, pmd, pte);
>>          }
>>
>> diff --git a/arch/loongarch/mm/kasan_init.c b/arch/loongarch/mm/kasan_init.c
>> index 427d6b1aec09..34988573b0d5 100644
>> --- a/arch/loongarch/mm/kasan_init.c
>> +++ b/arch/loongarch/mm/kasan_init.c
>> @@ -152,6 +152,8 @@ static void __init kasan_pte_populate(pmd_t *pmdp, unsigned long addr,
>>                  phys_addr_t page_phys = early ?
>>                                          __pa_symbol(kasan_early_shadow_page)
>>                                                : kasan_alloc_zeroed_page(node);
>> +               if (!early)
>> +                       kernel_pte_init(__va(page_phys));
>>                  next = addr + PAGE_SIZE;
>>                  set_pte(ptep, pfn_pte(__phys_to_pfn(page_phys), PAGE_KERNEL));
>>          } while (ptep++, addr = next, addr != end && __pte_none(early, ptep_get(ptep)));
>> @@ -287,7 +289,7 @@ void __init kasan_init(void)
>>                  set_pte(&kasan_early_shadow_pte[i],
>>                          pfn_pte(__phys_to_pfn(__pa_symbol(kasan_early_shadow_page)), PAGE_KERNEL_RO));
>>
>> -       memset(kasan_early_shadow_page, 0, PAGE_SIZE);
>> +       kernel_pte_init(kasan_early_shadow_page);
>>          csr_write64(__pa_symbol(swapper_pg_dir), LOONGARCH_CSR_PGDH);
>>          local_flush_tlb_all();
>>
>> diff --git a/arch/loongarch/mm/pgtable.c b/arch/loongarch/mm/pgtable.c
>> index eb6a29b491a7..228ffc1db0a3 100644
>> --- a/arch/loongarch/mm/pgtable.c
>> +++ b/arch/loongarch/mm/pgtable.c
>> @@ -38,6 +38,28 @@ pgd_t *pgd_alloc(struct mm_struct *mm)
>>   }
>>   EXPORT_SYMBOL_GPL(pgd_alloc);
>>
>> +void kernel_pte_init(void *addr)
>> +{
>> +       unsigned long *p, *end;
>> +       unsigned long entry;
>> +
>> +       entry = (unsigned long)_PAGE_GLOBAL;
>> +       p = (unsigned long *)addr;
>> +       end = p + PTRS_PER_PTE;
>> +
>> +       do {
>> +               p[0] = entry;
>> +               p[1] = entry;
>> +               p[2] = entry;
>> +               p[3] = entry;
>> +               p[4] = entry;
>> +               p += 8;
>> +               p[-3] = entry;
>> +               p[-2] = entry;
>> +               p[-1] = entry;
>> +       } while (p != end);
>> +}
>> +
>>   void pgd_init(void *addr)
>>   {
>>          unsigned long *p, *end;
>> --
>> 2.39.3
>>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 3/4] LoongArch: Add barrier between set_pte and memory access
  2024-10-12  2:16   ` Huacai Chen
@ 2024-10-12  2:48     ` maobibo
  0 siblings, 0 replies; 14+ messages in thread
From: maobibo @ 2024-10-12  2:48 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Andrey Ryabinin, Andrew Morton, David Hildenbrand, Barry Song,
	loongarch, linux-kernel, kasan-dev, linux-mm

Huacai,

On 2024/10/12 上午10:16, Huacai Chen wrote:
> Hi, Bibo,
> 
> On Thu, Oct 10, 2024 at 11:50 AM Bibo Mao <maobibo@loongson.cn> wrote:
>>
>> It is possible to return a spurious fault if memory is accessed
>> right after the pte is set. For user address space, pte is set
>> in kernel space and memory is accessed in user space, there is
>> long time for synchronization, no barrier needed. However for
>> kernel address space, it is possible that memory is accessed
>> right after the pte is set.
>>
>> Here flush_cache_vmap/flush_cache_vmap_early is used for
>> synchronization.
>>
>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>> ---
>>   arch/loongarch/include/asm/cacheflush.h | 14 +++++++++++++-
>>   1 file changed, 13 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/loongarch/include/asm/cacheflush.h b/arch/loongarch/include/asm/cacheflush.h
>> index f8754d08a31a..53be231319ef 100644
>> --- a/arch/loongarch/include/asm/cacheflush.h
>> +++ b/arch/loongarch/include/asm/cacheflush.h
>> @@ -42,12 +42,24 @@ void local_flush_icache_range(unsigned long start, unsigned long end);
>>   #define flush_cache_dup_mm(mm)                         do { } while (0)
>>   #define flush_cache_range(vma, start, end)             do { } while (0)
>>   #define flush_cache_page(vma, vmaddr, pfn)             do { } while (0)
>> -#define flush_cache_vmap(start, end)                   do { } while (0)
>>   #define flush_cache_vunmap(start, end)                 do { } while (0)
>>   #define flush_icache_user_page(vma, page, addr, len)   do { } while (0)
>>   #define flush_dcache_mmap_lock(mapping)                        do { } while (0)
>>   #define flush_dcache_mmap_unlock(mapping)              do { } while (0)
>>
>> +/*
>> + * It is possible for a kernel virtual mapping access to return a spurious
>> + * fault if it's accessed right after the pte is set. The page fault handler
>> + * does not expect this type of fault. flush_cache_vmap is not exactly the
>> + * right place to put this, but it seems to work well enough.
>> + */
>> +static inline void flush_cache_vmap(unsigned long start, unsigned long end)
>> +{
>> +       smp_mb();
>> +}
> I don't know whether this is the best API to do this, and I think
> flush_cache_vunmap() also should be a smp_mb().
I do not know neither -:(, it seems that flush_cache_vmap() is better 
than arch_sync_kernel_mappings(), since function flush_cache_vmap() is 
used in vmalloc/kasan/percpu module, however arch_sync_kernel_mappings
is only used in vmalloc.

For flush_cache_vunmap(), it is used before pte_clear(), here is usage 
example.
void vunmap_range(unsigned long addr, unsigned long end)
{
         flush_cache_vunmap(addr, end);
         vunmap_range_noflush(addr, end);
         flush_tlb_kernel_range(addr, end);
}

So I think it is not necessary to add smp_mb() in flush_cache_vunmap().

Regards
Bibo Mao
> 
> 
> Huacai
> 
>> +#define flush_cache_vmap flush_cache_vmap
>> +#define flush_cache_vmap_early flush_cache_vmap
>> +
>>   #define cache_op(op, addr)                                             \
>>          __asm__ __volatile__(                                           \
>>          "       cacop   %0, %1                                  \n"     \
>> --
>> 2.39.3
>>



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 4/4] LoongArch: Use atomic operation with set_pte and pte_clear function
  2024-10-12  2:16   ` Huacai Chen
@ 2024-10-12  2:51     ` maobibo
  0 siblings, 0 replies; 14+ messages in thread
From: maobibo @ 2024-10-12  2:51 UTC (permalink / raw)
  To: Huacai Chen
  Cc: Andrey Ryabinin, Andrew Morton, David Hildenbrand, Barry Song,
	loongarch, linux-kernel, kasan-dev, linux-mm


Huacai,

On 2024/10/12 上午10:16, Huacai Chen wrote:
> Hi, Bibo,
> 
> On Thu, Oct 10, 2024 at 11:50 AM Bibo Mao <maobibo@loongson.cn> wrote:
>>
>> For kernel space area on LoongArch system, both two consecutive page
>> table entries should be enabled with PAGE_GLOBAL bit. So with function
>> set_pte() and pte_clear(), pte buddy entry is checked and set besides
>> its own pte entry. However it is not atomic operation to set both two
>> pte entries, there is problem with test_vmalloc test case.
>>
>> With previous patch, all page table entries are set with PAGE_GLOBAL
>> bit at beginning. Only its own pte entry need update with function
>> set_pte() and pte_clear(), nothing to do with buddy pte entry.
>>
>> Signed-off-by: Bibo Mao <maobibo@loongson.cn>
>> ---
>>   arch/loongarch/include/asm/pgtable.h | 44 ++++++++++------------------
>>   1 file changed, 15 insertions(+), 29 deletions(-)
>>
>> diff --git a/arch/loongarch/include/asm/pgtable.h b/arch/loongarch/include/asm/pgtable.h
>> index 22e3a8f96213..4be3f0dbecda 100644
>> --- a/arch/loongarch/include/asm/pgtable.h
>> +++ b/arch/loongarch/include/asm/pgtable.h
>> @@ -325,40 +325,26 @@ extern void paging_init(void);
>>   static inline void set_pte(pte_t *ptep, pte_t pteval)
>>   {
>>          WRITE_ONCE(*ptep, pteval);
>> +}
>>
>> -       if (pte_val(pteval) & _PAGE_GLOBAL) {
>> -               pte_t *buddy = ptep_buddy(ptep);
>> -               /*
>> -                * Make sure the buddy is global too (if it's !none,
>> -                * it better already be global)
>> -                */
>> -               if (pte_none(ptep_get(buddy))) {
>> -#ifdef CONFIG_SMP
>> -                       /*
>> -                        * For SMP, multiple CPUs can race, so we need
>> -                        * to do this atomically.
>> -                        */
>> -                       __asm__ __volatile__(
>> -                       __AMOR "$zero, %[global], %[buddy] \n"
>> -                       : [buddy] "+ZB" (buddy->pte)
>> -                       : [global] "r" (_PAGE_GLOBAL)
>> -                       : "memory");
>> -
>> -                       DBAR(0b11000); /* o_wrw = 0b11000 */
>> -#else /* !CONFIG_SMP */
>> -                       WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL));
>> -#endif /* CONFIG_SMP */
>> -               }
>> -       }
>> +static inline unsigned long __ptep_get_and_clear(pte_t *ptep)
>> +{
>> +       return atomic64_fetch_and(_PAGE_GLOBAL, (atomic64_t *)&pte_val(*ptep));
>>   }
>>
>>   static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
>>   {
>> -       /* Preserve global status for the pair */
>> -       if (pte_val(ptep_get(ptep_buddy(ptep))) & _PAGE_GLOBAL)
>> -               set_pte(ptep, __pte(_PAGE_GLOBAL));
>> -       else
>> -               set_pte(ptep, __pte(0));
>> +       __ptep_get_and_clear(ptep);
> With the first patch, a kernel pte always take _PAGE_GLOBAL, so we
> don't need an expensive atomic operation, just
> "set_pte(pte_val(ptep_get(ptep)) & _PAGE_GLOBAL)" is OK here. And then
> we don't need a custom ptep_get_and_clear().
Will use non-atomic method and test again, also will remove customed 
function ptep_get_and_clear().

Regards
Bibo Mao
> 
> 
> Huacai
> 
>> +}
>> +
>> +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR
>> +static inline pte_t ptep_get_and_clear(struct mm_struct *mm,
>> +                                       unsigned long addr, pte_t *ptep)
>> +{
>> +       unsigned long val;
>> +
>> +       val = __ptep_get_and_clear(ptep);
>> +       return __pte(val);
>>   }
>>
>>   #define PGD_T_LOG2     (__builtin_ffs(sizeof(pgd_t)) - 1)
>> --
>> 2.39.3
>>



^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-10-12  2:51 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-10  3:50 [PATCH 0/4] LoongArch: Fix vmalloc test issue Bibo Mao
2024-10-10  3:50 ` [PATCH 1/4] LoongArch: Set pte entry with PAGE_GLOBAL for kernel space Bibo Mao
2024-10-12  2:15   ` Huacai Chen
2024-10-12  2:40     ` maobibo
2024-10-10  3:50 ` [PATCH 2/4] mm/sparse-vmemmap: set pte_init when vmemmap is created Bibo Mao
2024-10-11  4:42   ` kernel test robot
2024-10-11  4:42   ` kernel test robot
2024-10-12  2:17   ` Huacai Chen
2024-10-10  3:50 ` [PATCH 3/4] LoongArch: Add barrier between set_pte and memory access Bibo Mao
2024-10-12  2:16   ` Huacai Chen
2024-10-12  2:48     ` maobibo
2024-10-10  3:50 ` [PATCH 4/4] LoongArch: Use atomic operation with set_pte and pte_clear function Bibo Mao
2024-10-12  2:16   ` Huacai Chen
2024-10-12  2:51     ` maobibo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox