[PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
@ 2024-04-03  8:37 Kefeng Wang
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
                   ` (7 more replies)
  0 siblings, 8 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:37 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

After VMA lock-based page fault handling enabled, if bad access met
under per-vma lock, it will fallback to mmap_lock-based handling,
so it leads to unnessary mmap lock and vma find again. A test from
lmbench shows 34% improve after this changes on arm64,

  lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198

Only build test on other archs except arm64.

v2: 
- a better changelog, and describe the counting changes, suggested by
  Suren Baghdasaryan
- add RB

Kefeng Wang (7):
  arm64: mm: cleanup __do_page_fault()
  arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
  arm: mm: accelerate pagefault when VM_FAULT_BADACCESS
  powerpc: mm: accelerate pagefault when badaccess
  riscv: mm: accelerate pagefault when badaccess
  s390: mm: accelerate pagefault when badaccess
  x86: mm: accelerate pagefault when badaccess

 arch/arm/mm/fault.c     |  4 +++-
 arch/arm64/mm/fault.c   | 31 ++++++++++---------------------
 arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
 arch/riscv/mm/fault.c   |  5 ++++-
 arch/s390/mm/fault.c    |  3 ++-
 arch/x86/mm/fault.c     | 23 ++++++++++++++---------
 6 files changed, 53 insertions(+), 46 deletions(-)

-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault()
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-03  8:37 ` Kefeng Wang
  2024-04-09 11:14   ` Catalin Marinas
  2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:37 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

The __do_page_fault() only calls handle_mm_fault() after vm_flags
checked, and it is only called by do_page_fault(), let's squash
it into do_page_fault() to cleanup code.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/fault.c | 27 +++++++--------------------
 1 file changed, 7 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 8251e2fea9c7..9bb9f395351a 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -486,25 +486,6 @@ static void do_bad_area(unsigned long far, unsigned long esr,
 	}
 }
 
-#define VM_FAULT_BADMAP		((__force vm_fault_t)0x010000)
-#define VM_FAULT_BADACCESS	((__force vm_fault_t)0x020000)
-
-static vm_fault_t __do_page_fault(struct mm_struct *mm,
-				  struct vm_area_struct *vma, unsigned long addr,
-				  unsigned int mm_flags, unsigned long vm_flags,
-				  struct pt_regs *regs)
-{
-	/*
-	 * Ok, we have a good vm_area for this memory access, so we can handle
-	 * it.
-	 * Check that the permissions on the VMA allow for the fault which
-	 * occurred.
-	 */
-	if (!(vma->vm_flags & vm_flags))
-		return VM_FAULT_BADACCESS;
-	return handle_mm_fault(vma, addr, mm_flags, regs);
-}
-
 static bool is_el0_instruction_abort(unsigned long esr)
 {
 	return ESR_ELx_EC(esr) == ESR_ELx_EC_IABT_LOW;
@@ -519,6 +500,9 @@ static bool is_write_abort(unsigned long esr)
 	return (esr & ESR_ELx_WNR) && !(esr & ESR_ELx_CM);
 }
 
+#define VM_FAULT_BADMAP		((__force vm_fault_t)0x010000)
+#define VM_FAULT_BADACCESS	((__force vm_fault_t)0x020000)
+
 static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 				   struct pt_regs *regs)
 {
@@ -617,7 +601,10 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 		goto done;
 	}
 
-	fault = __do_page_fault(mm, vma, addr, mm_flags, vm_flags, regs);
+	if (!(vma->vm_flags & vm_flags))
+		fault = VM_FAULT_BADACCESS;
+	else
+		fault = handle_mm_fault(vma, addr, mm_flags, regs);
 
 	/* Quick path to respond to signals */
 	if (fault_signal_pending(fault, regs)) {
-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-09 11:15   ` Catalin Marinas
  2024-04-03  8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to retry with mmap_lock again, the latency time reduces 34% in
'lat_sig -P 1 prot lat_sig' from lmbench testcase.

Since the page faut is handled under per-VMA lock, count it as a vma lock
event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 9bb9f395351a..405f9aa831bd 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 
 	if (!(vma->vm_flags & vm_flags)) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		fault = VM_FAULT_BADACCESS;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		goto done;
 	}
 	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 3/7] arm: mm: accelerate pagefault when VM_FAULT_BADACCESS
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
  2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to retry with mmap_lock again. Since the page faut is handled
under per-VMA lock, count it as a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/fault.c b/arch/arm/mm/fault.c
index 439dc6a26bb9..5c4b417e24f9 100644
--- a/arch/arm/mm/fault.c
+++ b/arch/arm/mm/fault.c
@@ -294,7 +294,9 @@ do_page_fault(unsigned long addr, unsigned int fsr, struct pt_regs *regs)
 
 	if (!(vma->vm_flags & vm_flags)) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		fault = VM_FAULT_BADACCESS;
+		goto bad_area;
 	}
 	fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (2 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-09  8:56   ` Michael Ellerman
  2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

The access_[pkey]_error() of vma already checked under per-VMA lock, if
it is a bad access, directly handle error, no need to retry with mmap_lock
again. In order to release the correct lock, pass the mm_struct into
bad_access_pkey()/bad_access(), if mm is NULL, release vma lock, or
release mmap_lock. Since the page faut is handled under per-VMA lock,
count it as a vma lock event with VMA_LOCK_SUCCESS.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
 1 file changed, 20 insertions(+), 13 deletions(-)

diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 53335ae21a40..215690452495 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
 	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
 }
 
-static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
+static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
+		      struct mm_struct *mm, struct vm_area_struct *vma)
 {
-	struct mm_struct *mm = current->mm;
 
 	/*
 	 * Something tried to access memory that isn't in our memory map..
 	 * Fix it, but check if it's kernel or user first..
 	 */
-	mmap_read_unlock(mm);
+	if (mm)
+		mmap_read_unlock(mm);
+	else
+		vma_end_read(vma);
 
 	return __bad_area_nosemaphore(regs, address, si_code);
 }
 
 static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
+				    struct mm_struct *mm,
 				    struct vm_area_struct *vma)
 {
-	struct mm_struct *mm = current->mm;
 	int pkey;
 
 	/*
@@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
 	 */
 	pkey = vma_pkey(vma);
 
-	mmap_read_unlock(mm);
+	if (mm)
+		mmap_read_unlock(mm);
+	else
+		vma_end_read(vma);
 
 	/*
 	 * If we are in kernel mode, bail out with a SEGV, this will
@@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
 	return 0;
 }
 
-static noinline int bad_access(struct pt_regs *regs, unsigned long address)
+static noinline int bad_access(struct pt_regs *regs, unsigned long address,
+			       struct mm_struct *mm, struct vm_area_struct *vma)
 {
-	return __bad_area(regs, address, SEGV_ACCERR);
+	return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
 }
 
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
@@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	if (unlikely(access_pkey_error(is_write, is_exec,
 				       (error_code & DSISR_KEYFAULT), vma))) {
-		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return bad_access_pkey(regs, address, NULL, vma);
 	}
 
 	if (unlikely(access_error(is_write, is_exec, vma))) {
-		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return bad_access(regs, address, NULL, vma);
 	}
 
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
@@ -521,10 +528,10 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
 
 	if (unlikely(access_pkey_error(is_write, is_exec,
 				       (error_code & DSISR_KEYFAULT), vma)))
-		return bad_access_pkey(regs, address, vma);
+		return bad_access_pkey(regs, address, mm, vma);
 
 	if (unlikely(access_error(is_write, is_exec, vma)))
-		return bad_access(regs, address);
+		return bad_access(regs, address, mm, vma);
 
 	/*
 	 * If for any reason at all we couldn't handle the fault,
-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (3 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-10  7:32   ` Alexandre Ghiti
  2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it as
a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/riscv/mm/fault.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index 3ba1d4dde5dd..b3fcf7d67efb 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
 
 	if (unlikely(access_error(cause, vma))) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		tsk->thread.bad_cause = SEGV_ACCERR;
+		bad_area_nosemaphore(regs, code, addr);
+		return;
 	}
 
 	fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);
-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 6/7] s390: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (4 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-07 17:19   ` Heiko Carstens
  2024-04-03  8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
  2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
  7 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it
as a vma lock event with VMA_LOCK_SUCCESS.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/s390/mm/fault.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
index c421dd44ffbe..162ca2576fd4 100644
--- a/arch/s390/mm/fault.c
+++ b/arch/s390/mm/fault.c
@@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
 		goto lock_mmap;
 	if (!(vma->vm_flags & access)) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return handle_fault_error_nolock(regs, SEGV_ACCERR);
 	}
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH v2 7/7] x86: mm: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (5 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
@ 2024-04-03  8:38 ` Kefeng Wang
  2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
  7 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-03  8:38 UTC (permalink / raw)
  To: akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm,
	Kefeng Wang

The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. In order to release the correct lock, pass the mm_struct into
bad_area_access_error(), if mm is NULL, release vma lock, or release
mmap_lock. Since the page faut is handled under per-VMA lock, count it
as a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/x86/mm/fault.c | 23 ++++++++++++++---------
 1 file changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index a4cc20d0036d..67b18adc75dd 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -866,14 +866,17 @@ bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 
 static void
 __bad_area(struct pt_regs *regs, unsigned long error_code,
-	   unsigned long address, u32 pkey, int si_code)
+	   unsigned long address, struct mm_struct *mm,
+	   struct vm_area_struct *vma, u32 pkey, int si_code)
 {
-	struct mm_struct *mm = current->mm;
 	/*
 	 * Something tried to access memory that isn't in our memory map..
 	 * Fix it, but check if it's kernel or user first..
 	 */
-	mmap_read_unlock(mm);
+	if (mm)
+		mmap_read_unlock(mm);
+	else
+		vma_end_read(vma);
 
 	__bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
 }
@@ -897,7 +900,8 @@ static inline bool bad_area_access_from_pkeys(unsigned long error_code,
 
 static noinline void
 bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
-		      unsigned long address, struct vm_area_struct *vma)
+		      unsigned long address, struct mm_struct *mm,
+		      struct vm_area_struct *vma)
 {
 	/*
 	 * This OSPKE check is not strictly necessary at runtime.
@@ -927,9 +931,9 @@ bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
 		 */
 		u32 pkey = vma_pkey(vma);
 
-		__bad_area(regs, error_code, address, pkey, SEGV_PKUERR);
+		__bad_area(regs, error_code, address, mm, vma, pkey, SEGV_PKUERR);
 	} else {
-		__bad_area(regs, error_code, address, 0, SEGV_ACCERR);
+		__bad_area(regs, error_code, address, mm, vma, 0, SEGV_ACCERR);
 	}
 }
 
@@ -1357,8 +1361,9 @@ void do_user_addr_fault(struct pt_regs *regs,
 		goto lock_mmap;
 
 	if (unlikely(access_error(error_code, vma))) {
-		vma_end_read(vma);
-		goto lock_mmap;
+		bad_area_access_error(regs, error_code, address, NULL, vma);
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		return;
 	}
 	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
@@ -1394,7 +1399,7 @@ void do_user_addr_fault(struct pt_regs *regs,
 	 * we can handle it..
 	 */
 	if (unlikely(access_error(error_code, vma))) {
-		bad_area_access_error(regs, error_code, address, vma);
+		bad_area_access_error(regs, error_code, address, mm, vma);
 		return;
 	}
 
-- 
2.27.0



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
  2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
                   ` (6 preceding siblings ...)
  2024-04-03  8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
@ 2024-04-03 20:45 ` Andrew Morton
  2024-04-07  7:49   ` Kefeng Wang
  7 siblings, 1 reply; 19+ messages in thread
From: Andrew Morton @ 2024-04-03 20:45 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm

On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:

> After VMA lock-based page fault handling enabled, if bad access met
> under per-vma lock, it will fallback to mmap_lock-based handling,
> so it leads to unnessary mmap lock and vma find again. A test from
> lmbench shows 34% improve after this changes on arm64,
> 
>   lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
> 
> Only build test on other archs except arm64.

Thanks.  So we now want a bunch of architectures to runtime test this.  Do
we have a selftest in place which will adequately do this?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
  2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
@ 2024-04-07  7:49   ` Kefeng Wang
  2024-04-07 17:19     ` Heiko Carstens
  0 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-07  7:49 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm



On 2024/4/4 4:45, Andrew Morton wrote:
> On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> 
>> After VMA lock-based page fault handling enabled, if bad access met
>> under per-vma lock, it will fallback to mmap_lock-based handling,
>> so it leads to unnessary mmap lock and vma find again. A test from
>> lmbench shows 34% improve after this changes on arm64,
>>
>>    lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
>>
>> Only build test on other archs except arm64.
> 
> Thanks.  So we now want a bunch of architectures to runtime test this.  Do
> we have a selftest in place which will adequately do this?

I don't find such selftest, and badaccess would lead to coredump, the
performance should not affect most scene, so no selftest is acceptable.
lmbench is easy to use to measure the performance.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess
  2024-04-07  7:49   ` Kefeng Wang
@ 2024-04-07 17:19     ` Heiko Carstens
  0 siblings, 0 replies; 19+ messages in thread
From: Heiko Carstens @ 2024-04-07 17:19 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: Andrew Morton, Russell King, Catalin Marinas, Will Deacon,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexander Gordeev,
	Gerald Schaefer, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	x86, linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	surenb, linux-mm

On Sun, Apr 07, 2024 at 03:49:53PM +0800, Kefeng Wang wrote:
> On 2024/4/4 4:45, Andrew Morton wrote:
> > On Wed, 3 Apr 2024 16:37:58 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> > 
> > > After VMA lock-based page fault handling enabled, if bad access met
> > > under per-vma lock, it will fallback to mmap_lock-based handling,
> > > so it leads to unnessary mmap lock and vma find again. A test from
> > > lmbench shows 34% improve after this changes on arm64,
> > > 
> > >    lat_sig -P 1 prot lat_sig 0.29194 -> 0.19198
> > > 
> > > Only build test on other archs except arm64.
> > 
> > Thanks.  So we now want a bunch of architectures to runtime test this.  Do
> > we have a selftest in place which will adequately do this?
> 
> I don't find such selftest, and badaccess would lead to coredump, the
> performance should not affect most scene, so no selftest is acceptable.
> lmbench is easy to use to measure the performance.

The rationale for this series (performance improvement) is a bit odd,
since I would expect that the changed code is usually never executed.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 6/7] s390: mm: accelerate pagefault when badaccess
  2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
@ 2024-04-07 17:19   ` Heiko Carstens
  0 siblings, 0 replies; 19+ messages in thread
From: Heiko Carstens @ 2024-04-07 17:19 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: akpm, Russell King, Catalin Marinas, Will Deacon,
	Michael Ellerman, Nicholas Piggin, Christophe Leroy,
	Paul Walmsley, Palmer Dabbelt, Albert Ou, Alexander Gordeev,
	Gerald Schaefer, Dave Hansen, Andy Lutomirski, Peter Zijlstra,
	x86, linux-arm-kernel, linuxppc-dev, linux-riscv, linux-s390,
	surenb, linux-mm

On Wed, Apr 03, 2024 at 04:38:04PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly handle error, no need to retry with mmap_lock
> again. Since the page faut is handled under per-VMA lock, count it
> as a vma lock event with VMA_LOCK_SUCCESS.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  arch/s390/mm/fault.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/s390/mm/fault.c b/arch/s390/mm/fault.c
> index c421dd44ffbe..162ca2576fd4 100644
> --- a/arch/s390/mm/fault.c
> +++ b/arch/s390/mm/fault.c
> @@ -325,7 +325,8 @@ static void do_exception(struct pt_regs *regs, int access)
>  		goto lock_mmap;
>  	if (!(vma->vm_flags & access)) {
>  		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		return handle_fault_error_nolock(regs, SEGV_ACCERR);

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess
  2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
@ 2024-04-09  8:56   ` Michael Ellerman
  0 siblings, 0 replies; 19+ messages in thread
From: Michael Ellerman @ 2024-04-09  8:56 UTC (permalink / raw)
  To: Kefeng Wang, akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Nicholas Piggin,
	Christophe Leroy, Paul Walmsley, Palmer Dabbelt, Albert Ou,
	Alexander Gordeev, Gerald Schaefer, Dave Hansen, Andy Lutomirski,
	Peter Zijlstra, x86, linux-arm-kernel, linuxppc-dev, linux-riscv,
	linux-s390, surenb, linux-mm, Kefeng Wang

Kefeng Wang <wangkefeng.wang@huawei.com> writes:
> The access_[pkey]_error() of vma already checked under per-VMA lock, if
> it is a bad access, directly handle error, no need to retry with mmap_lock
> again. In order to release the correct lock, pass the mm_struct into
> bad_access_pkey()/bad_access(), if mm is NULL, release vma lock, or
> release mmap_lock. Since the page faut is handled under per-VMA lock,
> count it as a vma lock event with VMA_LOCK_SUCCESS.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  arch/powerpc/mm/fault.c | 33 ++++++++++++++++++++-------------
>  1 file changed, 20 insertions(+), 13 deletions(-)

I thought there might be a nicer way to do this, plumbing the mm and vma
down through all those levels is a bit of a pain (vma->vm_mm exists
after all).

But I couldn't come up with anything obviously better, without doing
lots of refactoring first, which would be a pain to integrate into this
series.

So anyway, if the series goes ahead:

Acked-by: Michael Ellerman <mpe@ellerman.id.au> (powerpc)

cheers

> diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
> index 53335ae21a40..215690452495 100644
> --- a/arch/powerpc/mm/fault.c
> +++ b/arch/powerpc/mm/fault.c
> @@ -71,23 +71,26 @@ static noinline int bad_area_nosemaphore(struct pt_regs *regs, unsigned long add
>  	return __bad_area_nosemaphore(regs, address, SEGV_MAPERR);
>  }
>  
> -static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code)
> +static int __bad_area(struct pt_regs *regs, unsigned long address, int si_code,
> +		      struct mm_struct *mm, struct vm_area_struct *vma)
>  {
> -	struct mm_struct *mm = current->mm;
>  
>  	/*
>  	 * Something tried to access memory that isn't in our memory map..
>  	 * Fix it, but check if it's kernel or user first..
>  	 */
> -	mmap_read_unlock(mm);
> +	if (mm)
> +		mmap_read_unlock(mm);
> +	else
> +		vma_end_read(vma);
>  
>  	return __bad_area_nosemaphore(regs, address, si_code);
>  }
>  
>  static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
> +				    struct mm_struct *mm,
>  				    struct vm_area_struct *vma)
>  {
> -	struct mm_struct *mm = current->mm;
>  	int pkey;
>  
>  	/*
> @@ -109,7 +112,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
>  	 */
>  	pkey = vma_pkey(vma);
>  
> -	mmap_read_unlock(mm);
> +	if (mm)
> +		mmap_read_unlock(mm);
> +	else
> +		vma_end_read(vma);
>  
>  	/*
>  	 * If we are in kernel mode, bail out with a SEGV, this will
> @@ -124,9 +130,10 @@ static noinline int bad_access_pkey(struct pt_regs *regs, unsigned long address,
>  	return 0;
>  }
>  
> -static noinline int bad_access(struct pt_regs *regs, unsigned long address)
> +static noinline int bad_access(struct pt_regs *regs, unsigned long address,
> +			       struct mm_struct *mm, struct vm_area_struct *vma)
>  {
> -	return __bad_area(regs, address, SEGV_ACCERR);
> +	return __bad_area(regs, address, SEGV_ACCERR, mm, vma);
>  }
>  
>  static int do_sigbus(struct pt_regs *regs, unsigned long address,
> @@ -479,13 +486,13 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>  
>  	if (unlikely(access_pkey_error(is_write, is_exec,
>  				       (error_code & DSISR_KEYFAULT), vma))) {
> -		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		return bad_access_pkey(regs, address, NULL, vma);
>  	}
>  
>  	if (unlikely(access_error(is_write, is_exec, vma))) {
> -		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		return bad_access(regs, address, NULL, vma);
>  	}
>  
>  	fault = handle_mm_fault(vma, address, flags | FAULT_FLAG_VMA_LOCK, regs);
> @@ -521,10 +528,10 @@ static int ___do_page_fault(struct pt_regs *regs, unsigned long address,
>  
>  	if (unlikely(access_pkey_error(is_write, is_exec,
>  				       (error_code & DSISR_KEYFAULT), vma)))
> -		return bad_access_pkey(regs, address, vma);
> +		return bad_access_pkey(regs, address, mm, vma);
>  
>  	if (unlikely(access_error(is_write, is_exec, vma)))
> -		return bad_access(regs, address);
> +		return bad_access(regs, address, mm, vma);
>  
>  	/*
>  	 * If for any reason at all we couldn't handle the fault,
> -- 
> 2.27.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault()
  2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
@ 2024-04-09 11:14   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2024-04-09 11:14 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: akpm, Russell King, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm

On Wed, Apr 03, 2024 at 04:37:59PM +0800, Kefeng Wang wrote:
> The __do_page_fault() only calls handle_mm_fault() after vm_flags
> checked, and it is only called by do_page_fault(), let's squash
> it into do_page_fault() to cleanup code.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

As I reviewed v1 and the changes are minimal:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS
  2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
@ 2024-04-09 11:15   ` Catalin Marinas
  0 siblings, 0 replies; 19+ messages in thread
From: Catalin Marinas @ 2024-04-09 11:15 UTC (permalink / raw)
  To: Kefeng Wang
  Cc: akpm, Russell King, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm

On Wed, Apr 03, 2024 at 04:38:00PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> no need to retry with mmap_lock again, the latency time reduces 34% in
> 'lat_sig -P 1 prot lat_sig' from lmbench testcase.
> 
> Since the page faut is handled under per-VMA lock, count it as a vma lock
> event with VMA_LOCK_SUCCESS.
> 
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
@ 2024-04-10  7:32   ` Alexandre Ghiti
  2024-04-10  8:07     ` Kefeng Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Ghiti @ 2024-04-10  7:32 UTC (permalink / raw)
  To: Kefeng Wang, akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm

Hi Kefeng,

On 03/04/2024 10:38, Kefeng Wang wrote:
> The access_error() of vma already checked under per-VMA lock, if it
> is a bad access, directly handle error, no need to retry with mmap_lock
> again. Since the page faut is handled under per-VMA lock, count it as
> a vma lock event with VMA_LOCK_SUCCESS.
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>   arch/riscv/mm/fault.c | 5 ++++-
>   1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index 3ba1d4dde5dd..b3fcf7d67efb 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>   
>   	if (unlikely(access_error(cause, vma))) {
>   		vma_end_read(vma);
> -		goto lock_mmap;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		tsk->thread.bad_cause = SEGV_ACCERR;


I think we should use the cause variable here instead of SEGV_ACCERR, as 
bad_cause is a riscv internal status which describes the real fault that 
happened.

Thanks,

Alex


> +		bad_area_nosemaphore(regs, code, addr);
> +		return;
>   	}
>   
>   	fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, regs);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-10  7:32   ` Alexandre Ghiti
@ 2024-04-10  8:07     ` Kefeng Wang
  2024-04-10 17:28       ` Alexandre Ghiti
  0 siblings, 1 reply; 19+ messages in thread
From: Kefeng Wang @ 2024-04-10  8:07 UTC (permalink / raw)
  To: Alexandre Ghiti, akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm



On 2024/4/10 15:32, Alexandre Ghiti wrote:
> Hi Kefeng,
> 
> On 03/04/2024 10:38, Kefeng Wang wrote:
>> The access_error() of vma already checked under per-VMA lock, if it
>> is a bad access, directly handle error, no need to retry with mmap_lock
>> again. Since the page faut is handled under per-VMA lock, count it as
>> a vma lock event with VMA_LOCK_SUCCESS.
>>
>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>>   arch/riscv/mm/fault.c | 5 ++++-
>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>       if (unlikely(access_error(cause, vma))) {
>>           vma_end_read(vma);
>> -        goto lock_mmap;
>> +        count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> +        tsk->thread.bad_cause = SEGV_ACCERR;
> 
> 
> I think we should use the cause variable here instead of SEGV_ACCERR, as 
> bad_cause is a riscv internal status which describes the real fault that 
> happened.

Oh, I see, it is exception causes on riscv, so it should be

diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index b3fcf7d67efb..5224f3733802 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
         if (unlikely(access_error(cause, vma))) {
                 vma_end_read(vma);
                 count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
-               tsk->thread.bad_cause = SEGV_ACCERR;
-               bad_area_nosemaphore(regs, code, addr);
+               tsk->thread.bad_cause = cause;
+               bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
                 return;
         }

Hi Alex, could you help to check it?

Hi Andrew, please help to squash it after Alex ack it.

Thanks both.


> 
> Thanks,
> 
> Alex
> 
> 
>> +        bad_area_nosemaphore(regs, code, addr);
>> +        return;
>>       }
>>       fault = handle_mm_fault(vma, addr, flags | FAULT_FLAG_VMA_LOCK, 
>> regs);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-10  8:07     ` Kefeng Wang
@ 2024-04-10 17:28       ` Alexandre Ghiti
  2024-04-11  1:17         ` Kefeng Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Alexandre Ghiti @ 2024-04-10 17:28 UTC (permalink / raw)
  To: Kefeng Wang, akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm

On 10/04/2024 10:07, Kefeng Wang wrote:
>
>
> On 2024/4/10 15:32, Alexandre Ghiti wrote:
>> Hi Kefeng,
>>
>> On 03/04/2024 10:38, Kefeng Wang wrote:
>>> The access_error() of vma already checked under per-VMA lock, if it
>>> is a bad access, directly handle error, no need to retry with mmap_lock
>>> again. Since the page faut is handled under per-VMA lock, count it as
>>> a vma lock event with VMA_LOCK_SUCCESS.
>>>
>>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>> ---
>>>   arch/riscv/mm/fault.c | 5 ++++-
>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>>> --- a/arch/riscv/mm/fault.c
>>> +++ b/arch/riscv/mm/fault.c
>>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>>       if (unlikely(access_error(cause, vma))) {
>>>           vma_end_read(vma);
>>> -        goto lock_mmap;
>>> +        count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>>> +        tsk->thread.bad_cause = SEGV_ACCERR;
>>
>>
>> I think we should use the cause variable here instead of SEGV_ACCERR, 
>> as bad_cause is a riscv internal status which describes the real 
>> fault that happened.
>
> Oh, I see, it is exception causes on riscv, so it should be
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index b3fcf7d67efb..5224f3733802 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
>         if (unlikely(access_error(cause, vma))) {
>                 vma_end_read(vma);
>                 count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> -               tsk->thread.bad_cause = SEGV_ACCERR;
> -               bad_area_nosemaphore(regs, code, addr);
> +               tsk->thread.bad_cause = cause;
> +               bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
>                 return;
>         }
>
> Hi Alex, could you help to check it?
>
> Hi Andrew, please help to squash it after Alex ack it.
>
> Thanks both.


So I have just tested Kefeng's fixup on my usual CI and with a simple 
program that triggers such bad access, everything went fine so with the 
fixup applied:

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Thanks,

Alex



>
>
>>
>> Thanks,
>>
>> Alex
>>
>>
>>> +        bad_area_nosemaphore(regs, code, addr);
>>> +        return;
>>>       }
>>>       fault = handle_mm_fault(vma, addr, flags | 
>>> FAULT_FLAG_VMA_LOCK, regs);


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH v2 5/7] riscv: mm: accelerate pagefault when badaccess
  2024-04-10 17:28       ` Alexandre Ghiti
@ 2024-04-11  1:17         ` Kefeng Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Kefeng Wang @ 2024-04-11  1:17 UTC (permalink / raw)
  To: Alexandre Ghiti, akpm
  Cc: Russell King, Catalin Marinas, Will Deacon, Michael Ellerman,
	Nicholas Piggin, Christophe Leroy, Paul Walmsley, Palmer Dabbelt,
	Albert Ou, Alexander Gordeev, Gerald Schaefer, Dave Hansen,
	Andy Lutomirski, Peter Zijlstra, x86, linux-arm-kernel,
	linuxppc-dev, linux-riscv, linux-s390, surenb, linux-mm



On 2024/4/11 1:28, Alexandre Ghiti wrote:
> On 10/04/2024 10:07, Kefeng Wang wrote:
>>
>>
>> On 2024/4/10 15:32, Alexandre Ghiti wrote:
>>> Hi Kefeng,
>>>
>>> On 03/04/2024 10:38, Kefeng Wang wrote:
>>>> The access_error() of vma already checked under per-VMA lock, if it
>>>> is a bad access, directly handle error, no need to retry with mmap_lock
>>>> again. Since the page faut is handled under per-VMA lock, count it as
>>>> a vma lock event with VMA_LOCK_SUCCESS.
>>>>
>>>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>> ---
>>>>   arch/riscv/mm/fault.c | 5 ++++-
>>>>   1 file changed, 4 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>>>> index 3ba1d4dde5dd..b3fcf7d67efb 100644
>>>> --- a/arch/riscv/mm/fault.c
>>>> +++ b/arch/riscv/mm/fault.c
>>>> @@ -292,7 +292,10 @@ void handle_page_fault(struct pt_regs *regs)
>>>>       if (unlikely(access_error(cause, vma))) {
>>>>           vma_end_read(vma);
>>>> -        goto lock_mmap;
>>>> +        count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>>>> +        tsk->thread.bad_cause = SEGV_ACCERR;
>>>
>>>
>>> I think we should use the cause variable here instead of SEGV_ACCERR, 
>>> as bad_cause is a riscv internal status which describes the real 
>>> fault that happened.
>>
>> Oh, I see, it is exception causes on riscv, so it should be
>>
>> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
>> index b3fcf7d67efb..5224f3733802 100644
>> --- a/arch/riscv/mm/fault.c
>> +++ b/arch/riscv/mm/fault.c
>> @@ -293,8 +293,8 @@ void handle_page_fault(struct pt_regs *regs)
>>         if (unlikely(access_error(cause, vma))) {
>>                 vma_end_read(vma);
>>                 count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>> -               tsk->thread.bad_cause = SEGV_ACCERR;
>> -               bad_area_nosemaphore(regs, code, addr);
>> +               tsk->thread.bad_cause = cause;
>> +               bad_area_nosemaphore(regs, SEGV_ACCERR, addr);
>>                 return;
>>         }
>>
>> Hi Alex, could you help to check it?
>>
>> Hi Andrew, please help to squash it after Alex ack it.
>>
>> Thanks both.
> 
> 
> So I have just tested Kefeng's fixup on my usual CI and with a simple 
> program that triggers such bad access, everything went fine so with the 
> fixup applied:
> 
> Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
> Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>

Great, thanks.

> 
> Thanks,
> 
> Alex
> 
> 
> 
>>
>>
>>>
>>> Thanks,
>>>
>>> Alex
>>>
>>>
>>>> +        bad_area_nosemaphore(regs, code, addr);
>>>> +        return;
>>>>       }
>>>>       fault = handle_mm_fault(vma, addr, flags | 
>>>> FAULT_FLAG_VMA_LOCK, regs);


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2024-04-11  1:18 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-03  8:37 [PATCH v2 0/7] arch/mm/fault: accelerate pagefault when badaccess Kefeng Wang
2024-04-03  8:37 ` [PATCH v2 1/7] arm64: mm: cleanup __do_page_fault() Kefeng Wang
2024-04-09 11:14   ` Catalin Marinas
2024-04-03  8:38 ` [PATCH v2 2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS Kefeng Wang
2024-04-09 11:15   ` Catalin Marinas
2024-04-03  8:38 ` [PATCH v2 3/7] arm: " Kefeng Wang
2024-04-03  8:38 ` [PATCH v2 4/7] powerpc: mm: accelerate pagefault when badaccess Kefeng Wang
2024-04-09  8:56   ` Michael Ellerman
2024-04-03  8:38 ` [PATCH v2 5/7] riscv: " Kefeng Wang
2024-04-10  7:32   ` Alexandre Ghiti
2024-04-10  8:07     ` Kefeng Wang
2024-04-10 17:28       ` Alexandre Ghiti
2024-04-11  1:17         ` Kefeng Wang
2024-04-03  8:38 ` [PATCH v2 6/7] s390: " Kefeng Wang
2024-04-07 17:19   ` Heiko Carstens
2024-04-03  8:38 ` [PATCH v2 7/7] x86: " Kefeng Wang
2024-04-03 20:45 ` [PATCH v2 0/7] arch/mm/fault: " Andrew Morton
2024-04-07  7:49   ` Kefeng Wang
2024-04-07 17:19     ` Heiko Carstens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox