linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3] mm: Fix possible NULL pointer dereference in __swap_duplicate
@ 2025-02-15  9:05 gaoxu
  2025-02-16  1:42 ` Barry Song
  0 siblings, 1 reply; 7+ messages in thread
From: gaoxu @ 2025-02-15  9:05 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: linux-kernel, Suren Baghdasaryan, Barry Song, Yosry Ahmed, yipengxiang

Add a NULL check on the return value of swp_swap_info in __swap_duplicate
to prevent crashes caused by NULL pointer dereference.

The reason why swp_swap_info() returns NULL is unclear; it may be due to
CPU cache issues or DDR bit flips. The probability of this issue is very
small, and the stack info we encountered is as follows:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000058
[RB/E]rb_sreason_str_set: sreason_str set null_pointer
Mem abort info:
  ESR = 0x0000000096000005
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x05: level 1 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
  CM = 0, WnR = 0, TnD = 0, TagAccess = 0
  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=00000008a80e5000
[0000000000000058] pgd=0000000000000000, p4d=0000000000000000,
pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Skip md ftrace buffer dump for: 0x1609e0
...
pc : swap_duplicate+0x44/0x164
lr : copy_page_range+0x508/0x1e78
sp : ffffffc0f2a699e0
x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388
x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073
x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000
x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0
x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001
x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff
x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006
x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10
x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f
Call trace:
 swap_duplicate+0x44/0x164
 copy_page_range+0x508/0x1e78
 copy_process+0x1278/0x21cc
 kernel_clone+0x90/0x438
 __arm64_sys_clone+0x5c/0x8c
 invoke_syscall+0x58/0x110
 do_el0_svc+0x8c/0xe0
 el0_svc+0x38/0x9c
 el0t_64_sync_handler+0x44/0xec
 el0t_64_sync+0x1a8/0x1ac
Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs

The patch seems to only provide a workaround, but there are no more
effective software solutions to handle the bit flips problem. This path
will change the issue from a system crash to a process exception, thereby
reducing the impact on the entire machine.

Signed-off-by: gao xu <gaoxu2@honor.com>
---
v1 -> v2: 
- Add WARN_ON_ONCE.
- update the commit info.
v2 -> v3: Delete the review tags (This is my issue, and I apologize).
---

mm/swapfile.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 7448a3876..a0bfdba94 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3521,6 +3521,8 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr)
 	int err, i;
 
 	si = swp_swap_info(entry);
+	if (WARN_ON_ONCE(!si))
+		return -EINVAL;
 
 	offset = swp_offset(entry);
 	VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);
-- 
2.17.1

^ permalink raw reply	[flat|nested] 7+ messages in thread
* [PATCH v3] mm: Fix possible NULL pointer dereference in __swap_duplicate
@ 2025-02-15  8:46 gaoxu
  0 siblings, 0 replies; 7+ messages in thread
From: gaoxu @ 2025-02-15  8:46 UTC (permalink / raw)
  To: Andrew Morton, linux-mm
  Cc: linux-kernel, Suren Baghdasaryan, Barry Song, Yosry Ahmed, yipengxiang

[-- Attachment #1: Type: text/plain, Size: 3457 bytes --]

Add a NULL check on the return value of swp_swap_info in __swap_duplicate

to prevent crashes caused by NULL pointer dereference.



The reason why swp_swap_info() returns NULL is unclear; it may be due to

CPU cache issues or DDR bit flips. The probability of this issue is very

small, and the stack info we encountered is as follows:

Unable to handle kernel NULL pointer dereference at virtual address

0000000000000058

[RB/E]rb_sreason_str_set: sreason_str set null_pointer

Mem abort info:

  ESR = 0x0000000096000005

  EC = 0x25: DABT (current EL), IL = 32 bits

  SET = 0, FnV = 0

  EA = 0, S1PTW = 0

  FSC = 0x05: level 1 translation fault

Data abort info:

  ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000

  CM = 0, WnR = 0, TnD = 0, TagAccess = 0

  GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0

user pgtable: 4k pages, 39-bit VAs, pgdp=00000008a80e5000

[0000000000000058] pgd=0000000000000000, p4d=0000000000000000,

pud=0000000000000000

Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP

Skip md ftrace buffer dump for: 0x1609e0

...

pc : swap_duplicate+0x44/0x164

lr : copy_page_range+0x508/0x1e78

sp : ffffffc0f2a699e0

x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388

x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073

x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000

x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0

x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001

x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff

x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006

x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10

x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000

x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f

Call trace:

swap_duplicate+0x44/0x164

copy_page_range+0x508/0x1e78

copy_process+0x1278/0x21cc

kernel_clone+0x90/0x438

__arm64_sys_clone+0x5c/0x8c

invoke_syscall+0x58/0x110

do_el0_svc+0x8c/0xe0

el0_svc+0x38/0x9c

el0t_64_sync_handler+0x44/0xec

el0t_64_sync+0x1a8/0x1ac

Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8)

---[ end trace 0000000000000000 ]---

Kernel panic - not syncing: Oops: Fatal exception

SMP: stopping secondary CPUs



The patch seems to only provide a workaround, but there are no more

effective software solutions to handle the bit flips problem. This path

will change the issue from a system crash to a process exception, thereby

reducing the impact on the entire machine.



Signed-off-by: gaoxu <gaoxu2@honor.com<mailto:gaoxu2@honor.com>>

---

v1 -> v2:

- Add WARN_ON_ONCE as suggested by Yosry Ahmed.

- update the commit info.

v2 -> v3: Delete the review tags (This is my issue, and I apologize).

---



mm/swapfile.c | 2 ++

1 file changed, 2 insertions(+)



diff --git a/mm/swapfile.c b/mm/swapfile.c

index 7448a3876..a0bfdba94 100644

--- a/mm/swapfile.c

+++ b/mm/swapfile.c

@@ -3521,6 +3521,8 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr)

       int err, i;

        si = swp_swap_info(entry);

+       if (WARN_ON_ONCE(!si))

+                return -EINVAL;

        offset = swp_offset(entry);

       VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);

--

2.17.1


[-- Attachment #2: Type: text/html, Size: 11433 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-02-18  9:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-15  9:05 [PATCH v3] mm: Fix possible NULL pointer dereference in __swap_duplicate gaoxu
2025-02-16  1:42 ` Barry Song
2025-02-18  2:51   ` 回复: " gaoxu
2025-02-18  5:40     ` Barry Song
2025-02-18  7:13       ` 回复: " gaoxu
2025-02-18  9:06         ` Barry Song
  -- strict thread matches above, loose matches on Subject: below --
2025-02-15  8:46 gaoxu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox