* [PATCH v2] mm: Fix possible NULL pointer dereference in __swap_duplicate
@ 2025-02-15 6:52 gaoxu
2025-02-15 8:24 ` Yosry Ahmed
0 siblings, 1 reply; 3+ messages in thread
From: gaoxu @ 2025-02-15 6:52 UTC (permalink / raw)
To: Andrew Morton, linux-mm
Cc: linux-kernel, Suren Baghdasaryan, Barry Song, Yosry Ahmed, yipengxiang
Add a NULL check on the return value of swp_swap_info in __swap_duplicate
to prevent crashes caused by NULL pointer dereference.
The reason why swp_swap_info() returns NULL is unclear; it may be due to
CPU cache issues or DDR bit flips. The probability of this issue is very
small, and the stack info we encountered is as follows:
Unable to handle kernel NULL pointer dereference at virtual address
0000000000000058
[RB/E]rb_sreason_str_set: sreason_str set null_pointer
Mem abort info:
ESR = 0x0000000096000005
EC = 0x25: DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
FSC = 0x05: level 1 translation fault
Data abort info:
ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
CM = 0, WnR = 0, TnD = 0, TagAccess = 0
GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
user pgtable: 4k pages, 39-bit VAs, pgdp=00000008a80e5000
[0000000000000058] pgd=0000000000000000, p4d=0000000000000000,
pud=0000000000000000
Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
Skip md ftrace buffer dump for: 0x1609e0
...
pc : swap_duplicate+0x44/0x164
lr : copy_page_range+0x508/0x1e78
sp : ffffffc0f2a699e0
x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388
x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073
x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000
x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0
x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001
x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff
x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006
x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10
x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000
x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f
Call trace:
swap_duplicate+0x44/0x164
copy_page_range+0x508/0x1e78
copy_process+0x1278/0x21cc
kernel_clone+0x90/0x438
__arm64_sys_clone+0x5c/0x8c
invoke_syscall+0x58/0x110
do_el0_svc+0x8c/0xe0
el0_svc+0x38/0x9c
el0t_64_sync_handler+0x44/0xec
el0t_64_sync+0x1a8/0x1ac
Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8)
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops: Fatal exception
SMP: stopping secondary CPUs
The patch seems to only provide a workaround, but there are no more
effective software solutions to handle the bit flips problem. This path
will change the issue from a system crash to a process exception, thereby
reducing the impact on the entire machine.
Signed-off-by: gaoxu <gaoxu2@honor.com>
Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
---
v1 -> v2:
- Add WARN_ON_ONCE.
- update the commit info.
mm/swapfile.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 7448a3876..a0bfdba94 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3521,6 +3521,8 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr)
int err, i;
si = swp_swap_info(entry);
+ if (WARN_ON_ONCE(!si))
+ return -EINVAL;
offset = swp_offset(entry);
VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);
--
2.17.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v2] mm: Fix possible NULL pointer dereference in __swap_duplicate
2025-02-15 6:52 [PATCH v2] mm: Fix possible NULL pointer dereference in __swap_duplicate gaoxu
@ 2025-02-15 8:24 ` Yosry Ahmed
2025-02-15 8:31 ` 回复: " gaoxu
0 siblings, 1 reply; 3+ messages in thread
From: Yosry Ahmed @ 2025-02-15 8:24 UTC (permalink / raw)
To: gaoxu, Andrew Morton, linux-mm
Cc: linux-kernel, Suren Baghdasaryan, Barry Song, yipengxiang
February 14, 2025 at 10:52 PM, "gaoxu" <gaoxu2@honor.com> wrote:
>
> Add a NULL check on the return value of swp_swap_info in __swap_duplicate
>
> to prevent crashes caused by NULL pointer dereference.
>
> The reason why swp_swap_info() returns NULL is unclear; it may be due to
>
> CPU cache issues or DDR bit flips. The probability of this issue is very
>
> small, and the stack info we encountered is as follows:
>
> Unable to handle kernel NULL pointer dereference at virtual address
>
> 0000000000000058
>
> [RB/E]rb_sreason_str_set: sreason_str set null_pointer
>
> Mem abort info:
>
> ESR = 0x0000000096000005
>
> EC = 0x25: DABT (current EL), IL = 32 bits
>
> SET = 0, FnV = 0
>
> EA = 0, S1PTW = 0
>
> FSC = 0x05: level 1 translation fault
>
> Data abort info:
>
> ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
>
> CM = 0, WnR = 0, TnD = 0, TagAccess = 0
>
> GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
>
> user pgtable: 4k pages, 39-bit VAs, pgdp=00000008a80e5000
>
> [0000000000000058] pgd=0000000000000000, p4d=0000000000000000,
>
> pud=0000000000000000
>
> Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
>
> Skip md ftrace buffer dump for: 0x1609e0
>
> ...
>
> pc : swap_duplicate+0x44/0x164
>
> lr : copy_page_range+0x508/0x1e78
>
> sp : ffffffc0f2a699e0
>
> x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388
>
> x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073
>
> x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000
>
> x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0
>
> x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001
>
> x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff
>
> x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006
>
> x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10
>
> x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000
>
> x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f
>
> Call trace:
>
> swap_duplicate+0x44/0x164
>
> copy_page_range+0x508/0x1e78
>
> copy_process+0x1278/0x21cc
>
> kernel_clone+0x90/0x438
>
> __arm64_sys_clone+0x5c/0x8c
>
> invoke_syscall+0x58/0x110
>
> do_el0_svc+0x8c/0xe0
>
> el0_svc+0x38/0x9c
>
> el0t_64_sync_handler+0x44/0xec
>
> el0t_64_sync+0x1a8/0x1ac
>
> Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8)
>
> ---[ end trace 0000000000000000 ]---
>
> Kernel panic - not syncing: Oops: Fatal exception
>
> SMP: stopping secondary CPUs
>
> The patch seems to only provide a workaround, but there are no more
>
> effective software solutions to handle the bit flips problem. This path
>
> will change the issue from a system crash to a process exception, thereby
>
> reducing the impact on the entire machine.
>
> Signed-off-by: gaoxu <gaoxu2@honor.com>
>
> Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
I did not review this patch, I only made a suggestion. Please only add Review tags when explicitly given.
^ permalink raw reply [flat|nested] 3+ messages in thread
* 回复: [PATCH v2] mm: Fix possible NULL pointer dereference in __swap_duplicate
2025-02-15 8:24 ` Yosry Ahmed
@ 2025-02-15 8:31 ` gaoxu
0 siblings, 0 replies; 3+ messages in thread
From: gaoxu @ 2025-02-15 8:31 UTC (permalink / raw)
To: Yosry Ahmed, Andrew Morton, linux-mm
Cc: linux-kernel, Suren Baghdasaryan, Barry Song, yipengxiang
>
> February 14, 2025 at 10:52 PM, "gaoxu" <gaoxu2@honor.com> wrote:
>
>
>
> >
> > Add a NULL check on the return value of swp_swap_info in __swap_duplicate
> >
> > to prevent crashes caused by NULL pointer dereference.
> >
> > The reason why swp_swap_info() returns NULL is unclear; it may be due to
> >
> > CPU cache issues or DDR bit flips. The probability of this issue is very
> >
> > small, and the stack info we encountered is as follows:
> >
> > Unable to handle kernel NULL pointer dereference at virtual address
> >
> > 0000000000000058
> >
> > [RB/E]rb_sreason_str_set: sreason_str set null_pointer
> >
> > Mem abort info:
> >
> > ESR = 0x0000000096000005
> >
> > EC = 0x25: DABT (current EL), IL = 32 bits
> >
> > SET = 0, FnV = 0
> >
> > EA = 0, S1PTW = 0
> >
> > FSC = 0x05: level 1 translation fault
> >
> > Data abort info:
> >
> > ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
> >
> > CM = 0, WnR = 0, TnD = 0, TagAccess = 0
> >
> > GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> >
> > user pgtable: 4k pages, 39-bit VAs, pgdp=00000008a80e5000
> >
> > [0000000000000058] pgd=0000000000000000, p4d=0000000000000000,
> >
> > pud=0000000000000000
> >
> > Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> >
> > Skip md ftrace buffer dump for: 0x1609e0
> >
> > ...
> >
> > pc : swap_duplicate+0x44/0x164
> >
> > lr : copy_page_range+0x508/0x1e78
> >
> > sp : ffffffc0f2a699e0
> >
> > x29: ffffffc0f2a699e0 x28: ffffff8a5b28d388 x27: ffffff8b06603388
> >
> > x26: ffffffdf7291fe70 x25: 0000000000000006 x24: 0000000000100073
> >
> > x23: 00000000002d2d2f x22: 0000000000000008 x21: 0000000000000000
> >
> > x20: 00000000002d2d2f x19: 18000000002d2d2f x18: ffffffdf726faec0
> >
> > x17: 0000000000000000 x16: 0010000000000001 x15: 0040000000000001
> >
> > x14: 0400000000000001 x13: ff7ffffffffffb7f x12: ffeffffffffffbff
> >
> > x11: ffffff8a5c7e1898 x10: 0000000000000018 x9 : 0000000000000006
> >
> > x8 : 1800000000000000 x7 : 0000000000000000 x6 : ffffff8057c01f10
> >
> > x5 : 000000000000a318 x4 : 0000000000000000 x3 : 0000000000000000
> >
> > x2 : 0000006daf200000 x1 : 0000000000000001 x0 : 18000000002d2d2f
> >
> > Call trace:
> >
> > swap_duplicate+0x44/0x164
> >
> > copy_page_range+0x508/0x1e78
> >
> > copy_process+0x1278/0x21cc
> >
> > kernel_clone+0x90/0x438
> >
> > __arm64_sys_clone+0x5c/0x8c
> >
> > invoke_syscall+0x58/0x110
> >
> > do_el0_svc+0x8c/0xe0
> >
> > el0_svc+0x38/0x9c
> >
> > el0t_64_sync_handler+0x44/0xec
> >
> > el0t_64_sync+0x1a8/0x1ac
> >
> > Code: 9139c35a 71006f3f 54000568 f8797b55 (f9402ea8)
> >
> > ---[ end trace 0000000000000000 ]---
> >
> > Kernel panic - not syncing: Oops: Fatal exception
> >
> > SMP: stopping secondary CPUs
> >
> > The patch seems to only provide a workaround, but there are no more
> >
> > effective software solutions to handle the bit flips problem. This path
> >
> > will change the issue from a system crash to a process exception, thereby
> >
> > reducing the impact on the entire machine.
> >
> > Signed-off-by: gaoxu <gaoxu2@honor.com>
> >
> > Reviewed-by: Yosry Ahmed <yosry.ahmed@linux.dev>
>
>
> I did not review this patch, I only made a suggestion. Please only add Review
> tags when explicitly given.
sorry, I will resend a patch that removes the Review tags.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-02-15 8:32 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-02-15 6:52 [PATCH v2] mm: Fix possible NULL pointer dereference in __swap_duplicate gaoxu
2025-02-15 8:24 ` Yosry Ahmed
2025-02-15 8:31 ` 回复: " gaoxu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox