* [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
@ 2023-09-06 12:42 Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 1/3] mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() Kefeng Wang
` (3 more replies)
0 siblings, 4 replies; 14+ messages in thread
From: Kefeng Wang @ 2023-09-06 12:42 UTC (permalink / raw)
To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
Cc: Kefeng Wang
This is a RFC, even patch3 is a hack to fix the softlock issue when
populate or depopulate pte with large region, looking forward to your
reply and advise, thanks.
Kefeng Wang (3):
mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
mm: kasan: shadow: move free_page() out of page table lock
mm: kasan: shadow: HACK add cond_resched_lock() in
kasan_depopulate_vmalloc_pte()
include/linux/kasan.h | 9 ++++++---
mm/kasan/shadow.c | 20 +++++++++++++-------
mm/vmalloc.c | 7 ++++---
3 files changed, 23 insertions(+), 13 deletions(-)
--
2.41.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH -rfc 1/3] mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
2023-09-06 12:42 [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
@ 2023-09-06 12:42 ` Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 2/3] mm: kasan: shadow: move free_page() out of page table lock Kefeng Wang
` (2 subsequent siblings)
3 siblings, 0 replies; 14+ messages in thread
From: Kefeng Wang @ 2023-09-06 12:42 UTC (permalink / raw)
To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
Cc: Kefeng Wang
The kasan_populate_vmalloc() will cost a lot of time when populate
large size, it will cause soft lockup,
watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
_raw_spin_unlock_irqrestore+0x50/0xb8
rmqueue_bulk+0x434/0x6b8
get_page_from_freelist+0xdd4/0x1680
__alloc_pages+0x244/0x508
alloc_pages+0xf0/0x218
__get_free_pages+0x1c/0x50
kasan_populate_vmalloc_pte+0x30/0x188
__apply_to_page_range+0x3ec/0x650
apply_to_page_range+0x1c/0x30
kasan_populate_vmalloc+0x60/0x70
alloc_vmap_area.part.67+0x328/0xe50
alloc_vmap_area+0x4c/0x78
__get_vm_area_node.constprop.76+0x130/0x240
__vmalloc_node_range+0x12c/0x340
__vmalloc_node+0x8c/0xb0
vmalloc+0x2c/0x40
Fix it by adding a cond_resched().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/kasan/shadow.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index dd772f9d0f08..fd15e38ff80e 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -317,6 +317,8 @@ static int kasan_populate_vmalloc_pte(pte_t *ptep, unsigned long addr,
unsigned long page;
pte_t pte;
+ cond_resched();
+
if (likely(!pte_none(ptep_get(ptep))))
return 0;
--
2.41.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH -rfc 2/3] mm: kasan: shadow: move free_page() out of page table lock
2023-09-06 12:42 [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 1/3] mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() Kefeng Wang
@ 2023-09-06 12:42 ` Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte() Kefeng Wang
2023-09-15 0:58 ` [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
3 siblings, 0 replies; 14+ messages in thread
From: Kefeng Wang @ 2023-09-06 12:42 UTC (permalink / raw)
To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
Cc: Kefeng Wang
free_page() is not needed to be protected by page table lock,
and it will take a little longer, so move it out of the spinlock.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
mm/kasan/shadow.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index fd15e38ff80e..d7d6724da2e0 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -423,12 +423,13 @@ static int kasan_depopulate_vmalloc_pte(pte_t *ptep, unsigned long addr,
page = (unsigned long)__va(pte_pfn(ptep_get(ptep)) << PAGE_SHIFT);
spin_lock(&init_mm.page_table_lock);
-
- if (likely(!pte_none(ptep_get(ptep)))) {
+ if (likely(!pte_none(ptep_get(ptep))))
pte_clear(&init_mm, addr, ptep);
- free_page(page);
- }
+ else
+ page = 0;
spin_unlock(&init_mm.page_table_lock);
+ if (page)
+ free_page(page);
return 0;
}
--
2.41.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte()
2023-09-06 12:42 [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 1/3] mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 2/3] mm: kasan: shadow: move free_page() out of page table lock Kefeng Wang
@ 2023-09-06 12:42 ` Kefeng Wang
2023-09-13 8:48 ` kernel test robot
2023-09-15 0:58 ` [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
3 siblings, 1 reply; 14+ messages in thread
From: Kefeng Wang @ 2023-09-06 12:42 UTC (permalink / raw)
To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
Cc: Kefeng Wang
There is a similar softlockup issue with large size in kasan_release_vmalloc(),
watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
_raw_spin_unlock_irqrestore+0x50/0xb8
free_pcppages_bulk+0x2bc/0x3e0
free_unref_page_commit+0x1fc/0x290
free_unref_page+0x184/0x250
__free_pages+0x154/0x1a0
free_pages+0x88/0xb0
kasan_depopulate_vmalloc_pte+0x58/0x80
__apply_to_page_range+0x3ec/0x650
apply_to_existing_page_range+0x1c/0x30
kasan_release_vmalloc+0xa4/0x118
__purge_vmap_area_lazy+0x4f4/0xe30
drain_vmap_area_work+0x60/0xc0
process_one_work+0x4cc/0xa38
worker_thread+0x240/0x638
kthread+0x1c8/0x1e0
ret_from_fork+0x10/0x20
But it is could be fixed by adding a cond_resched_lock(), but see comment
about kasan_release_vmalloc(), free_vmap_area_lock is to protect the
concurrency, so it looks risky, any advise to fix this issue?
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
include/linux/kasan.h | 9 ++++++---
mm/kasan/shadow.c | 9 ++++++---
mm/vmalloc.c | 7 ++++---
3 files changed, 16 insertions(+), 9 deletions(-)
diff --git a/include/linux/kasan.h b/include/linux/kasan.h
index 3df5499f7936..6d85715c47ad 100644
--- a/include/linux/kasan.h
+++ b/include/linux/kasan.h
@@ -385,7 +385,8 @@ void kasan_populate_early_vm_area_shadow(void *start, unsigned long size);
int kasan_populate_vmalloc(unsigned long addr, unsigned long size);
void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
- unsigned long free_region_end);
+ unsigned long free_region_end,
+ void *lock);
#else /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
@@ -400,7 +401,8 @@ static inline int kasan_populate_vmalloc(unsigned long start,
static inline void kasan_release_vmalloc(unsigned long start,
unsigned long end,
unsigned long free_region_start,
- unsigned long free_region_end) { }
+ unsigned long free_region_end,
+ void *lock) { }
#endif /* CONFIG_KASAN_GENERIC || CONFIG_KASAN_SW_TAGS */
@@ -435,7 +437,8 @@ static inline int kasan_populate_vmalloc(unsigned long start,
static inline void kasan_release_vmalloc(unsigned long start,
unsigned long end,
unsigned long free_region_start,
- unsigned long free_region_end) { }
+ unsigned long free_region_end,
+ void *lock) { }
static inline void *kasan_unpoison_vmalloc(const void *start,
unsigned long size,
diff --git a/mm/kasan/shadow.c b/mm/kasan/shadow.c
index d7d6724da2e0..4bce98e2b30d 100644
--- a/mm/kasan/shadow.c
+++ b/mm/kasan/shadow.c
@@ -416,12 +416,14 @@ int kasan_populate_vmalloc(unsigned long addr, unsigned long size)
}
static int kasan_depopulate_vmalloc_pte(pte_t *ptep, unsigned long addr,
- void *unused)
+ void *lock)
{
unsigned long page;
page = (unsigned long)__va(pte_pfn(ptep_get(ptep)) << PAGE_SHIFT);
+ cond_resched_lock(lock);
+
spin_lock(&init_mm.page_table_lock);
if (likely(!pte_none(ptep_get(ptep))))
pte_clear(&init_mm, addr, ptep);
@@ -511,7 +513,8 @@ static int kasan_depopulate_vmalloc_pte(pte_t *ptep, unsigned long addr,
*/
void kasan_release_vmalloc(unsigned long start, unsigned long end,
unsigned long free_region_start,
- unsigned long free_region_end)
+ unsigned long free_region_end,
+ void *lock)
{
void *shadow_start, *shadow_end;
unsigned long region_start, region_end;
@@ -547,7 +550,7 @@ void kasan_release_vmalloc(unsigned long start, unsigned long end,
apply_to_existing_page_range(&init_mm,
(unsigned long)shadow_start,
size, kasan_depopulate_vmalloc_pte,
- NULL);
+ lock);
flush_tlb_kernel_range((unsigned long)shadow_start,
(unsigned long)shadow_end);
}
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 228a4a5312f2..c40ea7d1b65e 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1768,7 +1768,8 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
if (is_vmalloc_or_module_addr((void *)orig_start))
kasan_release_vmalloc(orig_start, orig_end,
- va->va_start, va->va_end);
+ va->va_start, va->va_end,
+ &free_vmap_area_lock);
atomic_long_sub(nr, &vmap_lazy_nr);
num_purged_areas++;
@@ -4198,7 +4199,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
&free_vmap_area_list);
if (va)
kasan_release_vmalloc(orig_start, orig_end,
- va->va_start, va->va_end);
+ va->va_start, va->va_end, NULL);
vas[area] = NULL;
}
@@ -4248,7 +4249,7 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
&free_vmap_area_list);
if (va)
kasan_release_vmalloc(orig_start, orig_end,
- va->va_start, va->va_end);
+ va->va_start, va->va_end, &free_vmap_area_lock);
vas[area] = NULL;
kfree(vms[area]);
}
--
2.41.0
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte()
2023-09-06 12:42 ` [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte() Kefeng Wang
@ 2023-09-13 8:48 ` kernel test robot
2023-09-13 11:21 ` Kefeng Wang
0 siblings, 1 reply; 14+ messages in thread
From: kernel test robot @ 2023-09-13 8:48 UTC (permalink / raw)
To: Kefeng Wang
Cc: oe-lkp, lkp, kasan-dev, linux-mm, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton, Uladzislau Rezki,
Christoph Hellwig, Lorenzo Stoakes, Kefeng Wang, oliver.sang
hi, Kefeng Wang,
we don't have enough knowledge to connect below random issues with your change,
however, by running up to 300 times, we observed the parent keeps clean.
so make out this report FYI.
if you need more tests, please let us know. Thanks.
cb588b24f0fcf515 eaf065b089545219e27e529e3d6
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:300 6% 17:300 dmesg.BUG:#DF_stack_guard_page_was_hit_at#(stack_is#..#)
:300 0% 1:300 dmesg.BUG:#DF_stack_guard_page_was_hit_at(____ptrval____)(stack_is(____ptrval____)..(____ptrval____))
:300 6% 18:300 dmesg.BUG:KASAN:stack-out-of-bounds_in_vsnprintf
:300 6% 17:300 dmesg.BUG:TASK_stack_guard_page_was_hit_at#(stack_is#..#)
:300 0% 1:300 dmesg.BUG:TASK_stack_guard_page_was_hit_at(____ptrval____)(stack_is(____ptrval____)..(____ptrval____))
:300 9% 28:300 dmesg.BUG:unable_to_handle_page_fault_for_address
:300 3% 8:300 dmesg.Kernel_panic-not_syncing:Fatal_exception
:300 7% 20:300 dmesg.Kernel_panic-not_syncing:Fatal_exception_in_interrupt
:300 3% 10:300 dmesg.Oops:#[##]
:300 6% 19:300 dmesg.RIP:__sanitizer_cov_trace_pc
:300 5% 14:300 dmesg.RIP:exc_page_fault
:300 6% 18:300 dmesg.WARNING:kernel_stack
:300 6% 18:300 dmesg.WARNING:stack_recursion
:300 6% 18:300 dmesg.stack_guard_page:#[##]
Hello,
kernel test robot noticed "BUG:TASK_stack_guard_page_was_hit_at#(stack_is#..#)" on:
commit: eaf065b089545219e27e529e3d6deac4c0bad525 ("[PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte()")
url: https://github.com/intel-lab-lkp/linux/commits/Kefeng-Wang/mm-kasan-shadow-add-cond_resched-in-kasan_populate_vmalloc_pte/20230906-205407
base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
patch link: https://lore.kernel.org/all/20230906124234.134200-4-wangkefeng.wang@huawei.com/
patch subject: [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte()
in testcase: rcuscale
version:
with following parameters:
runtime: 300s
scale_type: srcud
compiler: gcc-9
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
(please refer to attached dmesg/kmsg for entire log/backtrace)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202309131652.3e9c0f06-oliver.sang@intel.com
[ 114.366291][ C1] BUG: TASK stack guard page was hit at 00000000d230e938 (stack is 000000004315c7ed..00000000e1c06e40)
[ 114.366312][ C1] stack guard page: 0000 [#1] SMP KASAN
[ 114.366324][ C1] CPU: 1 PID: 400 Comm: systemd-journal Tainted: G W N 6.5.0-11778-geaf065b08954 #1
[ 114.366338][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 114.366345][ C1] RIP: 0010:exc_page_fault (arch/x86/mm/fault.c:1518)
[ 114.366365][ C1] Code: 89 ee e8 74 ca 7c fe 0f 1f 44 00 00 90 44 89 f6 4c 89 e7 e8 7d 0b 00 00 41 5c 41 5d 41 5e 5d c3 66 0f 1f 00 55 48 89 e5 41 57 <41> 56 41 55 49 89 f5 41 54 49 89 fc 0f 1f 44 00 00 41 0f 20 d6 65
All code
========
0: 89 ee mov %ebp,%esi
2: e8 74 ca 7c fe callq 0xfffffffffe7cca7b
7: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
c: 90 nop
d: 44 89 f6 mov %r14d,%esi
10: 4c 89 e7 mov %r12,%rdi
13: e8 7d 0b 00 00 callq 0xb95
18: 41 5c pop %r12
1a: 41 5d pop %r13
1c: 41 5e pop %r14
1e: 5d pop %rbp
1f: c3 retq
20: 66 0f 1f 00 nopw (%rax)
24: 55 push %rbp
25: 48 89 e5 mov %rsp,%rbp
28: 41 57 push %r15
2a:* 41 56 push %r14 <-- trapping instruction
2c: 41 55 push %r13
2e: 49 89 f5 mov %rsi,%r13
31: 41 54 push %r12
33: 49 89 fc mov %rdi,%r12
36: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
3b: 41 0f 20 d6 mov %cr2,%r14
3f: 65 gs
Code starting with the faulting instruction
===========================================
0: 41 56 push %r14
2: 41 55 push %r13
4: 49 89 f5 mov %rsi,%r13
7: 41 54 push %r12
9: 49 89 fc mov %rdi,%r12
c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
11: 41 0f 20 d6 mov %cr2,%r14
15: 65 gs
[ 114.366375][ C1] RSP: 0000:ffffc90001388000 EFLAGS: 00210087
[ 114.366386][ C1] RAX: ffffc90001388018 RBX: 0000000000000000 RCX: ffffffff84801717
[ 114.366394][ C1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc90001388018
[ 114.366401][ C1] RBP: ffffc90001388008 R08: 0000000000000000 R09: 0000000000000000
[ 114.366409][ C1] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
[ 114.366416][ C1] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 114.366423][ C1] FS: 0000000000000000(0000) GS:ffff8883af500000(0063) knlGS:00000000f516bb40
[ 114.366433][ C1] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
[ 114.366441][ C1] CR2: ffffc90001387ff8 CR3: 00000001bcfc9000 CR4: 00000000000406a0
[ 114.366451][ C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 114.366459][ C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 114.366466][ C1] Call Trace:
[ 114.366473][ C1] BUG: unable to handle page fault for address: fffff52000271002
[ 114.366479][ C1] #PF: supervisor read access in kernel mode
[ 114.366485][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366491][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366513][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366518][ C1] #PF: supervisor read access in kernel mode
[ 114.366524][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366529][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366549][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366554][ C1] #PF: supervisor read access in kernel mode
[ 114.366559][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366565][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366584][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366589][ C1] #PF: supervisor read access in kernel mode
[ 114.366595][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366600][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366620][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366625][ C1] #PF: supervisor read access in kernel mode
[ 114.366630][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366635][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366655][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366660][ C1] #PF: supervisor read access in kernel mode
[ 114.366666][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366671][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366691][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366695][ C1] #PF: supervisor read access in kernel mode
[ 114.366701][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366706][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366726][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366731][ C1] #PF: supervisor read access in kernel mode
[ 114.366736][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366741][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366761][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366766][ C1] #PF: supervisor read access in kernel mode
[ 114.366771][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366776][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366796][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366801][ C1] #PF: supervisor read access in kernel mode
[ 114.366807][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366811][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366831][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366836][ C1] #PF: supervisor read access in kernel mode
[ 114.366842][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366847][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366866][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366871][ C1] #PF: supervisor read access in kernel mode
[ 114.366877][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366882][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366902][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366907][ C1] #PF: supervisor read access in kernel mode
[ 114.366912][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366917][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366932][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366937][ C1] #PF: supervisor read access in kernel mode
[ 114.366942][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366947][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.366966][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.366971][ C1] #PF: supervisor read access in kernel mode
[ 114.366976][ C1] #PF: error_code(0x0000) - not-present page
[ 114.366981][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.367001][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.367006][ C1] #PF: supervisor read access in kernel mode
[ 114.367012][ C1] #PF: error_code(0x0000) - not-present page
[ 114.367016][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.367036][ C1] BUG: unable to handle page fault for address: fffff52000271000
[ 114.367042][ C1] #PF: supervisor read access in kernel mode
[ 114.367047][ C1] #PF: error_code(0x0000) - not-present page
[ 114.367052][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
[ 114.367075][ C1] BUG: #DF stack guard page was hit at 0000000071957a17 (stack is 00000000d15a2314..00000000d7ec09e2)
[ 114.367086][ C1] stack guard page: 0000 [#2] SMP KASAN
[ 114.367095][ C1] CPU: 1 PID: 400 Comm: systemd-journal Tainted: G W N 6.5.0-11778-geaf065b08954 #1
[ 114.367107][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 114.367121][ C1] ==================================================================
[ 114.367125][ C1] BUG: KASAN: stack-out-of-bounds in vsnprintf (lib/vsprintf.c:2851)
[ 114.367141][ C1] Read of size 8 at addr fffffe39ea66b3c0 by task systemd-journal/400
[ 114.367150][ C1]
[ 114.367153][ C1] CPU: 1 PID: 400 Comm: systemd-journal Tainted: G W N 6.5.0-11778-geaf065b08954 #1
[ 114.367165][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 114.367172][ C1] Call Trace:
[ 114.367176][ C1] <#DF>
[ 114.367181][ C1] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
[ 114.367197][ C1] print_address_description+0x7d/0x2ee
[ 114.367219][ C1] print_report (mm/kasan/report.c:476)
[ 114.367234][ C1] ? vsnprintf (lib/vsprintf.c:2851)
[ 114.367248][ C1] ? kasan_addr_to_slab (mm/kasan/common.c:35)
[ 114.367265][ C1] ? vsnprintf (lib/vsprintf.c:2851)
[ 114.367278][ C1] kasan_report (mm/kasan/report.c:590)
[ 114.367293][ C1] ? format_decode (lib/vsprintf.c:2526)
[ 114.367308][ C1] ? vsnprintf (lib/vsprintf.c:2851)
[ 114.367327][ C1] __asan_report_load8_noabort (mm/kasan/report_generic.c:381)
[ 114.367346][ C1] vsnprintf (lib/vsprintf.c:2851)
[ 114.367365][ C1] ? pointer (lib/vsprintf.c:2749)
[ 114.367384][ C1] sprintf (lib/vsprintf.c:3017)
[ 114.367399][ C1] ? snprintf (lib/vsprintf.c:3017)
[ 114.367411][ C1] ? kallsyms_sym_address (kernel/kallsyms.c:164)
[ 114.367426][ C1] ? kallsyms_expand_symbol+0x1f1/0x231
[ 114.367443][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
[ 114.367460][ C1] ? kallsyms_lookup_buildid (kernel/kallsyms.c:437)
[ 114.367476][ C1] __sprint_symbol+0x15b/0x1ec
[ 114.367491][ C1] ? kallsyms_lookup_buildid (kernel/kallsyms.c:482)
[ 114.367504][ C1] ? page_fault_oops (arch/x86/mm/fault.c:699)
[ 114.367516][ C1] ? fixup_exception (arch/x86/mm/extable.c:305)
[ 114.367550][ C1] ? kernelmode_fixup_or_oops (arch/x86/mm/fault.c:761)
[ 114.367566][ C1] ? __bad_area_nosemaphore (arch/x86/mm/fault.c:819)
[ 114.367579][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
[ 114.367597][ C1] sprint_symbol (kernel/kallsyms.c:536)
[ 114.367609][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
[ 114.367625][ C1] symbol_string (lib/vsprintf.c:1001)
[ 114.367639][ C1] ? ip4_addr_string (lib/vsprintf.c:983)
[ 114.367656][ C1] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:570)
[ 114.367677][ C1] ? page_fault_oops (include/linux/sched/task_stack.h:31 arch/x86/mm/fault.c:699)
[ 114.367689][ C1] ? page_fault_oops (arch/x86/mm/fault.c:699)
[ 114.367706][ C1] ? dump_pagetable (arch/x86/mm/fault.c:635)
[ 114.367718][ C1] ? search_extable (lib/extable.c:115)
[ 114.367731][ C1] ? is_prefetch+0x36f/0x3b4
[ 114.367745][ C1] ? spurious_kernel_fault_check (arch/x86/mm/fault.c:122)
[ 114.367758][ C1] ? search_module_extables (arch/x86/include/asm/preempt.h:85 kernel/module/main.c:3236)
[ 114.367775][ C1] ? widen_string (lib/vsprintf.c:618)
[ 114.367792][ C1] ? widen_string (lib/vsprintf.c:618)
[ 114.367805][ C1] ? set_precision (lib/vsprintf.c:618)
[ 114.367824][ C1] ? string_nocheck (lib/vsprintf.c:640)
[ 114.367838][ C1] ? number (lib/vsprintf.c:573)
[ 114.367854][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
[ 114.367872][ C1] pointer (lib/vsprintf.c:2416)
[ 114.367887][ C1] ? va_format+0x1a1/0x1a1
[ 114.367900][ C1] ? hex_string (lib/vsprintf.c:723)
[ 114.367919][ C1] vsnprintf (lib/vsprintf.c:2822)
[ 114.367937][ C1] ? pointer (lib/vsprintf.c:2749)
[ 114.367952][ C1] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91)
[ 114.367966][ C1] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:267)
[ 114.367982][ C1] vprintk_store (kernel/printk/printk.c:2193)
[ 114.367996][ C1] ? __kasan_check_write (mm/kasan/shadow.c:38)
[ 114.368011][ C1] ? printk_sprint (kernel/printk/printk.c:2158)
[ 114.368028][ C1] ? printk_sprint (kernel/printk/printk.c:2158)
[ 114.368057][ C1] vprintk_emit (kernel/printk/printk.c:2290)
[ 114.368074][ C1] vprintk_deferred (kernel/printk/printk.c:3911)
[ 114.368089][ C1] vprintk (kernel/printk/printk_safe.c:42)
[ 114.368104][ C1] _printk (kernel/printk/printk.c:2329)
[ 114.368116][ C1] ? syslog_print (kernel/printk/printk.c:2329)
[ 114.368127][ C1] ? vprintk (kernel/printk/printk_safe.c:46)
[ 114.368143][ C1] ? syslog_print (kernel/printk/printk.c:2329)
[ 114.368157][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
[ 114.368175][ C1] show_ip (arch/x86/kernel/dumpstack.c:144)
[ 114.368188][ C1] show_iret_regs (arch/x86/kernel/dumpstack.c:150)
[ 114.368200][ C1] __show_regs (arch/x86/kernel/process_64.c:77)
[ 114.368214][ C1] ? dump_stack_print_info (lib/dump_stack.c:71)
[ 114.368231][ C1] show_regs (arch/x86/kernel/dumpstack.c:477)
[ 114.368243][ C1] __die_body (arch/x86/kernel/dumpstack.c:421)
[ 114.368256][ C1] __die (arch/x86/kernel/dumpstack.c:435)
[ 114.368268][ C1] die (arch/x86/kernel/dumpstack.c:448)
[ 114.368280][ C1] handle_stack_overflow (arch/x86/kernel/traps.c:327)
[ 114.368298][ C1] exc_double_fault (arch/x86/kernel/traps.c:464)
[ 114.368315][ C1] asm_exc_double_fault (arch/x86/include/asm/idtentry.h:611)
[ 114.368329][ C1] RIP: 0010:__sanitizer_cov_trace_pc (kernel/kcov.c:200)
[ 114.368347][ C1] Code: 00 00 48 c1 e6 38 48 21 fe 74 12 b8 01 00 00 00 48 c1 e0 38 48 39 c6 b0 00 0f 44 c2 c3 85 ff 0f 44 c1 c3 31 c0 c3 f3 0f 1e fa <55> 65 8b 05 6e 52 f0 7c 89 c1 48 89 e5 81 e1 00 01 00 00 48 8b 75
All code
========
0: 00 00 add %al,(%rax)
2: 48 c1 e6 38 shl $0x38,%rsi
6: 48 21 fe and %rdi,%rsi
9: 74 12 je 0x1d
b: b8 01 00 00 00 mov $0x1,%eax
10: 48 c1 e0 38 shl $0x38,%rax
14: 48 39 c6 cmp %rax,%rsi
17: b0 00 mov $0x0,%al
19: 0f 44 c2 cmove %edx,%eax
1c: c3 retq
1d: 85 ff test %edi,%edi
1f: 0f 44 c1 cmove %ecx,%eax
22: c3 retq
23: 31 c0 xor %eax,%eax
25: c3 retq
26: f3 0f 1e fa endbr64
2a:* 55 push %rbp <-- trapping instruction
2b: 65 8b 05 6e 52 f0 7c mov %gs:0x7cf0526e(%rip),%eax # 0x7cf052a0
32: 89 c1 mov %eax,%ecx
34: 48 89 e5 mov %rsp,%rbp
37: 81 e1 00 01 00 00 and $0x100,%ecx
3d: 48 rex.W
3e: 8b .byte 0x8b
3f: 75 .byte 0x75
Code starting with the faulting instruction
===========================================
0: 55 push %rbp
1: 65 8b 05 6e 52 f0 7c mov %gs:0x7cf0526e(%rip),%eax # 0x7cf05276
8: 89 c1 mov %eax,%ecx
a: 48 89 e5 mov %rsp,%rbp
d: 81 e1 00 01 00 00 and $0x100,%ecx
13: 48 rex.W
14: 8b .byte 0x8b
15: 75 .byte 0x75
The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20230913/202309131652.3e9c0f06-oliver.sang@intel.com
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte()
2023-09-13 8:48 ` kernel test robot
@ 2023-09-13 11:21 ` Kefeng Wang
0 siblings, 0 replies; 14+ messages in thread
From: Kefeng Wang @ 2023-09-13 11:21 UTC (permalink / raw)
To: kernel test robot
Cc: oe-lkp, lkp, kasan-dev, linux-mm, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton, Uladzislau Rezki,
Christoph Hellwig, Lorenzo Stoakes
Hi, thanks for you test, but as the commit log of this patch,
it is a hack, I don't think this patch is correct, hope kasan maintainer
to give some advise about the softlock issue about populate/depopulate pte.
On 2023/9/13 16:48, kernel test robot wrote:
>
> hi, Kefeng Wang,
>
> we don't have enough knowledge to connect below random issues with your change,
> however, by running up to 300 times, we observed the parent keeps clean.
> so make out this report FYI.
> if you need more tests, please let us know. Thanks.
>
> cb588b24f0fcf515 eaf065b089545219e27e529e3d6
> ---------------- ---------------------------
> fail:runs %reproduction fail:runs
> | | |
> :300 6% 17:300 dmesg.BUG:#DF_stack_guard_page_was_hit_at#(stack_is#..#)
> :300 0% 1:300 dmesg.BUG:#DF_stack_guard_page_was_hit_at(____ptrval____)(stack_is(____ptrval____)..(____ptrval____))
> :300 6% 18:300 dmesg.BUG:KASAN:stack-out-of-bounds_in_vsnprintf
> :300 6% 17:300 dmesg.BUG:TASK_stack_guard_page_was_hit_at#(stack_is#..#)
> :300 0% 1:300 dmesg.BUG:TASK_stack_guard_page_was_hit_at(____ptrval____)(stack_is(____ptrval____)..(____ptrval____))
> :300 9% 28:300 dmesg.BUG:unable_to_handle_page_fault_for_address
> :300 3% 8:300 dmesg.Kernel_panic-not_syncing:Fatal_exception
> :300 7% 20:300 dmesg.Kernel_panic-not_syncing:Fatal_exception_in_interrupt
> :300 3% 10:300 dmesg.Oops:#[##]
> :300 6% 19:300 dmesg.RIP:__sanitizer_cov_trace_pc
> :300 5% 14:300 dmesg.RIP:exc_page_fault
> :300 6% 18:300 dmesg.WARNING:kernel_stack
> :300 6% 18:300 dmesg.WARNING:stack_recursion
> :300 6% 18:300 dmesg.stack_guard_page:#[##]
>
>
> Hello,
>
> kernel test robot noticed "BUG:TASK_stack_guard_page_was_hit_at#(stack_is#..#)" on:
>
> commit: eaf065b089545219e27e529e3d6deac4c0bad525 ("[PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte()")
> url: https://github.com/intel-lab-lkp/linux/commits/Kefeng-Wang/mm-kasan-shadow-add-cond_resched-in-kasan_populate_vmalloc_pte/20230906-205407
> base: https://git.kernel.org/cgit/linux/kernel/git/akpm/mm.git mm-everything
> patch link: https://lore.kernel.org/all/20230906124234.134200-4-wangkefeng.wang@huawei.com/
> patch subject: [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte()
>
> in testcase: rcuscale
> version:
> with following parameters:
>
> runtime: 300s
> scale_type: srcud
>
>
>
> compiler: gcc-9
> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G
>
> (please refer to attached dmesg/kmsg for entire log/backtrace)
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> | Reported-by: kernel test robot <oliver.sang@intel.com>
> | Closes: https://lore.kernel.org/oe-lkp/202309131652.3e9c0f06-oliver.sang@intel.com
>
>
> [ 114.366291][ C1] BUG: TASK stack guard page was hit at 00000000d230e938 (stack is 000000004315c7ed..00000000e1c06e40)
> [ 114.366312][ C1] stack guard page: 0000 [#1] SMP KASAN
> [ 114.366324][ C1] CPU: 1 PID: 400 Comm: systemd-journal Tainted: G W N 6.5.0-11778-geaf065b08954 #1
> [ 114.366338][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 114.366345][ C1] RIP: 0010:exc_page_fault (arch/x86/mm/fault.c:1518)
> [ 114.366365][ C1] Code: 89 ee e8 74 ca 7c fe 0f 1f 44 00 00 90 44 89 f6 4c 89 e7 e8 7d 0b 00 00 41 5c 41 5d 41 5e 5d c3 66 0f 1f 00 55 48 89 e5 41 57 <41> 56 41 55 49 89 f5 41 54 49 89 fc 0f 1f 44 00 00 41 0f 20 d6 65
> All code
> ========
> 0: 89 ee mov %ebp,%esi
> 2: e8 74 ca 7c fe callq 0xfffffffffe7cca7b
> 7: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> c: 90 nop
> d: 44 89 f6 mov %r14d,%esi
> 10: 4c 89 e7 mov %r12,%rdi
> 13: e8 7d 0b 00 00 callq 0xb95
> 18: 41 5c pop %r12
> 1a: 41 5d pop %r13
> 1c: 41 5e pop %r14
> 1e: 5d pop %rbp
> 1f: c3 retq
> 20: 66 0f 1f 00 nopw (%rax)
> 24: 55 push %rbp
> 25: 48 89 e5 mov %rsp,%rbp
> 28: 41 57 push %r15
> 2a:* 41 56 push %r14 <-- trapping instruction
> 2c: 41 55 push %r13
> 2e: 49 89 f5 mov %rsi,%r13
> 31: 41 54 push %r12
> 33: 49 89 fc mov %rdi,%r12
> 36: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 3b: 41 0f 20 d6 mov %cr2,%r14
> 3f: 65 gs
>
> Code starting with the faulting instruction
> ===========================================
> 0: 41 56 push %r14
> 2: 41 55 push %r13
> 4: 49 89 f5 mov %rsi,%r13
> 7: 41 54 push %r12
> 9: 49 89 fc mov %rdi,%r12
> c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 11: 41 0f 20 d6 mov %cr2,%r14
> 15: 65 gs
> [ 114.366375][ C1] RSP: 0000:ffffc90001388000 EFLAGS: 00210087
> [ 114.366386][ C1] RAX: ffffc90001388018 RBX: 0000000000000000 RCX: ffffffff84801717
> [ 114.366394][ C1] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffc90001388018
> [ 114.366401][ C1] RBP: ffffc90001388008 R08: 0000000000000000 R09: 0000000000000000
> [ 114.366409][ C1] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
> [ 114.366416][ C1] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
> [ 114.366423][ C1] FS: 0000000000000000(0000) GS:ffff8883af500000(0063) knlGS:00000000f516bb40
> [ 114.366433][ C1] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
> [ 114.366441][ C1] CR2: ffffc90001387ff8 CR3: 00000001bcfc9000 CR4: 00000000000406a0
> [ 114.366451][ C1] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 114.366459][ C1] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 114.366466][ C1] Call Trace:
> [ 114.366473][ C1] BUG: unable to handle page fault for address: fffff52000271002
> [ 114.366479][ C1] #PF: supervisor read access in kernel mode
> [ 114.366485][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366491][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366513][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366518][ C1] #PF: supervisor read access in kernel mode
> [ 114.366524][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366529][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366549][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366554][ C1] #PF: supervisor read access in kernel mode
> [ 114.366559][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366565][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366584][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366589][ C1] #PF: supervisor read access in kernel mode
> [ 114.366595][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366600][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366620][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366625][ C1] #PF: supervisor read access in kernel mode
> [ 114.366630][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366635][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366655][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366660][ C1] #PF: supervisor read access in kernel mode
> [ 114.366666][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366671][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366691][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366695][ C1] #PF: supervisor read access in kernel mode
> [ 114.366701][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366706][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366726][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366731][ C1] #PF: supervisor read access in kernel mode
> [ 114.366736][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366741][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366761][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366766][ C1] #PF: supervisor read access in kernel mode
> [ 114.366771][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366776][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366796][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366801][ C1] #PF: supervisor read access in kernel mode
> [ 114.366807][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366811][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366831][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366836][ C1] #PF: supervisor read access in kernel mode
> [ 114.366842][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366847][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366866][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366871][ C1] #PF: supervisor read access in kernel mode
> [ 114.366877][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366882][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366902][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366907][ C1] #PF: supervisor read access in kernel mode
> [ 114.366912][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366917][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366932][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366937][ C1] #PF: supervisor read access in kernel mode
> [ 114.366942][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366947][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.366966][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.366971][ C1] #PF: supervisor read access in kernel mode
> [ 114.366976][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.366981][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.367001][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.367006][ C1] #PF: supervisor read access in kernel mode
> [ 114.367012][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.367016][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.367036][ C1] BUG: unable to handle page fault for address: fffff52000271000
> [ 114.367042][ C1] #PF: supervisor read access in kernel mode
> [ 114.367047][ C1] #PF: error_code(0x0000) - not-present page
> [ 114.367052][ C1] PGD 417fdf067 P4D 417fdf067 PUD 1009ad067 PMD 14692d067 PTE 0
> [ 114.367075][ C1] BUG: #DF stack guard page was hit at 0000000071957a17 (stack is 00000000d15a2314..00000000d7ec09e2)
> [ 114.367086][ C1] stack guard page: 0000 [#2] SMP KASAN
> [ 114.367095][ C1] CPU: 1 PID: 400 Comm: systemd-journal Tainted: G W N 6.5.0-11778-geaf065b08954 #1
> [ 114.367107][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 114.367121][ C1] ==================================================================
> [ 114.367125][ C1] BUG: KASAN: stack-out-of-bounds in vsnprintf (lib/vsprintf.c:2851)
> [ 114.367141][ C1] Read of size 8 at addr fffffe39ea66b3c0 by task systemd-journal/400
> [ 114.367150][ C1]
> [ 114.367153][ C1] CPU: 1 PID: 400 Comm: systemd-journal Tainted: G W N 6.5.0-11778-geaf065b08954 #1
> [ 114.367165][ C1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 114.367172][ C1] Call Trace:
> [ 114.367176][ C1] <#DF>
> [ 114.367181][ C1] dump_stack_lvl (lib/dump_stack.c:107 (discriminator 4))
> [ 114.367197][ C1] print_address_description+0x7d/0x2ee
> [ 114.367219][ C1] print_report (mm/kasan/report.c:476)
> [ 114.367234][ C1] ? vsnprintf (lib/vsprintf.c:2851)
> [ 114.367248][ C1] ? kasan_addr_to_slab (mm/kasan/common.c:35)
> [ 114.367265][ C1] ? vsnprintf (lib/vsprintf.c:2851)
> [ 114.367278][ C1] kasan_report (mm/kasan/report.c:590)
> [ 114.367293][ C1] ? format_decode (lib/vsprintf.c:2526)
> [ 114.367308][ C1] ? vsnprintf (lib/vsprintf.c:2851)
> [ 114.367327][ C1] __asan_report_load8_noabort (mm/kasan/report_generic.c:381)
> [ 114.367346][ C1] vsnprintf (lib/vsprintf.c:2851)
> [ 114.367365][ C1] ? pointer (lib/vsprintf.c:2749)
> [ 114.367384][ C1] sprintf (lib/vsprintf.c:3017)
> [ 114.367399][ C1] ? snprintf (lib/vsprintf.c:3017)
> [ 114.367411][ C1] ? kallsyms_sym_address (kernel/kallsyms.c:164)
> [ 114.367426][ C1] ? kallsyms_expand_symbol+0x1f1/0x231
> [ 114.367443][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
> [ 114.367460][ C1] ? kallsyms_lookup_buildid (kernel/kallsyms.c:437)
> [ 114.367476][ C1] __sprint_symbol+0x15b/0x1ec
> [ 114.367491][ C1] ? kallsyms_lookup_buildid (kernel/kallsyms.c:482)
> [ 114.367504][ C1] ? page_fault_oops (arch/x86/mm/fault.c:699)
> [ 114.367516][ C1] ? fixup_exception (arch/x86/mm/extable.c:305)
> [ 114.367550][ C1] ? kernelmode_fixup_or_oops (arch/x86/mm/fault.c:761)
> [ 114.367566][ C1] ? __bad_area_nosemaphore (arch/x86/mm/fault.c:819)
> [ 114.367579][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
> [ 114.367597][ C1] sprint_symbol (kernel/kallsyms.c:536)
> [ 114.367609][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
> [ 114.367625][ C1] symbol_string (lib/vsprintf.c:1001)
> [ 114.367639][ C1] ? ip4_addr_string (lib/vsprintf.c:983)
> [ 114.367656][ C1] ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:570)
> [ 114.367677][ C1] ? page_fault_oops (include/linux/sched/task_stack.h:31 arch/x86/mm/fault.c:699)
> [ 114.367689][ C1] ? page_fault_oops (arch/x86/mm/fault.c:699)
> [ 114.367706][ C1] ? dump_pagetable (arch/x86/mm/fault.c:635)
> [ 114.367718][ C1] ? search_extable (lib/extable.c:115)
> [ 114.367731][ C1] ? is_prefetch+0x36f/0x3b4
> [ 114.367745][ C1] ? spurious_kernel_fault_check (arch/x86/mm/fault.c:122)
> [ 114.367758][ C1] ? search_module_extables (arch/x86/include/asm/preempt.h:85 kernel/module/main.c:3236)
> [ 114.367775][ C1] ? widen_string (lib/vsprintf.c:618)
> [ 114.367792][ C1] ? widen_string (lib/vsprintf.c:618)
> [ 114.367805][ C1] ? set_precision (lib/vsprintf.c:618)
> [ 114.367824][ C1] ? string_nocheck (lib/vsprintf.c:640)
> [ 114.367838][ C1] ? number (lib/vsprintf.c:573)
> [ 114.367854][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
> [ 114.367872][ C1] pointer (lib/vsprintf.c:2416)
> [ 114.367887][ C1] ? va_format+0x1a1/0x1a1
> [ 114.367900][ C1] ? hex_string (lib/vsprintf.c:723)
> [ 114.367919][ C1] vsnprintf (lib/vsprintf.c:2822)
> [ 114.367937][ C1] ? pointer (lib/vsprintf.c:2749)
> [ 114.367952][ C1] ? kvm_sched_clock_read (arch/x86/kernel/kvmclock.c:91)
> [ 114.367966][ C1] ? sched_clock_noinstr (arch/x86/kernel/tsc.c:267)
> [ 114.367982][ C1] vprintk_store (kernel/printk/printk.c:2193)
> [ 114.367996][ C1] ? __kasan_check_write (mm/kasan/shadow.c:38)
> [ 114.368011][ C1] ? printk_sprint (kernel/printk/printk.c:2158)
> [ 114.368028][ C1] ? printk_sprint (kernel/printk/printk.c:2158)
> [ 114.368057][ C1] vprintk_emit (kernel/printk/printk.c:2290)
> [ 114.368074][ C1] vprintk_deferred (kernel/printk/printk.c:3911)
> [ 114.368089][ C1] vprintk (kernel/printk/printk_safe.c:42)
> [ 114.368104][ C1] _printk (kernel/printk/printk.c:2329)
> [ 114.368116][ C1] ? syslog_print (kernel/printk/printk.c:2329)
> [ 114.368127][ C1] ? vprintk (kernel/printk/printk_safe.c:46)
> [ 114.368143][ C1] ? syslog_print (kernel/printk/printk.c:2329)
> [ 114.368157][ C1] ? __sanitizer_cov_trace_pc (kernel/kcov.c:200)
> [ 114.368175][ C1] show_ip (arch/x86/kernel/dumpstack.c:144)
> [ 114.368188][ C1] show_iret_regs (arch/x86/kernel/dumpstack.c:150)
> [ 114.368200][ C1] __show_regs (arch/x86/kernel/process_64.c:77)
> [ 114.368214][ C1] ? dump_stack_print_info (lib/dump_stack.c:71)
> [ 114.368231][ C1] show_regs (arch/x86/kernel/dumpstack.c:477)
> [ 114.368243][ C1] __die_body (arch/x86/kernel/dumpstack.c:421)
> [ 114.368256][ C1] __die (arch/x86/kernel/dumpstack.c:435)
> [ 114.368268][ C1] die (arch/x86/kernel/dumpstack.c:448)
> [ 114.368280][ C1] handle_stack_overflow (arch/x86/kernel/traps.c:327)
> [ 114.368298][ C1] exc_double_fault (arch/x86/kernel/traps.c:464)
> [ 114.368315][ C1] asm_exc_double_fault (arch/x86/include/asm/idtentry.h:611)
> [ 114.368329][ C1] RIP: 0010:__sanitizer_cov_trace_pc (kernel/kcov.c:200)
> [ 114.368347][ C1] Code: 00 00 48 c1 e6 38 48 21 fe 74 12 b8 01 00 00 00 48 c1 e0 38 48 39 c6 b0 00 0f 44 c2 c3 85 ff 0f 44 c1 c3 31 c0 c3 f3 0f 1e fa <55> 65 8b 05 6e 52 f0 7c 89 c1 48 89 e5 81 e1 00 01 00 00 48 8b 75
> All code
> ========
> 0: 00 00 add %al,(%rax)
> 2: 48 c1 e6 38 shl $0x38,%rsi
> 6: 48 21 fe and %rdi,%rsi
> 9: 74 12 je 0x1d
> b: b8 01 00 00 00 mov $0x1,%eax
> 10: 48 c1 e0 38 shl $0x38,%rax
> 14: 48 39 c6 cmp %rax,%rsi
> 17: b0 00 mov $0x0,%al
> 19: 0f 44 c2 cmove %edx,%eax
> 1c: c3 retq
> 1d: 85 ff test %edi,%edi
> 1f: 0f 44 c1 cmove %ecx,%eax
> 22: c3 retq
> 23: 31 c0 xor %eax,%eax
> 25: c3 retq
> 26: f3 0f 1e fa endbr64
> 2a:* 55 push %rbp <-- trapping instruction
> 2b: 65 8b 05 6e 52 f0 7c mov %gs:0x7cf0526e(%rip),%eax # 0x7cf052a0
> 32: 89 c1 mov %eax,%ecx
> 34: 48 89 e5 mov %rsp,%rbp
> 37: 81 e1 00 01 00 00 and $0x100,%ecx
> 3d: 48 rex.W
> 3e: 8b .byte 0x8b
> 3f: 75 .byte 0x75
>
> Code starting with the faulting instruction
> ===========================================
> 0: 55 push %rbp
> 1: 65 8b 05 6e 52 f0 7c mov %gs:0x7cf0526e(%rip),%eax # 0x7cf05276
> 8: 89 c1 mov %eax,%ecx
> a: 48 89 e5 mov %rsp,%rbp
> d: 81 e1 00 01 00 00 and $0x100,%ecx
> 13: 48 rex.W
> 14: 8b .byte 0x8b
> 15: 75 .byte 0x75
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20230913/202309131652.3e9c0f06-oliver.sang@intel.com
>
>
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-09-06 12:42 [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
` (2 preceding siblings ...)
2023-09-06 12:42 ` [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte() Kefeng Wang
@ 2023-09-15 0:58 ` Kefeng Wang
2023-10-18 14:16 ` Kefeng Wang
3 siblings, 1 reply; 14+ messages in thread
From: Kefeng Wang @ 2023-09-15 0:58 UTC (permalink / raw)
To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
Hi All, any suggest or comments,many thanks.
On 2023/9/6 20:42, Kefeng Wang wrote:
> This is a RFC, even patch3 is a hack to fix the softlock issue when
> populate or depopulate pte with large region, looking forward to your
> reply and advise, thanks.
Here is full stack,for populate pte,
[ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
[ C3] Modules linked in: test(OE+)
[ C3] irq event stamp: 320776
[ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
_raw_spin_unlock_irqrestore+0x98/0xb8
[ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
el1_interrupt+0x38/0xa8
[ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
__do_softirq+0x658/0x7ac
[ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
____do_softirq+0x18/0x30
[ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE
6.5.0+ #595
[ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
[ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
[ C3] sp : ffff800093386d70
[ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
[ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
[ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
[ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
[ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
[ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
[ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
[ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
[ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
[ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
[ C3] Call trace:
[ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
[ C3] rmqueue_bulk+0x434/0x6b8
[ C3] get_page_from_freelist+0xdd4/0x1680
[ C3] __alloc_pages+0x244/0x508
[ C3] alloc_pages+0xf0/0x218
[ C3] __get_free_pages+0x1c/0x50
[ C3] kasan_populate_vmalloc_pte+0x30/0x188
[ C3] __apply_to_page_range+0x3ec/0x650
[ C3] apply_to_page_range+0x1c/0x30
[ C3] kasan_populate_vmalloc+0x60/0x70
[ C3] alloc_vmap_area.part.67+0x328/0xe50
[ C3] alloc_vmap_area+0x4c/0x78
[ C3] __get_vm_area_node.constprop.76+0x130/0x240
[ C3] __vmalloc_node_range+0x12c/0x340
[ C3] __vmalloc_node+0x8c/0xb0
[ C3] vmalloc+0x2c/0x40
[ C3] show_mem_init+0x1c/0xff8 [test]
[ C3] do_one_initcall+0xe4/0x500
[ C3] do_init_module+0x100/0x358
[ C3] load_module+0x2e64/0x2fc8
[ C3] init_module_from_file+0xec/0x148
[ C3] idempotent_init_module+0x278/0x380
[ C3] __arm64_sys_finit_module+0x88/0xf8
[ C3] invoke_syscall+0x64/0x188
[ C3] el0_svc_common.constprop.1+0xec/0x198
[ C3] do_el0_svc+0x48/0xc8
[ C3] el0_svc+0x3c/0xe8
[ C3] el0t_64_sync_handler+0xa0/0xc8
[ C3] el0t_64_sync+0x188/0x190
and for depopuldate pte,
[ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
[ C6] Modules linked in: test(OE+)
[ C6] irq event stamp: 39458
[ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
_raw_spin_unlock_irqrestore+0x98/0xb8
[ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
el1_interrupt+0x38/0xa8
[ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
__do_softirq+0x658/0x7ac
[ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
____do_softirq+0x18/0x30
[ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
6.5.0+ #595
[ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
[ C6] Workqueue: events drain_vmap_area_work
[ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
[ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
[ C6] sp : ffff80008fe676b0
[ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
[ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
[ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
[ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
[ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
[ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
[ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
[ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
[ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
[ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
[ C6] Call trace:
[ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
[ C6] free_pcppages_bulk+0x2bc/0x3e0
[ C6] free_unref_page_commit+0x1fc/0x290
[ C6] free_unref_page+0x184/0x250
[ C6] __free_pages+0x154/0x1a0
[ C6] free_pages+0x88/0xb0
[ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
[ C6] __apply_to_page_range+0x3ec/0x650
[ C6] apply_to_existing_page_range+0x1c/0x30
[ C6] kasan_release_vmalloc+0xa4/0x118
[ C6] __purge_vmap_area_lazy+0x4f4/0xe30
[ C6] drain_vmap_area_work+0x60/0xc0
[ C6] process_one_work+0x4cc/0xa38
[ C6] worker_thread+0x240/0x638
[ C6] kthread+0x1c8/0x1e0
[ C6] ret_from_fork+0x10/0x20
>
> Kefeng Wang (3):
> mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
> mm: kasan: shadow: move free_page() out of page table lock
> mm: kasan: shadow: HACK add cond_resched_lock() in
> kasan_depopulate_vmalloc_pte()
>
> include/linux/kasan.h | 9 ++++++---
> mm/kasan/shadow.c | 20 +++++++++++++-------
> mm/vmalloc.c | 7 ++++---
> 3 files changed, 23 insertions(+), 13 deletions(-)
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-09-15 0:58 ` [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
@ 2023-10-18 14:16 ` Kefeng Wang
2023-10-18 16:37 ` Marco Elver
0 siblings, 1 reply; 14+ messages in thread
From: Kefeng Wang @ 2023-10-18 14:16 UTC (permalink / raw)
To: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
The issue is easy to reproduced with large vmalloc, kindly ping...
On 2023/9/15 8:58, Kefeng Wang wrote:
> Hi All, any suggest or comments,many thanks.
>
> On 2023/9/6 20:42, Kefeng Wang wrote:
>> This is a RFC, even patch3 is a hack to fix the softlock issue when
>> populate or depopulate pte with large region, looking forward to your
>> reply and advise, thanks.
>
> Here is full stack,for populate pte,
>
> [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
> [ C3] Modules linked in: test(OE+)
> [ C3] irq event stamp: 320776
> [ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
> _raw_spin_unlock_irqrestore+0x98/0xb8
> [ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
> el1_interrupt+0x38/0xa8
> [ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
> __do_softirq+0x658/0x7ac
> [ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
> ____do_softirq+0x18/0x30
> [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595
> [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> [ C3] sp : ffff800093386d70
> [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
> [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
> [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
> [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
> [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
> [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
> [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
> [ C3] Call trace:
> [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
> [ C3] rmqueue_bulk+0x434/0x6b8
> [ C3] get_page_from_freelist+0xdd4/0x1680
> [ C3] __alloc_pages+0x244/0x508
> [ C3] alloc_pages+0xf0/0x218
> [ C3] __get_free_pages+0x1c/0x50
> [ C3] kasan_populate_vmalloc_pte+0x30/0x188
> [ C3] __apply_to_page_range+0x3ec/0x650
> [ C3] apply_to_page_range+0x1c/0x30
> [ C3] kasan_populate_vmalloc+0x60/0x70
> [ C3] alloc_vmap_area.part.67+0x328/0xe50
> [ C3] alloc_vmap_area+0x4c/0x78
> [ C3] __get_vm_area_node.constprop.76+0x130/0x240
> [ C3] __vmalloc_node_range+0x12c/0x340
> [ C3] __vmalloc_node+0x8c/0xb0
> [ C3] vmalloc+0x2c/0x40
> [ C3] show_mem_init+0x1c/0xff8 [test]
> [ C3] do_one_initcall+0xe4/0x500
> [ C3] do_init_module+0x100/0x358
> [ C3] load_module+0x2e64/0x2fc8
> [ C3] init_module_from_file+0xec/0x148
> [ C3] idempotent_init_module+0x278/0x380
> [ C3] __arm64_sys_finit_module+0x88/0xf8
> [ C3] invoke_syscall+0x64/0x188
> [ C3] el0_svc_common.constprop.1+0xec/0x198
> [ C3] do_el0_svc+0x48/0xc8
> [ C3] el0_svc+0x3c/0xe8
> [ C3] el0t_64_sync_handler+0xa0/0xc8
> [ C3] el0t_64_sync+0x188/0x190
>
> and for depopuldate pte,
>
> [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
> [ C6] Modules linked in: test(OE+)
> [ C6] irq event stamp: 39458
> [ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
> _raw_spin_unlock_irqrestore+0x98/0xb8
> [ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
> el1_interrupt+0x38/0xa8
> [ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
> __do_softirq+0x658/0x7ac
> [ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
> ____do_softirq+0x18/0x30
> [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
> 6.5.0+ #595
> [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> [ C6] Workqueue: events drain_vmap_area_work
> [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> [ C6] sp : ffff80008fe676b0
> [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
> [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
> [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
> [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
> [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
> [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
> [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
> [ C6] Call trace:
> [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
> [ C6] free_pcppages_bulk+0x2bc/0x3e0
> [ C6] free_unref_page_commit+0x1fc/0x290
> [ C6] free_unref_page+0x184/0x250
> [ C6] __free_pages+0x154/0x1a0
> [ C6] free_pages+0x88/0xb0
> [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
> [ C6] __apply_to_page_range+0x3ec/0x650
> [ C6] apply_to_existing_page_range+0x1c/0x30
> [ C6] kasan_release_vmalloc+0xa4/0x118
> [ C6] __purge_vmap_area_lazy+0x4f4/0xe30
> [ C6] drain_vmap_area_work+0x60/0xc0
> [ C6] process_one_work+0x4cc/0xa38
> [ C6] worker_thread+0x240/0x638
> [ C6] kthread+0x1c8/0x1e0
> [ C6] ret_from_fork+0x10/0x20
>
>
>
>>
>> Kefeng Wang (3):
>> mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
>> mm: kasan: shadow: move free_page() out of page table lock
>> mm: kasan: shadow: HACK add cond_resched_lock() in
>> kasan_depopulate_vmalloc_pte()
>>
>> include/linux/kasan.h | 9 ++++++---
>> mm/kasan/shadow.c | 20 +++++++++++++-------
>> mm/vmalloc.c | 7 ++++---
>> 3 files changed, 23 insertions(+), 13 deletions(-)
>>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-10-18 14:16 ` Kefeng Wang
@ 2023-10-18 16:37 ` Marco Elver
2023-10-19 1:40 ` Kefeng Wang
0 siblings, 1 reply; 14+ messages in thread
From: Marco Elver @ 2023-10-18 16:37 UTC (permalink / raw)
To: Kefeng Wang
Cc: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev
<kasan-dev@googlegroups.com> wrote:
>
> The issue is easy to reproduced with large vmalloc, kindly ping...
>
> On 2023/9/15 8:58, Kefeng Wang wrote:
> > Hi All, any suggest or comments,many thanks.
> >
> > On 2023/9/6 20:42, Kefeng Wang wrote:
> >> This is a RFC, even patch3 is a hack to fix the softlock issue when
> >> populate or depopulate pte with large region, looking forward to your
> >> reply and advise, thanks.
> >
> > Here is full stack,for populate pte,
> >
> > [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
> > [ C3] Modules linked in: test(OE+)
> > [ C3] irq event stamp: 320776
> > [ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
> > _raw_spin_unlock_irqrestore+0x98/0xb8
> > [ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
> > el1_interrupt+0x38/0xa8
> > [ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
> > __do_softirq+0x658/0x7ac
> > [ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
> > ____do_softirq+0x18/0x30
> > [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595
> > [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> > [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> > [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> > [ C3] sp : ffff800093386d70
> > [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
> > [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
> > [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
> > [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> > [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
> > [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
> > [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> > [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> > [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
> > [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
> > [ C3] Call trace:
> > [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
> > [ C3] rmqueue_bulk+0x434/0x6b8
> > [ C3] get_page_from_freelist+0xdd4/0x1680
> > [ C3] __alloc_pages+0x244/0x508
> > [ C3] alloc_pages+0xf0/0x218
> > [ C3] __get_free_pages+0x1c/0x50
> > [ C3] kasan_populate_vmalloc_pte+0x30/0x188
> > [ C3] __apply_to_page_range+0x3ec/0x650
> > [ C3] apply_to_page_range+0x1c/0x30
> > [ C3] kasan_populate_vmalloc+0x60/0x70
> > [ C3] alloc_vmap_area.part.67+0x328/0xe50
> > [ C3] alloc_vmap_area+0x4c/0x78
> > [ C3] __get_vm_area_node.constprop.76+0x130/0x240
> > [ C3] __vmalloc_node_range+0x12c/0x340
> > [ C3] __vmalloc_node+0x8c/0xb0
> > [ C3] vmalloc+0x2c/0x40
> > [ C3] show_mem_init+0x1c/0xff8 [test]
> > [ C3] do_one_initcall+0xe4/0x500
> > [ C3] do_init_module+0x100/0x358
> > [ C3] load_module+0x2e64/0x2fc8
> > [ C3] init_module_from_file+0xec/0x148
> > [ C3] idempotent_init_module+0x278/0x380
> > [ C3] __arm64_sys_finit_module+0x88/0xf8
> > [ C3] invoke_syscall+0x64/0x188
> > [ C3] el0_svc_common.constprop.1+0xec/0x198
> > [ C3] do_el0_svc+0x48/0xc8
> > [ C3] el0_svc+0x3c/0xe8
> > [ C3] el0t_64_sync_handler+0xa0/0xc8
> > [ C3] el0t_64_sync+0x188/0x190
> >
> > and for depopuldate pte,
> >
> > [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
> > [ C6] Modules linked in: test(OE+)
> > [ C6] irq event stamp: 39458
> > [ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
> > _raw_spin_unlock_irqrestore+0x98/0xb8
> > [ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
> > el1_interrupt+0x38/0xa8
> > [ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
> > __do_softirq+0x658/0x7ac
> > [ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
> > ____do_softirq+0x18/0x30
> > [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
> > 6.5.0+ #595
> > [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> > [ C6] Workqueue: events drain_vmap_area_work
> > [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> > [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> > [ C6] sp : ffff80008fe676b0
> > [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
> > [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
> > [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
> > [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> > [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
> > [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
> > [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> > [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> > [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
> > [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
> > [ C6] Call trace:
> > [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
> > [ C6] free_pcppages_bulk+0x2bc/0x3e0
> > [ C6] free_unref_page_commit+0x1fc/0x290
> > [ C6] free_unref_page+0x184/0x250
> > [ C6] __free_pages+0x154/0x1a0
> > [ C6] free_pages+0x88/0xb0
> > [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
> > [ C6] __apply_to_page_range+0x3ec/0x650
> > [ C6] apply_to_existing_page_range+0x1c/0x30
> > [ C6] kasan_release_vmalloc+0xa4/0x118
> > [ C6] __purge_vmap_area_lazy+0x4f4/0xe30
> > [ C6] drain_vmap_area_work+0x60/0xc0
> > [ C6] process_one_work+0x4cc/0xa38
> > [ C6] worker_thread+0x240/0x638
> > [ C6] kthread+0x1c8/0x1e0
> > [ C6] ret_from_fork+0x10/0x20
> >
> >
> >
> >>
> >> Kefeng Wang (3):
> >> mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
> >> mm: kasan: shadow: move free_page() out of page table lock
> >> mm: kasan: shadow: HACK add cond_resched_lock() in
> >> kasan_depopulate_vmalloc_pte()
The first 2 patches look ok, but yeah, the last is a hack. I also
don't have any better suggestions, only more questions.
Does this only happen on arm64?
Do you have a minimal reproducer you can share?
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-10-18 16:37 ` Marco Elver
@ 2023-10-19 1:40 ` Kefeng Wang
2023-10-19 6:17 ` Uladzislau Rezki
0 siblings, 1 reply; 14+ messages in thread
From: Kefeng Wang @ 2023-10-19 1:40 UTC (permalink / raw)
To: Marco Elver
Cc: Andrey Ryabinin, Alexander Potapenko, Andrey Konovalov,
Dmitry Vyukov, Vincenzo Frascino, Andrew Morton,
Uladzislau Rezki, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
On 2023/10/19 0:37, Marco Elver wrote:
> On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev
> <kasan-dev@googlegroups.com> wrote:
>>
>> The issue is easy to reproduced with large vmalloc, kindly ping...
>>
>> On 2023/9/15 8:58, Kefeng Wang wrote:
>>> Hi All, any suggest or comments,many thanks.
>>>
>>> On 2023/9/6 20:42, Kefeng Wang wrote:
>>>> This is a RFC, even patch3 is a hack to fix the softlock issue when
>>>> populate or depopulate pte with large region, looking forward to your
>>>> reply and advise, thanks.
>>>
>>> Here is full stack,for populate pte,
>>>
>>> [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
>>> [ C3] Modules linked in: test(OE+)
>>> [ C3] irq event stamp: 320776
>>> [ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
>>> _raw_spin_unlock_irqrestore+0x98/0xb8
>>> [ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
>>> el1_interrupt+0x38/0xa8
>>> [ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
>>> __do_softirq+0x658/0x7ac
>>> [ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
>>> ____do_softirq+0x18/0x30
>>> [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595
>>> [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>>> [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
>>> [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
>>> [ C3] sp : ffff800093386d70
>>> [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
>>> [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
>>> [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
>>> [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
>>> [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
>>> [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
>>> [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
>>> [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
>>> [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
>>> [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
>>> [ C3] Call trace:
>>> [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
>>> [ C3] rmqueue_bulk+0x434/0x6b8
>>> [ C3] get_page_from_freelist+0xdd4/0x1680
>>> [ C3] __alloc_pages+0x244/0x508
>>> [ C3] alloc_pages+0xf0/0x218
>>> [ C3] __get_free_pages+0x1c/0x50
>>> [ C3] kasan_populate_vmalloc_pte+0x30/0x188
>>> [ C3] __apply_to_page_range+0x3ec/0x650
>>> [ C3] apply_to_page_range+0x1c/0x30
>>> [ C3] kasan_populate_vmalloc+0x60/0x70
>>> [ C3] alloc_vmap_area.part.67+0x328/0xe50
>>> [ C3] alloc_vmap_area+0x4c/0x78
>>> [ C3] __get_vm_area_node.constprop.76+0x130/0x240
>>> [ C3] __vmalloc_node_range+0x12c/0x340
>>> [ C3] __vmalloc_node+0x8c/0xb0
>>> [ C3] vmalloc+0x2c/0x40
>>> [ C3] show_mem_init+0x1c/0xff8 [test]
>>> [ C3] do_one_initcall+0xe4/0x500
>>> [ C3] do_init_module+0x100/0x358
>>> [ C3] load_module+0x2e64/0x2fc8
>>> [ C3] init_module_from_file+0xec/0x148
>>> [ C3] idempotent_init_module+0x278/0x380
>>> [ C3] __arm64_sys_finit_module+0x88/0xf8
>>> [ C3] invoke_syscall+0x64/0x188
>>> [ C3] el0_svc_common.constprop.1+0xec/0x198
>>> [ C3] do_el0_svc+0x48/0xc8
>>> [ C3] el0_svc+0x3c/0xe8
>>> [ C3] el0t_64_sync_handler+0xa0/0xc8
>>> [ C3] el0t_64_sync+0x188/0x190
>>>
>>> and for depopuldate pte,
>>>
>>> [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
>>> [ C6] Modules linked in: test(OE+)
>>> [ C6] irq event stamp: 39458
>>> [ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
>>> _raw_spin_unlock_irqrestore+0x98/0xb8
>>> [ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
>>> el1_interrupt+0x38/0xa8
>>> [ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
>>> __do_softirq+0x658/0x7ac
>>> [ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
>>> ____do_softirq+0x18/0x30
>>> [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
>>> 6.5.0+ #595
>>> [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>>> [ C6] Workqueue: events drain_vmap_area_work
>>> [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>> [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
>>> [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
>>> [ C6] sp : ffff80008fe676b0
>>> [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
>>> [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
>>> [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
>>> [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
>>> [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
>>> [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
>>> [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
>>> [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
>>> [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
>>> [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
>>> [ C6] Call trace:
>>> [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
>>> [ C6] free_pcppages_bulk+0x2bc/0x3e0
>>> [ C6] free_unref_page_commit+0x1fc/0x290
>>> [ C6] free_unref_page+0x184/0x250
>>> [ C6] __free_pages+0x154/0x1a0
>>> [ C6] free_pages+0x88/0xb0
>>> [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
>>> [ C6] __apply_to_page_range+0x3ec/0x650
>>> [ C6] apply_to_existing_page_range+0x1c/0x30
>>> [ C6] kasan_release_vmalloc+0xa4/0x118
>>> [ C6] __purge_vmap_area_lazy+0x4f4/0xe30
>>> [ C6] drain_vmap_area_work+0x60/0xc0
>>> [ C6] process_one_work+0x4cc/0xa38
>>> [ C6] worker_thread+0x240/0x638
>>> [ C6] kthread+0x1c8/0x1e0
>>> [ C6] ret_from_fork+0x10/0x20
>>>
>>>
>>>
>>>>
>>>> Kefeng Wang (3):
>>>> mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
>>>> mm: kasan: shadow: move free_page() out of page table lock
>>>> mm: kasan: shadow: HACK add cond_resched_lock() in
>>>> kasan_depopulate_vmalloc_pte()
>
> The first 2 patches look ok, but yeah, the last is a hack. I also
> don't have any better suggestions, only more questions.
Thanks Marco, maybe we could convert free_vmap_area_lock from spinlock
to mutex lock only if KASAN enabled?
>
> Does this only happen on arm64?
Our test case run on arm64 qemu(host is x86), so it run much more slower
than real board.
> Do you have a minimal reproducer you can share?
Here is the code in test driver,
void *buf = vmalloc(40UL << 30);
vfree(buf);
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-10-19 1:40 ` Kefeng Wang
@ 2023-10-19 6:17 ` Uladzislau Rezki
2023-10-19 7:26 ` Kefeng Wang
0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2023-10-19 6:17 UTC (permalink / raw)
To: Kefeng Wang
Cc: Marco Elver, Andrey Ryabinin, Alexander Potapenko,
Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino,
Andrew Morton, Uladzislau Rezki, Christoph Hellwig,
Lorenzo Stoakes, kasan-dev, linux-mm
On Thu, Oct 19, 2023 at 09:40:10AM +0800, Kefeng Wang wrote:
>
>
> On 2023/10/19 0:37, Marco Elver wrote:
> > On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev
> > <kasan-dev@googlegroups.com> wrote:
> > >
> > > The issue is easy to reproduced with large vmalloc, kindly ping...
> > >
> > > On 2023/9/15 8:58, Kefeng Wang wrote:
> > > > Hi All, any suggest or comments,many thanks.
> > > >
> > > > On 2023/9/6 20:42, Kefeng Wang wrote:
> > > > > This is a RFC, even patch3 is a hack to fix the softlock issue when
> > > > > populate or depopulate pte with large region, looking forward to your
> > > > > reply and advise, thanks.
> > > >
> > > > Here is full stack,for populate pte,
> > > >
> > > > [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
> > > > [ C3] Modules linked in: test(OE+)
> > > > [ C3] irq event stamp: 320776
> > > > [ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
> > > > _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > [ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
> > > > el1_interrupt+0x38/0xa8
> > > > [ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
> > > > __do_softirq+0x658/0x7ac
> > > > [ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
> > > > ____do_softirq+0x18/0x30
> > > > [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595
> > > > [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> > > > [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > [ C3] sp : ffff800093386d70
> > > > [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
> > > > [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
> > > > [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
> > > > [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> > > > [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
> > > > [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
> > > > [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> > > > [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> > > > [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
> > > > [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
> > > > [ C3] Call trace:
> > > > [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > [ C3] rmqueue_bulk+0x434/0x6b8
> > > > [ C3] get_page_from_freelist+0xdd4/0x1680
> > > > [ C3] __alloc_pages+0x244/0x508
> > > > [ C3] alloc_pages+0xf0/0x218
> > > > [ C3] __get_free_pages+0x1c/0x50
> > > > [ C3] kasan_populate_vmalloc_pte+0x30/0x188
> > > > [ C3] __apply_to_page_range+0x3ec/0x650
> > > > [ C3] apply_to_page_range+0x1c/0x30
> > > > [ C3] kasan_populate_vmalloc+0x60/0x70
> > > > [ C3] alloc_vmap_area.part.67+0x328/0xe50
> > > > [ C3] alloc_vmap_area+0x4c/0x78
> > > > [ C3] __get_vm_area_node.constprop.76+0x130/0x240
> > > > [ C3] __vmalloc_node_range+0x12c/0x340
> > > > [ C3] __vmalloc_node+0x8c/0xb0
> > > > [ C3] vmalloc+0x2c/0x40
> > > > [ C3] show_mem_init+0x1c/0xff8 [test]
> > > > [ C3] do_one_initcall+0xe4/0x500
> > > > [ C3] do_init_module+0x100/0x358
> > > > [ C3] load_module+0x2e64/0x2fc8
> > > > [ C3] init_module_from_file+0xec/0x148
> > > > [ C3] idempotent_init_module+0x278/0x380
> > > > [ C3] __arm64_sys_finit_module+0x88/0xf8
> > > > [ C3] invoke_syscall+0x64/0x188
> > > > [ C3] el0_svc_common.constprop.1+0xec/0x198
> > > > [ C3] do_el0_svc+0x48/0xc8
> > > > [ C3] el0_svc+0x3c/0xe8
> > > > [ C3] el0t_64_sync_handler+0xa0/0xc8
> > > > [ C3] el0t_64_sync+0x188/0x190
> > > >
> > > > and for depopuldate pte,
> > > >
> > > > [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
> > > > [ C6] Modules linked in: test(OE+)
> > > > [ C6] irq event stamp: 39458
> > > > [ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
> > > > _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > [ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
> > > > el1_interrupt+0x38/0xa8
> > > > [ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
> > > > __do_softirq+0x658/0x7ac
> > > > [ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
> > > > ____do_softirq+0x18/0x30
> > > > [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
> > > > 6.5.0+ #595
> > > > [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> > > > [ C6] Workqueue: events drain_vmap_area_work
> > > > [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > [ C6] sp : ffff80008fe676b0
> > > > [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
> > > > [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
> > > > [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
> > > > [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> > > > [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
> > > > [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
> > > > [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> > > > [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> > > > [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
> > > > [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
> > > > [ C6] Call trace:
> > > > [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > [ C6] free_pcppages_bulk+0x2bc/0x3e0
> > > > [ C6] free_unref_page_commit+0x1fc/0x290
> > > > [ C6] free_unref_page+0x184/0x250
> > > > [ C6] __free_pages+0x154/0x1a0
> > > > [ C6] free_pages+0x88/0xb0
> > > > [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
> > > > [ C6] __apply_to_page_range+0x3ec/0x650
> > > > [ C6] apply_to_existing_page_range+0x1c/0x30
> > > > [ C6] kasan_release_vmalloc+0xa4/0x118
> > > > [ C6] __purge_vmap_area_lazy+0x4f4/0xe30
> > > > [ C6] drain_vmap_area_work+0x60/0xc0
> > > > [ C6] process_one_work+0x4cc/0xa38
> > > > [ C6] worker_thread+0x240/0x638
> > > > [ C6] kthread+0x1c8/0x1e0
> > > > [ C6] ret_from_fork+0x10/0x20
> > > >
> > > >
> > > >
> > > > >
> > > > > Kefeng Wang (3):
> > > > > mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
> > > > > mm: kasan: shadow: move free_page() out of page table lock
> > > > > mm: kasan: shadow: HACK add cond_resched_lock() in
> > > > > kasan_depopulate_vmalloc_pte()
> >
> > The first 2 patches look ok, but yeah, the last is a hack. I also
> > don't have any better suggestions, only more questions.
>
> Thanks Marco, maybe we could convert free_vmap_area_lock from spinlock to
> mutex lock only if KASAN enabled?
>
I do not think it is a good suggestion. Could you please clarify the
reason of such conversion?
> >
> > Does this only happen on arm64?
>
> Our test case run on arm64 qemu(host is x86), so it run much more slower
> than real board.
> > Do you have a minimal reproducer you can share?
> Here is the code in test driver,
>
> void *buf = vmalloc(40UL << 30);
> vfree(buf);
>
What is a test driver? Why do you need 42G of memmory, for which purpose?
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-10-19 6:17 ` Uladzislau Rezki
@ 2023-10-19 7:26 ` Kefeng Wang
2023-10-19 8:53 ` Uladzislau Rezki
0 siblings, 1 reply; 14+ messages in thread
From: Kefeng Wang @ 2023-10-19 7:26 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Marco Elver, Andrey Ryabinin, Alexander Potapenko,
Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino,
Andrew Morton, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
On 2023/10/19 14:17, Uladzislau Rezki wrote:
> On Thu, Oct 19, 2023 at 09:40:10AM +0800, Kefeng Wang wrote:
>>
>>
>> On 2023/10/19 0:37, Marco Elver wrote:
>>> On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev
>>> <kasan-dev@googlegroups.com> wrote:
>>>>
>>>> The issue is easy to reproduced with large vmalloc, kindly ping...
>>>>
>>>> On 2023/9/15 8:58, Kefeng Wang wrote:
>>>>> Hi All, any suggest or comments,many thanks.
>>>>>
>>>>> On 2023/9/6 20:42, Kefeng Wang wrote:
>>>>>> This is a RFC, even patch3 is a hack to fix the softlock issue when
>>>>>> populate or depopulate pte with large region, looking forward to your
>>>>>> reply and advise, thanks.
>>>>>
>>>>> Here is full stack,for populate pte,
>>>>>
>>>>> [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
>>>>> [ C3] Modules linked in: test(OE+)
>>>>> [ C3] irq event stamp: 320776
>>>>> [ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
>>>>> _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>> [ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
>>>>> el1_interrupt+0x38/0xa8
>>>>> [ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
>>>>> __do_softirq+0x658/0x7ac
>>>>> [ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
>>>>> ____do_softirq+0x18/0x30
>>>>> [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595
>>>>> [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>>>>> [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>> [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>> [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>> [ C3] sp : ffff800093386d70
>>>>> [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
>>>>> [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
>>>>> [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
>>>>> [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
>>>>> [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
>>>>> [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
>>>>> [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
>>>>> [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
>>>>> [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
>>>>> [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
>>>>> [ C3] Call trace:
>>>>> [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>> [ C3] rmqueue_bulk+0x434/0x6b8
>>>>> [ C3] get_page_from_freelist+0xdd4/0x1680
>>>>> [ C3] __alloc_pages+0x244/0x508
>>>>> [ C3] alloc_pages+0xf0/0x218
>>>>> [ C3] __get_free_pages+0x1c/0x50
>>>>> [ C3] kasan_populate_vmalloc_pte+0x30/0x188
>>>>> [ C3] __apply_to_page_range+0x3ec/0x650
>>>>> [ C3] apply_to_page_range+0x1c/0x30
>>>>> [ C3] kasan_populate_vmalloc+0x60/0x70
>>>>> [ C3] alloc_vmap_area.part.67+0x328/0xe50
>>>>> [ C3] alloc_vmap_area+0x4c/0x78
>>>>> [ C3] __get_vm_area_node.constprop.76+0x130/0x240
>>>>> [ C3] __vmalloc_node_range+0x12c/0x340
>>>>> [ C3] __vmalloc_node+0x8c/0xb0
>>>>> [ C3] vmalloc+0x2c/0x40
>>>>> [ C3] show_mem_init+0x1c/0xff8 [test]
>>>>> [ C3] do_one_initcall+0xe4/0x500
>>>>> [ C3] do_init_module+0x100/0x358
>>>>> [ C3] load_module+0x2e64/0x2fc8
>>>>> [ C3] init_module_from_file+0xec/0x148
>>>>> [ C3] idempotent_init_module+0x278/0x380
>>>>> [ C3] __arm64_sys_finit_module+0x88/0xf8
>>>>> [ C3] invoke_syscall+0x64/0x188
>>>>> [ C3] el0_svc_common.constprop.1+0xec/0x198
>>>>> [ C3] do_el0_svc+0x48/0xc8
>>>>> [ C3] el0_svc+0x3c/0xe8
>>>>> [ C3] el0t_64_sync_handler+0xa0/0xc8
>>>>> [ C3] el0t_64_sync+0x188/0x190
>>>>>
>>>>> and for depopuldate pte,
>>>>>
>>>>> [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
>>>>> [ C6] Modules linked in: test(OE+)
>>>>> [ C6] irq event stamp: 39458
>>>>> [ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
>>>>> _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>> [ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
>>>>> el1_interrupt+0x38/0xa8
>>>>> [ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
>>>>> __do_softirq+0x658/0x7ac
>>>>> [ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
>>>>> ____do_softirq+0x18/0x30
>>>>> [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
>>>>> 6.5.0+ #595
>>>>> [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>>>>> [ C6] Workqueue: events drain_vmap_area_work
>>>>> [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>> [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>> [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>> [ C6] sp : ffff80008fe676b0
>>>>> [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
>>>>> [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
>>>>> [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
>>>>> [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
>>>>> [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
>>>>> [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
>>>>> [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
>>>>> [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
>>>>> [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
>>>>> [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
>>>>> [ C6] Call trace:
>>>>> [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>> [ C6] free_pcppages_bulk+0x2bc/0x3e0
>>>>> [ C6] free_unref_page_commit+0x1fc/0x290
>>>>> [ C6] free_unref_page+0x184/0x250
>>>>> [ C6] __free_pages+0x154/0x1a0
>>>>> [ C6] free_pages+0x88/0xb0
>>>>> [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
>>>>> [ C6] __apply_to_page_range+0x3ec/0x650
>>>>> [ C6] apply_to_existing_page_range+0x1c/0x30
>>>>> [ C6] kasan_release_vmalloc+0xa4/0x118
>>>>> [ C6] __purge_vmap_area_lazy+0x4f4/0xe30
>>>>> [ C6] drain_vmap_area_work+0x60/0xc0
>>>>> [ C6] process_one_work+0x4cc/0xa38
>>>>> [ C6] worker_thread+0x240/0x638
>>>>> [ C6] kthread+0x1c8/0x1e0
>>>>> [ C6] ret_from_fork+0x10/0x20
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Kefeng Wang (3):
>>>>>> mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte()
>>>>>> mm: kasan: shadow: move free_page() out of page table lock
>>>>>> mm: kasan: shadow: HACK add cond_resched_lock() in
>>>>>> kasan_depopulate_vmalloc_pte()
>>>
>>> The first 2 patches look ok, but yeah, the last is a hack. I also
>>> don't have any better suggestions, only more questions.
>>
>> Thanks Marco, maybe we could convert free_vmap_area_lock from spinlock to
>> mutex lock only if KASAN enabled?
>>
> I do not think it is a good suggestion. Could you please clarify the
> reason of such conversion?
See Call Trace of softlock, when map/unmap the vmalloc buf, the kasan
will populate and depopulate vmalloc pte, those will spend more time
than no-kasan kernel, for unmap, and there is already a
cond_resched_lock() in __purge_vmap_area_lazy(), but with more time
consumed under spinlock(free_vmap_area_lock), and we couldn't add
cond_resched_lock in kasan_depopulate_vmalloc_pte(), so if spin lock
converted to mutex lock, we could add a cond_resched into kasan
depopulate, this is why make such conversion if kasan enabled, but this
conversion maybe not correct, any better solution?
>
>>>
>>> Does this only happen on arm64?
>>
>> Our test case run on arm64 qemu(host is x86), so it run much more slower
>> than real board.
>>> Do you have a minimal reproducer you can share?
>> Here is the code in test driver,
>>
>> void *buf = vmalloc(40UL << 30);
>> vfree(buf);
>>
> What is a test driver? Why do you need 42G of memmory, for which purpose?
This is just to accelerate reproduction of above issue, the main reason
of the issue is too much time spent during kasan_populate_vmalloc() and
kasan_release_vmalloc().
Thanks.
>
> --
> Uladzislau Rezki
>
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-10-19 7:26 ` Kefeng Wang
@ 2023-10-19 8:53 ` Uladzislau Rezki
2023-10-19 9:47 ` Kefeng Wang
0 siblings, 1 reply; 14+ messages in thread
From: Uladzislau Rezki @ 2023-10-19 8:53 UTC (permalink / raw)
To: Kefeng Wang
Cc: Uladzislau Rezki, Marco Elver, Andrey Ryabinin,
Alexander Potapenko, Andrey Konovalov, Dmitry Vyukov,
Vincenzo Frascino, Andrew Morton, Christoph Hellwig,
Lorenzo Stoakes, kasan-dev, linux-mm
On Thu, Oct 19, 2023 at 03:26:48PM +0800, Kefeng Wang wrote:
>
>
> On 2023/10/19 14:17, Uladzislau Rezki wrote:
> > On Thu, Oct 19, 2023 at 09:40:10AM +0800, Kefeng Wang wrote:
> > >
> > >
> > > On 2023/10/19 0:37, Marco Elver wrote:
> > > > On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev
> > > > <kasan-dev@googlegroups.com> wrote:
> > > > >
> > > > > The issue is easy to reproduced with large vmalloc, kindly ping...
> > > > >
> > > > > On 2023/9/15 8:58, Kefeng Wang wrote:
> > > > > > Hi All, any suggest or comments,many thanks.
> > > > > >
> > > > > > On 2023/9/6 20:42, Kefeng Wang wrote:
> > > > > > > This is a RFC, even patch3 is a hack to fix the softlock issue when
> > > > > > > populate or depopulate pte with large region, looking forward to your
> > > > > > > reply and advise, thanks.
> > > > > >
> > > > > > Here is full stack,for populate pte,
> > > > > >
> > > > > > [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
> > > > > > [ C3] Modules linked in: test(OE+)
> > > > > > [ C3] irq event stamp: 320776
> > > > > > [ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
> > > > > > _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > > > [ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
> > > > > > el1_interrupt+0x38/0xa8
> > > > > > [ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
> > > > > > __do_softirq+0x658/0x7ac
> > > > > > [ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
> > > > > > ____do_softirq+0x18/0x30
> > > > > > [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595
> > > > > > [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> > > > > > [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > > > [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > > > [ C3] sp : ffff800093386d70
> > > > > > [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
> > > > > > [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
> > > > > > [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
> > > > > > [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> > > > > > [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
> > > > > > [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
> > > > > > [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> > > > > > [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> > > > > > [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
> > > > > > [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
> > > > > > [ C3] Call trace:
> > > > > > [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > > > [ C3] rmqueue_bulk+0x434/0x6b8
> > > > > > [ C3] get_page_from_freelist+0xdd4/0x1680
> > > > > > [ C3] __alloc_pages+0x244/0x508
> > > > > > [ C3] alloc_pages+0xf0/0x218
> > > > > > [ C3] __get_free_pages+0x1c/0x50
> > > > > > [ C3] kasan_populate_vmalloc_pte+0x30/0x188
> > > > > > [ C3] __apply_to_page_range+0x3ec/0x650
> > > > > > [ C3] apply_to_page_range+0x1c/0x30
> > > > > > [ C3] kasan_populate_vmalloc+0x60/0x70
> > > > > > [ C3] alloc_vmap_area.part.67+0x328/0xe50
> > > > > > [ C3] alloc_vmap_area+0x4c/0x78
> > > > > > [ C3] __get_vm_area_node.constprop.76+0x130/0x240
> > > > > > [ C3] __vmalloc_node_range+0x12c/0x340
> > > > > > [ C3] __vmalloc_node+0x8c/0xb0
> > > > > > [ C3] vmalloc+0x2c/0x40
> > > > > > [ C3] show_mem_init+0x1c/0xff8 [test]
> > > > > > [ C3] do_one_initcall+0xe4/0x500
> > > > > > [ C3] do_init_module+0x100/0x358
> > > > > > [ C3] load_module+0x2e64/0x2fc8
> > > > > > [ C3] init_module_from_file+0xec/0x148
> > > > > > [ C3] idempotent_init_module+0x278/0x380
> > > > > > [ C3] __arm64_sys_finit_module+0x88/0xf8
> > > > > > [ C3] invoke_syscall+0x64/0x188
> > > > > > [ C3] el0_svc_common.constprop.1+0xec/0x198
> > > > > > [ C3] do_el0_svc+0x48/0xc8
> > > > > > [ C3] el0_svc+0x3c/0xe8
> > > > > > [ C3] el0t_64_sync_handler+0xa0/0xc8
> > > > > > [ C3] el0t_64_sync+0x188/0x190
> > > > > >
This trace is stuck in the rmqueue_bulk() because you request a
huge alloc size. It has nothing to do with free_vmap_area_lock,
it is about bulk allocator. It gets stuck to accomplish such
demand.
> > > > > > and for depopuldate pte,
> > > > > >
> > > > > > [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
> > > > > > [ C6] Modules linked in: test(OE+)
> > > > > > [ C6] irq event stamp: 39458
> > > > > > [ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
> > > > > > _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > > > [ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
> > > > > > el1_interrupt+0x38/0xa8
> > > > > > [ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
> > > > > > __do_softirq+0x658/0x7ac
> > > > > > [ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
> > > > > > ____do_softirq+0x18/0x30
> > > > > > [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
> > > > > > 6.5.0+ #595
> > > > > > [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
> > > > > > [ C6] Workqueue: events drain_vmap_area_work
> > > > > > [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > > > > > [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > > > [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
> > > > > > [ C6] sp : ffff80008fe676b0
> > > > > > [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
> > > > > > [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
> > > > > > [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
> > > > > > [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
> > > > > > [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
> > > > > > [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
> > > > > > [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
> > > > > > [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
> > > > > > [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
> > > > > > [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
> > > > > > [ C6] Call trace:
> > > > > > [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
> > > > > > [ C6] free_pcppages_bulk+0x2bc/0x3e0
> > > > > > [ C6] free_unref_page_commit+0x1fc/0x290
> > > > > > [ C6] free_unref_page+0x184/0x250
> > > > > > [ C6] __free_pages+0x154/0x1a0
> > > > > > [ C6] free_pages+0x88/0xb0
> > > > > > [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
> > > > > > [ C6] __apply_to_page_range+0x3ec/0x650
> > > > > > [ C6] apply_to_existing_page_range+0x1c/0x30
> > > > > > [ C6] kasan_release_vmalloc+0xa4/0x118
> > > > > > [ C6] __purge_vmap_area_lazy+0x4f4/0xe30
> > > > > > [ C6] drain_vmap_area_work+0x60/0xc0
> > > > > > [ C6] process_one_work+0x4cc/0xa38
> > > > > > [ C6] worker_thread+0x240/0x638
> > > > > > [ C6] kthread+0x1c8/0x1e0
> > > > > > [ C6] ret_from_fork+0x10/0x20
> > > > > >
>
> See Call Trace of softlock, when map/unmap the vmalloc buf, the kasan will
> populate and depopulate vmalloc pte, those will spend more time than
> no-kasan kernel, for unmap, and there is already a cond_resched_lock() in
> __purge_vmap_area_lazy(), but with more time consumed under
> spinlock(free_vmap_area_lock), and we couldn't add cond_resched_lock in
> kasan_depopulate_vmalloc_pte(), so if spin lock converted to mutex lock, we
> could add a cond_resched into kasan depopulate, this is why make such
> conversion if kasan enabled, but this
> conversion maybe not correct, any better solution?
>
I have at least below thoughts:
a) Add a max allowed threshold that user can request over vmalloc() call.
I do not think ~40G is a correct request.
b) This can fix unmap path:
<snip>
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index ef8599d394fd..988735da5c5c 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1723,7 +1723,6 @@ static void purge_fragmented_blocks_allcpus(void);
*/
static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
{
- unsigned long resched_threshold;
unsigned int num_purged_areas = 0;
struct list_head local_purge_list;
struct vmap_area *va, *n_va;
@@ -1747,36 +1746,32 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
struct vmap_area, list)->va_end);
flush_tlb_kernel_range(start, end);
- resched_threshold = lazy_max_pages() << 1;
- spin_lock(&free_vmap_area_lock);
list_for_each_entry_safe(va, n_va, &local_purge_list, list) {
unsigned long nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
unsigned long orig_start = va->va_start;
unsigned long orig_end = va->va_end;
+ if (is_vmalloc_or_module_addr((void *)orig_start))
+ kasan_release_vmalloc(orig_start, orig_end,
+ va->va_start, va->va_end);
+
/*
* Finally insert or merge lazily-freed area. It is
* detached and there is no need to "unlink" it from
* anything.
*/
+ spin_lock(&free_vmap_area_lock);
va = merge_or_add_vmap_area_augment(va, &free_vmap_area_root,
&free_vmap_area_list);
+ spin_unlock(&free_vmap_area_lock);
if (!va)
continue;
- if (is_vmalloc_or_module_addr((void *)orig_start))
- kasan_release_vmalloc(orig_start, orig_end,
- va->va_start, va->va_end);
-
atomic_long_sub(nr, &vmap_lazy_nr);
num_purged_areas++;
-
- if (atomic_long_read(&vmap_lazy_nr) < resched_threshold)
- cond_resched_lock(&free_vmap_area_lock);
}
- spin_unlock(&free_vmap_area_lock);
out:
trace_purge_vmap_area_lazy(start, end, num_purged_areas);
<snip>
c) bulk-path i have not checked, but on a high level kasan_populate_vmalloc()
should take a breath between requests.
--
Uladzislau Rezki
^ permalink raw reply [flat|nested] 14+ messages in thread
* Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte
2023-10-19 8:53 ` Uladzislau Rezki
@ 2023-10-19 9:47 ` Kefeng Wang
0 siblings, 0 replies; 14+ messages in thread
From: Kefeng Wang @ 2023-10-19 9:47 UTC (permalink / raw)
To: Uladzislau Rezki
Cc: Marco Elver, Andrey Ryabinin, Alexander Potapenko,
Andrey Konovalov, Dmitry Vyukov, Vincenzo Frascino,
Andrew Morton, Christoph Hellwig, Lorenzo Stoakes, kasan-dev,
linux-mm
On 2023/10/19 16:53, Uladzislau Rezki wrote:
> On Thu, Oct 19, 2023 at 03:26:48PM +0800, Kefeng Wang wrote:
>>
>>
>> On 2023/10/19 14:17, Uladzislau Rezki wrote:
>>> On Thu, Oct 19, 2023 at 09:40:10AM +0800, Kefeng Wang wrote:
>>>>
>>>>
>>>> On 2023/10/19 0:37, Marco Elver wrote:
>>>>> On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev
>>>>> <kasan-dev@googlegroups.com> wrote:
>>>>>>
>>>>>> The issue is easy to reproduced with large vmalloc, kindly ping...
>>>>>>
>>>>>> On 2023/9/15 8:58, Kefeng Wang wrote:
>>>>>>> Hi All, any suggest or comments,many thanks.
>>>>>>>
>>>>>>> On 2023/9/6 20:42, Kefeng Wang wrote:
>>>>>>>> This is a RFC, even patch3 is a hack to fix the softlock issue when
>>>>>>>> populate or depopulate pte with large region, looking forward to your
>>>>>>>> reply and advise, thanks.
>>>>>>>
>>>>>>> Here is full stack,for populate pte,
>>>>>>>
>>>>>>> [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458]
>>>>>>> [ C3] Modules linked in: test(OE+)
>>>>>>> [ C3] irq event stamp: 320776
>>>>>>> [ C3] hardirqs last enabled at (320775): [<ffff8000815a0c98>]
>>>>>>> _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>>>> [ C3] hardirqs last disabled at (320776): [<ffff8000815816e0>]
>>>>>>> el1_interrupt+0x38/0xa8
>>>>>>> [ C3] softirqs last enabled at (318174): [<ffff800080040ba8>]
>>>>>>> __do_softirq+0x658/0x7ac
>>>>>>> [ C3] softirqs last disabled at (318169): [<ffff800080047fd8>]
>>>>>>> ____do_softirq+0x18/0x30
>>>>>>> [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595
>>>>>>> [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>>>>>>> [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>>>> [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>>>> [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>>>> [ C3] sp : ffff800093386d70
>>>>>>> [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0
>>>>>>> [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708
>>>>>>> [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000
>>>>>>> [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
>>>>>>> [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60
>>>>>>> [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9
>>>>>>> [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
>>>>>>> [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
>>>>>>> [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70
>>>>>>> [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507
>>>>>>> [ C3] Call trace:
>>>>>>> [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>>>> [ C3] rmqueue_bulk+0x434/0x6b8
>>>>>>> [ C3] get_page_from_freelist+0xdd4/0x1680
>>>>>>> [ C3] __alloc_pages+0x244/0x508
>>>>>>> [ C3] alloc_pages+0xf0/0x218
>>>>>>> [ C3] __get_free_pages+0x1c/0x50
>>>>>>> [ C3] kasan_populate_vmalloc_pte+0x30/0x188
>>>>>>> [ C3] __apply_to_page_range+0x3ec/0x650
>>>>>>> [ C3] apply_to_page_range+0x1c/0x30
>>>>>>> [ C3] kasan_populate_vmalloc+0x60/0x70
>>>>>>> [ C3] alloc_vmap_area.part.67+0x328/0xe50
>>>>>>> [ C3] alloc_vmap_area+0x4c/0x78
>>>>>>> [ C3] __get_vm_area_node.constprop.76+0x130/0x240
>>>>>>> [ C3] __vmalloc_node_range+0x12c/0x340
>>>>>>> [ C3] __vmalloc_node+0x8c/0xb0
>>>>>>> [ C3] vmalloc+0x2c/0x40
>>>>>>> [ C3] show_mem_init+0x1c/0xff8 [test]
>>>>>>> [ C3] do_one_initcall+0xe4/0x500
>>>>>>> [ C3] do_init_module+0x100/0x358
>>>>>>> [ C3] load_module+0x2e64/0x2fc8
>>>>>>> [ C3] init_module_from_file+0xec/0x148
>>>>>>> [ C3] idempotent_init_module+0x278/0x380
>>>>>>> [ C3] __arm64_sys_finit_module+0x88/0xf8
>>>>>>> [ C3] invoke_syscall+0x64/0x188
>>>>>>> [ C3] el0_svc_common.constprop.1+0xec/0x198
>>>>>>> [ C3] do_el0_svc+0x48/0xc8
>>>>>>> [ C3] el0_svc+0x3c/0xe8
>>>>>>> [ C3] el0t_64_sync_handler+0xa0/0xc8
>>>>>>> [ C3] el0t_64_sync+0x188/0x190
>>>>>>>
> This trace is stuck in the rmqueue_bulk() because you request a
> huge alloc size. It has nothing to do with free_vmap_area_lock,
> it is about bulk allocator. It gets stuck to accomplish such
> demand.
Yes, this is not about spinlock issue, it runs too much time in
kasan_populate_vmalloc() as the __apply_to_page_range() with a
large range, and this issue could be fixed by adding a cond_resched()
in kasan_populate_vmalloc(), see patch1.
>
>
>>>>>>> and for depopuldate pte,
>>>>>>>
>>>>>>> [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59]
>>>>>>> [ C6] Modules linked in: test(OE+)
>>>>>>> [ C6] irq event stamp: 39458
>>>>>>> [ C6] hardirqs last enabled at (39457): [<ffff8000815a0c98>]
>>>>>>> _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>>>> [ C6] hardirqs last disabled at (39458): [<ffff8000815816e0>]
>>>>>>> el1_interrupt+0x38/0xa8
>>>>>>> [ C6] softirqs last enabled at (39420): [<ffff800080040ba8>]
>>>>>>> __do_softirq+0x658/0x7ac
>>>>>>> [ C6] softirqs last disabled at (39415): [<ffff800080047fd8>]
>>>>>>> ____do_softirq+0x18/0x30
>>>>>>> [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL
>>>>>>> 6.5.0+ #595
>>>>>>> [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015
>>>>>>> [ C6] Workqueue: events drain_vmap_area_work
>>>>>>> [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
>>>>>>> [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>>>> [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8
>>>>>>> [ C6] sp : ffff80008fe676b0
>>>>>>> [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80
>>>>>>> [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006
>>>>>>> [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006
>>>>>>> [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000
>>>>>>> [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0
>>>>>>> [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9
>>>>>>> [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8
>>>>>>> [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000
>>>>>>> [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98
>>>>>>> [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21
>>>>>>> [ C6] Call trace:
>>>>>>> [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8
>>>>>>> [ C6] free_pcppages_bulk+0x2bc/0x3e0
>>>>>>> [ C6] free_unref_page_commit+0x1fc/0x290
>>>>>>> [ C6] free_unref_page+0x184/0x250
>>>>>>> [ C6] __free_pages+0x154/0x1a0
>>>>>>> [ C6] free_pages+0x88/0xb0
>>>>>>> [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80
>>>>>>> [ C6] __apply_to_page_range+0x3ec/0x650
>>>>>>> [ C6] apply_to_existing_page_range+0x1c/0x30
>>>>>>> [ C6] kasan_release_vmalloc+0xa4/0x118
>>>>>>> [ C6] __purge_vmap_area_lazy+0x4f4/0xe30
>>>>>>> [ C6] drain_vmap_area_work+0x60/0xc0
>>>>>>> [ C6] process_one_work+0x4cc/0xa38
>>>>>>> [ C6] worker_thread+0x240/0x638
>>>>>>> [ C6] kthread+0x1c8/0x1e0
>>>>>>> [ C6] ret_from_fork+0x10/0x20
>>>>>>>
>>
>> See Call Trace of softlock, when map/unmap the vmalloc buf, the kasan will
>> populate and depopulate vmalloc pte, those will spend more time than
>> no-kasan kernel, for unmap, and there is already a cond_resched_lock() in
>> __purge_vmap_area_lazy(), but with more time consumed under
>> spinlock(free_vmap_area_lock), and we couldn't add cond_resched_lock in
>> kasan_depopulate_vmalloc_pte(), so if spin lock converted to mutex lock, we
>> could add a cond_resched into kasan depopulate, this is why make such
>> conversion if kasan enabled, but this
>> conversion maybe not correct, any better solution?
>>
> I have at least below thoughts:
>
> a) Add a max allowed threshold that user can request over vmalloc() call.
> I do not think ~40G is a correct request.
I don't know, but maybe some driver could map large range , but we do
meet this issue in qemu, though it is very low probability.
>
> b) This can fix unmap path:
>
> <snip>
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index ef8599d394fd..988735da5c5c 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -1723,7 +1723,6 @@ static void purge_fragmented_blocks_allcpus(void);
> */
> static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
> {
> - unsigned long resched_threshold;
> unsigned int num_purged_areas = 0;
> struct list_head local_purge_list;
> struct vmap_area *va, *n_va;
> @@ -1747,36 +1746,32 @@ static bool __purge_vmap_area_lazy(unsigned long start, unsigned long end)
> struct vmap_area, list)->va_end);
>
> flush_tlb_kernel_range(start, end);
> - resched_threshold = lazy_max_pages() << 1;
>
> - spin_lock(&free_vmap_area_lock);
> list_for_each_entry_safe(va, n_va, &local_purge_list, list) {
> unsigned long nr = (va->va_end - va->va_start) >> PAGE_SHIFT;
> unsigned long orig_start = va->va_start;
> unsigned long orig_end = va->va_end;
>
> + if (is_vmalloc_or_module_addr((void *)orig_start))
> + kasan_release_vmalloc(orig_start, orig_end,
> + va->va_start, va->va_end);
> +
> /*
> * Finally insert or merge lazily-freed area. It is
> * detached and there is no need to "unlink" it from
> * anything.
> */
> + spin_lock(&free_vmap_area_lock);
> va = merge_or_add_vmap_area_augment(va, &free_vmap_area_root,
> &free_vmap_area_list);
> + spin_unlock(&free_vmap_area_lock);
>
> if (!va)
> continue;
>
> - if (is_vmalloc_or_module_addr((void *)orig_start))
> - kasan_release_vmalloc(orig_start, orig_end,
> - va->va_start, va->va_end);
> -
> atomic_long_sub(nr, &vmap_lazy_nr);
> num_purged_areas++;
> -
> - if (atomic_long_read(&vmap_lazy_nr) < resched_threshold)
> - cond_resched_lock(&free_vmap_area_lock);
> }
> - spin_unlock(&free_vmap_area_lock);
>
> out:
> trace_purge_vmap_area_lazy(start, end, num_purged_areas);
> <snip>
Thanks for you suggestion. but check kasan_release_vmalloc(), it seems
that kasan_release_vmalloc() need free_vmap_area_lock from comment[1],
Marco and all kasan maintainers, please help to check the above way.
[1] https://elixir.bootlin.com/linux/v6.6-rc6/source/mm/kasan/shadow.c#L491
>
> c) bulk-path i have not checked, but on a high level kasan_populate_vmalloc()
> should take a breath between requests.
>
> --
> Uladzislau Rezki
>
^ permalink raw reply [flat|nested] 14+ messages in thread
end of thread, other threads:[~2023-10-19 9:47 UTC | newest]
Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-09-06 12:42 [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 1/3] mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 2/3] mm: kasan: shadow: move free_page() out of page table lock Kefeng Wang
2023-09-06 12:42 ` [PATCH -rfc 3/3] mm: kasan: shadow: HACK: add cond_resched_lock() in kasan_depopulate_vmalloc_pte() Kefeng Wang
2023-09-13 8:48 ` kernel test robot
2023-09-13 11:21 ` Kefeng Wang
2023-09-15 0:58 ` [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Kefeng Wang
2023-10-18 14:16 ` Kefeng Wang
2023-10-18 16:37 ` Marco Elver
2023-10-19 1:40 ` Kefeng Wang
2023-10-19 6:17 ` Uladzislau Rezki
2023-10-19 7:26 ` Kefeng Wang
2023-10-19 8:53 ` Uladzislau Rezki
2023-10-19 9:47 ` Kefeng Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox