* [syzbot] [mm?] WARNING in folio_large_mapcount
@ 2025-05-17 8:21 syzbot
2025-05-19 13:26 ` David Hildenbrand
0 siblings, 1 reply; 7+ messages in thread
From: syzbot @ 2025-05-17 8:21 UTC (permalink / raw)
To: Liam.Howlett, akpm, baolin.wang, david, dev.jain, linux-kernel,
linux-mm, lorenzo.stoakes, npache, ryan.roberts, syzkaller-bugs,
ziy
Hello,
syzbot found the following issue on:
HEAD commit: 627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
kernel config: https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
------------[ cut here ]------------
WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
Modules linked in:
CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
folio_mapcount include/linux/mm.h:1369 [inline]
is_refcount_suitable+0x350/0x430 mm/khugepaged.c:553
hpage_collapse_scan_file+0x6d4/0x4200 mm/khugepaged.c:2323
khugepaged_scan_mm_slot mm/khugepaged.c:2447 [inline]
khugepaged_do_scan mm/khugepaged.c:2548 [inline]
khugepaged+0xa2a/0x1690 mm/khugepaged.c:2604
kthread+0x70e/0x8a0 kernel/kthread.c:464
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
</TASK>
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at syzkaller@googlegroups.com.
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_large_mapcount
2025-05-17 8:21 [syzbot] [mm?] WARNING in folio_large_mapcount syzbot
@ 2025-05-19 13:26 ` David Hildenbrand
2025-05-20 5:45 ` Shivank Garg
0 siblings, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2025-05-19 13:26 UTC (permalink / raw)
To: syzbot, Liam.Howlett, akpm, baolin.wang, dev.jain, linux-kernel,
linux-mm, lorenzo.stoakes, npache, ryan.roberts, syzkaller-bugs,
ziy, Matthew Wilcox
On 17.05.25 10:21, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
> kernel config: https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
> dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
This should be
VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
> Modules linked in:
> CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
> RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
> Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
> RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
> RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
> RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
> RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
> R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
> R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
> FS: 0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
> <TASK>
> folio_mapcount include/linux/mm.h:1369 [inline]
And here we come through
if (likely(!folio_test_large(folio))) {
...
}
return folio_large_mapcount(folio);
So the folio is split concurrently. And I think there is nothing
stopping it from getting freed.
We do a xas_for_each() under RCU. So yes, this is racy.
In collapse_file(), we re-validate everything.
We could
(A) Take proper pagecache locks
(B) Try grabbing a temporary folio reference
(C) Try snapshotting the folio
Probably, in this code, (B) might be cleanest for now? Handling it just
like other code in mm/filemap.c.
> is_refcount_suitable+0x350/0x430 mm/khugepaged.c:553
> hpage_collapse_scan_file+0x6d4/0x4200 mm/khugepaged.c:2323
> khugepaged_scan_mm_slot mm/khugepaged.c:2447 [inline]
> khugepaged_do_scan mm/khugepaged.c:2548 [inline]
> khugepaged+0xa2a/0x1690 mm/khugepaged.c:2604
> kthread+0x70e/0x8a0 kernel/kthread.c:464
> ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:153
> ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> </TASK>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_large_mapcount
2025-05-19 13:26 ` David Hildenbrand
@ 2025-05-20 5:45 ` Shivank Garg
2025-05-20 5:46 ` syzbot
2025-05-20 14:05 ` David Hildenbrand
0 siblings, 2 replies; 7+ messages in thread
From: Shivank Garg @ 2025-05-20 5:45 UTC (permalink / raw)
To: David Hildenbrand, syzbot, Liam.Howlett, akpm, baolin.wang,
dev.jain, linux-kernel, linux-mm, lorenzo.stoakes, npache,
ryan.roberts, syzkaller-bugs, ziy, Matthew Wilcox
[-- Attachment #1: Type: text/plain, Size: 3638 bytes --]
On 5/19/2025 6:56 PM, David Hildenbrand wrote:
> On 17.05.25 10:21, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit: 627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
>> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
>>
>> ------------[ cut here ]------------
>> WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>
> This should be
>
> VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
>
>> Modules linked in:
>> CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
>> RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>> Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
>> RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
>> RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
>> RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
>> RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
>> R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
>> R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
>> FS: 0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> Call Trace:
>> <TASK>
>> folio_mapcount include/linux/mm.h:1369 [inline]
>
> And here we come through
>
> if (likely(!folio_test_large(folio))) {
> ...
> }
> return folio_large_mapcount(folio);
>
>
> So the folio is split concurrently. And I think there is nothing stopping it from getting freed.
>
> We do a xas_for_each() under RCU. So yes, this is racy.
>
> In collapse_file(), we re-validate everything.
>
> We could
>
> (A) Take proper pagecache locks
>
> (B) Try grabbing a temporary folio reference
>
> (C) Try snapshotting the folio
>
> Probably, in this code, (B) might be cleanest for now? Handling it just like other code in mm/filemap.c.
>
Hi,
I've implemented your suggestion (B) using folio_try_get().
Could you please review if my patch looks correct?
Tested it using existing selftests: sudo make -C tools/testing/selftests/mm run_tests
Other two instances of is_refcount_suitable() uses folio locking. Should we maintain
consistency with those?
Thanks,
Shivank
#syz test
[-- Attachment #2: 0001-mm-khugepaged-Fix-race-with-folio-splitting-in-hpage.patch --]
[-- Type: text/plain, Size: 2498 bytes --]
From d1c3427e80215fea992428c8b5caf5291725dd65 Mon Sep 17 00:00:00 2001
From: Shivank Garg <shivankg@amd.com>
Date: Mon, 19 May 2025 20:19:32 +0000
Subject: [PATCH] mm/khugepaged: Fix race with folio splitting in
hpage_collapse_scan_file()
folio_mapcount() checks folio_test_large() before proceeding to
folio_large_mapcount(), but there exists a race window where a folio
could be split between these checks which triggered the
VM_WARN_ON_FOLIO(!folio_test_large(folio), folio) in
folio_large_mapcount().
Take a temporary folio reference in hpage_collapse_scan_file() to prevent
races with concurrent folio splitting/freeing. This prevent potential
incorrect large folio detection.
Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6828470d.a70a0220.38f255.000c.GAE@google.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shivank Garg <shivankg@amd.com>
---
mm/khugepaged.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index cc945c6ab3bd..ef4f95409723 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2295,6 +2295,19 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
continue;
}
+ /* Take a reference to prevent any concurrent split or free. */
+ if (!folio_try_get(folio)) {
+ xas_reset(&xas);
+ continue;
+ }
+
+ /* Has the folio been freed or split? */
+ if (unlikely(folio != xas_reload(&xas))) {
+ folio_put(folio);
+ xas_reset(&xas);
+ continue;
+ }
+
if (folio_order(folio) == HPAGE_PMD_ORDER &&
folio->index == start) {
/* Maybe PMD-mapped */
@@ -2305,23 +2318,27 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
* it's safe to skip LRU and refcount checks before
* returning.
*/
+ folio_put(folio);
break;
}
node = folio_nid(folio);
if (hpage_collapse_scan_abort(node, cc)) {
result = SCAN_SCAN_ABORT;
+ folio_put(folio);
break;
}
cc->node_load[node]++;
if (!folio_test_lru(folio)) {
result = SCAN_PAGE_LRU;
+ folio_put(folio);
break;
}
if (!is_refcount_suitable(folio)) {
result = SCAN_PAGE_COUNT;
+ folio_put(folio);
break;
}
@@ -2333,6 +2350,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
*/
present += folio_nr_pages(folio);
+ folio_put(folio);
if (need_resched()) {
xas_pause(&xas);
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_large_mapcount
2025-05-20 5:45 ` Shivank Garg
@ 2025-05-20 5:46 ` syzbot
2025-05-20 14:05 ` David Hildenbrand
1 sibling, 0 replies; 7+ messages in thread
From: syzbot @ 2025-05-20 5:46 UTC (permalink / raw)
To: shivankg
Cc: akpm, baolin.wang, david, dev.jain, liam.howlett, linux-kernel,
linux-mm, lorenzo.stoakes, npache, ryan.roberts, shivankg,
syzkaller-bugs, willy, ziy
> On 5/19/2025 6:56 PM, David Hildenbrand wrote:
>> On 17.05.25 10:21, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit: 627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
>>> git tree: upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
>>> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>
>>> Downloadable assets:
>>> disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
>>> vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
>>> kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>
>> This should be
>>
>> VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
>>
>>> Modules linked in:
>>> CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
>>> RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>> Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
>>> RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
>>> RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
>>> RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
>>> RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
>>> R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
>>> R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
>>> FS: 0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>> <TASK>
>>> folio_mapcount include/linux/mm.h:1369 [inline]
>>
>> And here we come through
>>
>> if (likely(!folio_test_large(folio))) {
>> ...
>> }
>> return folio_large_mapcount(folio);
>>
>>
>> So the folio is split concurrently. And I think there is nothing stopping it from getting freed.
>>
>> We do a xas_for_each() under RCU. So yes, this is racy.
>>
>> In collapse_file(), we re-validate everything.
>>
>> We could
>>
>> (A) Take proper pagecache locks
>>
>> (B) Try grabbing a temporary folio reference
>>
>> (C) Try snapshotting the folio
>>
>> Probably, in this code, (B) might be cleanest for now? Handling it just like other code in mm/filemap.c.
>>
>
> Hi,
>
> I've implemented your suggestion (B) using folio_try_get().
> Could you please review if my patch looks correct?
>
> Tested it using existing selftests: sudo make -C tools/testing/selftests/mm run_tests
>
> Other two instances of is_refcount_suitable() uses folio locking. Should we maintain
> consistency with those?
>
> Thanks,
> Shivank
>
> #syz test
This crash does not have a reproducer. I cannot test it.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_large_mapcount
2025-05-20 5:45 ` Shivank Garg
2025-05-20 5:46 ` syzbot
@ 2025-05-20 14:05 ` David Hildenbrand
2025-05-22 4:57 ` Shivank Garg
1 sibling, 1 reply; 7+ messages in thread
From: David Hildenbrand @ 2025-05-20 14:05 UTC (permalink / raw)
To: Shivank Garg, syzbot, Liam.Howlett, akpm, baolin.wang, dev.jain,
linux-kernel, linux-mm, lorenzo.stoakes, npache, ryan.roberts,
syzkaller-bugs, ziy, Matthew Wilcox
On 20.05.25 07:45, Shivank Garg wrote:
> On 5/19/2025 6:56 PM, David Hildenbrand wrote:
>> On 17.05.25 10:21, syzbot wrote:
>>> Hello,
>>>
>>> syzbot found the following issue on:
>>>
>>> HEAD commit: 627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
>>> git tree: upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
>>> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>>>
>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>
>>> Downloadable assets:
>>> disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
>>> vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
>>> kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
>>>
>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>> Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
>>>
>>> ------------[ cut here ]------------
>>> WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>
>> This should be
>>
>> VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
>>
>>> Modules linked in:
>>> CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
>>> RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>> Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
>>> RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
>>> RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
>>> RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
>>> RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
>>> R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
>>> R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
>>> FS: 0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>> Call Trace:
>>> <TASK>
>>> folio_mapcount include/linux/mm.h:1369 [inline]
>>
>> And here we come through
>>
>> if (likely(!folio_test_large(folio))) {
>> ...
>> }
>> return folio_large_mapcount(folio);
>>
>>
>> So the folio is split concurrently. And I think there is nothing stopping it from getting freed.
>>
>> We do a xas_for_each() under RCU. So yes, this is racy.
>>
>> In collapse_file(), we re-validate everything.
>>
>> We could
>>
>> (A) Take proper pagecache locks
>>
>> (B) Try grabbing a temporary folio reference
>>
>> (C) Try snapshotting the folio
>>
>> Probably, in this code, (B) might be cleanest for now? Handling it just like other code in mm/filemap.c.
>>
>
> Hi,
Hi,
>
> I've implemented your suggestion (B) using folio_try_get().
> Could you please review if my patch looks correct?
You should probably drop both comments, the code merely mimics what
filemap.c does.
Apart from that, nothing jumped at me.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_large_mapcount
2025-05-20 14:05 ` David Hildenbrand
@ 2025-05-22 4:57 ` Shivank Garg
2025-05-22 7:11 ` David Hildenbrand
0 siblings, 1 reply; 7+ messages in thread
From: Shivank Garg @ 2025-05-22 4:57 UTC (permalink / raw)
To: David Hildenbrand, syzbot, Liam.Howlett, akpm, baolin.wang,
dev.jain, linux-kernel, linux-mm, lorenzo.stoakes, npache,
ryan.roberts, syzkaller-bugs, ziy, Matthew Wilcox
[-- Attachment #1: Type: text/plain, Size: 3871 bytes --]
On 5/20/2025 7:35 PM, David Hildenbrand wrote:
> On 20.05.25 07:45, Shivank Garg wrote:
>> On 5/19/2025 6:56 PM, David Hildenbrand wrote:
>>> On 17.05.25 10:21, syzbot wrote:
>>>> Hello,
>>>>
>>>> syzbot found the following issue on:
>>>>
>>>> HEAD commit: 627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
>>>> git tree: upstream
>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
>>>> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>>>>
>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>
>>>> Downloadable assets:
>>>> disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
>>>> kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
>>>>
>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>> Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
>>>>
>>>> ------------[ cut here ]------------
>>>> WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>>
>>> This should be
>>>
>>> VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
>>>
>>>> Modules linked in:
>>>> CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
>>>> RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>>> Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
>>>> RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
>>>> RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
>>>> RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
>>>> RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
>>>> R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
>>>> R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
>>>> FS: 0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>> Call Trace:
>>>> <TASK>
>>>> folio_mapcount include/linux/mm.h:1369 [inline]
>>>
>>> And here we come through
>>>
>>> if (likely(!folio_test_large(folio))) {
>>> ...
>>> }
>>> return folio_large_mapcount(folio);
>>>
>>>
>>> So the folio is split concurrently. And I think there is nothing stopping it from getting freed.
>>>
>>> We do a xas_for_each() under RCU. So yes, this is racy.
>>>
>>> In collapse_file(), we re-validate everything.
>>>
>>> We could
>>>
>>> (A) Take proper pagecache locks
>>>
>>> (B) Try grabbing a temporary folio reference
>>>
>>> (C) Try snapshotting the folio
>>>
>>> Probably, in this code, (B) might be cleanest for now? Handling it just like other code in mm/filemap.c.
>>>
>>
>> Hi,
>
> Hi,
>
>>
>> I've implemented your suggestion (B) using folio_try_get().
>> Could you please review if my patch looks correct?
>
> You should probably drop both comments, the code merely mimics what filemap.c does.
>
> Apart from that, nothing jumped at me.
>
Thank you. I have attached revised patch.
Best Regards,
Shivank
[-- Attachment #2: 0001-mm-khugepaged-Fix-race-with-folio-splitting-in-hpage.patch --]
[-- Type: text/plain, Size: 2385 bytes --]
From e77693b67a8c032d636f6c0bc3f179c4e9cc1133 Mon Sep 17 00:00:00 2001
From: Shivank Garg <shivankg@amd.com>
Date: Mon, 19 May 2025 20:19:32 +0000
Subject: [PATCH] mm/khugepaged: Fix race with folio splitting in
hpage_collapse_scan_file()
folio_mapcount() checks folio_test_large() before proceeding to
folio_large_mapcount(), but there exists a race window where a folio
could be split between these checks which triggered the
VM_WARN_ON_FOLIO(!folio_test_large(folio), folio) in
folio_large_mapcount().
Take a temporary folio reference in hpage_collapse_scan_file() to prevent
races with concurrent folio splitting/freeing. This prevent potential
incorrect large folio detection.
Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/6828470d.a70a0220.38f255.000c.GAE@google.com
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Shivank Garg <shivankg@amd.com>
---
mm/khugepaged.c | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index cc945c6ab3bd..6e8902f9d88c 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2295,6 +2295,17 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
continue;
}
+ if (!folio_try_get(folio)) {
+ xas_reset(&xas);
+ continue;
+ }
+
+ if (unlikely(folio != xas_reload(&xas))) {
+ folio_put(folio);
+ xas_reset(&xas);
+ continue;
+ }
+
if (folio_order(folio) == HPAGE_PMD_ORDER &&
folio->index == start) {
/* Maybe PMD-mapped */
@@ -2305,23 +2316,27 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
* it's safe to skip LRU and refcount checks before
* returning.
*/
+ folio_put(folio);
break;
}
node = folio_nid(folio);
if (hpage_collapse_scan_abort(node, cc)) {
result = SCAN_SCAN_ABORT;
+ folio_put(folio);
break;
}
cc->node_load[node]++;
if (!folio_test_lru(folio)) {
result = SCAN_PAGE_LRU;
+ folio_put(folio);
break;
}
if (!is_refcount_suitable(folio)) {
result = SCAN_PAGE_COUNT;
+ folio_put(folio);
break;
}
@@ -2333,6 +2348,7 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr,
*/
present += folio_nr_pages(folio);
+ folio_put(folio);
if (need_resched()) {
xas_pause(&xas);
--
2.34.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [syzbot] [mm?] WARNING in folio_large_mapcount
2025-05-22 4:57 ` Shivank Garg
@ 2025-05-22 7:11 ` David Hildenbrand
0 siblings, 0 replies; 7+ messages in thread
From: David Hildenbrand @ 2025-05-22 7:11 UTC (permalink / raw)
To: Shivank Garg, syzbot, Liam.Howlett, akpm, baolin.wang, dev.jain,
linux-kernel, linux-mm, lorenzo.stoakes, npache, ryan.roberts,
syzkaller-bugs, ziy, Matthew Wilcox
On 22.05.25 06:57, Shivank Garg wrote:
> On 5/20/2025 7:35 PM, David Hildenbrand wrote:
>> On 20.05.25 07:45, Shivank Garg wrote:
>>> On 5/19/2025 6:56 PM, David Hildenbrand wrote:
>>>> On 17.05.25 10:21, syzbot wrote:
>>>>> Hello,
>>>>>
>>>>> syzbot found the following issue on:
>>>>>
>>>>> HEAD commit: 627277ba7c23 Merge tag 'arm64_cbpf_mitigation_2025_05_08' ..
>>>>> git tree: upstream
>>>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1150f670580000
>>>>> kernel config: https://syzkaller.appspot.com/x/.config?x=5929ac65be9baf3c
>>>>> dashboard link: https://syzkaller.appspot.com/bug?extid=2b99589e33edbe9475ca
>>>>> compiler: Debian clang version 20.1.2 (++20250402124445+58df0ef89dd6-1~exp1~20250402004600.97), Debian LLD 20.1.2
>>>>>
>>>>> Unfortunately, I don't have any reproducer for this issue yet.
>>>>>
>>>>> Downloadable assets:
>>>>> disk image: https://storage.googleapis.com/syzbot-assets/0a42ae72fe0e/disk-627277ba.raw.xz
>>>>> vmlinux: https://storage.googleapis.com/syzbot-assets/0be88297bb66/vmlinux-627277ba.xz
>>>>> kernel image: https://storage.googleapis.com/syzbot-assets/31808a4b1210/bzImage-627277ba.xz
>>>>>
>>>>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>>>>> Reported-by: syzbot+2b99589e33edbe9475ca@syzkaller.appspotmail.com
>>>>>
>>>>> ------------[ cut here ]------------
>>>>> WARNING: CPU: 1 PID: 38 at ./include/linux/mm.h:1335 folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>>>
>>>> This should be
>>>>
>>>> VM_WARN_ON_FOLIO(!folio_test_large(folio), folio);
>>>>
>>>>> Modules linked in:
>>>>> CPU: 1 UID: 0 PID: 38 Comm: khugepaged Not tainted 6.15.0-rc6-syzkaller-00025-g627277ba7c23 #0 PREEMPT(full)
>>>>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/07/2025
>>>>> RIP: 0010:folio_large_mapcount+0xd0/0x110 include/linux/mm.h:1335
>>>>> Code: 04 38 84 c0 75 29 8b 03 ff c0 5b 41 5e 41 5f e9 96 d2 2b 09 cc e8 d0 cb 99 ff 48 89 df 48 c7 c6 20 de 77 8b e8 a1 dc de ff 90 <0f> 0b 90 eb b6 89 d9 80 e1 07 80 c1 03 38 c1 7c cb 48 89 df e8 87
>>>>> RSP: 0018:ffffc90000af77e0 EFLAGS: 00010246
>>>>> RAX: e1fcb38c0ff8ce00 RBX: ffffea00014c8000 RCX: e1fcb38c0ff8ce00
>>>>> RDX: 0000000000000001 RSI: ffffffff8d9226df RDI: ffff88801e2fbc00
>>>>> RBP: ffffc90000af7b50 R08: ffff8880b8923e93 R09: 1ffff110171247d2
>>>>> R10: dffffc0000000000 R11: ffffed10171247d3 R12: 1ffffd4000299000
>>>>> R13: dffffc0000000000 R14: 0000000000000000 R15: dffffc0000000000
>>>>> FS: 0000000000000000(0000) GS:ffff8881261fb000(0000) knlGS:0000000000000000
>>>>> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> CR2: 00007ffe58f12dc0 CR3: 0000000030e04000 CR4: 00000000003526f0
>>>>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>>>>> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>>>>> Call Trace:
>>>>> <TASK>
>>>>> folio_mapcount include/linux/mm.h:1369 [inline]
>>>>
>>>> And here we come through
>>>>
>>>> if (likely(!folio_test_large(folio))) {
>>>> ...
>>>> }
>>>> return folio_large_mapcount(folio);
>>>>
>>>>
>>>> So the folio is split concurrently. And I think there is nothing stopping it from getting freed.
>>>>
>>>> We do a xas_for_each() under RCU. So yes, this is racy.
>>>>
>>>> In collapse_file(), we re-validate everything.
>>>>
>>>> We could
>>>>
>>>> (A) Take proper pagecache locks
>>>>
>>>> (B) Try grabbing a temporary folio reference
>>>>
>>>> (C) Try snapshotting the folio
>>>>
>>>> Probably, in this code, (B) might be cleanest for now? Handling it just like other code in mm/filemap.c.
>>>>
>>>
>>> Hi,
>>
>> Hi,
>>
>>>
>>> I've implemented your suggestion (B) using folio_try_get().
>>> Could you please review if my patch looks correct?
>>
>> You should probably drop both comments, the code merely mimics what filemap.c does.
>>
>> Apart from that, nothing jumped at me.
>>
>
> Thank you. I have attached revised patch.
LGTM, please send it as a proper patch.
Not sure about which Fixes: tag.
I added that VM_WARN_ON_FOLIO in 05c5323b2a34 ("mm: track mapcount of
large folios in single value"), but the real problem is rather the
raciness of the code that probably dates back a bit further.
Maybe that code was always assumed to be okay when racing? Possibly.
Maybe let's just use 05c5323b2a34 unless we know that the raciness
resulted in some other problems earlier.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-05-22 7:12 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-05-17 8:21 [syzbot] [mm?] WARNING in folio_large_mapcount syzbot
2025-05-19 13:26 ` David Hildenbrand
2025-05-20 5:45 ` Shivank Garg
2025-05-20 5:46 ` syzbot
2025-05-20 14:05 ` David Hildenbrand
2025-05-22 4:57 ` Shivank Garg
2025-05-22 7:11 ` David Hildenbrand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox