linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH RFC 0/1] fix for large folio split race in page cache
@ 2026-03-05 18:34 Chris J Arges
  2026-03-05 18:34 ` [PATCH RFC 1/1] mm/filemap: handle large folio split race in page cache lookups Chris J Arges
  0 siblings, 1 reply; 3+ messages in thread
From: Chris J Arges @ 2026-03-05 18:34 UTC (permalink / raw)
  To: willy, akpm, william.kucharski
  Cc: linux-fsdevel, linux-mm, linux-kernel, kernel-team, Chris J Arges

In production we've seen crashes on 6.18.7+ with the following
signature below. These machines have high memory pressure, were using
xfs file-systems, and generally journalctl was the comm when we oops.

After some crash-dump analysis we determined that this was a race
condition. We tried to create a more self-contained reproducer for this
issue, but unfortunately were unable to do so. This patch will be
applied internally as a mitigation for the issue, but will take time
to validate fully (ensuring we don't see crashes over a longer time). We
are looking for feedback to see if this could be a valid fix or if there
are other approaches that we should look into.

An earlier email I posted with some analysis is here
https://lore.kernel.org/lkml/aYN3JC_Kdgw5G2Ik@861G6M3/T/#u

Thanks,
--chris

Call Trace:
```
aops:xfs_address_space_operations ino:5000126 dentry name(?):"system@d737aaecce5449038a638f9e18bbf5f5-0000000004e06fa7-00064"
flags: 0xeffff8000001ad(locked|waiters|referenced|uptodate|lru|active|node=3|zone=2|lastcpupid=0x1ffff)
raw: 00effff8000001ad ffaa3c6b85b73ec8 ffaa3c6b85b73e08 ff4e378b0e95dea8
raw: 000000000000737a 0000000000000000 00000002ffffffff ff4e379527691b00
page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, index))
------------[ cut here ]------------
kernel BUG at mm/filemap.c:3519!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 49159 Comm: journalctl Kdump: loaded Tainted: G        W  O        6.18.7-cloudflare-2026.1.15 #1 PREEMPT(voluntary)
Tainted: [W]=WARN, [O]=OOT_MODULE
Hardware name: MiTAC TC55-B8051-G12/S8051GM, BIOS V1.08 09/16/2025
RIP: 0010:filemap_fault+0xa61/0x1410
Code: 48 8b 4c 24 10 4c 8b 44 24 08 48 85 c9 0f 84 82 fa ff ff 49 89 cd e9 bc f9 ff ff 48 c7 c6 20 44 d0 86 4c 89 c7 e8 3f 1c 04 00 <0f> 0b 48 8d 7b 18 4c 89 44 24 08 4c 89 1c 24 e8 0b 97 e3 ff 4c 8b
RSP: 0000:ff6fd043bed0fcb0 EFLAGS: 00010246
RAX: 0000000000000043 RBX: ff4e378b0e95dea8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ff4e375cef81c4c0
RBP: 000000000000737b R08: 0000000000000000 R09: ff6fd043bed0fb48
R10: ff4e37b4ecc3ffa8 R11: 0000000000000003 R12: 0000000000000000
R13: ff4e375c4fa17680 R14: ff4e378b0e95dd38 R15: ff6fd043bed0fde8
FS:  00007f6c5b8b4980(0000) GS:ff4e375d67864000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6c48b7b050 CR3: 0000005065d34006 CR4: 0000000000771ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? mod_memcg_state+0x80/0x1c0
 __do_fault+0x31/0xd0
 do_fault+0x2e6/0x710
 __handle_mm_fault+0x7b3/0xe50
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? anon_pipe_write+0x27e/0x670
 handle_mm_fault+0xaa/0x2a0
 do_user_addr_fault+0x208/0x660
 exc_page_fault+0x77/0x170
 asm_exc_page_fault+0x26/0x30
RIP: 0033:0x7f6c5b67c3dc
Code: e2 ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 41 55 41 54 55 53 48 83 ec 18 48 85 ff 0f 84 bd 01 00 00 48 85 f6 0f 84 d4 01 00 00 <48> 8b 5e 08 48 89 cd 48 85 db 74 60 48 83 fb 0f 0f 86 86 00 00 00
RSP: 002b:00007ffe78c072e0 EFLAGS: 00010206
RAX: 0000000000000000 RBX: 000000000737b048 RCX: 000000000737b048
RDX: 0000000000000003 RSI: 00007f6c48b7b048 RDI: 000055bc3b28dee0
RBP: 000055bc3b28dee0 R08: 0000000000000010 R09: 000055bc3b28df18
R10: 0000000000000001 R11: 00007f6c5b679fa0 R12: 0000000000000003
R13: 00007ffe78c07450 R14: 00007ffe78c07450 R15: 00007f6c48b7b048
 </TASK>
```

Chris J Arges (1):
  mm/filemap: handle large folio split race in page cache lookups

 mm/filemap.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

--
2.43.0



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-03-05 19:24 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2026-03-05 18:34 [PATCH RFC 0/1] fix for large folio split race in page cache Chris J Arges
2026-03-05 18:34 ` [PATCH RFC 1/1] mm/filemap: handle large folio split race in page cache lookups Chris J Arges
2026-03-05 19:24   ` Matthew Wilcox

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox