linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Chris J Arges <carges@cloudflare.com>
To: willy@infradead.org, akpm@linux-foundation.org,
	william.kucharski@oracle.com
Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org, kernel-team@cloudflare.com,
	Chris J Arges <carges@cloudflare.com>
Subject: [PATCH RFC 0/1] fix for large folio split race in page cache
Date: Thu,  5 Mar 2026 12:34:32 -0600	[thread overview]
Message-ID: <20260305183438.1062312-1-carges@cloudflare.com> (raw)

In production we've seen crashes on 6.18.7+ with the following
signature below. These machines have high memory pressure, were using
xfs file-systems, and generally journalctl was the comm when we oops.

After some crash-dump analysis we determined that this was a race
condition. We tried to create a more self-contained reproducer for this
issue, but unfortunately were unable to do so. This patch will be
applied internally as a mitigation for the issue, but will take time
to validate fully (ensuring we don't see crashes over a longer time). We
are looking for feedback to see if this could be a valid fix or if there
are other approaches that we should look into.

An earlier email I posted with some analysis is here
https://lore.kernel.org/lkml/aYN3JC_Kdgw5G2Ik@861G6M3/T/#u

Thanks,
--chris

Call Trace:
```
aops:xfs_address_space_operations ino:5000126 dentry name(?):"system@d737aaecce5449038a638f9e18bbf5f5-0000000004e06fa7-00064"
flags: 0xeffff8000001ad(locked|waiters|referenced|uptodate|lru|active|node=3|zone=2|lastcpupid=0x1ffff)
raw: 00effff8000001ad ffaa3c6b85b73ec8 ffaa3c6b85b73e08 ff4e378b0e95dea8
raw: 000000000000737a 0000000000000000 00000002ffffffff ff4e379527691b00
page dumped because: VM_BUG_ON_FOLIO(!folio_contains(folio, index))
------------[ cut here ]------------
kernel BUG at mm/filemap.c:3519!
Oops: invalid opcode: 0000 [#1] SMP NOPTI
CPU: 0 UID: 0 PID: 49159 Comm: journalctl Kdump: loaded Tainted: G        W  O        6.18.7-cloudflare-2026.1.15 #1 PREEMPT(voluntary)
Tainted: [W]=WARN, [O]=OOT_MODULE
Hardware name: MiTAC TC55-B8051-G12/S8051GM, BIOS V1.08 09/16/2025
RIP: 0010:filemap_fault+0xa61/0x1410
Code: 48 8b 4c 24 10 4c 8b 44 24 08 48 85 c9 0f 84 82 fa ff ff 49 89 cd e9 bc f9 ff ff 48 c7 c6 20 44 d0 86 4c 89 c7 e8 3f 1c 04 00 <0f> 0b 48 8d 7b 18 4c 89 44 24 08 4c 89 1c 24 e8 0b 97 e3 ff 4c 8b
RSP: 0000:ff6fd043bed0fcb0 EFLAGS: 00010246
RAX: 0000000000000043 RBX: ff4e378b0e95dea8 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000001 RDI: ff4e375cef81c4c0
RBP: 000000000000737b R08: 0000000000000000 R09: ff6fd043bed0fb48
R10: ff4e37b4ecc3ffa8 R11: 0000000000000003 R12: 0000000000000000
R13: ff4e375c4fa17680 R14: ff4e378b0e95dd38 R15: ff6fd043bed0fde8
FS:  00007f6c5b8b4980(0000) GS:ff4e375d67864000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6c48b7b050 CR3: 0000005065d34006 CR4: 0000000000771ef0
PKRU: 55555554
Call Trace:
 <TASK>
 ? mod_memcg_state+0x80/0x1c0
 __do_fault+0x31/0xd0
 do_fault+0x2e6/0x710
 __handle_mm_fault+0x7b3/0xe50
 ? srso_alias_return_thunk+0x5/0xfbef5
 ? anon_pipe_write+0x27e/0x670
 handle_mm_fault+0xaa/0x2a0
 do_user_addr_fault+0x208/0x660
 exc_page_fault+0x77/0x170
 asm_exc_page_fault+0x26/0x30
RIP: 0033:0x7f6c5b67c3dc
Code: e2 ff 66 66 2e 0f 1f 84 00 00 00 00 00 90 41 55 41 54 55 53 48 83 ec 18 48 85 ff 0f 84 bd 01 00 00 48 85 f6 0f 84 d4 01 00 00 <48> 8b 5e 08 48 89 cd 48 85 db 74 60 48 83 fb 0f 0f 86 86 00 00 00
RSP: 002b:00007ffe78c072e0 EFLAGS: 00010206
RAX: 0000000000000000 RBX: 000000000737b048 RCX: 000000000737b048
RDX: 0000000000000003 RSI: 00007f6c48b7b048 RDI: 000055bc3b28dee0
RBP: 000055bc3b28dee0 R08: 0000000000000010 R09: 000055bc3b28df18
R10: 0000000000000001 R11: 00007f6c5b679fa0 R12: 0000000000000003
R13: 00007ffe78c07450 R14: 00007ffe78c07450 R15: 00007f6c48b7b048
 </TASK>
```

Chris J Arges (1):
  mm/filemap: handle large folio split race in page cache lookups

 mm/filemap.c | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)

--
2.43.0



             reply	other threads:[~2026-03-05 18:35 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-05 18:34 Chris J Arges [this message]
2026-03-05 18:34 ` [PATCH RFC 1/1] mm/filemap: handle large folio split race in page cache lookups Chris J Arges
2026-03-05 19:24   ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260305183438.1062312-1-carges@cloudflare.com \
    --to=carges@cloudflare.com \
    --cc=akpm@linux-foundation.org \
    --cc=kernel-team@cloudflare.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox