From: Daniel Gomez <da.gomez@kernel.org>
To: Luis Chamberlain <mcgrof@kernel.org>,
Matthew Wilcox <willy@infradead.org>
Cc: Daniel Gomez <da.gomez@samsung.com>,
Tamir Duberstein <tamird@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
kdevops@lists.linux.dev
Subject: Re: xarray regression: XArray: Add extra debugging check to xas_lock and friends
Date: Fri, 30 May 2025 08:58:35 -0700 [thread overview]
Message-ID: <7df55910-13b4-4ac5-b13b-22a44366e193@kernel.org> (raw)
In-Reply-To: <aAG_Sz_a2j3ummY2@bombadil.infradead.org>
On 17/04/2025 19.56, Luis Chamberlain wrote:
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: BUG at xa_alloc_index:57
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: CPU: 1 UID: 0 PID: 874 Comm: modprobe Tainted: G W 6.15.0-rc2-next-20250417 #5 PREEMPT(full)
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: Tainted: [W]=WARN
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 2024.11-5 01/28/2025
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: Call Trace:
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: <TASK>
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: dump_stack_lvl (lib/dump_stack.c:122)
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: xa_alloc_index.constprop.0.cold (lib/test_xarray.c:602) test_xarray
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: check_xa_alloc_1 (lib/test_xarray.c:940) test_xarray
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: ? __pfx_xarray_checks (lib/test_xarray.c:2233) test_xarray
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: check_xa_alloc (lib/test_xarray.c:1106) test_xarray
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: xarray_checks (lib/test_xarray.c:2250) test_xarray
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: do_one_initcall (init/main.c:1271)
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: do_init_module (kernel/module/main.c:2930)
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: init_module_from_file (kernel/module/main.c:3587)
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: idempotent_init_module (./include/linux/spinlock.h:351 kernel/module/main.c:3528 kernel/module/main.c:3600)
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: __x64_sys_finit_module (./include/linux/file.h:62 (discriminator 1) ./include/linux/file.h:83 (discriminator 1) kernel/module/main.c:3622 (discriminator 1) kernel/module/main.c:3609 (discriminator 1) kernel/module/main.c:3609 (discriminator 1))
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: do_syscall_64 (arch/x86/entry/syscall_64.c:63 (discriminator 1) arch/x86/entry/syscall_64.c:94 (discriminator 1))
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: RIP: 0033:0x7f0a99f18779
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 4f 86 0d 00 f7 d8 64 89 01 48
> All code
> ========
> 0: ff c3 inc %ebx
> 2: 66 2e 0f 1f 84 00 00 cs nopw 0x0(%rax,%rax,1)
> 9: 00 00 00
> c: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1)
> 11: 48 89 f8 mov %rdi,%rax
> 14: 48 89 f7 mov %rsi,%rdi
> 17: 48 89 d6 mov %rdx,%rsi
> 1a: 48 89 ca mov %rcx,%rdx
> 1d: 4d 89 c2 mov %r8,%r10
> 20: 4d 89 c8 mov %r9,%r8
> 23: 4c 8b 4c 24 08 mov 0x8(%rsp),%r9
> 28: 0f 05 syscall
> 2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
> 30: 73 01 jae 0x33
> 32: c3 ret
> 33: 48 8b 0d 4f 86 0d 00 mov 0xd864f(%rip),%rcx # 0xd8689
> 3a: f7 d8 neg %eax
> 3c: 64 89 01 mov %eax,%fs:(%rcx)
> 3f: 48 rex.W
>
> Code starting with the faulting instruction
> ===========================================
> 0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
> 6: 73 01 jae 0x9
> 8: c3 ret
> 9: 48 8b 0d 4f 86 0d 00 mov 0xd864f(%rip),%rcx # 0xd865f
> 10: f7 d8 neg %eax
> 12: 64 89 01 mov %eax,%fs:(%rcx)
> 15: 48 rex.W
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: RSP: 002b:00007fffcb2588c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000139
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: RAX: ffffffffffffffda RBX: 000055e8f735a970 RCX: 00007f0a99f18779
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: RDX: 0000000000000000 RSI: 000055e8e9dd2328 RDI: 0000000000000003
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 000055e8f735c410
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: R10: 0000000000000000 R11: 0000000000000246 R12: 000055e8e9dd2328
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: R13: 0000000000040000 R14: 000055e8f735aa80 R15: 0000000000000000
> Apr 18 02:30:42 e00aeb44aaa1-xarray kernel: </TASK>
I've been looking into this issue and noticed that setting the XArray operation
state to a valid state was removed in patch [1]. Can you elaborate more why this
was removed?
[1] 6684aba0780da "XArray: Add extra debugging check to xas_lock and friends"
Reverting behaviour makes it work again:
diff --git a/lib/xarray.c b/lib/xarray.c
index ee826f1a21fe..00b15287c292 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -2386,6 +2386,8 @@ void xa_destroy(struct xarray *xa)
xas_lock_irqsave(&xas, flags);
entry = xa_head_locked(xa);
RCU_INIT_POINTER(xa->xa_head, NULL);
+ if (xas_top(xas.xa_node))
+ xas.xa_node = NULL;
xas_init_marks(&xas);
if (xa_zero_busy(xa))
xa_mark_clear(xa, XA_FREE_MARK);
However, based on [2], this might not be the right fix.
[2]
https://lore.kernel.org/all/Z98oChgU7Z9wyTw1@casper.infradead.org/
The problem shows up in a test from check_xa_alloc_1(). The xa_state
is initialized during __xa_alloc() (via xa_alloc_index() → xa_alloc() →
__xa_alloc()), but the internal xa_node stays in an invalid state. So when we
try to allocate again at that same index after xa_destroy(), xas is still
invalid, leading to the BUG above. This has nothing to do with the debug check
added in commit [1].
Also, since xas stays invalid, xa_destroy() skips setting XA_FREE_MARK (via
xas_init_marks()), which I think is needed because we declared the array with
XA_FLAGS_TRACK_FREE flag in DEFINE_XARRAY_ALLOC(xa0).
Can you clarify why we cannot set the XArray state to valid and what would be
the expected behaviour? As Luis reported, this is failing in kdevops CI which is
now testing XArray tests among others in a daily basis.
next prev parent reply other threads:[~2025-05-30 15:58 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-18 2:56 Luis Chamberlain
2025-05-30 15:58 ` Daniel Gomez [this message]
2025-06-02 9:22 ` Daniel Gomez
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7df55910-13b4-4ac5-b13b-22a44366e193@kernel.org \
--to=da.gomez@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=da.gomez@samsung.com \
--cc=kdevops@lists.linux.dev \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mcgrof@kernel.org \
--cc=tamird@gmail.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox