* [PATCH v4] lib: xarray: free unused spare node in xas_create_range()
@ 2025-12-04 14:26 Shardul Bankar
2025-12-05 7:22 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 3+ messages in thread
From: Shardul Bankar @ 2025-12-04 14:26 UTC (permalink / raw)
To: willy, akpm, linux-mm
Cc: linux-fsdevel, linux-kernel, dev.jain, david, shardulsb08, janak,
Shardul Bankar
xas_create_range() is typically called in a retry loop that uses
xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare
xa_node and store it in xas->xa_alloc for use in the retry.
If the lock is dropped after xas_nomem(), another thread can expand the
xarray tree in the meantime. On the next retry, xas_create_range() can
then succeed without consuming the spare node stored in xas->xa_alloc.
If the function returns without freeing this spare node, it leaks.
xas_create_range() calls xas_create() multiple times in a loop for
different index ranges. A spare node that isn't needed for one range
iteration might be needed for the next, so we cannot free it after each
xas_create() call. We can only safely free it after xas_create_range()
completes.
Fix this by calling xas_destroy() at the end of xas_create_range() to
free any unused spare node. This makes the API safer by default and
prevents callers from needing to remember cleanup.
This fixes a memory leak in mm/khugepaged.c and potentially other
callers that use xas_nomem() with xas_create_range().
Link: https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c
Link: https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3")
Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow")
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
---
v4:
- Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy()
already checks xa_alloc internally.
v3:
- Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox
- Fix in library function makes API safer by default, preventing callers from needing
to remember cleanup
- Use shared cleanup label that both restore: and success: paths jump to
- Clean up unused spare node on both success and error exit paths
v2:
- Call xas_destroy() on both success and failure
- Explained retry semantics and xa_alloc / concurrency risk
- Dropped cleanup_empty_nodes from previous proposal
lib/xarray.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/lib/xarray.c b/lib/xarray.c
index 9a8b4916540c..f49ccfa5f57d 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas)
xas->xa_shift = shift;
xas->xa_sibs = sibs;
xas->xa_index = index;
- return;
+ goto cleanup;
+
success:
xas->xa_index = index;
if (xas->xa_node)
xas_set_offset(xas);
+
+cleanup:
+ /* Free any unused spare node from xas_nomem() */
+ xas_destroy(xas);
}
EXPORT_SYMBOL_GPL(xas_create_range);
--
2.34.1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range()
2025-12-04 14:26 [PATCH v4] lib: xarray: free unused spare node in xas_create_range() Shardul Bankar
@ 2025-12-05 7:22 ` David Hildenbrand (Red Hat)
2025-12-05 10:51 ` Shardul Bankar
0 siblings, 1 reply; 3+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-05 7:22 UTC (permalink / raw)
To: Shardul Bankar, willy, akpm, linux-mm
Cc: linux-fsdevel, linux-kernel, dev.jain, shardulsb08, janak
On 12/4/25 15:26, Shardul Bankar wrote:
> xas_create_range() is typically called in a retry loop that uses
> xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare
> xa_node and store it in xas->xa_alloc for use in the retry.
>
> If the lock is dropped after xas_nomem(), another thread can expand the
> xarray tree in the meantime. On the next retry, xas_create_range() can
> then succeed without consuming the spare node stored in xas->xa_alloc.
> If the function returns without freeing this spare node, it leaks.
>
> xas_create_range() calls xas_create() multiple times in a loop for
> different index ranges. A spare node that isn't needed for one range
> iteration might be needed for the next, so we cannot free it after each
> xas_create() call. We can only safely free it after xas_create_range()
> completes.
>
> Fix this by calling xas_destroy() at the end of xas_create_range() to
> free any unused spare node. This makes the API safer by default and
> prevents callers from needing to remember cleanup.
>
> This fixes a memory leak in mm/khugepaged.c and potentially other
> callers that use xas_nomem() with xas_create_range().
>
> Link: https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c
> Link: https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3")
> Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow")
> Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
> ---
> v4:
> - Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy()
> already checks xa_alloc internally.
> v3:
> - Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox
> - Fix in library function makes API safer by default, preventing callers from needing
> to remember cleanup
> - Use shared cleanup label that both restore: and success: paths jump to
> - Clean up unused spare node on both success and error exit paths
> v2:
> - Call xas_destroy() on both success and failure
> - Explained retry semantics and xa_alloc / concurrency risk
> - Dropped cleanup_empty_nodes from previous proposal
>
> lib/xarray.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/lib/xarray.c b/lib/xarray.c
> index 9a8b4916540c..f49ccfa5f57d 100644
> --- a/lib/xarray.c
> +++ b/lib/xarray.c
> @@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas)
> xas->xa_shift = shift;
> xas->xa_sibs = sibs;
> xas->xa_index = index;
> - return;
> + goto cleanup;
> +
> success:
> xas->xa_index = index;
> if (xas->xa_node)
> xas_set_offset(xas);
> +
> +cleanup:
> + /* Free any unused spare node from xas_nomem() */
> + xas_destroy(xas);
> }
> EXPORT_SYMBOL_GPL(xas_create_range);
>
Nothing jumped at me, except that the label situation is a bit
suboptimal.
Hoping Willy will take another look as well.
Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org>
BTW, do we have a way to test this in a test case?
A follow-up cleanup that avoids labels could be something like (untested):
diff --git a/lib/xarray.c b/lib/xarray.c
index 9a8b4916540cf..325f264530fb2 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -714,6 +714,7 @@ void xas_create_range(struct xa_state *xas)
unsigned long index = xas->xa_index;
unsigned char shift = xas->xa_shift;
unsigned char sibs = xas->xa_sibs;
+ bool success = false;
xas->xa_index |= ((sibs + 1UL) << shift) - 1;
if (xas_is_node(xas) && xas->xa_node->shift == xas->xa_shift)
@@ -724,9 +725,11 @@ void xas_create_range(struct xa_state *xas)
for (;;) {
xas_create(xas, true);
if (xas_error(xas))
- goto restore;
- if (xas->xa_index <= (index | XA_CHUNK_MASK))
- goto success;
+ break
+ if (xas->xa_index <= (index | XA_CHUNK_MASK)) {
+ succeess = true;
+ break;
+ }
xas->xa_index -= XA_CHUNK_SIZE;
for (;;) {
@@ -740,15 +743,17 @@ void xas_create_range(struct xa_state *xas)
}
}
-restore:
- xas->xa_shift = shift;
- xas->xa_sibs = sibs;
- xas->xa_index = index;
- return;
-success:
- xas->xa_index = index;
- if (xas->xa_node)
- xas_set_offset(xas);
+ if (success) {
+ xas->xa_index = index;
+ if (xas->xa_node)
+ xas_set_offset(xas);
+ } else {
+ xas->xa_shift = shift;
+ xas->xa_sibs = sibs;
+ xas->xa_index = index;
+ }
+ /* Free any unused spare node from xas_nomem() */
+ xas_destroy(xas);
}
EXPORT_SYMBOL_GPL(xas_create_range);
--
Cheers
David
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range()
2025-12-05 7:22 ` David Hildenbrand (Red Hat)
@ 2025-12-05 10:51 ` Shardul Bankar
0 siblings, 0 replies; 3+ messages in thread
From: Shardul Bankar @ 2025-12-05 10:51 UTC (permalink / raw)
To: David Hildenbrand (Red Hat), willy, akpm, linux-mm
Cc: linux-fsdevel, linux-kernel, dev.jain, janak, shardulsb08
On Fri, 2025-12-05 at 08:22 +0100, David Hildenbrand (Red Hat) wrote:
> > Link:
> > https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c
> ...
> Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org>
>
>
> BTW, do we have a way to test this in a test case?
Hi David,
Thanks for the review and the Reviewed-by.
Regarding a test case: I don’t have a focused selftest or fault-
injection setup yet that reliably hits this xas_nomem() +
xas_create_range() corner case.
I noticed this spare-node leak while analyzing the Syzbot report I
referenced in the Link: tag, but the reproducer I see there doesn’t
isolate this path and reports other kmemleaks.
For now I’d prefer to treat this as a small correctness fix in xarray
itself. If I manage to come up with a robust way to exercise this path
in a selftest (e.g. via targeted fault injection in lib/test_xarray.c),
I can follow up with a separate patch, but I don’t have anything solid
to propose today.
>
>
> A follow-up cleanup that avoids labels could be something like
> (untested):
>
>
> diff --git a/lib/xarray.c b/lib/xarray.c
> index 9a8b4916540cf..325f264530fb2 100644
> --- a/lib/xarray.c
> +++ b/lib/xarray.c
> @@ -714,6 +714,7 @@ void xas_create_range(struct xa_state *xas)
> unsigned long index = xas->xa_index;
> unsigned char shift = xas->xa_shift;
> unsigned char sibs = xas->xa_sibs;
> + bool success = false;
>
> xas->xa_index |= ((sibs + 1UL) << shift) - 1;
> if (xas_is_node(xas) && xas->xa_node->shift == xas-
> >xa_shift)
> @@ -724,9 +725,11 @@ void xas_create_range(struct xa_state *xas)
> for (;;) {
> xas_create(xas, true);
> if (xas_error(xas))
> - goto restore;
> - if (xas->xa_index <= (index | XA_CHUNK_MASK))
> - goto success;
> + break
> + if (xas->xa_index <= (index | XA_CHUNK_MASK)) {
> + succeess = true;
> + break;
> + }
> xas->xa_index -= XA_CHUNK_SIZE;
>
> for (;;) {
> @@ -740,15 +743,17 @@ void xas_create_range(struct xa_state *xas)
> }
> }
>
> -restore:
> - xas->xa_shift = shift;
> - xas->xa_sibs = sibs;
> - xas->xa_index = index;
> - return;
> -success:
> - xas->xa_index = index;
> - if (xas->xa_node)
> - xas_set_offset(xas);
> + if (success) {
> + xas->xa_index = index;
> + if (xas->xa_node)
> + xas_set_offset(xas);
> + } else {
> + xas->xa_shift = shift;
> + xas->xa_sibs = sibs;
> + xas->xa_index = index;
> + }
> + /* Free any unused spare node from xas_nomem() */
> + xas_destroy(xas);
> }
> EXPORT_SYMBOL_GPL(xas_create_range);
>
>
Your bool-based version reads nicer; I’m happy to follow up with a
small cleanup patch on top that switches xas_create_range() over to
that style (with a Suggested-by tag).
Thanks and Regards,
Shardul
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2025-12-05 10:51 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-04 14:26 [PATCH v4] lib: xarray: free unused spare node in xas_create_range() Shardul Bankar
2025-12-05 7:22 ` David Hildenbrand (Red Hat)
2025-12-05 10:51 ` Shardul Bankar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox