* [PATCH v4] lib: xarray: free unused spare node in xas_create_range()
@ 2025-12-04 14:26 Shardul Bankar
2025-12-05 7:22 ` David Hildenbrand (Red Hat)
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Shardul Bankar @ 2025-12-04 14:26 UTC (permalink / raw)
To: willy, akpm, linux-mm
Cc: linux-fsdevel, linux-kernel, dev.jain, david, shardulsb08, janak,
Shardul Bankar
xas_create_range() is typically called in a retry loop that uses
xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare
xa_node and store it in xas->xa_alloc for use in the retry.
If the lock is dropped after xas_nomem(), another thread can expand the
xarray tree in the meantime. On the next retry, xas_create_range() can
then succeed without consuming the spare node stored in xas->xa_alloc.
If the function returns without freeing this spare node, it leaks.
xas_create_range() calls xas_create() multiple times in a loop for
different index ranges. A spare node that isn't needed for one range
iteration might be needed for the next, so we cannot free it after each
xas_create() call. We can only safely free it after xas_create_range()
completes.
Fix this by calling xas_destroy() at the end of xas_create_range() to
free any unused spare node. This makes the API safer by default and
prevents callers from needing to remember cleanup.
This fixes a memory leak in mm/khugepaged.c and potentially other
callers that use xas_nomem() with xas_create_range().
Link: https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c
Link: https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3")
Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow")
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
---
v4:
- Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy()
already checks xa_alloc internally.
v3:
- Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox
- Fix in library function makes API safer by default, preventing callers from needing
to remember cleanup
- Use shared cleanup label that both restore: and success: paths jump to
- Clean up unused spare node on both success and error exit paths
v2:
- Call xas_destroy() on both success and failure
- Explained retry semantics and xa_alloc / concurrency risk
- Dropped cleanup_empty_nodes from previous proposal
lib/xarray.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/lib/xarray.c b/lib/xarray.c
index 9a8b4916540c..f49ccfa5f57d 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas)
xas->xa_shift = shift;
xas->xa_sibs = sibs;
xas->xa_index = index;
- return;
+ goto cleanup;
+
success:
xas->xa_index = index;
if (xas->xa_node)
xas_set_offset(xas);
+
+cleanup:
+ /* Free any unused spare node from xas_nomem() */
+ xas_destroy(xas);
}
EXPORT_SYMBOL_GPL(xas_create_range);
--
2.34.1
^ permalink raw reply [flat|nested] 12+ messages in thread* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-04 14:26 [PATCH v4] lib: xarray: free unused spare node in xas_create_range() Shardul Bankar @ 2025-12-05 7:22 ` David Hildenbrand (Red Hat) 2025-12-05 10:51 ` Shardul Bankar 2025-12-08 8:37 ` Dev Jain ` (2 subsequent siblings) 3 siblings, 1 reply; 12+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-12-05 7:22 UTC (permalink / raw) To: Shardul Bankar, willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, dev.jain, shardulsb08, janak On 12/4/25 15:26, Shardul Bankar wrote: > xas_create_range() is typically called in a retry loop that uses > xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare > xa_node and store it in xas->xa_alloc for use in the retry. > > If the lock is dropped after xas_nomem(), another thread can expand the > xarray tree in the meantime. On the next retry, xas_create_range() can > then succeed without consuming the spare node stored in xas->xa_alloc. > If the function returns without freeing this spare node, it leaks. > > xas_create_range() calls xas_create() multiple times in a loop for > different index ranges. A spare node that isn't needed for one range > iteration might be needed for the next, so we cannot free it after each > xas_create() call. We can only safely free it after xas_create_range() > completes. > > Fix this by calling xas_destroy() at the end of xas_create_range() to > free any unused spare node. This makes the API safer by default and > prevents callers from needing to remember cleanup. > > This fixes a memory leak in mm/khugepaged.c and potentially other > callers that use xas_nomem() with xas_create_range(). > > Link: https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c > Link: https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3") > Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow") > Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com> > --- > v4: > - Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy() > already checks xa_alloc internally. > v3: > - Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox > - Fix in library function makes API safer by default, preventing callers from needing > to remember cleanup > - Use shared cleanup label that both restore: and success: paths jump to > - Clean up unused spare node on both success and error exit paths > v2: > - Call xas_destroy() on both success and failure > - Explained retry semantics and xa_alloc / concurrency risk > - Dropped cleanup_empty_nodes from previous proposal > > lib/xarray.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/lib/xarray.c b/lib/xarray.c > index 9a8b4916540c..f49ccfa5f57d 100644 > --- a/lib/xarray.c > +++ b/lib/xarray.c > @@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas) > xas->xa_shift = shift; > xas->xa_sibs = sibs; > xas->xa_index = index; > - return; > + goto cleanup; > + > success: > xas->xa_index = index; > if (xas->xa_node) > xas_set_offset(xas); > + > +cleanup: > + /* Free any unused spare node from xas_nomem() */ > + xas_destroy(xas); > } > EXPORT_SYMBOL_GPL(xas_create_range); > Nothing jumped at me, except that the label situation is a bit suboptimal. Hoping Willy will take another look as well. Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> BTW, do we have a way to test this in a test case? A follow-up cleanup that avoids labels could be something like (untested): diff --git a/lib/xarray.c b/lib/xarray.c index 9a8b4916540cf..325f264530fb2 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -714,6 +714,7 @@ void xas_create_range(struct xa_state *xas) unsigned long index = xas->xa_index; unsigned char shift = xas->xa_shift; unsigned char sibs = xas->xa_sibs; + bool success = false; xas->xa_index |= ((sibs + 1UL) << shift) - 1; if (xas_is_node(xas) && xas->xa_node->shift == xas->xa_shift) @@ -724,9 +725,11 @@ void xas_create_range(struct xa_state *xas) for (;;) { xas_create(xas, true); if (xas_error(xas)) - goto restore; - if (xas->xa_index <= (index | XA_CHUNK_MASK)) - goto success; + break + if (xas->xa_index <= (index | XA_CHUNK_MASK)) { + succeess = true; + break; + } xas->xa_index -= XA_CHUNK_SIZE; for (;;) { @@ -740,15 +743,17 @@ void xas_create_range(struct xa_state *xas) } } -restore: - xas->xa_shift = shift; - xas->xa_sibs = sibs; - xas->xa_index = index; - return; -success: - xas->xa_index = index; - if (xas->xa_node) - xas_set_offset(xas); + if (success) { + xas->xa_index = index; + if (xas->xa_node) + xas_set_offset(xas); + } else { + xas->xa_shift = shift; + xas->xa_sibs = sibs; + xas->xa_index = index; + } + /* Free any unused spare node from xas_nomem() */ + xas_destroy(xas); } EXPORT_SYMBOL_GPL(xas_create_range); -- Cheers David ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-05 7:22 ` David Hildenbrand (Red Hat) @ 2025-12-05 10:51 ` Shardul Bankar 2025-12-08 11:36 ` David Hildenbrand (Red Hat) 0 siblings, 1 reply; 12+ messages in thread From: Shardul Bankar @ 2025-12-05 10:51 UTC (permalink / raw) To: David Hildenbrand (Red Hat), willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, dev.jain, janak, shardulsb08 On Fri, 2025-12-05 at 08:22 +0100, David Hildenbrand (Red Hat) wrote: > > Link: > > https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c > ... > Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org> > > > BTW, do we have a way to test this in a test case? Hi David, Thanks for the review and the Reviewed-by. Regarding a test case: I don’t have a focused selftest or fault- injection setup yet that reliably hits this xas_nomem() + xas_create_range() corner case. I noticed this spare-node leak while analyzing the Syzbot report I referenced in the Link: tag, but the reproducer I see there doesn’t isolate this path and reports other kmemleaks. For now I’d prefer to treat this as a small correctness fix in xarray itself. If I manage to come up with a robust way to exercise this path in a selftest (e.g. via targeted fault injection in lib/test_xarray.c), I can follow up with a separate patch, but I don’t have anything solid to propose today. > > > A follow-up cleanup that avoids labels could be something like > (untested): > > > diff --git a/lib/xarray.c b/lib/xarray.c > index 9a8b4916540cf..325f264530fb2 100644 > --- a/lib/xarray.c > +++ b/lib/xarray.c > @@ -714,6 +714,7 @@ void xas_create_range(struct xa_state *xas) > unsigned long index = xas->xa_index; > unsigned char shift = xas->xa_shift; > unsigned char sibs = xas->xa_sibs; > + bool success = false; > > xas->xa_index |= ((sibs + 1UL) << shift) - 1; > if (xas_is_node(xas) && xas->xa_node->shift == xas- > >xa_shift) > @@ -724,9 +725,11 @@ void xas_create_range(struct xa_state *xas) > for (;;) { > xas_create(xas, true); > if (xas_error(xas)) > - goto restore; > - if (xas->xa_index <= (index | XA_CHUNK_MASK)) > - goto success; > + break > + if (xas->xa_index <= (index | XA_CHUNK_MASK)) { > + succeess = true; > + break; > + } > xas->xa_index -= XA_CHUNK_SIZE; > > for (;;) { > @@ -740,15 +743,17 @@ void xas_create_range(struct xa_state *xas) > } > } > > -restore: > - xas->xa_shift = shift; > - xas->xa_sibs = sibs; > - xas->xa_index = index; > - return; > -success: > - xas->xa_index = index; > - if (xas->xa_node) > - xas_set_offset(xas); > + if (success) { > + xas->xa_index = index; > + if (xas->xa_node) > + xas_set_offset(xas); > + } else { > + xas->xa_shift = shift; > + xas->xa_sibs = sibs; > + xas->xa_index = index; > + } > + /* Free any unused spare node from xas_nomem() */ > + xas_destroy(xas); > } > EXPORT_SYMBOL_GPL(xas_create_range); > > Your bool-based version reads nicer; I’m happy to follow up with a small cleanup patch on top that switches xas_create_range() over to that style (with a Suggested-by tag). Thanks and Regards, Shardul ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-05 10:51 ` Shardul Bankar @ 2025-12-08 11:36 ` David Hildenbrand (Red Hat) 0 siblings, 0 replies; 12+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-12-08 11:36 UTC (permalink / raw) To: Shardul Bankar, willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, dev.jain, janak, shardulsb08 >> -restore: >> - xas->xa_shift = shift; >> - xas->xa_sibs = sibs; >> - xas->xa_index = index; >> - return; >> -success: >> - xas->xa_index = index; >> - if (xas->xa_node) >> - xas_set_offset(xas); >> + if (success) { >> + xas->xa_index = index; >> + if (xas->xa_node) >> + xas_set_offset(xas); >> + } else { >> + xas->xa_shift = shift; >> + xas->xa_sibs = sibs; >> + xas->xa_index = index; >> + } >> + /* Free any unused spare node from xas_nomem() */ >> + xas_destroy(xas); >> } >> EXPORT_SYMBOL_GPL(xas_create_range); >> >> > Your bool-based version reads nicer; I’m happy to follow up with a > small cleanup patch on top that switches xas_create_range() over to > that style (with a Suggested-by tag). Yeah, feel free to send a cleanup out that removes some of these labels (doesn't necessarily have to be what I proposed). -- Cheers David ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-04 14:26 [PATCH v4] lib: xarray: free unused spare node in xas_create_range() Shardul Bankar 2025-12-05 7:22 ` David Hildenbrand (Red Hat) @ 2025-12-08 8:37 ` Dev Jain 2025-12-15 2:19 ` Jinjiang Tu 2025-12-31 6:29 ` Shardul Bankar 3 siblings, 0 replies; 12+ messages in thread From: Dev Jain @ 2025-12-08 8:37 UTC (permalink / raw) To: Shardul Bankar, willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, david, shardulsb08, janak On 04/12/25 7:56 pm, Shardul Bankar wrote: > xas_create_range() is typically called in a retry loop that uses > xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare > xa_node and store it in xas->xa_alloc for use in the retry. > > If the lock is dropped after xas_nomem(), another thread can expand the Nit: "When the lock is dropped after xas_create_range()". > xarray tree in the meantime. On the next retry, xas_create_range() can > then succeed without consuming the spare node stored in xas->xa_alloc. > If the function returns without freeing this spare node, it leaks. > > xas_create_range() calls xas_create() multiple times in a loop for > different index ranges. A spare node that isn't needed for one range > iteration might be needed for the next, so we cannot free it after each > xas_create() call. We can only safely free it after xas_create_range() > completes. > > Fix this by calling xas_destroy() at the end of xas_create_range() to > free any unused spare node. This makes the API safer by default and > prevents callers from needing to remember cleanup. > > This fixes a memory leak in mm/khugepaged.c and potentially other > callers that use xas_nomem() with xas_create_range(). > > Link: https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c > Link: https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3") > Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow") > Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com> > --- > v4: > - Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy() > already checks xa_alloc internally. > v3: > - Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox > - Fix in library function makes API safer by default, preventing callers from needing > to remember cleanup > - Use shared cleanup label that both restore: and success: paths jump to > - Clean up unused spare node on both success and error exit paths > v2: > - Call xas_destroy() on both success and failure > - Explained retry semantics and xa_alloc / concurrency risk > - Dropped cleanup_empty_nodes from previous proposal > > lib/xarray.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/lib/xarray.c b/lib/xarray.c > index 9a8b4916540c..f49ccfa5f57d 100644 > --- a/lib/xarray.c > +++ b/lib/xarray.c > @@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas) > xas->xa_shift = shift; > xas->xa_sibs = sibs; > xas->xa_index = index; > - return; > + goto cleanup; > + > success: > xas->xa_index = index; > if (xas->xa_node) > xas_set_offset(xas); > + > +cleanup: > + /* Free any unused spare node from xas_nomem() */ > + xas_destroy(xas); > } > EXPORT_SYMBOL_GPL(xas_create_range); Since there are other code paths doing this fashion of xas_create/xas_store followed by calling xas_nomem till the former succeeds, I believe we should handle this in the caller - for example, I believe we need to fix xa_store_range() as well? It does xas_create, then drops the lock in the same fashion, and I believe we have no choice here but to handle it in the caller - we should not put xas_destroy inside xas_create. Although for this particular case, we have only one caller of xas_create_range, so this is fine really. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-04 14:26 [PATCH v4] lib: xarray: free unused spare node in xas_create_range() Shardul Bankar 2025-12-05 7:22 ` David Hildenbrand (Red Hat) 2025-12-08 8:37 ` Dev Jain @ 2025-12-15 2:19 ` Jinjiang Tu 2025-12-15 3:42 ` Jinjiang Tu 2025-12-31 6:29 ` Shardul Bankar 3 siblings, 1 reply; 12+ messages in thread From: Jinjiang Tu @ 2025-12-15 2:19 UTC (permalink / raw) To: Shardul Bankar, willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, dev.jain, david, shardulsb08, janak [-- Attachment #1: Type: text/plain, Size: 3765 bytes --] 在 2025/12/4 22:26, Shardul Bankar 写道: > xas_create_range() is typically called in a retry loop that uses > xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare > xa_node and store it in xas->xa_alloc for use in the retry. > > If the lock is dropped after xas_nomem(), another thread can expand the > xarray tree in the meantime. On the next retry, xas_create_range() can > then succeed without consuming the spare node stored in xas->xa_alloc. > If the function returns without freeing this spare node, it leaks. > > xas_create_range() calls xas_create() multiple times in a loop for > different index ranges. A spare node that isn't needed for one range > iteration might be needed for the next, so we cannot free it after each > xas_create() call. We can only safely free it after xas_create_range() > completes. > > Fix this by calling xas_destroy() at the end of xas_create_range() to > free any unused spare node. This makes the API safer by default and > prevents callers from needing to remember cleanup. > > This fixes a memory leak in mm/khugepaged.c and potentially other > callers that use xas_nomem() with xas_create_range(). I encountered another memory leak issue in xas_create_range(). collapse_file() calls xas_create_range() to pre-create all slots needed. If collapse_file() finally fails, these pre-created slots are empty nodes. When the file is deleted, shmem_evict_inode()->shmem_truncate_range()->shmem_undo_range() calls xas_store(&xas, NULL) for each entries to delete nodes, but leaving those pre-created empty nodes leaked. I can reproduce it with following steps. 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, and then mmap the file 2) memset for the first 2MB 3) madvise(MADV_COLLAPSE) for the second 2MB. 4) unlink the file in 3), collapse_file() calls xas_create_range() to expand xarray depth, and fails to collapse due to the whole 2M region is empty, leading to the new created empty nodes leaked. To fix it, maybe we should add a new function xas_delete_range() to revert what xas_create_range() does when xas_create_range() runs into rollback path? > > Link:https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c > Link:https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3") > Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow") > Signed-off-by: Shardul Bankar<shardul.b@mpiricsoftware.com> > --- > v4: > - Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy() > already checks xa_alloc internally. > v3: > - Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox > - Fix in library function makes API safer by default, preventing callers from needing > to remember cleanup > - Use shared cleanup label that both restore: and success: paths jump to > - Clean up unused spare node on both success and error exit paths > v2: > - Call xas_destroy() on both success and failure > - Explained retry semantics and xa_alloc / concurrency risk > - Dropped cleanup_empty_nodes from previous proposal > > lib/xarray.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/lib/xarray.c b/lib/xarray.c > index 9a8b4916540c..f49ccfa5f57d 100644 > --- a/lib/xarray.c > +++ b/lib/xarray.c > @@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas) > xas->xa_shift = shift; > xas->xa_sibs = sibs; > xas->xa_index = index; > - return; > + goto cleanup; > + > success: > xas->xa_index = index; > if (xas->xa_node) > xas_set_offset(xas); > + > +cleanup: > + /* Free any unused spare node from xas_nomem() */ > + xas_destroy(xas); > } > EXPORT_SYMBOL_GPL(xas_create_range); > [-- Attachment #2: Type: text/html, Size: 4539 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-15 2:19 ` Jinjiang Tu @ 2025-12-15 3:42 ` Jinjiang Tu 0 siblings, 0 replies; 12+ messages in thread From: Jinjiang Tu @ 2025-12-15 3:42 UTC (permalink / raw) To: Shardul Bankar, willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, dev.jain, david, shardulsb08, janak, Kefeng Wang 在 2025/12/15 10:19, Jinjiang Tu 写道: > > > 在 2025/12/4 22:26, Shardul Bankar 写道: >> xas_create_range() is typically called in a retry loop that uses >> xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare >> xa_node and store it in xas->xa_alloc for use in the retry. >> >> If the lock is dropped after xas_nomem(), another thread can expand the >> xarray tree in the meantime. On the next retry, xas_create_range() can >> then succeed without consuming the spare node stored in xas->xa_alloc. >> If the function returns without freeing this spare node, it leaks. >> >> xas_create_range() calls xas_create() multiple times in a loop for >> different index ranges. A spare node that isn't needed for one range >> iteration might be needed for the next, so we cannot free it after each >> xas_create() call. We can only safely free it after xas_create_range() >> completes. >> >> Fix this by calling xas_destroy() at the end of xas_create_range() to >> free any unused spare node. This makes the API safer by default and >> prevents callers from needing to remember cleanup. >> >> This fixes a memory leak in mm/khugepaged.c and potentially other >> callers that use xas_nomem() with xas_create_range(). > I encountered another memory leak issue in xas_create_range(). > > collapse_file() calls xas_create_range() to pre-create all slots needed. > If collapse_file() finally fails, these pre-created slots are empty nodes. > When the file is deleted, shmem_evict_inode()->shmem_truncate_range()->shmem_undo_range() > calls xas_store(&xas, NULL) for each entries to delete nodes, but leaving those pre-created > empty nodes leaked. > > I can reproduce it with following steps. > 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, and then mmap the file > 2) memset for the first 2MB > 3) madvise(MADV_COLLAPSE) for the second 2MB. > 4) unlink the file > > in 3), collapse_file() calls xas_create_range() to expand xarray depth, and fails to collapse > due to the whole 2M region is empty, leading to the new created empty nodes leaked. > > To fix it, maybe we should add a new function xas_delete_range() to revert what xas_create_range() > does when xas_create_range() runs into rollback path? How about the following diff? I tried it, and the memory leak disappears. I'm new in xarray, so I don't if this fix works properly. diff --git a/include/linux/xarray.h b/include/linux/xarray.h index be850174e802..972df5ceeb84 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -1555,6 +1555,7 @@ void xas_destroy(struct xa_state *); void xas_pause(struct xa_state *); void xas_create_range(struct xa_state *); +void xas_destroy_range(struct xa_state *xas, unsigned long start, unsigned long end); #ifdef CONFIG_XARRAY_MULTI int xa_get_order(struct xarray *, unsigned long index); diff --git a/lib/xarray.c b/lib/xarray.c index 9a8b4916540c..ab15dc939962 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -752,6 +752,21 @@ void xas_create_range(struct xa_state *xas) } EXPORT_SYMBOL_GPL(xas_create_range); +void xas_destroy_range(struct xa_state *xas, unsigned long start, unsigned long end) +{ + unsigned long index; + void *entry; + + for (index = start; index < end; ++index) { + xas_set(xas, index); + entry = xas_load(xas); + if (entry) + continue; + else if (xas->xa_node && !xas->xa_node->count) + xas_delete_node(xas); + } +} + static void update_node(struct xa_state *xas, struct xa_node *node, int count, int values) { diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 97d1b2824386..dd9d3f202c4b 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -2247,7 +2247,10 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, goto out; rollback: - /* Something went wrong: roll back page cache changes */ + /* Something went wrong: roll back empty xa_node created by + * xas_create_range() and page cache changes + */ + xas_destroy_range(&xas, start, end); + if (nr_none) { xas_lock_irq(&xas); mapping->nrpages -= nr_none; >> Link:https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c >> Link:https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3") >> Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow") >> Signed-off-by: Shardul Bankar<shardul.b@mpiricsoftware.com> >> --- >> v4: >> - Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy() >> already checks xa_alloc internally. >> v3: >> - Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox >> - Fix in library function makes API safer by default, preventing callers from needing >> to remember cleanup >> - Use shared cleanup label that both restore: and success: paths jump to >> - Clean up unused spare node on both success and error exit paths >> v2: >> - Call xas_destroy() on both success and failure >> - Explained retry semantics and xa_alloc / concurrency risk >> - Dropped cleanup_empty_nodes from previous proposal >> >> lib/xarray.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/lib/xarray.c b/lib/xarray.c >> index 9a8b4916540c..f49ccfa5f57d 100644 >> --- a/lib/xarray.c >> +++ b/lib/xarray.c >> @@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas) >> xas->xa_shift = shift; >> xas->xa_sibs = sibs; >> xas->xa_index = index; >> - return; >> + goto cleanup; >> + >> success: >> xas->xa_index = index; >> if (xas->xa_node) >> xas_set_offset(xas); >> + >> +cleanup: >> + /* Free any unused spare node from xas_nomem() */ >> + xas_destroy(xas); >> } >> EXPORT_SYMBOL_GPL(xas_create_range); >> ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-04 14:26 [PATCH v4] lib: xarray: free unused spare node in xas_create_range() Shardul Bankar ` (2 preceding siblings ...) 2025-12-15 2:19 ` Jinjiang Tu @ 2025-12-31 6:29 ` Shardul Bankar 2026-01-26 5:46 ` Shardul B 2026-01-27 3:48 ` Matthew Wilcox 3 siblings, 2 replies; 12+ messages in thread From: Shardul Bankar @ 2025-12-31 6:29 UTC (permalink / raw) To: willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, dev.jain, david, janak, shardulsb08, tujinjiang Hi Matthew, Andrew, Just a gentle ping on this one. v4 has a Reviewed-by from David, and Dev and Jinjiang both followed up with additional observations and ideas for related cleanups. As far as I can see, there are no outstanding objections to the current xas_nomem() / xas_create_range() spare-node fix. If this looks good to you, could it be queued for inclusion via whichever tree you think is appropriate? The separate question that Jinjiang raised about empty xa_nodes installed by xas_create_range() but never populated is being discussed in its own bug-report thread here: https://lore.kernel.org/all/86834731-02ba-43ea-9def-8b8ca156ec4a@huawei.com/ Once this patch is accepted/taken, I plan to follow up with a small cleanup patch that simplifies the label usage in xas_create_range() along the lines David suggested. Thanks, Shardul ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-31 6:29 ` Shardul Bankar @ 2026-01-26 5:46 ` Shardul B 2026-01-26 20:04 ` Andrew Morton 2026-01-27 3:48 ` Matthew Wilcox 1 sibling, 1 reply; 12+ messages in thread From: Shardul B @ 2026-01-26 5:46 UTC (permalink / raw) To: willy, akpm, linux-mm Cc: linux-fsdevel, linux-kernel, dev.jain, david, janak, tujinjiang, shardulsb08 [-- Attachment #1: Type: text/plain, Size: 437 bytes --] Hi Matthew, Andrew, Gently pinging this patch. I’ve checked the latest linux-next (next-20260123) and the akpm-unstable branches, and it doesn't appear to have been picked up yet. The patch has a Reviewed-by from David Hildenbrand and fixes a memory leak in the XArray library identified by syzbot. If there are no further concerns, could you please let me know if this is queued for the next cycle? Thanks, Shardul [-- Attachment #2: Type: text/html, Size: 1606 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2026-01-26 5:46 ` Shardul B @ 2026-01-26 20:04 ` Andrew Morton 2026-01-27 1:12 ` Jinjiang Tu 0 siblings, 1 reply; 12+ messages in thread From: Andrew Morton @ 2026-01-26 20:04 UTC (permalink / raw) To: Shardul B Cc: willy, linux-mm, linux-fsdevel, linux-kernel, dev.jain, david, janak, tujinjiang, shardulsb08 On Mon, 26 Jan 2026 11:16:08 +0530 Shardul B <shardul.b@mpiricsoftware.com> wrote: > Hi Matthew, Andrew, > > > Gently pinging this patch. I’ve checked the latest linux-next (next-20260123) and the akpm-unstable branches, and it doesn't appear to have been picked up yet. > > > The patch has a Reviewed-by from David Hildenbrand and fixes a memory leak in the XArray library identified by syzbot. > > > If there are no further concerns, could you please let me know if this is queued for the next cycle? There are comments from Jinjiang Tu and Dev Jain which remain unaddressed, please. The leak is a rare and this is a minor problem, I believe? I'll queue the patch for some exposure and testing but it appears that some additional consideration is needed before it should be progressed further. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2026-01-26 20:04 ` Andrew Morton @ 2026-01-27 1:12 ` Jinjiang Tu 0 siblings, 0 replies; 12+ messages in thread From: Jinjiang Tu @ 2026-01-27 1:12 UTC (permalink / raw) To: Andrew Morton, Shardul B Cc: willy, linux-mm, linux-fsdevel, linux-kernel, dev.jain, david, janak, shardulsb08 在 2026/1/27 4:04, Andrew Morton 写道: > On Mon, 26 Jan 2026 11:16:08 +0530 Shardul B <shardul.b@mpiricsoftware.com> wrote: > >> Hi Matthew, Andrew, >> >> >> Gently pinging this patch. I’ve checked the latest linux-next (next-20260123) and the akpm-unstable branches, and it doesn't appear to have been picked up yet. >> >> >> The patch has a Reviewed-by from David Hildenbrand and fixes a memory leak in the XArray library identified by syzbot. >> >> >> If there are no further concerns, could you please let me know if this is queued for the next cycle? > There are comments from Jinjiang Tu and Dev Jain which remain > unaddressed, please. Hi, Andrew The issue is another problem, independent ofShardul's patch. I have posted a patch to solve it. https://lore.kernel.org/linux-mm/20260121062243.1893129-1-tujinjiang@huawei.com > The leak is a rare and this is a minor problem, I believe? > > I'll queue the patch for some exposure and testing but it appears that > some additional consideration is needed before it should be progressed > further. > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range() 2025-12-31 6:29 ` Shardul Bankar 2026-01-26 5:46 ` Shardul B @ 2026-01-27 3:48 ` Matthew Wilcox 1 sibling, 0 replies; 12+ messages in thread From: Matthew Wilcox @ 2026-01-27 3:48 UTC (permalink / raw) To: Shardul Bankar Cc: akpm, linux-mm, linux-fsdevel, linux-kernel, dev.jain, david, janak, shardulsb08, tujinjiang On Wed, Dec 31, 2025 at 11:59:42AM +0530, Shardul Bankar wrote: > Hi Matthew, Andrew, pinging on december 31st is an entirely ineffective thing to do. for one, oracle is in shutdown. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2026-01-27 3:48 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-12-04 14:26 [PATCH v4] lib: xarray: free unused spare node in xas_create_range() Shardul Bankar 2025-12-05 7:22 ` David Hildenbrand (Red Hat) 2025-12-05 10:51 ` Shardul Bankar 2025-12-08 11:36 ` David Hildenbrand (Red Hat) 2025-12-08 8:37 ` Dev Jain 2025-12-15 2:19 ` Jinjiang Tu 2025-12-15 3:42 ` Jinjiang Tu 2025-12-31 6:29 ` Shardul Bankar 2026-01-26 5:46 ` Shardul B 2026-01-26 20:04 ` Andrew Morton 2026-01-27 1:12 ` Jinjiang Tu 2026-01-27 3:48 ` Matthew Wilcox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox