* [bug report] memory leak of xa_node in collapse_file() when rollbacks
@ 2025-12-18 11:45 Jinjiang Tu
2025-12-18 11:51 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 10+ messages in thread
From: Jinjiang Tu @ 2025-12-18 11:45 UTC (permalink / raw)
To: Andrew Morton, Matthew Wilcox, ziy, david, lorenzo.stoakes,
baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
baohua, lance.yang, linux-mm, linux-fsdevel
Cc: Kefeng Wang, tujinjiang
I encountered a memory leak issue caused by xas_create_range().
collapse_file() calls xas_create_range() to pre-create all slots needed.
If collapse_file() finally fails, these pre-created slots are empty nodes
and aren't destroyed.
I can reproduce it with following steps.
1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, and then mmap the file
2) memset for the first 2MB
3) madvise(MADV_COLLAPSE) for the second 2MB
4) unlink the file
in 3), collapse_file() calls xas_create_range() to expand xarray depth, and fails to collapse
due to the whole 2M region is empty, the code is as following:
collapse_file()
for (index = start; index < end;) {
xas_set(&xas, index);
folio = xas_load(&xas);
VM_BUG_ON(index != xas.xa_index);
if (is_shmem) {
if (!folio) {
/*
* Stop if extent has been truncated or
* hole-punched, and is now completely
* empty.
*/
if (index == start) {
if (!xas_next_entry(&xas, end - 1)) {
result = SCAN_TRUNCATED;
goto xa_locked;
}
}
...
}
collapse_file() rollback path doesn't destroy the pre-created empty nodes.
When the file is deleted, shmem_evict_inode()->shmem_truncate_range() traverses
all entries and calls xas_store(xas, NULL) to delete, if the leaf xa_node that
stores deleted entry becomes emtry, xas_store() will automatically delete the empty
node and delete it's parent is empty too, until parent node isn't empty. shmem_evict_inode()
won't traverse the empty nodes created by xas_create_range() due to these nodes doesn't store
any entries. As a result, these empty nodes are leaked.
At first, I tried to destory the empty nodes when collapse_file() goes to rollback path. However,
collapse_file() only holds xarray lock and may release the lock, so we couldn't prevent concurrent
call of collapse_file(), so the deleted empty nodes may be needed by other collapse_file() calls.
IIUC, xas_create_range() is used to guarantee the xas_store(&xas, new_folio); succeeds. Could we
remove xas_create_range() call and just rollback when we fail to xas_store?
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-18 11:45 [bug report] memory leak of xa_node in collapse_file() when rollbacks Jinjiang Tu
@ 2025-12-18 11:51 ` David Hildenbrand (Red Hat)
2025-12-18 12:18 ` Jinjiang Tu
2025-12-18 12:35 ` Jinjiang Tu
0 siblings, 2 replies; 10+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-18 11:51 UTC (permalink / raw)
To: Jinjiang Tu, Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes,
baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
baohua, lance.yang, linux-mm, linux-fsdevel
Cc: Kefeng Wang
On 12/18/25 12:45, Jinjiang Tu wrote:
> I encountered a memory leak issue caused by xas_create_range().
>
> collapse_file() calls xas_create_range() to pre-create all slots needed.
> If collapse_file() finally fails, these pre-created slots are empty nodes
> and aren't destroyed.
>
> I can reproduce it with following steps.
> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, and then mmap the file
> 2) memset for the first 2MB
> 3) madvise(MADV_COLLAPSE) for the second 2MB
> 4) unlink the file
>
> in 3), collapse_file() calls xas_create_range() to expand xarray depth, and fails to collapse
> due to the whole 2M region is empty, the code is as following:
>
> collapse_file()
> for (index = start; index < end;) {
> xas_set(&xas, index);
> folio = xas_load(&xas);
>
> VM_BUG_ON(index != xas.xa_index);
> if (is_shmem) {
> if (!folio) {
> /*
> * Stop if extent has been truncated or
> * hole-punched, and is now completely
> * empty.
> */
> if (index == start) {
> if (!xas_next_entry(&xas, end - 1)) {
> result = SCAN_TRUNCATED;
> goto xa_locked;
> }
> }
> ...
> }
>
>
> collapse_file() rollback path doesn't destroy the pre-created empty nodes.
>
> When the file is deleted, shmem_evict_inode()->shmem_truncate_range() traverses
> all entries and calls xas_store(xas, NULL) to delete, if the leaf xa_node that
> stores deleted entry becomes emtry, xas_store() will automatically delete the empty
> node and delete it's parent is empty too, until parent node isn't empty. shmem_evict_inode()
> won't traverse the empty nodes created by xas_create_range() due to these nodes doesn't store
> any entries. As a result, these empty nodes are leaked.
>
> At first, I tried to destory the empty nodes when collapse_file() goes to rollback path. However,
> collapse_file() only holds xarray lock and may release the lock, so we couldn't prevent concurrent
> call of collapse_file(), so the deleted empty nodes may be needed by other collapse_file() calls.
>
> IIUC, xas_create_range() is used to guarantee the xas_store(&xas, new_folio); succeeds. Could we
> remove xas_create_range() call and just rollback when we fail to xas_store?
Hi,
thanks for the report.
Is that what [1] is fixing?
[1]
https://lore.kernel.org/linux-mm/20251204142625.1763372-1-shardul.b@mpiricsoftware.com/
--
Cheers
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-18 11:51 ` David Hildenbrand (Red Hat)
@ 2025-12-18 12:18 ` Jinjiang Tu
2025-12-18 12:49 ` David Hildenbrand (Red Hat)
2025-12-18 12:35 ` Jinjiang Tu
1 sibling, 1 reply; 10+ messages in thread
From: Jinjiang Tu @ 2025-12-18 12:18 UTC (permalink / raw)
To: David Hildenbrand (Red Hat),
Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang,
Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang,
linux-mm, linux-fsdevel
Cc: Kefeng Wang
[-- Attachment #1: Type: text/plain, Size: 3253 bytes --]
在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道:
> On 12/18/25 12:45, Jinjiang Tu wrote:
>> I encountered a memory leak issue caused by xas_create_range().
>>
>> collapse_file() calls xas_create_range() to pre-create all slots needed.
>> If collapse_file() finally fails, these pre-created slots are empty
>> nodes
>> and aren't destroyed.
>>
>> I can reproduce it with following steps.
>> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size,
>> and then mmap the file
>> 2) memset for the first 2MB
>> 3) madvise(MADV_COLLAPSE) for the second 2MB
>> 4) unlink the file
>>
>> in 3), collapse_file() calls xas_create_range() to expand xarray
>> depth, and fails to collapse
>> due to the whole 2M region is empty, the code is as following:
>>
>> collapse_file()
>> for (index = start; index < end;) {
>> xas_set(&xas, index);
>> folio = xas_load(&xas);
>>
>> VM_BUG_ON(index != xas.xa_index);
>> if (is_shmem) {
>> if (!folio) {
>> /*
>> * Stop if extent has been truncated or
>> * hole-punched, and is now completely
>> * empty.
>> */
>> if (index == start) {
>> if (!xas_next_entry(&xas, end - 1)) {
>> result = SCAN_TRUNCATED;
>> goto xa_locked;
>> }
>> }
>> ...
>> }
>>
>>
>> collapse_file() rollback path doesn't destroy the pre-created empty
>> nodes.
>>
>> When the file is deleted, shmem_evict_inode()->shmem_truncate_range()
>> traverses
>> all entries and calls xas_store(xas, NULL) to delete, if the leaf
>> xa_node that
>> stores deleted entry becomes emtry, xas_store() will automatically
>> delete the empty
>> node and delete it's parent is empty too, until parent node isn't
>> empty. shmem_evict_inode()
>> won't traverse the empty nodes created by xas_create_range() due to
>> these nodes doesn't store
>> any entries. As a result, these empty nodes are leaked.
>>
>> At first, I tried to destory the empty nodes when collapse_file()
>> goes to rollback path. However,
>> collapse_file() only holds xarray lock and may release the lock, so
>> we couldn't prevent concurrent
>> call of collapse_file(), so the deleted empty nodes may be needed by
>> other collapse_file() calls.
>>
>> IIUC, xas_create_range() is used to guarantee the xas_store(&xas,
>> new_folio); succeeds. Could we
>> remove xas_create_range() call and just rollback when we fail to
>> xas_store?
>
> Hi,
>
> thanks for the report.
>
> Is that what [1] is fixing?
>
> [1]
> https://lore.kernel.org/linux-mm/20251204142625.1763372-1-shardul.b@mpiricsoftware.com/
>
>
No, this patch fixes memory leak caused by xas->xa_alloc allocated by xas_nomem() and the xa_node
isn't installed into xarray.
In my case, the leaked xa_nodes have been installed into xarray by xas_create_range().
[-- Attachment #2: Type: text/html, Size: 4850 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-18 11:51 ` David Hildenbrand (Red Hat)
2025-12-18 12:18 ` Jinjiang Tu
@ 2025-12-18 12:35 ` Jinjiang Tu
1 sibling, 0 replies; 10+ messages in thread
From: Jinjiang Tu @ 2025-12-18 12:35 UTC (permalink / raw)
To: David Hildenbrand (Red Hat),
Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang,
Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang,
linux-mm, linux-fsdevel
Cc: Kefeng Wang
在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道:
> On 12/18/25 12:45, Jinjiang Tu wrote:
>> I encountered a memory leak issue caused by xas_create_range().
>>
>> collapse_file() calls xas_create_range() to pre-create all slots needed.
>> If collapse_file() finally fails, these pre-created slots are empty
>> nodes
>> and aren't destroyed.
>>
>> I can reproduce it with following steps.
>> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size,
>> and then mmap the file
>> 2) memset for the first 2MB
>> 3) madvise(MADV_COLLAPSE) for the second 2MB
>> 4) unlink the file
>>
>> in 3), collapse_file() calls xas_create_range() to expand xarray
>> depth, and fails to collapse
>> due to the whole 2M region is empty, the code is as following:
>>
>> collapse_file()
>> for (index = start; index < end;) {
>> xas_set(&xas, index);
>> folio = xas_load(&xas);
>>
>> VM_BUG_ON(index != xas.xa_index);
>> if (is_shmem) {
>> if (!folio) {
>> /*
>> * Stop if extent has been truncated or
>> * hole-punched, and is now completely
>> * empty.
>> */
>> if (index == start) {
>> if (!xas_next_entry(&xas, end - 1)) {
>> result = SCAN_TRUNCATED;
>> goto xa_locked;
>> }
>> }
>> ...
>> }
>>
>>
>> collapse_file() rollback path doesn't destroy the pre-created empty
>> nodes.
>>
>> When the file is deleted, shmem_evict_inode()->shmem_truncate_range()
>> traverses
>> all entries and calls xas_store(xas, NULL) to delete, if the leaf
>> xa_node that
>> stores deleted entry becomes emtry, xas_store() will automatically
>> delete the empty
>> node and delete it's parent is empty too, until parent node isn't
>> empty. shmem_evict_inode()
>> won't traverse the empty nodes created by xas_create_range() due to
>> these nodes doesn't store
>> any entries. As a result, these empty nodes are leaked.
>>
>> At first, I tried to destory the empty nodes when collapse_file()
>> goes to rollback path. However,
>> collapse_file() only holds xarray lock and may release the lock, so
>> we couldn't prevent concurrent
>> call of collapse_file(), so the deleted empty nodes may be needed by
>> other collapse_file() calls.
>>
>> IIUC, xas_create_range() is used to guarantee the xas_store(&xas,
>> new_folio); succeeds. Could we
>> remove xas_create_range() call and just rollback when we fail to
>> xas_store?
>
> Hi,
>
> thanks for the report.
>
> Is that what [1] is fixing?
>
> [1]
> https://lore.kernel.org/linux-mm/20251204142625.1763372-1-shardul.b@mpiricsoftware.com/
>
>
the allocation stack of the leaked xa_node is as follows:
unreferenced object 0xffff0000d4d9fd98 (size 576):
comm "test_filemap", pid 8220, jiffies 4294957272 (age 659.036s)
hex dump (first 32 bytes):
00 08 00 00 00 00 00 00 88 54 75 dc 00 00 ff ff .........Tu.....
a0 7d db d6 00 00 ff ff b0 fd d9 d4 00 00 ff ff .}..............
backtrace:
kmemleak_alloc+0xb8/0xd8
kmem_cache_alloc_lru+0x308/0x558
xas_alloc+0xb4/0xd0
xas_create+0xf4/0x1f8
xas_create_range+0xd0/0x158
collapse_file+0x13c/0x11e0
hpage_collapse_scan_file+0x1e0/0x488
madvise_collapse+0x174/0x650
madvise_vma_behavior+0x334/0x520
do_madvise+0x1bc/0x388
__arm64_sys_madvise+0x2c/0x48
invoke_syscall+0x50/0x128
el0_svc_common.constprop.0+0xc8/0xf0
do_el0_svc+0x24/0x38
el0_svc+0x44/0x200
el0t_64_sync_handler+0x100/0x130
it is allocated by xas_create_range(), not xas_nomem(). So it isn't the same issue.
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-18 12:18 ` Jinjiang Tu
@ 2025-12-18 12:49 ` David Hildenbrand (Red Hat)
2025-12-18 13:11 ` Jinjiang Tu
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-18 12:49 UTC (permalink / raw)
To: Jinjiang Tu, Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes,
baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
baohua, lance.yang, linux-mm, linux-fsdevel
Cc: Kefeng Wang, Shardul Bankar
On 12/18/25 13:18, Jinjiang Tu wrote:
>
> 在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道:
>> On 12/18/25 12:45, Jinjiang Tu wrote:
>>> I encountered a memory leak issue caused by xas_create_range().
>>>
>>> collapse_file() calls xas_create_range() to pre-create all slots needed.
>>> If collapse_file() finally fails, these pre-created slots are empty
>>> nodes
>>> and aren't destroyed.
>>>
>>> I can reproduce it with following steps.
>>> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size,
>>> and then mmap the file
>>> 2) memset for the first 2MB
>>> 3) madvise(MADV_COLLAPSE) for the second 2MB
>>> 4) unlink the file
>>>
>>> in 3), collapse_file() calls xas_create_range() to expand xarray
>>> depth, and fails to collapse
>>> due to the whole 2M region is empty, the code is as following:
>>>
>>> collapse_file()
>>> for (index = start; index < end;) {
>>> xas_set(&xas, index);
>>> folio = xas_load(&xas);
>>>
>>> VM_BUG_ON(index != xas.xa_index);
>>> if (is_shmem) {
>>> if (!folio) {
>>> /*
>>> * Stop if extent has been truncated or
>>> * hole-punched, and is now completely
>>> * empty.
>>> */
>>> if (index == start) {
>>> if (!xas_next_entry(&xas, end - 1)) {
>>> result = SCAN_TRUNCATED;
>>> goto xa_locked;
>>> }
>>> }
>>> ...
>>> }
>>>
>>>
>>> collapse_file() rollback path doesn't destroy the pre-created empty
>>> nodes.
>>>
>>> When the file is deleted, shmem_evict_inode()->shmem_truncate_range()
>>> traverses
>>> all entries and calls xas_store(xas, NULL) to delete, if the leaf
>>> xa_node that
>>> stores deleted entry becomes emtry, xas_store() will automatically
>>> delete the empty
>>> node and delete it's parent is empty too, until parent node isn't
>>> empty. shmem_evict_inode()
>>> won't traverse the empty nodes created by xas_create_range() due to
>>> these nodes doesn't store
>>> any entries. As a result, these empty nodes are leaked.
>>>
>>> At first, I tried to destory the empty nodes when collapse_file()
>>> goes to rollback path. However,
>>> collapse_file() only holds xarray lock and may release the lock, so
>>> we couldn't prevent concurrent
>>> call of collapse_file(), so the deleted empty nodes may be needed by
>>> other collapse_file() calls.
>>>
>>> IIUC, xas_create_range() is used to guarantee the xas_store(&xas,
>>> new_folio); succeeds. Could we
>>> remove xas_create_range() call and just rollback when we fail to
>>> xas_store?
>>
>> Hi,
>>
>> thanks for the report.
>>
>> Is that what [1] is fixing?
>>
>> [1] https://lore.kernel.org/linux-mm/20251204142625.1763372-1-
>> shardul.b@mpiricsoftware.com/
>>
> No, this patch fixes memory leak caused by xas->xa_alloc allocated by xas_nomem() and the xa_node
> isn't installed into xarray.
>
> In my case, the leaked xa_nodes have been installed into xarray by xas_create_range().
Thanks for checking. I thought that was also discussed as part of the
other fix.
See [2] where we have
"Note: This fixes the leak of pre-allocated nodes. A separate fix will
be needed to clean up empty nodes that were inserted into the tree by
xas_create_range() but never populated."
Is that the issue you are describing? (sounds like it, but I only
skimmed over the details).
CCing Shardul.
[2]
https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/
--
Cheers
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-18 12:49 ` David Hildenbrand (Red Hat)
@ 2025-12-18 13:11 ` Jinjiang Tu
2025-12-25 4:15 ` Shardul Bankar
0 siblings, 1 reply; 10+ messages in thread
From: Jinjiang Tu @ 2025-12-18 13:11 UTC (permalink / raw)
To: David Hildenbrand (Red Hat),
Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang,
Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang,
linux-mm, linux-fsdevel
Cc: Kefeng Wang, Shardul Bankar
[-- Attachment #1: Type: text/plain, Size: 7593 bytes --]
在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道:
> On 12/18/25 13:18, Jinjiang Tu wrote:
>>
>> 在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道:
>>> On 12/18/25 12:45, Jinjiang Tu wrote:
>>>> I encountered a memory leak issue caused by xas_create_range().
>>>>
>>>> collapse_file() calls xas_create_range() to pre-create all slots
>>>> needed.
>>>> If collapse_file() finally fails, these pre-created slots are empty
>>>> nodes
>>>> and aren't destroyed.
>>>>
>>>> I can reproduce it with following steps.
>>>> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB
>>>> size, and then mmap the file
>>>> 2) memset for the first 2MB
>>>> 3) madvise(MADV_COLLAPSE) for the second 2MB
>>>> 4) unlink the file
>>>>
>>>> in 3), collapse_file() calls xas_create_range() to expand xarray
>>>> depth, and fails to collapse
>>>> due to the whole 2M region is empty, the code is as following:
>>>>
>>>> collapse_file()
>>>> for (index = start; index < end;) {
>>>> xas_set(&xas, index);
>>>> folio = xas_load(&xas);
>>>>
>>>> VM_BUG_ON(index != xas.xa_index);
>>>> if (is_shmem) {
>>>> if (!folio) {
>>>> /*
>>>> * Stop if extent has been truncated or
>>>> * hole-punched, and is now completely
>>>> * empty.
>>>> */
>>>> if (index == start) {
>>>> if (!xas_next_entry(&xas, end - 1)) {
>>>> result = SCAN_TRUNCATED;
>>>> goto xa_locked;
>>>> }
>>>> }
>>>> ...
>>>> }
>>>>
>>>>
>>>> collapse_file() rollback path doesn't destroy the pre-created empty
>>>> nodes.
>>>>
>>>> When the file is deleted,
>>>> shmem_evict_inode()->shmem_truncate_range() traverses
>>>> all entries and calls xas_store(xas, NULL) to delete, if the leaf
>>>> xa_node that
>>>> stores deleted entry becomes emtry, xas_store() will automatically
>>>> delete the empty
>>>> node and delete it's parent is empty too, until parent node isn't
>>>> empty. shmem_evict_inode()
>>>> won't traverse the empty nodes created by xas_create_range() due to
>>>> these nodes doesn't store
>>>> any entries. As a result, these empty nodes are leaked.
>>>>
>>>> At first, I tried to destory the empty nodes when collapse_file()
>>>> goes to rollback path. However,
>>>> collapse_file() only holds xarray lock and may release the lock, so
>>>> we couldn't prevent concurrent
>>>> call of collapse_file(), so the deleted empty nodes may be needed
>>>> by other collapse_file() calls.
>>>>
>>>> IIUC, xas_create_range() is used to guarantee the xas_store(&xas,
>>>> new_folio); succeeds. Could we
>>>> remove xas_create_range() call and just rollback when we fail to
>>>> xas_store?
>>>
>>> Hi,
>>>
>>> thanks for the report.
>>>
>>> Is that what [1] is fixing?
>>>
>>> [1] https://lore.kernel.org/linux-mm/20251204142625.1763372-1-
>>> shardul.b@mpiricsoftware.com/
>>>
>> No, this patch fixes memory leak caused by xas->xa_alloc allocated by
>> xas_nomem() and the xa_node
>> isn't installed into xarray.
>>
>> In my case, the leaked xa_nodes have been installed into xarray by
>> xas_create_range().
>
> Thanks for checking. I thought that was also discussed as part of the
> other fix.
>
> See [2] where we have
>
> "Note: This fixes the leak of pre-allocated nodes. A separate fix will
> be needed to clean up empty nodes that were inserted into the tree by
> xas_create_range() but never populated."
>
> Is that the issue you are describing? (sounds like it, but I only
> skimmed over the details).
>
> CCing Shardul.
Yes, the same issue. As I descirbed in the first email:
"
At first, I tried to destory the empty nodes when collapse_file() goes to rollback path. However,
collapse_file() only holds xarray lock and may release the lock, so we couldn't prevent concurrent
call of collapse_file(), so the deleted empty nodes may be needed by other collapse_file() calls.
"
We couldn't bindly cleanup empty nodes in the rollback path.
I'm trying to move the xas_create_range() before xas_store() and always under xarray lock to make
rollback easier, the diff likes (applied on 6.6, haven't tested yet)
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index c3f54d2eaf36..5ef393011a61 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -1548,6 +1548,7 @@ void xas_destroy(struct xa_state *);
void xas_pause(struct xa_state *);
void xas_create_range(struct xa_state *);
+void xas_destroy_range(struct xa_state *xas, unsigned long start, unsigned long end);
#ifdef CONFIG_XARRAY_MULTI
int xa_get_order(struct xarray *, unsigned long index);
diff --git a/lib/xarray.c b/lib/xarray.c
index 32d4bac8c94c..724a7f35a26f 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -745,6 +745,25 @@ void xas_create_range(struct xa_state *xas)
}
EXPORT_SYMBOL_GPL(xas_create_range);
+void xas_destroy_range(struct xa_state *xas, unsigned long start, unsigned long end)
+{
+ unsigned long index;
+ void *entry;
+
+ for (index = start; index < end; ++index) {
+ xas_set(xas, index);
+ entry = xas_load(xas);
+ if (entry)
+ continue;
+
+ if (!xas->xa_node || xas_invalid(xas))
+ continue;
+
+ if (!xas->xa_node->count)
+ xas_delete_node(xas);
+ }
+}
+
static void update_node(struct xa_state *xas, struct xa_node *node,
int count, int values)
{
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index e8cb826c1994..36f0600ef7b1 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1832,7 +1832,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
struct folio *folio, *tmp, *new_folio;
pgoff_t index = 0, end = start + HPAGE_PMD_NR;
LIST_HEAD(pagelist);
- XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER);
+ XA_STATE(xas, &mapping->i_pages, 0);
int nr_none = 0, result = SCAN_SUCCEED;
bool is_shmem = shmem_file(file);
@@ -1851,22 +1851,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
new_folio->index = start;
new_folio->mapping = mapping;
- /*
- * Ensure we have slots for all the pages in the range. This is
- * almost certainly a no-op because most of the pages must be present
- */
- do {
- xas_lock_irq(&xas);
- xas_create_range(&xas);
- if (!xas_error(&xas))
- break;
- xas_unlock_irq(&xas);
- if (!xas_nomem(&xas, GFP_KERNEL)) {
- result = SCAN_FAIL;
- goto rollback;
- }
- } while (1);
-
for (index = start; index < end;) {
xas_set(&xas, index);
folio = xas_load(&xas);
@@ -2163,6 +2147,23 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr,
xas_lock_irq(&xas);
}
+ xas_set_order(&xas, start, HPAGE_PMD_ORDER);
+ xas_create_range(&xas);
+ if (xas_error(&xas)) {
+ if (nr_none) {
+ xas_set(&xas, start);
+ for (index = start; index < end; index++) {
+ if (xas_next(&xas) == XA_RETRY_ENTRY)
+ xas_store(&xas, NULL);
+ }
+ }
+ xas_destroy_range(&xas, start, end);
+ xas_unlock_irq(&xas);
+ result = SCAN_FAIL;
+
+ goto rollback;
+ }
+
if (is_shmem)
__lruvec_stat_mod_folio(new_folio, NR_SHMEM_THPS, HPAGE_PMD_NR);
else
--
2.43.0
>
>
> [2]
> https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/
>
[-- Attachment #2: Type: text/html, Size: 10351 bytes --]
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-18 13:11 ` Jinjiang Tu
@ 2025-12-25 4:15 ` Shardul Bankar
2025-12-27 1:24 ` Jinjiang Tu
0 siblings, 1 reply; 10+ messages in thread
From: Shardul Bankar @ 2025-12-25 4:15 UTC (permalink / raw)
To: Jinjiang Tu, David Hildenbrand (Red Hat),
Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang,
Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang,
linux-mm, linux-fsdevel
Cc: Kefeng Wang, shardulsb08
On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote:
>
> 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道:
>
> > >
> >
> > Thanks for checking. I thought that was also discussed as part of
> > the other fix.
> >
> > See [2] where we have
> >
> > "Note: This fixes the leak of pre-allocated nodes. A separate fix
> > will
> > be needed to clean up empty nodes that were inserted into the tree
> > by
> > xas_create_range() but never populated."
> >
> > Is that the issue you are describing? (sounds like it, but I only
> > skimmed over the details).
> >
> > CCing Shardul.
> Yes, the same issue. As I descirbed in the first email:
> "
> At first, I tried to destory the empty nodes when collapse_file()
> goes to rollback path. However,
> collapse_file() only holds xarray lock and may release the lock, so
> we couldn't prevent concurrent
> call of collapse_file(), so the deleted empty nodes may be needed by
> other collapse_file() calls.
> "
Hi David, Jinjiang,
As Jinjiang mentioned, this appears to address what I had originally
referred to in the "Note:" in [1].
Just to clarify the context of the "Note:", that was based on my
assumption at the time that such empty nodes would be considered leaks.
After Dev’s feedback in [2]:
"No "fix" is needed in this case, the empty nodes are there in the tree
and there is no leak."
and looking at the older discussion in [3]:
"There's nothing to free; if a node is allocated, then it's stored in
the tree where it can later be found and reused. "
my updated understanding is that there is no leak in this case- the
nodes remain valid and reusable, and therefore do not require a
separate fix.
David could you correct me if I am mistaken?
[1]
https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/
[2]
https://lore.kernel.org/linux-mm/57cbf887-d181-418b-a6c7-9f3eff5d632a@arm.com/
[3]
https://lore.kernel.org/all/Ys1r06szkVi3QEai@casper.infradead.org/
Thanks,
Shardul
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-25 4:15 ` Shardul Bankar
@ 2025-12-27 1:24 ` Jinjiang Tu
2025-12-30 21:03 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 10+ messages in thread
From: Jinjiang Tu @ 2025-12-27 1:24 UTC (permalink / raw)
To: Shardul Bankar, David Hildenbrand (Red Hat),
Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang,
Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang,
linux-mm, linux-fsdevel
Cc: Kefeng Wang, shardulsb08
在 2025/12/25 12:15, Shardul Bankar 写道:
> On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote:
>> 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道:
>>
>>> Thanks for checking. I thought that was also discussed as part of
>>> the other fix.
>>>
>>> See [2] where we have
>>>
>>> "Note: This fixes the leak of pre-allocated nodes. A separate fix
>>> will
>>> be needed to clean up empty nodes that were inserted into the tree
>>> by
>>> xas_create_range() but never populated."
>>>
>>> Is that the issue you are describing? (sounds like it, but I only
>>> skimmed over the details).
>>>
>>> CCing Shardul.
>> Yes, the same issue. As I descirbed in the first email:
>> "
>> At first, I tried to destory the empty nodes when collapse_file()
>> goes to rollback path. However,
>> collapse_file() only holds xarray lock and may release the lock, so
>> we couldn't prevent concurrent
>> call of collapse_file(), so the deleted empty nodes may be needed by
>> other collapse_file() calls.
>> "
> Hi David, Jinjiang,
>
> As Jinjiang mentioned, this appears to address what I had originally
> referred to in the "Note:" in [1].
>
> Just to clarify the context of the "Note:", that was based on my
> assumption at the time that such empty nodes would be considered leaks.
> After Dev’s feedback in [2]:
> "No "fix" is needed in this case, the empty nodes are there in the tree
> and there is no leak."
>
> and looking at the older discussion in [3]:
> "There's nothing to free; if a node is allocated, then it's stored in
> the tree where it can later be found and reused. "
However, if the empty nodes aren't reused, When the file is deleted,
shmem_evict_inode()->shmem_truncate_range() traverses all entries and
calls xas_store(xas, NULL) to delete, if the leaf xa_node that stores
deleted entry becomes empty, xas_store() will automatically delete the
empty node and delete it's parent is empty too, until parent node isn't
empty. shmem_evict_inode() won't traverse the empty nodes created by
xas_create_range() due to these nodes doesn't store any entries.
>
> my updated understanding is that there is no leak in this case- the
> nodes remain valid and reusable, and therefore do not require a
> separate fix.
>
> David could you correct me if I am mistaken?
>
> [1]
> https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/
>
> [2]
> https://lore.kernel.org/linux-mm/57cbf887-d181-418b-a6c7-9f3eff5d632a@arm.com/
>
> [3]
> https://lore.kernel.org/all/Ys1r06szkVi3QEai@casper.infradead.org/
>
> Thanks,
> Shardul
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-27 1:24 ` Jinjiang Tu
@ 2025-12-30 21:03 ` David Hildenbrand (Red Hat)
2025-12-31 6:29 ` Jinjiang Tu
0 siblings, 1 reply; 10+ messages in thread
From: David Hildenbrand (Red Hat) @ 2025-12-30 21:03 UTC (permalink / raw)
To: Jinjiang Tu, Shardul Bankar, Andrew Morton, Matthew Wilcox, ziy,
lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts,
dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel
Cc: Kefeng Wang, shardulsb08
On 12/27/25 02:24, Jinjiang Tu wrote:
>
> 在 2025/12/25 12:15, Shardul Bankar 写道:
>> On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote:
>>> 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道:
>>>
>>>> Thanks for checking. I thought that was also discussed as part of
>>>> the other fix.
>>>>
>>>> See [2] where we have
>>>>
>>>> "Note: This fixes the leak of pre-allocated nodes. A separate fix
>>>> will
>>>> be needed to clean up empty nodes that were inserted into the tree
>>>> by
>>>> xas_create_range() but never populated."
>>>>
>>>> Is that the issue you are describing? (sounds like it, but I only
>>>> skimmed over the details).
>>>>
>>>> CCing Shardul.
>>> Yes, the same issue. As I descirbed in the first email:
>>> "
>>> At first, I tried to destory the empty nodes when collapse_file()
>>> goes to rollback path. However,
>>> collapse_file() only holds xarray lock and may release the lock, so
>>> we couldn't prevent concurrent
>>> call of collapse_file(), so the deleted empty nodes may be needed by
>>> other collapse_file() calls.
>>> "
>> Hi David, Jinjiang,
>>
>> As Jinjiang mentioned, this appears to address what I had originally
>> referred to in the "Note:" in [1].
>>
>> Just to clarify the context of the "Note:", that was based on my
>> assumption at the time that such empty nodes would be considered leaks.
>> After Dev’s feedback in [2]:
>> "No "fix" is needed in this case, the empty nodes are there in the tree
>> and there is no leak."
>>
>> and looking at the older discussion in [3]:
>> "There's nothing to free; if a node is allocated, then it's stored in
>> the tree where it can later be found and reused. "
>
> However, if the empty nodes aren't reused, When the file is deleted,
> shmem_evict_inode()->shmem_truncate_range() traverses all entries and
> calls xas_store(xas, NULL) to delete, if the leaf xa_node that stores
> deleted entry becomes empty, xas_store() will automatically delete the
> empty node and delete it's parent is empty too, until parent node isn't
> empty. shmem_evict_inode() won't traverse the empty nodes created by
> xas_create_range() due to these nodes doesn't store any entries.
So you're saying that nothing/nobody would clean up these xarray entries
and we'd be leaking them?
"struct xarray" documents "If all of the entries in the array are NULL,
@xa_head is a NULL pointer.". So we depend on all entries being set to
NULL in order to properly cleanup/free the xarray automatically.
--
Cheers
David
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks
2025-12-30 21:03 ` David Hildenbrand (Red Hat)
@ 2025-12-31 6:29 ` Jinjiang Tu
0 siblings, 0 replies; 10+ messages in thread
From: Jinjiang Tu @ 2025-12-31 6:29 UTC (permalink / raw)
To: David Hildenbrand (Red Hat),
Shardul Bankar, Andrew Morton, Matthew Wilcox, ziy,
lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts,
dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel
Cc: Kefeng Wang, shardulsb08
在 2025/12/31 5:03, David Hildenbrand (Red Hat) 写道:
> On 12/27/25 02:24, Jinjiang Tu wrote:
>>
>> 在 2025/12/25 12:15, Shardul Bankar 写道:
>>> On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote:
>>>> 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道:
>>>>> Thanks for checking. I thought that was also discussed as part of
>>>>> the other fix.
>>>>> See [2] where we have
>>>>> "Note: This fixes the leak of pre-allocated nodes. A
>>>>> separate fix
>>>>> will
>>>>> be needed to clean up empty nodes that were inserted into the tree
>>>>> by
>>>>> xas_create_range() but never populated."
>>>>> Is that the issue you are describing? (sounds like it, but I
>>>>> only
>>>>> skimmed over the details).
>>>>> CCing Shardul.
>>>> Yes, the same issue. As I descirbed in the first email:
>>>> "
>>>> At first, I tried to destory the empty nodes when collapse_file()
>>>> goes to rollback path. However,
>>>> collapse_file() only holds xarray lock and may release the lock, so
>>>> we couldn't prevent concurrent
>>>> call of collapse_file(), so the deleted empty nodes may be needed by
>>>> other collapse_file() calls.
>>>> "
>>> Hi David, Jinjiang,
>>>
>>> As Jinjiang mentioned, this appears to address what I had originally
>>> referred to in the "Note:" in [1].
>>>
>>> Just to clarify the context of the "Note:", that was based on my
>>> assumption at the time that such empty nodes would be considered leaks.
>>> After Dev’s feedback in [2]:
>>> "No "fix" is needed in this case, the empty nodes are there in the tree
>>> and there is no leak."
>>>
>>> and looking at the older discussion in [3]:
>>> "There's nothing to free; if a node is allocated, then it's stored in
>>> the tree where it can later be found and reused. "
>>
>> However, if the empty nodes aren't reused, When the file is deleted,
>> shmem_evict_inode()->shmem_truncate_range() traverses all entries and
>> calls xas_store(xas, NULL) to delete, if the leaf xa_node that stores
>> deleted entry becomes empty, xas_store() will automatically delete the
>> empty node and delete it's parent is empty too, until parent node isn't
>> empty. shmem_evict_inode() won't traverse the empty nodes created by
>> xas_create_range() due to these nodes doesn't store any entries.
>
> So you're saying that nothing/nobody would clean up these xarray
> entries and we'd be leaking them?
Yes.
>
> "struct xarray" documents "If all of the entries in the array are
> NULL, @xa_head is a NULL pointer.". So we depend on all entries being
> set to NULL in order to properly cleanup/free the xarray automatically.
>
Yes
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-12-31 6:29 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-12-18 11:45 [bug report] memory leak of xa_node in collapse_file() when rollbacks Jinjiang Tu
2025-12-18 11:51 ` David Hildenbrand (Red Hat)
2025-12-18 12:18 ` Jinjiang Tu
2025-12-18 12:49 ` David Hildenbrand (Red Hat)
2025-12-18 13:11 ` Jinjiang Tu
2025-12-25 4:15 ` Shardul Bankar
2025-12-27 1:24 ` Jinjiang Tu
2025-12-30 21:03 ` David Hildenbrand (Red Hat)
2025-12-31 6:29 ` Jinjiang Tu
2025-12-18 12:35 ` Jinjiang Tu
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox