* [bug report] memory leak of xa_node in collapse_file() when rollbacks
@ 2025-12-18 11:45 Jinjiang Tu
2025-12-18 11:51 ` David Hildenbrand (Red Hat)
0 siblings, 1 reply; 10+ messages in thread
From: Jinjiang Tu @ 2025-12-18 11:45 UTC (permalink / raw)
To: Andrew Morton, Matthew Wilcox, ziy, david, lorenzo.stoakes,
baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain,
baohua, lance.yang, linux-mm, linux-fsdevel
Cc: Kefeng Wang, tujinjiang
I encountered a memory leak issue caused by xas_create_range().
collapse_file() calls xas_create_range() to pre-create all slots needed.
If collapse_file() finally fails, these pre-created slots are empty nodes
and aren't destroyed.
I can reproduce it with following steps.
1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, and then mmap the file
2) memset for the first 2MB
3) madvise(MADV_COLLAPSE) for the second 2MB
4) unlink the file
in 3), collapse_file() calls xas_create_range() to expand xarray depth, and fails to collapse
due to the whole 2M region is empty, the code is as following:
collapse_file()
for (index = start; index < end;) {
xas_set(&xas, index);
folio = xas_load(&xas);
VM_BUG_ON(index != xas.xa_index);
if (is_shmem) {
if (!folio) {
/*
* Stop if extent has been truncated or
* hole-punched, and is now completely
* empty.
*/
if (index == start) {
if (!xas_next_entry(&xas, end - 1)) {
result = SCAN_TRUNCATED;
goto xa_locked;
}
}
...
}
collapse_file() rollback path doesn't destroy the pre-created empty nodes.
When the file is deleted, shmem_evict_inode()->shmem_truncate_range() traverses
all entries and calls xas_store(xas, NULL) to delete, if the leaf xa_node that
stores deleted entry becomes emtry, xas_store() will automatically delete the empty
node and delete it's parent is empty too, until parent node isn't empty. shmem_evict_inode()
won't traverse the empty nodes created by xas_create_range() due to these nodes doesn't store
any entries. As a result, these empty nodes are leaked.
At first, I tried to destory the empty nodes when collapse_file() goes to rollback path. However,
collapse_file() only holds xarray lock and may release the lock, so we couldn't prevent concurrent
call of collapse_file(), so the deleted empty nodes may be needed by other collapse_file() calls.
IIUC, xas_create_range() is used to guarantee the xas_store(&xas, new_folio); succeeds. Could we
remove xas_create_range() call and just rollback when we fail to xas_store?
^ permalink raw reply [flat|nested] 10+ messages in thread* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-18 11:45 [bug report] memory leak of xa_node in collapse_file() when rollbacks Jinjiang Tu @ 2025-12-18 11:51 ` David Hildenbrand (Red Hat) 2025-12-18 12:18 ` Jinjiang Tu 2025-12-18 12:35 ` Jinjiang Tu 0 siblings, 2 replies; 10+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-12-18 11:51 UTC (permalink / raw) To: Jinjiang Tu, Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang On 12/18/25 12:45, Jinjiang Tu wrote: > I encountered a memory leak issue caused by xas_create_range(). > > collapse_file() calls xas_create_range() to pre-create all slots needed. > If collapse_file() finally fails, these pre-created slots are empty nodes > and aren't destroyed. > > I can reproduce it with following steps. > 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, and then mmap the file > 2) memset for the first 2MB > 3) madvise(MADV_COLLAPSE) for the second 2MB > 4) unlink the file > > in 3), collapse_file() calls xas_create_range() to expand xarray depth, and fails to collapse > due to the whole 2M region is empty, the code is as following: > > collapse_file() > for (index = start; index < end;) { > xas_set(&xas, index); > folio = xas_load(&xas); > > VM_BUG_ON(index != xas.xa_index); > if (is_shmem) { > if (!folio) { > /* > * Stop if extent has been truncated or > * hole-punched, and is now completely > * empty. > */ > if (index == start) { > if (!xas_next_entry(&xas, end - 1)) { > result = SCAN_TRUNCATED; > goto xa_locked; > } > } > ... > } > > > collapse_file() rollback path doesn't destroy the pre-created empty nodes. > > When the file is deleted, shmem_evict_inode()->shmem_truncate_range() traverses > all entries and calls xas_store(xas, NULL) to delete, if the leaf xa_node that > stores deleted entry becomes emtry, xas_store() will automatically delete the empty > node and delete it's parent is empty too, until parent node isn't empty. shmem_evict_inode() > won't traverse the empty nodes created by xas_create_range() due to these nodes doesn't store > any entries. As a result, these empty nodes are leaked. > > At first, I tried to destory the empty nodes when collapse_file() goes to rollback path. However, > collapse_file() only holds xarray lock and may release the lock, so we couldn't prevent concurrent > call of collapse_file(), so the deleted empty nodes may be needed by other collapse_file() calls. > > IIUC, xas_create_range() is used to guarantee the xas_store(&xas, new_folio); succeeds. Could we > remove xas_create_range() call and just rollback when we fail to xas_store? Hi, thanks for the report. Is that what [1] is fixing? [1] https://lore.kernel.org/linux-mm/20251204142625.1763372-1-shardul.b@mpiricsoftware.com/ -- Cheers David ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-18 11:51 ` David Hildenbrand (Red Hat) @ 2025-12-18 12:18 ` Jinjiang Tu 2025-12-18 12:49 ` David Hildenbrand (Red Hat) 2025-12-18 12:35 ` Jinjiang Tu 1 sibling, 1 reply; 10+ messages in thread From: Jinjiang Tu @ 2025-12-18 12:18 UTC (permalink / raw) To: David Hildenbrand (Red Hat), Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang [-- Attachment #1: Type: text/plain, Size: 3253 bytes --] 在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道: > On 12/18/25 12:45, Jinjiang Tu wrote: >> I encountered a memory leak issue caused by xas_create_range(). >> >> collapse_file() calls xas_create_range() to pre-create all slots needed. >> If collapse_file() finally fails, these pre-created slots are empty >> nodes >> and aren't destroyed. >> >> I can reproduce it with following steps. >> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, >> and then mmap the file >> 2) memset for the first 2MB >> 3) madvise(MADV_COLLAPSE) for the second 2MB >> 4) unlink the file >> >> in 3), collapse_file() calls xas_create_range() to expand xarray >> depth, and fails to collapse >> due to the whole 2M region is empty, the code is as following: >> >> collapse_file() >> for (index = start; index < end;) { >> xas_set(&xas, index); >> folio = xas_load(&xas); >> >> VM_BUG_ON(index != xas.xa_index); >> if (is_shmem) { >> if (!folio) { >> /* >> * Stop if extent has been truncated or >> * hole-punched, and is now completely >> * empty. >> */ >> if (index == start) { >> if (!xas_next_entry(&xas, end - 1)) { >> result = SCAN_TRUNCATED; >> goto xa_locked; >> } >> } >> ... >> } >> >> >> collapse_file() rollback path doesn't destroy the pre-created empty >> nodes. >> >> When the file is deleted, shmem_evict_inode()->shmem_truncate_range() >> traverses >> all entries and calls xas_store(xas, NULL) to delete, if the leaf >> xa_node that >> stores deleted entry becomes emtry, xas_store() will automatically >> delete the empty >> node and delete it's parent is empty too, until parent node isn't >> empty. shmem_evict_inode() >> won't traverse the empty nodes created by xas_create_range() due to >> these nodes doesn't store >> any entries. As a result, these empty nodes are leaked. >> >> At first, I tried to destory the empty nodes when collapse_file() >> goes to rollback path. However, >> collapse_file() only holds xarray lock and may release the lock, so >> we couldn't prevent concurrent >> call of collapse_file(), so the deleted empty nodes may be needed by >> other collapse_file() calls. >> >> IIUC, xas_create_range() is used to guarantee the xas_store(&xas, >> new_folio); succeeds. Could we >> remove xas_create_range() call and just rollback when we fail to >> xas_store? > > Hi, > > thanks for the report. > > Is that what [1] is fixing? > > [1] > https://lore.kernel.org/linux-mm/20251204142625.1763372-1-shardul.b@mpiricsoftware.com/ > > No, this patch fixes memory leak caused by xas->xa_alloc allocated by xas_nomem() and the xa_node isn't installed into xarray. In my case, the leaked xa_nodes have been installed into xarray by xas_create_range(). [-- Attachment #2: Type: text/html, Size: 4850 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-18 12:18 ` Jinjiang Tu @ 2025-12-18 12:49 ` David Hildenbrand (Red Hat) 2025-12-18 13:11 ` Jinjiang Tu 0 siblings, 1 reply; 10+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-12-18 12:49 UTC (permalink / raw) To: Jinjiang Tu, Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang, Shardul Bankar On 12/18/25 13:18, Jinjiang Tu wrote: > > 在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道: >> On 12/18/25 12:45, Jinjiang Tu wrote: >>> I encountered a memory leak issue caused by xas_create_range(). >>> >>> collapse_file() calls xas_create_range() to pre-create all slots needed. >>> If collapse_file() finally fails, these pre-created slots are empty >>> nodes >>> and aren't destroyed. >>> >>> I can reproduce it with following steps. >>> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, >>> and then mmap the file >>> 2) memset for the first 2MB >>> 3) madvise(MADV_COLLAPSE) for the second 2MB >>> 4) unlink the file >>> >>> in 3), collapse_file() calls xas_create_range() to expand xarray >>> depth, and fails to collapse >>> due to the whole 2M region is empty, the code is as following: >>> >>> collapse_file() >>> for (index = start; index < end;) { >>> xas_set(&xas, index); >>> folio = xas_load(&xas); >>> >>> VM_BUG_ON(index != xas.xa_index); >>> if (is_shmem) { >>> if (!folio) { >>> /* >>> * Stop if extent has been truncated or >>> * hole-punched, and is now completely >>> * empty. >>> */ >>> if (index == start) { >>> if (!xas_next_entry(&xas, end - 1)) { >>> result = SCAN_TRUNCATED; >>> goto xa_locked; >>> } >>> } >>> ... >>> } >>> >>> >>> collapse_file() rollback path doesn't destroy the pre-created empty >>> nodes. >>> >>> When the file is deleted, shmem_evict_inode()->shmem_truncate_range() >>> traverses >>> all entries and calls xas_store(xas, NULL) to delete, if the leaf >>> xa_node that >>> stores deleted entry becomes emtry, xas_store() will automatically >>> delete the empty >>> node and delete it's parent is empty too, until parent node isn't >>> empty. shmem_evict_inode() >>> won't traverse the empty nodes created by xas_create_range() due to >>> these nodes doesn't store >>> any entries. As a result, these empty nodes are leaked. >>> >>> At first, I tried to destory the empty nodes when collapse_file() >>> goes to rollback path. However, >>> collapse_file() only holds xarray lock and may release the lock, so >>> we couldn't prevent concurrent >>> call of collapse_file(), so the deleted empty nodes may be needed by >>> other collapse_file() calls. >>> >>> IIUC, xas_create_range() is used to guarantee the xas_store(&xas, >>> new_folio); succeeds. Could we >>> remove xas_create_range() call and just rollback when we fail to >>> xas_store? >> >> Hi, >> >> thanks for the report. >> >> Is that what [1] is fixing? >> >> [1] https://lore.kernel.org/linux-mm/20251204142625.1763372-1- >> shardul.b@mpiricsoftware.com/ >> > No, this patch fixes memory leak caused by xas->xa_alloc allocated by xas_nomem() and the xa_node > isn't installed into xarray. > > In my case, the leaked xa_nodes have been installed into xarray by xas_create_range(). Thanks for checking. I thought that was also discussed as part of the other fix. See [2] where we have "Note: This fixes the leak of pre-allocated nodes. A separate fix will be needed to clean up empty nodes that were inserted into the tree by xas_create_range() but never populated." Is that the issue you are describing? (sounds like it, but I only skimmed over the details). CCing Shardul. [2] https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/ -- Cheers David ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-18 12:49 ` David Hildenbrand (Red Hat) @ 2025-12-18 13:11 ` Jinjiang Tu 2025-12-25 4:15 ` Shardul Bankar 0 siblings, 1 reply; 10+ messages in thread From: Jinjiang Tu @ 2025-12-18 13:11 UTC (permalink / raw) To: David Hildenbrand (Red Hat), Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang, Shardul Bankar [-- Attachment #1: Type: text/plain, Size: 7593 bytes --] 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道: > On 12/18/25 13:18, Jinjiang Tu wrote: >> >> 在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道: >>> On 12/18/25 12:45, Jinjiang Tu wrote: >>>> I encountered a memory leak issue caused by xas_create_range(). >>>> >>>> collapse_file() calls xas_create_range() to pre-create all slots >>>> needed. >>>> If collapse_file() finally fails, these pre-created slots are empty >>>> nodes >>>> and aren't destroyed. >>>> >>>> I can reproduce it with following steps. >>>> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB >>>> size, and then mmap the file >>>> 2) memset for the first 2MB >>>> 3) madvise(MADV_COLLAPSE) for the second 2MB >>>> 4) unlink the file >>>> >>>> in 3), collapse_file() calls xas_create_range() to expand xarray >>>> depth, and fails to collapse >>>> due to the whole 2M region is empty, the code is as following: >>>> >>>> collapse_file() >>>> for (index = start; index < end;) { >>>> xas_set(&xas, index); >>>> folio = xas_load(&xas); >>>> >>>> VM_BUG_ON(index != xas.xa_index); >>>> if (is_shmem) { >>>> if (!folio) { >>>> /* >>>> * Stop if extent has been truncated or >>>> * hole-punched, and is now completely >>>> * empty. >>>> */ >>>> if (index == start) { >>>> if (!xas_next_entry(&xas, end - 1)) { >>>> result = SCAN_TRUNCATED; >>>> goto xa_locked; >>>> } >>>> } >>>> ... >>>> } >>>> >>>> >>>> collapse_file() rollback path doesn't destroy the pre-created empty >>>> nodes. >>>> >>>> When the file is deleted, >>>> shmem_evict_inode()->shmem_truncate_range() traverses >>>> all entries and calls xas_store(xas, NULL) to delete, if the leaf >>>> xa_node that >>>> stores deleted entry becomes emtry, xas_store() will automatically >>>> delete the empty >>>> node and delete it's parent is empty too, until parent node isn't >>>> empty. shmem_evict_inode() >>>> won't traverse the empty nodes created by xas_create_range() due to >>>> these nodes doesn't store >>>> any entries. As a result, these empty nodes are leaked. >>>> >>>> At first, I tried to destory the empty nodes when collapse_file() >>>> goes to rollback path. However, >>>> collapse_file() only holds xarray lock and may release the lock, so >>>> we couldn't prevent concurrent >>>> call of collapse_file(), so the deleted empty nodes may be needed >>>> by other collapse_file() calls. >>>> >>>> IIUC, xas_create_range() is used to guarantee the xas_store(&xas, >>>> new_folio); succeeds. Could we >>>> remove xas_create_range() call and just rollback when we fail to >>>> xas_store? >>> >>> Hi, >>> >>> thanks for the report. >>> >>> Is that what [1] is fixing? >>> >>> [1] https://lore.kernel.org/linux-mm/20251204142625.1763372-1- >>> shardul.b@mpiricsoftware.com/ >>> >> No, this patch fixes memory leak caused by xas->xa_alloc allocated by >> xas_nomem() and the xa_node >> isn't installed into xarray. >> >> In my case, the leaked xa_nodes have been installed into xarray by >> xas_create_range(). > > Thanks for checking. I thought that was also discussed as part of the > other fix. > > See [2] where we have > > "Note: This fixes the leak of pre-allocated nodes. A separate fix will > be needed to clean up empty nodes that were inserted into the tree by > xas_create_range() but never populated." > > Is that the issue you are describing? (sounds like it, but I only > skimmed over the details). > > CCing Shardul. Yes, the same issue. As I descirbed in the first email: " At first, I tried to destory the empty nodes when collapse_file() goes to rollback path. However, collapse_file() only holds xarray lock and may release the lock, so we couldn't prevent concurrent call of collapse_file(), so the deleted empty nodes may be needed by other collapse_file() calls. " We couldn't bindly cleanup empty nodes in the rollback path. I'm trying to move the xas_create_range() before xas_store() and always under xarray lock to make rollback easier, the diff likes (applied on 6.6, haven't tested yet) diff --git a/include/linux/xarray.h b/include/linux/xarray.h index c3f54d2eaf36..5ef393011a61 100644 --- a/include/linux/xarray.h +++ b/include/linux/xarray.h @@ -1548,6 +1548,7 @@ void xas_destroy(struct xa_state *); void xas_pause(struct xa_state *); void xas_create_range(struct xa_state *); +void xas_destroy_range(struct xa_state *xas, unsigned long start, unsigned long end); #ifdef CONFIG_XARRAY_MULTI int xa_get_order(struct xarray *, unsigned long index); diff --git a/lib/xarray.c b/lib/xarray.c index 32d4bac8c94c..724a7f35a26f 100644 --- a/lib/xarray.c +++ b/lib/xarray.c @@ -745,6 +745,25 @@ void xas_create_range(struct xa_state *xas) } EXPORT_SYMBOL_GPL(xas_create_range); +void xas_destroy_range(struct xa_state *xas, unsigned long start, unsigned long end) +{ + unsigned long index; + void *entry; + + for (index = start; index < end; ++index) { + xas_set(xas, index); + entry = xas_load(xas); + if (entry) + continue; + + if (!xas->xa_node || xas_invalid(xas)) + continue; + + if (!xas->xa_node->count) + xas_delete_node(xas); + } +} + static void update_node(struct xa_state *xas, struct xa_node *node, int count, int values) { diff --git a/mm/khugepaged.c b/mm/khugepaged.c index e8cb826c1994..36f0600ef7b1 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1832,7 +1832,7 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, struct folio *folio, *tmp, *new_folio; pgoff_t index = 0, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); - XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); + XA_STATE(xas, &mapping->i_pages, 0); int nr_none = 0, result = SCAN_SUCCEED; bool is_shmem = shmem_file(file); @@ -1851,22 +1851,6 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, new_folio->index = start; new_folio->mapping = mapping; - /* - * Ensure we have slots for all the pages in the range. This is - * almost certainly a no-op because most of the pages must be present - */ - do { - xas_lock_irq(&xas); - xas_create_range(&xas); - if (!xas_error(&xas)) - break; - xas_unlock_irq(&xas); - if (!xas_nomem(&xas, GFP_KERNEL)) { - result = SCAN_FAIL; - goto rollback; - } - } while (1); - for (index = start; index < end;) { xas_set(&xas, index); folio = xas_load(&xas); @@ -2163,6 +2147,23 @@ static int collapse_file(struct mm_struct *mm, unsigned long addr, xas_lock_irq(&xas); } + xas_set_order(&xas, start, HPAGE_PMD_ORDER); + xas_create_range(&xas); + if (xas_error(&xas)) { + if (nr_none) { + xas_set(&xas, start); + for (index = start; index < end; index++) { + if (xas_next(&xas) == XA_RETRY_ENTRY) + xas_store(&xas, NULL); + } + } + xas_destroy_range(&xas, start, end); + xas_unlock_irq(&xas); + result = SCAN_FAIL; + + goto rollback; + } + if (is_shmem) __lruvec_stat_mod_folio(new_folio, NR_SHMEM_THPS, HPAGE_PMD_NR); else -- 2.43.0 > > > [2] > https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/ > [-- Attachment #2: Type: text/html, Size: 10351 bytes --] ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-18 13:11 ` Jinjiang Tu @ 2025-12-25 4:15 ` Shardul Bankar 2025-12-27 1:24 ` Jinjiang Tu 0 siblings, 1 reply; 10+ messages in thread From: Shardul Bankar @ 2025-12-25 4:15 UTC (permalink / raw) To: Jinjiang Tu, David Hildenbrand (Red Hat), Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang, shardulsb08 On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote: > > 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道: > > > > > > > > Thanks for checking. I thought that was also discussed as part of > > the other fix. > > > > See [2] where we have > > > > "Note: This fixes the leak of pre-allocated nodes. A separate fix > > will > > be needed to clean up empty nodes that were inserted into the tree > > by > > xas_create_range() but never populated." > > > > Is that the issue you are describing? (sounds like it, but I only > > skimmed over the details). > > > > CCing Shardul. > Yes, the same issue. As I descirbed in the first email: > " > At first, I tried to destory the empty nodes when collapse_file() > goes to rollback path. However, > collapse_file() only holds xarray lock and may release the lock, so > we couldn't prevent concurrent > call of collapse_file(), so the deleted empty nodes may be needed by > other collapse_file() calls. > " Hi David, Jinjiang, As Jinjiang mentioned, this appears to address what I had originally referred to in the "Note:" in [1]. Just to clarify the context of the "Note:", that was based on my assumption at the time that such empty nodes would be considered leaks. After Dev’s feedback in [2]: "No "fix" is needed in this case, the empty nodes are there in the tree and there is no leak." and looking at the older discussion in [3]: "There's nothing to free; if a node is allocated, then it's stored in the tree where it can later be found and reused. " my updated understanding is that there is no leak in this case- the nodes remain valid and reusable, and therefore do not require a separate fix. David could you correct me if I am mistaken? [1] https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/ [2] https://lore.kernel.org/linux-mm/57cbf887-d181-418b-a6c7-9f3eff5d632a@arm.com/ [3] https://lore.kernel.org/all/Ys1r06szkVi3QEai@casper.infradead.org/ Thanks, Shardul ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-25 4:15 ` Shardul Bankar @ 2025-12-27 1:24 ` Jinjiang Tu 2025-12-30 21:03 ` David Hildenbrand (Red Hat) 0 siblings, 1 reply; 10+ messages in thread From: Jinjiang Tu @ 2025-12-27 1:24 UTC (permalink / raw) To: Shardul Bankar, David Hildenbrand (Red Hat), Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang, shardulsb08 在 2025/12/25 12:15, Shardul Bankar 写道: > On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote: >> 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道: >> >>> Thanks for checking. I thought that was also discussed as part of >>> the other fix. >>> >>> See [2] where we have >>> >>> "Note: This fixes the leak of pre-allocated nodes. A separate fix >>> will >>> be needed to clean up empty nodes that were inserted into the tree >>> by >>> xas_create_range() but never populated." >>> >>> Is that the issue you are describing? (sounds like it, but I only >>> skimmed over the details). >>> >>> CCing Shardul. >> Yes, the same issue. As I descirbed in the first email: >> " >> At first, I tried to destory the empty nodes when collapse_file() >> goes to rollback path. However, >> collapse_file() only holds xarray lock and may release the lock, so >> we couldn't prevent concurrent >> call of collapse_file(), so the deleted empty nodes may be needed by >> other collapse_file() calls. >> " > Hi David, Jinjiang, > > As Jinjiang mentioned, this appears to address what I had originally > referred to in the "Note:" in [1]. > > Just to clarify the context of the "Note:", that was based on my > assumption at the time that such empty nodes would be considered leaks. > After Dev’s feedback in [2]: > "No "fix" is needed in this case, the empty nodes are there in the tree > and there is no leak." > > and looking at the older discussion in [3]: > "There's nothing to free; if a node is allocated, then it's stored in > the tree where it can later be found and reused. " However, if the empty nodes aren't reused, When the file is deleted, shmem_evict_inode()->shmem_truncate_range() traverses all entries and calls xas_store(xas, NULL) to delete, if the leaf xa_node that stores deleted entry becomes empty, xas_store() will automatically delete the empty node and delete it's parent is empty too, until parent node isn't empty. shmem_evict_inode() won't traverse the empty nodes created by xas_create_range() due to these nodes doesn't store any entries. > > my updated understanding is that there is no leak in this case- the > nodes remain valid and reusable, and therefore do not require a > separate fix. > > David could you correct me if I am mistaken? > > [1] > https://lore.kernel.org/linux-mm/20251123132727.3262731-1-shardul.b@mpiricsoftware.com/ > > [2] > https://lore.kernel.org/linux-mm/57cbf887-d181-418b-a6c7-9f3eff5d632a@arm.com/ > > [3] > https://lore.kernel.org/all/Ys1r06szkVi3QEai@casper.infradead.org/ > > Thanks, > Shardul > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-27 1:24 ` Jinjiang Tu @ 2025-12-30 21:03 ` David Hildenbrand (Red Hat) 2025-12-31 6:29 ` Jinjiang Tu 0 siblings, 1 reply; 10+ messages in thread From: David Hildenbrand (Red Hat) @ 2025-12-30 21:03 UTC (permalink / raw) To: Jinjiang Tu, Shardul Bankar, Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang, shardulsb08 On 12/27/25 02:24, Jinjiang Tu wrote: > > 在 2025/12/25 12:15, Shardul Bankar 写道: >> On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote: >>> 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道: >>> >>>> Thanks for checking. I thought that was also discussed as part of >>>> the other fix. >>>> >>>> See [2] where we have >>>> >>>> "Note: This fixes the leak of pre-allocated nodes. A separate fix >>>> will >>>> be needed to clean up empty nodes that were inserted into the tree >>>> by >>>> xas_create_range() but never populated." >>>> >>>> Is that the issue you are describing? (sounds like it, but I only >>>> skimmed over the details). >>>> >>>> CCing Shardul. >>> Yes, the same issue. As I descirbed in the first email: >>> " >>> At first, I tried to destory the empty nodes when collapse_file() >>> goes to rollback path. However, >>> collapse_file() only holds xarray lock and may release the lock, so >>> we couldn't prevent concurrent >>> call of collapse_file(), so the deleted empty nodes may be needed by >>> other collapse_file() calls. >>> " >> Hi David, Jinjiang, >> >> As Jinjiang mentioned, this appears to address what I had originally >> referred to in the "Note:" in [1]. >> >> Just to clarify the context of the "Note:", that was based on my >> assumption at the time that such empty nodes would be considered leaks. >> After Dev’s feedback in [2]: >> "No "fix" is needed in this case, the empty nodes are there in the tree >> and there is no leak." >> >> and looking at the older discussion in [3]: >> "There's nothing to free; if a node is allocated, then it's stored in >> the tree where it can later be found and reused. " > > However, if the empty nodes aren't reused, When the file is deleted, > shmem_evict_inode()->shmem_truncate_range() traverses all entries and > calls xas_store(xas, NULL) to delete, if the leaf xa_node that stores > deleted entry becomes empty, xas_store() will automatically delete the > empty node and delete it's parent is empty too, until parent node isn't > empty. shmem_evict_inode() won't traverse the empty nodes created by > xas_create_range() due to these nodes doesn't store any entries. So you're saying that nothing/nobody would clean up these xarray entries and we'd be leaking them? "struct xarray" documents "If all of the entries in the array are NULL, @xa_head is a NULL pointer.". So we depend on all entries being set to NULL in order to properly cleanup/free the xarray automatically. -- Cheers David ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-30 21:03 ` David Hildenbrand (Red Hat) @ 2025-12-31 6:29 ` Jinjiang Tu 0 siblings, 0 replies; 10+ messages in thread From: Jinjiang Tu @ 2025-12-31 6:29 UTC (permalink / raw) To: David Hildenbrand (Red Hat), Shardul Bankar, Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang, shardulsb08 在 2025/12/31 5:03, David Hildenbrand (Red Hat) 写道: > On 12/27/25 02:24, Jinjiang Tu wrote: >> >> 在 2025/12/25 12:15, Shardul Bankar 写道: >>> On Thu, 2025-12-18 at 21:11 +0800, Jinjiang Tu wrote: >>>> 在 2025/12/18 20:49, David Hildenbrand (Red Hat) 写道: >>>>> Thanks for checking. I thought that was also discussed as part of >>>>> the other fix. >>>>> See [2] where we have >>>>> "Note: This fixes the leak of pre-allocated nodes. A >>>>> separate fix >>>>> will >>>>> be needed to clean up empty nodes that were inserted into the tree >>>>> by >>>>> xas_create_range() but never populated." >>>>> Is that the issue you are describing? (sounds like it, but I >>>>> only >>>>> skimmed over the details). >>>>> CCing Shardul. >>>> Yes, the same issue. As I descirbed in the first email: >>>> " >>>> At first, I tried to destory the empty nodes when collapse_file() >>>> goes to rollback path. However, >>>> collapse_file() only holds xarray lock and may release the lock, so >>>> we couldn't prevent concurrent >>>> call of collapse_file(), so the deleted empty nodes may be needed by >>>> other collapse_file() calls. >>>> " >>> Hi David, Jinjiang, >>> >>> As Jinjiang mentioned, this appears to address what I had originally >>> referred to in the "Note:" in [1]. >>> >>> Just to clarify the context of the "Note:", that was based on my >>> assumption at the time that such empty nodes would be considered leaks. >>> After Dev’s feedback in [2]: >>> "No "fix" is needed in this case, the empty nodes are there in the tree >>> and there is no leak." >>> >>> and looking at the older discussion in [3]: >>> "There's nothing to free; if a node is allocated, then it's stored in >>> the tree where it can later be found and reused. " >> >> However, if the empty nodes aren't reused, When the file is deleted, >> shmem_evict_inode()->shmem_truncate_range() traverses all entries and >> calls xas_store(xas, NULL) to delete, if the leaf xa_node that stores >> deleted entry becomes empty, xas_store() will automatically delete the >> empty node and delete it's parent is empty too, until parent node isn't >> empty. shmem_evict_inode() won't traverse the empty nodes created by >> xas_create_range() due to these nodes doesn't store any entries. > > So you're saying that nothing/nobody would clean up these xarray > entries and we'd be leaking them? Yes. > > "struct xarray" documents "If all of the entries in the array are > NULL, @xa_head is a NULL pointer.". So we depend on all entries being > set to NULL in order to properly cleanup/free the xarray automatically. > Yes ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [bug report] memory leak of xa_node in collapse_file() when rollbacks 2025-12-18 11:51 ` David Hildenbrand (Red Hat) 2025-12-18 12:18 ` Jinjiang Tu @ 2025-12-18 12:35 ` Jinjiang Tu 1 sibling, 0 replies; 10+ messages in thread From: Jinjiang Tu @ 2025-12-18 12:35 UTC (permalink / raw) To: David Hildenbrand (Red Hat), Andrew Morton, Matthew Wilcox, ziy, lorenzo.stoakes, baolin.wang, Liam.Howlett, npache, ryan.roberts, dev.jain, baohua, lance.yang, linux-mm, linux-fsdevel Cc: Kefeng Wang 在 2025/12/18 19:51, David Hildenbrand (Red Hat) 写道: > On 12/18/25 12:45, Jinjiang Tu wrote: >> I encountered a memory leak issue caused by xas_create_range(). >> >> collapse_file() calls xas_create_range() to pre-create all slots needed. >> If collapse_file() finally fails, these pre-created slots are empty >> nodes >> and aren't destroyed. >> >> I can reproduce it with following steps. >> 1) create file /tmp/test_madvise_collapse and ftruncate to 4MB size, >> and then mmap the file >> 2) memset for the first 2MB >> 3) madvise(MADV_COLLAPSE) for the second 2MB >> 4) unlink the file >> >> in 3), collapse_file() calls xas_create_range() to expand xarray >> depth, and fails to collapse >> due to the whole 2M region is empty, the code is as following: >> >> collapse_file() >> for (index = start; index < end;) { >> xas_set(&xas, index); >> folio = xas_load(&xas); >> >> VM_BUG_ON(index != xas.xa_index); >> if (is_shmem) { >> if (!folio) { >> /* >> * Stop if extent has been truncated or >> * hole-punched, and is now completely >> * empty. >> */ >> if (index == start) { >> if (!xas_next_entry(&xas, end - 1)) { >> result = SCAN_TRUNCATED; >> goto xa_locked; >> } >> } >> ... >> } >> >> >> collapse_file() rollback path doesn't destroy the pre-created empty >> nodes. >> >> When the file is deleted, shmem_evict_inode()->shmem_truncate_range() >> traverses >> all entries and calls xas_store(xas, NULL) to delete, if the leaf >> xa_node that >> stores deleted entry becomes emtry, xas_store() will automatically >> delete the empty >> node and delete it's parent is empty too, until parent node isn't >> empty. shmem_evict_inode() >> won't traverse the empty nodes created by xas_create_range() due to >> these nodes doesn't store >> any entries. As a result, these empty nodes are leaked. >> >> At first, I tried to destory the empty nodes when collapse_file() >> goes to rollback path. However, >> collapse_file() only holds xarray lock and may release the lock, so >> we couldn't prevent concurrent >> call of collapse_file(), so the deleted empty nodes may be needed by >> other collapse_file() calls. >> >> IIUC, xas_create_range() is used to guarantee the xas_store(&xas, >> new_folio); succeeds. Could we >> remove xas_create_range() call and just rollback when we fail to >> xas_store? > > Hi, > > thanks for the report. > > Is that what [1] is fixing? > > [1] > https://lore.kernel.org/linux-mm/20251204142625.1763372-1-shardul.b@mpiricsoftware.com/ > > the allocation stack of the leaked xa_node is as follows: unreferenced object 0xffff0000d4d9fd98 (size 576): comm "test_filemap", pid 8220, jiffies 4294957272 (age 659.036s) hex dump (first 32 bytes): 00 08 00 00 00 00 00 00 88 54 75 dc 00 00 ff ff .........Tu..... a0 7d db d6 00 00 ff ff b0 fd d9 d4 00 00 ff ff .}.............. backtrace: kmemleak_alloc+0xb8/0xd8 kmem_cache_alloc_lru+0x308/0x558 xas_alloc+0xb4/0xd0 xas_create+0xf4/0x1f8 xas_create_range+0xd0/0x158 collapse_file+0x13c/0x11e0 hpage_collapse_scan_file+0x1e0/0x488 madvise_collapse+0x174/0x650 madvise_vma_behavior+0x334/0x520 do_madvise+0x1bc/0x388 __arm64_sys_madvise+0x2c/0x48 invoke_syscall+0x50/0x128 el0_svc_common.constprop.0+0xc8/0xf0 do_el0_svc+0x24/0x38 el0_svc+0x44/0x200 el0t_64_sync_handler+0x100/0x130 it is allocated by xas_create_range(), not xas_nomem(). So it isn't the same issue. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2025-12-31 6:29 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-12-18 11:45 [bug report] memory leak of xa_node in collapse_file() when rollbacks Jinjiang Tu 2025-12-18 11:51 ` David Hildenbrand (Red Hat) 2025-12-18 12:18 ` Jinjiang Tu 2025-12-18 12:49 ` David Hildenbrand (Red Hat) 2025-12-18 13:11 ` Jinjiang Tu 2025-12-25 4:15 ` Shardul Bankar 2025-12-27 1:24 ` Jinjiang Tu 2025-12-30 21:03 ` David Hildenbrand (Red Hat) 2025-12-31 6:29 ` Jinjiang Tu 2025-12-18 12:35 ` Jinjiang Tu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox