linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Shardul Bankar <shardul.b@mpiricsoftware.com>,
	willy@infradead.org, akpm@linux-foundation.org,
	linux-mm@kvack.org
Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	dev.jain@arm.com, shardulsb08@gmail.com,
	janak@mpiricsoftware.com
Subject: Re: [PATCH v4] lib: xarray: free unused spare node in xas_create_range()
Date: Fri, 5 Dec 2025 08:22:15 +0100	[thread overview]
Message-ID: <d651e943-99f5-431e-a67d-e4e6784e720e@kernel.org> (raw)
In-Reply-To: <20251204142625.1763372-1-shardul.b@mpiricsoftware.com>

On 12/4/25 15:26, Shardul Bankar wrote:
> xas_create_range() is typically called in a retry loop that uses
> xas_nomem() to handle -ENOMEM errors. xas_nomem() may allocate a spare
> xa_node and store it in xas->xa_alloc for use in the retry.
> 
> If the lock is dropped after xas_nomem(), another thread can expand the
> xarray tree in the meantime. On the next retry, xas_create_range() can
> then succeed without consuming the spare node stored in xas->xa_alloc.
> If the function returns without freeing this spare node, it leaks.
> 
> xas_create_range() calls xas_create() multiple times in a loop for
> different index ranges. A spare node that isn't needed for one range
> iteration might be needed for the next, so we cannot free it after each
> xas_create() call. We can only safely free it after xas_create_range()
> completes.
> 
> Fix this by calling xas_destroy() at the end of xas_create_range() to
> free any unused spare node. This makes the API safer by default and
> prevents callers from needing to remember cleanup.
> 
> This fixes a memory leak in mm/khugepaged.c and potentially other
> callers that use xas_nomem() with xas_create_range().
> 
> Link: https://syzkaller.appspot.com/bug?id=a274d65fc733448ed518ad15481ed575669dd98c
> Link: https://lore.kernel.org/all/20251201074540.3576327-1-shardul.b@mpiricsoftware.com/ ("v3")
> Fixes: cae106dd67b9 ("mm/khugepaged: refactor collapse_file control flow")
> Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
> ---
>   v4:
>   - Drop redundant `if (xa_alloc)` around xas_destroy(), as xas_destroy()
>     already checks xa_alloc internally.
>   v3:
>   - Move fix from collapse_file() to xas_create_range() as suggested by Matthew Wilcox
>   - Fix in library function makes API safer by default, preventing callers from needing
>     to remember cleanup
>   - Use shared cleanup label that both restore: and success: paths jump to
>   - Clean up unused spare node on both success and error exit paths
>   v2:
>   - Call xas_destroy() on both success and failure
>   - Explained retry semantics and xa_alloc / concurrency risk
>   - Dropped cleanup_empty_nodes from previous proposal
> 
>   lib/xarray.c | 7 ++++++-
>   1 file changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/xarray.c b/lib/xarray.c
> index 9a8b4916540c..f49ccfa5f57d 100644
> --- a/lib/xarray.c
> +++ b/lib/xarray.c
> @@ -744,11 +744,16 @@ void xas_create_range(struct xa_state *xas)
>   	xas->xa_shift = shift;
>   	xas->xa_sibs = sibs;
>   	xas->xa_index = index;
> -	return;
> +	goto cleanup;
> +
>   success:
>   	xas->xa_index = index;
>   	if (xas->xa_node)
>   		xas_set_offset(xas);
> +
> +cleanup:
> +	/* Free any unused spare node from xas_nomem() */
> +	xas_destroy(xas);
>   }
>   EXPORT_SYMBOL_GPL(xas_create_range);
>   

Nothing jumped at me, except that the label situation is a bit
suboptimal.

Hoping Willy will take another look as well.

Reviewed-by: David Hildenbrand (Red Hat) <david@kernel.org>


BTW, do we have a way to test this in a test case?


A follow-up cleanup that avoids labels could be something like (untested):


diff --git a/lib/xarray.c b/lib/xarray.c
index 9a8b4916540cf..325f264530fb2 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -714,6 +714,7 @@ void xas_create_range(struct xa_state *xas)
         unsigned long index = xas->xa_index;
         unsigned char shift = xas->xa_shift;
         unsigned char sibs = xas->xa_sibs;
+       bool success = false;
  
         xas->xa_index |= ((sibs + 1UL) << shift) - 1;
         if (xas_is_node(xas) && xas->xa_node->shift == xas->xa_shift)
@@ -724,9 +725,11 @@ void xas_create_range(struct xa_state *xas)
         for (;;) {
                 xas_create(xas, true);
                 if (xas_error(xas))
-                       goto restore;
-               if (xas->xa_index <= (index | XA_CHUNK_MASK))
-                       goto success;
+                       break
+               if (xas->xa_index <= (index | XA_CHUNK_MASK)) {
+                       succeess = true;
+                       break;
+               }
                 xas->xa_index -= XA_CHUNK_SIZE;
  
                 for (;;) {
@@ -740,15 +743,17 @@ void xas_create_range(struct xa_state *xas)
                 }
         }
  
-restore:
-       xas->xa_shift = shift;
-       xas->xa_sibs = sibs;
-       xas->xa_index = index;
-       return;
-success:
-       xas->xa_index = index;
-       if (xas->xa_node)
-               xas_set_offset(xas);
+       if (success) {
+               xas->xa_index = index;
+               if (xas->xa_node)
+                       xas_set_offset(xas);
+       } else {
+               xas->xa_shift = shift;
+               xas->xa_sibs = sibs;
+               xas->xa_index = index;
+       }
+       /* Free any unused spare node from xas_nomem() */
+       xas_destroy(xas);
  }
  EXPORT_SYMBOL_GPL(xas_create_range);


-- 
Cheers

David


  reply	other threads:[~2025-12-05  7:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-04 14:26 Shardul Bankar
2025-12-05  7:22 ` David Hildenbrand (Red Hat) [this message]
2025-12-05 10:51   ` Shardul Bankar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d651e943-99f5-431e-a67d-e4e6784e720e@kernel.org \
    --to=david@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dev.jain@arm.com \
    --cc=janak@mpiricsoftware.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shardul.b@mpiricsoftware.com \
    --cc=shardulsb08@gmail.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox