linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Zi Yan <ziy@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@kernel.org>,
	Hugh Dickins <hughd@google.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	"Liam R. Howlett" <Liam.Howlett@oracle.com>,
	Nico Pache <npache@redhat.com>,
	Ryan Roberts <ryan.roberts@arm.com>, Dev Jain <dev.jain@arm.com>,
	Barry Song <baohua@kernel.org>, Lance Yang <lance.yang@linux.dev>,
	Matthew Wilcox <willy@infradead.org>,
	Bas van Dijk <bas@dfinity.org>,
	Eero Kelly <eero.kelly@dfinity.org>,
	Andrew Battat <andrew.battat@dfinity.org>,
	Adam Bratschi-Kaye <adam.bratschikaye@dfinity.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] mm/huge_memory: fix a folio_split() race condition with folio_try_get()
Date: Mon, 2 Mar 2026 13:30:25 +0000	[thread overview]
Message-ID: <54a4d554-d4cd-47d2-bdc1-8796c5d7d947@lucifer.local> (raw)
In-Reply-To: <20260228010614.2536430-1-ziy@nvidia.com>

On Fri, Feb 27, 2026 at 08:06:14PM -0500, Zi Yan wrote:
> During a pagecache folio split, the values in the related xarray should not
> be changed from the original folio at xarray split time until all
> after-split folios are well formed and stored in the xarray. Current use
> of xas_try_split() in __split_unmapped_folio() lets some after-split folios
> show up at wrong indices in the xarray. When these misplaced after-split
> folios are unfrozen, before correct folios are stored via __xa_store(), and
> grabbed by folio_try_get(), they are returned to userspace at wrong file
> indices, causing data corruption.
>
> Fix it by using the original folio in xas_try_split() calls, so that
> folio_try_get() can get the right after-split folios after the original
> folio is unfrozen.
>
> Uniform split, split_huge_page*(), is not affected, since it uses
> xas_split_alloc() and xas_split() only once and stores the original folio
> in the xarray.
>
> Fixes below points to the commit introduces the code, but folio_split() is
> used in a later commit 7460b470a131f ("mm/truncate: use folio_split() in
> truncate operation").
>
> Fixes: 00527733d0dc8 ("mm/huge_memory: add two new (not yet used) functions for folio_split()")
> Reported-by: Bas van Dijk <bas@dfinity.org>
> Closes: https://lore.kernel.org/all/CAKNNEtw5_kZomhkugedKMPOG-sxs5Q5OLumWJdiWXv+C9Yct0w@mail.gmail.com/
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> Cc: <stable@vger.kernel.org>
> ---
>  mm/huge_memory.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 56db54fa48181..e4ed0404e8b55 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3647,6 +3647,7 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>  	const bool is_anon = folio_test_anon(folio);
>  	int old_order = folio_order(folio);
>  	int start_order = split_type == SPLIT_TYPE_UNIFORM ? new_order : old_order - 1;
> +	struct folio *origin_folio = folio;

NIT: 'origin' folio is a bit ambigious, maybe old_folio, since it is of order old_order?

>  	int split_order;
>
>  	/*
> @@ -3672,7 +3673,13 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
>  				xas_split(xas, folio, old_order);

Aside, but this 'if (foo) bar(); else { ... }' pattern is horrible, think it's
justifiable to put both in {}... :)

>  			else {
>  				xas_set_order(xas, folio->index, split_order);
> -				xas_try_split(xas, folio, old_order);
> +				/*
> +				 * use the original folio, so that a parallel
> +				 * folio_try_get() waits on it until xarray is
> +				 * updated with after-split folios and
> +				 * the original one is unfrozen.
> +				 */
> +				xas_try_split(xas, origin_folio, old_order);

Hmm, but won't we have already split the original folio by now? So is
origin_folio/old_folio a pointer to what was the original folio but now is
that but with weird tail page setup? :) like:

|------------------------|
|           f            |
|------------------------|
^old_folio  ^ split_at

|-----------|------------|
|     f     |     f2     |
|-----------|------------|
^old_folio

|-----------|-----|------|
|     f     |  f3 |  f4  |
|-----------|-----|------|
^old_folio

etc.

So the xarray would contain:

|-----------|-----|------|
|    f      |  f  |   f  |
|-----------|-----|------|

Wouldn't it after this?

Oh I guess before it'd contain:

|-----------|-----|------|
|     f     |  f4 |  f4  |
|-----------|-----|------|

Right?


You saying you'll later put the correct xas entries in post-split. Where does
that happen?

And why was it a problem when these new folios were unfrozen?

(Since the folio is a pointer to an offset in the vmemmap)

I guess if you update that later in the xas, it's ok, and everything waits on
the right thing so this is probably fine, and the f4 f4 above is probably not
fine...

I'm guessing the original folio is kept frozen during the operation?

Anyway please help my confusion not so familiar with this code :)


>  				if (xas_error(xas))
>  					return xas_error(xas);
>  			}
> --
> 2.51.0
>

Thanks, Lorenzo


  parent reply	other threads:[~2026-03-02 13:30 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-28  1:06 Zi Yan
2026-02-28  3:10 ` Lance Yang
2026-03-02 14:28   ` David Hildenbrand (Arm)
2026-03-02 15:11     ` Lance Yang
2026-03-02 16:36       ` Zi Yan
2026-03-02 13:30 ` Lorenzo Stoakes [this message]
2026-03-02 16:30   ` Zi Yan
2026-03-02 14:40 ` David Hildenbrand (Arm)
2026-03-02 16:34   ` Zi Yan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54a4d554-d4cd-47d2-bdc1-8796c5d7d947@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=adam.bratschikaye@dfinity.org \
    --cc=akpm@linux-foundation.org \
    --cc=andrew.battat@dfinity.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=bas@dfinity.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=eero.kelly@dfinity.org \
    --cc=hughd@google.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=ryan.roberts@arm.com \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox