Re: [Patch v2 2/2] mm/huge_memory: Optimize and simplify __split_unmapped_folio() logic

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: Zi Yan <ziy@nvidia.com>
Cc: David Hildenbrand <david@redhat.com>,
	Wei Yang <richard.weiyang@gmail.com>,
	akpm@linux-foundation.org, baolin.wang@linux.alibaba.com,
	Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com,
	dev.jain@arm.com, baohua@kernel.org, lance.yang@linux.dev,
	linux-mm@kvack.org
Subject: Re: [Patch v2 2/2] mm/huge_memory: Optimize and simplify __split_unmapped_folio() logic
Date: Mon, 20 Oct 2025 15:03:06 +0100	[thread overview]
Message-ID: <8dadd256-175a-4706-84d0-180eda04ce89@lucifer.local> (raw)
In-Reply-To: <A67D1276-D59B-41E9-8A88-E54E05A0F479@nvidia.com>

On Fri, Oct 17, 2025 at 01:24:37PM -0400, Zi Yan wrote:
> Let me explain the changes in details below, so that we might find a
> good way of splitting it up for an easy review:

Thanks, much appreciated!

Looking through it seems to me there should be 3 patches:

- Getting rid of the trailing loop
- Stats fixups
- Remaining trivial cleanups

Do you agree?

Each can have a commit message that describes in detail what's happening
and why it's ok.

This is also useful for bisection purposes.

>
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 4b2d5a7e5c8e..68e851f5fcb2 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -3528,15 +3528,9 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
> >  		struct address_space *mapping, bool uniform_split)
> >  {
> >  	bool is_anon = folio_test_anon(folio);
> > -	int order = folio_order(folio);
> > -	int start_order = uniform_split ? new_order : order - 1;
> > -	bool stop_split = false;
> > -	struct folio *next;
> > +	int old_order = folio_order(folio);
> > +	int start_order = uniform_split ? new_order : old_order - 1;
> >  	int split_order;
> > -	int ret = 0;
>
> These remove unused variables and do a order->old_order rename.
>
> > -
> > -	if (is_anon)
> > -		mod_mthp_stat(order, MTHP_STAT_NR_ANON, -1);
>
> In the original __split_unmapped_folio() implementation, I probably
> moved the stats in split_huge_page_to_list_to_order() (now __folio_split())[1]
> in to __split_unmapped_folio() and needed to handle xas_try_split() failures,
> so left mod_mthp_stat(order, ..., -1) here. It should have been put
> together with mod_mthp_stat(new_order, ..., 1 << (order - new_order).
>
> [1] https://elixir.bootlin.com/linux/v6.13/source/mm/huge_memory.c#L3622
>
> >
> >  	folio_clear_has_hwpoisoned(folio);
> >
> > @@ -3545,17 +3539,13 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
> >  	 * folio is split to new_order directly.
> >  	 */
> >  	for (split_order = start_order;
> > -	     split_order >= new_order && !stop_split;
> > +	     split_order >= new_order;
>
> stop_split was used to handle xas_try_split() failure, since the stats separation
> I had in the initial code, I needed to add the -1 stats back. Now with Wei's
> change, subtracting old folio and increment new folios are done together,
> stop_split is no longer needed.
>
> >  	     split_order--) {
> > -		struct folio *end_folio = folio_next(folio);
> > -		int old_order = folio_order(folio);
> > -		struct folio *new_folio;
> > +		int nr_new_folios = 1UL << (old_order - split_order);
>
> stats for counting new after-split folios.
>
> >
> >  		/* order-1 anonymous folio is not supported */
> >  		if (is_anon && split_order == 1)
> >  			continue;
> > -		if (uniform_split && split_order != new_order)
> > -			continue;
>
> split_order is always new_order when uniform_split is true, so it is useless
> code.
>
> >
> >  		if (mapping) {
> >  			/*
> > @@ -3568,49 +3558,31 @@ static int __split_unmapped_folio(struct folio *folio, int new_order,
> >  			else {
> >  				xas_set_order(xas, folio->index, split_order);
> >  				xas_try_split(xas, folio, old_order);
> > -				if (xas_error(xas)) {
> > -					ret = xas_error(xas);
> > -					stop_split = true;
> > -				}
> > +				if (xas_error(xas))
> > +					return xas_error(xas);
>
> xas_error() was not returned immediately because the -1 for old folio needed
> to be fixed later. Now since the stats is not changed upfront, it can simply
> return.
>
>
> >  			}
> >  		}
> >
> > -		if (!stop_split) {
> > -			folio_split_memcg_refs(folio, old_order, split_order);
> > -			split_page_owner(&folio->page, old_order, split_order);
> > -			pgalloc_tag_split(folio, old_order, split_order);
> > +		folio_split_memcg_refs(folio, old_order, split_order);
> > +		split_page_owner(&folio->page, old_order, split_order);
> > +		pgalloc_tag_split(folio, old_order, split_order);
> > +		__split_folio_to_order(folio, old_order, split_order);
> >
> > -			__split_folio_to_order(folio, old_order, split_order);
>
> No more stop_split, so all these are unconditional.
>
> > +		if (is_anon) {
> > +			mod_mthp_stat(old_order, MTHP_STAT_NR_ANON, -1);
> > +			mod_mthp_stat(split_order, MTHP_STAT_NR_ANON, nr_new_folios);
> >  		}
>
> Update the stats when the actual split is done, much better than
> the original one, which assumes the split would happen and recovers
> when it does not.
>
> >
> >  		/*
> > -		 * Iterate through after-split folios and update folio stats.
> > -		 * But in buddy allocator like split, the folio
> > -		 * containing the specified page is skipped until its order
> > -		 * is new_order, since the folio will be worked on in next
> > -		 * iteration.
> > +		 * For uniform split, we have finished the job.
> > +		 * For non-uniform split, we assign folio to the one the one
> > +		 * containing @split_at and assign @old_order to @split_order.
> >  		 */
> > -		for (new_folio = folio; new_folio != end_folio; new_folio = next) {
> > -			next = folio_next(new_folio);
> > -			/*
> > -			 * for buddy allocator like split, new_folio containing
> > -			 * @split_at page could be split again, thus do not
> > -			 * change stats yet. Wait until new_folio's order is
> > -			 * @new_order or stop_split is set to true by the above
> > -			 * xas_split() failure.
> > -			 */
> > -			if (new_folio == page_folio(split_at)) {
> > -				folio = new_folio;
> > -				if (split_order != new_order && !stop_split)
> > -					continue;
> > -			}
> > -			if (is_anon)
> > -				mod_mthp_stat(folio_order(new_folio),
> > -					      MTHP_STAT_NR_ANON, 1);
> > -		}
>
> This loop was needed to update after-split folios in the xarray and unfreeze
> folios, but since last refactor, neither is here. And the stats update can
> be done in one shot instead of the originally one by one method. So
> this hunk can be removed completely.

Which refactor in particular? This is the bit that really needs separating as
it's the most confusing and begs the question 'why can we avoid doing this'?

>
> > +		folio = page_folio(split_at);
> > +		old_order = split_order;
>
> These two are for non-uniform split, where the code moves to split next
> folio after it splits the current one into half. And the old_order needs
> to be updated, since the folio has been split.
>
> For uniform split, the for loop is done once, these two effectively does
> nothing, since the for loop ends.
>
> >  	}
> >
> > -	return ret;
> > +	return 0;
>
> xas_try_split() failure is handled above, code should always succeed at this
> point.
>
> >  }
> >
> >  bool non_uniform_split_supported(struct folio *folio, unsigned int new_order,
>
> --
> Best Regards,
> Yan, Zi
>

next prev parent reply	other threads:[~2025-10-20 14:03 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-16  0:46 [Patch v2 0/2] mm/huge_memory: cleanup __split_unmapped_folio() Wei Yang
2025-10-16  0:46 ` [Patch v2 1/2] mm/huge_memory: cache folio attribute in __split_unmapped_folio() Wei Yang
2025-10-16  1:34   ` Barry Song
2025-10-16 20:01   ` David Hildenbrand
2025-10-17  9:46   ` Lorenzo Stoakes
2025-10-19  7:51     ` Wei Yang
2025-10-16  0:46 ` [Patch v2 2/2] mm/huge_memory: Optimize and simplify __split_unmapped_folio() logic Wei Yang
2025-10-16  1:25   ` wang lian
2025-10-16 20:10   ` David Hildenbrand
2025-10-16 20:22     ` Zi Yan
2025-10-16 20:55       ` David Hildenbrand
2025-10-16 20:56         ` Zi Yan
2025-10-17  0:55   ` Wei Yang
2025-10-17  9:44   ` Lorenzo Stoakes
2025-10-17 14:26     ` Zi Yan
2025-10-17 14:29       ` Zi Yan
2025-10-17 14:44       ` Lorenzo Stoakes
2025-10-17 14:45       ` David Hildenbrand
2025-10-17 14:55         ` Lorenzo Stoakes
2025-10-17 17:24           ` Zi Yan
2025-10-20 14:03             ` Lorenzo Stoakes [this message]
2025-10-20 14:28               ` Zi Yan
2025-10-21  0:30               ` Wei Yang
2025-10-21  9:17                 ` Lorenzo Stoakes
2025-10-19  8:00           ` Wei Yang
2025-10-20 11:55             ` Lorenzo Stoakes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8dadd256-175a-4706-84d0-180eda04ce89@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=npache@redhat.com \
    --cc=richard.weiyang@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox