linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alistair Popple <apopple@nvidia.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>,
	Lorenzo Stoakes <ljs@kernel.org>,
	 "Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@kernel.org>,
	 Mike Rapoport <rppt@kernel.org>,
	Suren Baghdasaryan <surenb@google.com>,
	 Michal Hocko <mhocko@suse.com>, Zi Yan <ziy@nvidia.com>,
	Matthew Brost <matthew.brost@intel.com>,
	 Joshua Hahn <joshua.hahnjy@gmail.com>,
	Rakie Kim <rakie.kim@sk.com>, Byungchul Park <byungchul@sk.com>,
	 Gregory Price <gourry@gourry.net>,
	Ying Huang <ying.huang@linux.alibaba.com>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	 Chris Li <chrisl@kernel.org>, Kairui Song <kasong@tencent.com>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
	 Barry Song <baohua@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH 2/2] mm/migrate: wait for folio refcount during longterm pin migration
Date: Tue, 21 Apr 2026 15:57:17 +1000	[thread overview]
Message-ID: <aecQkNxjoMReR6Ss@nvdebian.thelocal> (raw)
In-Reply-To: <20260410032333.400406-3-jhubbard@nvidia.com>

On 2026-04-10 at 13:23 +1000, John Hubbard <jhubbard@nvidia.com> wrote...
> When migrating pages for FOLL_LONGTERM pinning (MR_LONGTERM_PIN), the
> migration can fail with -EAGAIN if the folio has unexpected references.
> These references are often transient (e.g., from GPU operations like
> cuMemset that will complete shortly).

Is there a reason this logic should only apply to FOLL_LONGTERM pinning?
Or could it also apply more generally to any ZONE_MOVABLE page, for which
migration should eventually succeed? Currently that has similar retry logic of
NR_MAX_MIGRATE_PAGES_RETRY times and give up.

We have a similar retry problems in mm/migrate_device.c:migrate_vma_*() so I
could see something similar being potentially useful there.

 - Alistair

> Previously, the migration code would retry up to 10 times
> (NR_MAX_MIGRATE_PAGES_RETRY), but this busy-retry approach failed when
> the transient reference holder needed more time than the retry loop
> provides.
> 
> Fix this by waiting up to one second for the folio's refcount to drop
> to the expected value before retrying migration. The wait uses
> wait_var_event_timeout() paired with the wake_up_var() calls added to
> folio_put() in the previous commit. If the timeout expires, the
> existing retry loop continues as before. The folio_put_wakeup_key
> static key is enabled for the duration of migrate_pages() so that
> folio_put() only wakes waiters when migration is active.
> 
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  mm/migrate.c | 30 ++++++++++++++++++++++++++++++
>  1 file changed, 30 insertions(+)
> 
> diff --git a/mm/migrate.c b/mm/migrate.c
> index 2c3d489ecf51..a5d9f85aa376 100644
> --- a/mm/migrate.c
> +++ b/mm/migrate.c
> @@ -47,6 +47,8 @@
>  #include <asm/tlbflush.h>
>  
>  #include <trace/events/migrate.h>
> +#include <linux/jump_label.h>
> +#include <linux/wait_bit.h>
>  
>  #include "internal.h"
>  #include "swap.h"
> @@ -1732,6 +1734,17 @@ static void migrate_folios_move(struct list_head *src_folios,
>  			*retry += 1;
>  			*thp_retry += is_thp;
>  			*nr_retry_pages += nr_pages;
> +			/*
> +			 * For longterm pinning, wait for references
> +			 * to be released before retrying.
> +			 */
> +			if (reason == MR_LONGTERM_PIN) {
> +				int expected = folio_expected_ref_count(folio) + 1;
> +
> +				wait_var_event_timeout(&folio->_refcount,
> +					folio_ref_count(folio) <= expected,
> +					HZ);
> +			}
>  			break;
>  		case 0:
>  			stats->nr_succeeded += nr_pages;
> @@ -1941,6 +1954,17 @@ static int migrate_pages_batch(struct list_head *from,
>  				retry++;
>  				thp_retry += is_thp;
>  				nr_retry_pages += nr_pages;
> +				/*
> +				 * For longterm pinning, wait for references
> +				 * to be released.
> +				 */
> +				if (reason == MR_LONGTERM_PIN) {
> +					int expected = folio_expected_ref_count(folio) + 1;
> +
> +					wait_var_event_timeout(&folio->_refcount,
> +							folio_ref_count(folio) <= expected,
> +							HZ);
> +				}
>  				break;
>  			case 0:
>  				list_move_tail(&folio->lru, &unmap_folios);
> @@ -2085,6 +2109,9 @@ int migrate_pages(struct list_head *from, new_folio_t get_new_folio,
>  
>  	memset(&stats, 0, sizeof(stats));
>  
> +	if (reason == MR_LONGTERM_PIN)
> +		static_branch_inc(&folio_put_wakeup_key);
> +
>  	rc_gather = migrate_hugetlbs(from, get_new_folio, put_new_folio, private,
>  				     mode, reason, &stats, &ret_folios);
>  	if (rc_gather < 0)
> @@ -2137,6 +2164,9 @@ int migrate_pages(struct list_head *from, new_folio_t get_new_folio,
>  	if (!list_empty(from))
>  		goto again;
>  out:
> +	if (reason == MR_LONGTERM_PIN)
> +		static_branch_dec(&folio_put_wakeup_key);
> +
>  	/*
>  	 * Put the permanent failure folio back to migration list, they
>  	 * will be put back to the right list by the caller.
> -- 
> 2.53.0
> 


  reply	other threads:[~2026-04-21  5:57 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10  3:23 [RFC PATCH 0/2] " John Hubbard
2026-04-10  3:23 ` [RFC PATCH 1/2] mm: wake up folio refcount waiters on folio_put() John Hubbard
2026-04-10  3:23 ` [RFC PATCH 2/2] mm/migrate: wait for folio refcount during longterm pin migration John Hubbard
2026-04-21  5:57   ` Alistair Popple [this message]
2026-04-21  9:21   ` Huang, Ying
2026-04-21  5:52 ` [RFC PATCH 0/2] " Alistair Popple
2026-04-21  9:19 ` Huang, Ying

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aecQkNxjoMReR6Ss@nvdebian.thelocal \
    --to=apopple@nvidia.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=byungchul@sk.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=gourry@gourry.net \
    --cc=jhubbard@nvidia.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=nphamcs@gmail.com \
    --cc=rakie.kim@sk.com \
    --cc=rppt@kernel.org \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox