linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Huang, Ying" <ying.huang@linux.alibaba.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	 David Hildenbrand <david@kernel.org>,
	 Lorenzo Stoakes <ljs@kernel.org>,
	 "Liam R . Howlett" <Liam.Howlett@oracle.com>,
	 Vlastimil Babka <vbabka@kernel.org>,
	 Mike Rapoport <rppt@kernel.org>,
	 Suren Baghdasaryan <surenb@google.com>,
	Michal Hocko <mhocko@suse.com>,  Zi Yan <ziy@nvidia.com>,
	 Matthew Brost <matthew.brost@intel.com>,
	 Joshua Hahn <joshua.hahnjy@gmail.com>,
	 Rakie Kim <rakie.kim@sk.com>,  Byungchul Park <byungchul@sk.com>,
	 Gregory Price <gourry@gourry.net>,
	 Alistair Popple <apopple@nvidia.com>,
	 Axel Rasmussen <axelrasmussen@google.com>,
	 Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
	 Chris Li <chrisl@kernel.org>,  Kairui Song <kasong@tencent.com>,
	 Kemeng Shi <shikemeng@huaweicloud.com>,
	 Nhat Pham <nphamcs@gmail.com>,  Baoquan He <bhe@redhat.com>,
	 Barry Song <baohua@kernel.org>,
	 LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH 0/2] mm/migrate: wait for folio refcount during longterm pin migration
Date: Tue, 21 Apr 2026 17:19:36 +0800	[thread overview]
Message-ID: <87h5p4isbb.fsf@DESKTOP-5N7EMDA> (raw)
In-Reply-To: <20260410032333.400406-1-jhubbard@nvidia.com> (John Hubbard's message of "Thu, 9 Apr 2026 20:23:31 -0700")

Hi, John,

John Hubbard <jhubbard@nvidia.com> writes:

> Hi,
>
> This adds a bounded sleep to migration so that FOLL_LONGTERM pinning can
> wait for transient folio references to drain, instead of failing after a
> fixed number of retries. The wait uses a one-second timeout. An

Is the one-second timeout appropriate for all users?  Do some users
prefer fail-fast behavior instead?  If so, should we add another FOLL
flag to support a timed wait?

> alternative approach would be to call wait_var_event_killable() with no
> timeout, but that doesn't match as well with migration's "this will
> probably work" API. In other words, a short sleeping wait is more
> appropriate here.
>
> When migrating pages for FOLL_LONGTERM pinning, migration can fail with
> -EAGAIN if a folio has unexpected references. These references are often
> transient, but the current retry loop gives up too quickly. This series
> adds wait_var_event_timeout() at the retry points, paired with
> wake_up_var() in folio_put() to wake the sleeper as soon as the refcount
> drops.
>
> The wake_up_var() calls in folio_put() are gated behind a static key,
> disabled by default, so non-migration workloads pay zero cost.
> migrate_pages() enables the key on entry when the reason is
> MR_LONGTERM_PIN, and disables it on exit.
>
> Toggling the key is not free. folio_put() is static inline, so every
> compilation unit that calls it gets its own patch site (roughly 500 in
> vmlinux, plus modules). On x86, jump label patching is batched (256
> sites per batch, 3 IPI rounds per batch), so enabling the key costs
> 6-9 IPI broadcasts, a few hundred microseconds on a large machine.
> That cost is paid twice per migrate_pages() call. Migration itself
> spends several milliseconds per batch on LRU isolation, TLB flushes,
> and page copies. Concurrent longterm-pin migrations after the first
> just do an atomic_inc (no patching).
>
> Matthew Brost offered to performance-test this series [1], as Intel has
> tests that stress migration and good metrics to catch regressions.
>
> [1] https://lore.kernel.org/all/aX+oUorOWPt1xbgw@lstrano-desk.jf.intel.com/
>
> John Hubbard (2):
>   mm: wake up folio refcount waiters on folio_put()
>   mm/migrate: wait for folio refcount during longterm pin migration
>
>  include/linux/mm.h |  8 ++++++++
>  mm/migrate.c       | 30 ++++++++++++++++++++++++++++++
>  mm/swap.c          | 10 +++++++++-
>  3 files changed, 47 insertions(+), 1 deletion(-)
>
>
> base-commit: 9a9c8ce300cd3859cc87b408ef552cd697cc2ab7

---
Best Regards,
Huang, Ying


      parent reply	other threads:[~2026-04-21  9:20 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10  3:23 John Hubbard
2026-04-10  3:23 ` [RFC PATCH 1/2] mm: wake up folio refcount waiters on folio_put() John Hubbard
2026-04-10  3:23 ` [RFC PATCH 2/2] mm/migrate: wait for folio refcount during longterm pin migration John Hubbard
2026-04-21  5:57   ` Alistair Popple
2026-04-21  9:21   ` Huang, Ying
2026-04-21  5:52 ` [RFC PATCH 0/2] " Alistair Popple
2026-04-21  9:19 ` Huang, Ying [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87h5p4isbb.fsf@DESKTOP-5N7EMDA \
    --to=ying.huang@linux.alibaba.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=apopple@nvidia.com \
    --cc=axelrasmussen@google.com \
    --cc=baohua@kernel.org \
    --cc=bhe@redhat.com \
    --cc=byungchul@sk.com \
    --cc=chrisl@kernel.org \
    --cc=david@kernel.org \
    --cc=gourry@gourry.net \
    --cc=jhubbard@nvidia.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ljs@kernel.org \
    --cc=matthew.brost@intel.com \
    --cc=mhocko@suse.com \
    --cc=nphamcs@gmail.com \
    --cc=rakie.kim@sk.com \
    --cc=rppt@kernel.org \
    --cc=shikemeng@huaweicloud.com \
    --cc=surenb@google.com \
    --cc=vbabka@kernel.org \
    --cc=weixugc@google.com \
    --cc=yuanchu@google.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox