From: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>
To: John Hubbard <jhubbard@nvidia.com>,
Matthew Brost <matthew.brost@intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
intel-xe@lists.freedesktop.org,
Ralph Campbell <rcampbell@nvidia.com>,
Christoph Hellwig <hch@lst.de>,
Jason Gunthorpe <jgg@mellanox.com>,
Jason Gunthorpe <jgg@ziepe.ca>,
Leon Romanovsky <leon@kernel.org>,
linux-mm@kvack.org, stable@vger.kernel.org,
dri-devel@lists.freedesktop.org
Subject: Re: [PATCH] mm/hmm: Fix a hmm_range_fault() livelock / starvation problem
Date: Tue, 03 Feb 2026 10:31:04 +0100 [thread overview]
Message-ID: <9a9853a320a30802ff35803a574aab037aa2fd92.camel@linux.intel.com> (raw)
In-Reply-To: <a5b71dbc-9e3a-4098-8821-21a9a02ec235@nvidia.com>
On Mon, 2026-02-02 at 14:28 -0800, John Hubbard wrote:
> On 2/2/26 1:13 AM, Thomas Hellström wrote:
> > On Sat, 2026-01-31 at 13:42 -0800, John Hubbard wrote:
> > > On 1/31/26 11:00 AM, Matthew Brost wrote:
> > > > On Sat, Jan 31, 2026 at 01:57:21PM +0100, Thomas Hellström
> > > > wrote:
> > > > > On Fri, 2026-01-30 at 19:01 -0800, John Hubbard wrote:
> > > > > > On 1/30/26 10:00 AM, Andrew Morton wrote:
> > > > > > > On Fri, 30 Jan 2026 15:45:29 +0100 Thomas Hellström
> > > > > > > <thomas.hellstrom@linux.intel.com> wrote:
> > > > > > ...
> > >
> > > >
> > > > > I'm also not sure a folio refcount should block migration
> > > > > after
> > > > > the
> > > > > introduction of pinned (like in pin_user_pages) pages. Rather
> > > > > perhaps a
> > > > > folio pin-count should block migration and in that case
> > > > > do_swap_page()
> > > > > can definitely do a sleeping folio lock and the problem is
> > > > > gone.
> > >
> > > A problem for that specific point is that pincount and refcount
> > > both
> > > mean, "the page is pinned" (which in turn literally means "not
> > > allowed
> > > to migrate/move").
> >
> > Yeah this is what I actually want to challenge since this is what
> > blocks us from doing a clean robust solution here. From brief
> > reading
> > of the docs around the pin-count implementation, I understand it as
> > "If
> > you want to access the struct page metadata, get a refcount, If you
> > want to access the actual memory of a page, take a pin-count"
> >
> > I guess that might still not be true for all old instances in the
> > kernel using get_user_pages() instead of pin_user_pages() for
> > things
> > like DMA, but perhaps we can set that in stone and document it at
> > least
> > for device-private pages for now which would be sufficient for the
> > do_swap_pages() refcount not to block migration.
> >
>
> It's an interesting direction to go...
>
> >
> > >
> > > (In fact, pincount is implemented in terms of refcount, in most
> > > configurations still.)
> >
> > Yes but that's only a space optimization never intended to
> > conflict,
> > right? Meaning a pin-count will imply a refcount but a refcount
> > will
> > never imply a pin-count?
> >
> Unfortunately, they are more tightly linked than that today, at least
> until
> someday when specialized folios are everywhere (at which point
> pincount
> gets its own field).
>
> Until then, it's not just a "space optimization", it's "overload
> refcount
> to also do pincounting". And "let core mm continue to treat refcounts
> as
> meaning that the page is pinned".
So this is what I had in mind:
I think certainly this would work regardless of whether pincount is
implemented by means of refcount with a bias or not, and AFAICT it's
also consistent with
https://docs.kernel.org/core-api/pin_user_pages.html
But it would not work if some part of core mm grabs a page refcount and
*expects* that to pin a page in the sense that it should not be
migrated. But you're suggesting that's actually the case?
Thanks,
Thomas
diff --git a/mm/migrate_device.c b/mm/migrate_device.c
index a101a187e6da..c07a79995128 100644
--- a/mm/migrate_device.c
+++ b/mm/migrate_device.c
@@ -534,33 +534,15 @@ static void migrate_vma_collect(struct
migrate_vma *migrate)
* migrate_vma_check_page() - check if page is pinned or not
* @page: struct page to check
*
- * Pinned pages cannot be migrated. This is the same test as in
- * folio_migrate_mapping(), except that here we allow migration of a
- * ZONE_DEVICE page.
+ * Pinned pages cannot be migrated.
*/
static bool migrate_vma_check_page(struct page *page, struct page
*fault_page)
{
struct folio *folio = page_folio(page);
- /*
- * One extra ref because caller holds an extra reference,
either from
- * folio_isolate_lru() for a regular folio, or
migrate_vma_collect() for
- * a device folio.
- */
- int extra = 1 + (page == fault_page);
-
- /* Page from ZONE_DEVICE have one extra reference */
- if (folio_is_zone_device(folio))
- extra++;
-
- /* For file back page */
- if (folio_mapping(folio))
- extra += 1 + folio_has_private(folio);
-
- if ((folio_ref_count(folio) - extra) > folio_mapcount(folio))
- return false;
+ VM_WARN_ON_FOLIO(folio_test_lru(folio) || folio_mapped(folio),
folio);
- return true;
+ return !folio_maybe_dma_pinned(folio);
}
>
>
> thanks,
next prev parent reply other threads:[~2026-02-03 9:31 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-30 14:45 Thomas Hellström
2026-01-30 18:00 ` Andrew Morton
2026-01-30 19:56 ` Thomas Hellström
2026-01-30 20:38 ` Andrew Morton
2026-01-30 21:01 ` Matthew Brost
2026-01-30 21:08 ` Andrew Morton
2026-01-31 0:59 ` Matthew Brost
2026-01-31 3:01 ` John Hubbard
2026-01-31 12:57 ` Thomas Hellström
2026-01-31 19:00 ` Matthew Brost
2026-01-31 21:42 ` John Hubbard
2026-02-01 19:24 ` Matthew Brost
2026-02-01 20:48 ` John Hubbard
2026-02-01 21:07 ` Matthew Brost
2026-02-02 0:10 ` Alistair Popple
2026-02-02 9:30 ` Thomas Hellström
2026-02-02 10:25 ` Alistair Popple
2026-02-02 10:41 ` Thomas Hellström
2026-02-02 11:22 ` Alistair Popple
2026-02-02 11:44 ` Thomas Hellström
2026-02-02 12:26 ` Alistair Popple
2026-02-02 14:07 ` Thomas Hellström
2026-02-02 23:13 ` Alistair Popple
2026-02-02 9:13 ` Thomas Hellström
2026-02-02 10:34 ` Alistair Popple
2026-02-02 10:51 ` Thomas Hellström
2026-02-02 11:28 ` Alistair Popple
2026-02-02 22:28 ` John Hubbard
2026-02-03 9:31 ` Thomas Hellström [this message]
2026-02-04 1:13 ` pincount vs refcount: " John Hubbard
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9a9853a320a30802ff35803a574aab037aa2fd92.camel@linux.intel.com \
--to=thomas.hellstrom@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=dri-devel@lists.freedesktop.org \
--cc=hch@lst.de \
--cc=intel-xe@lists.freedesktop.org \
--cc=jgg@mellanox.com \
--cc=jgg@ziepe.ca \
--cc=jhubbard@nvidia.com \
--cc=leon@kernel.org \
--cc=linux-mm@kvack.org \
--cc=matthew.brost@intel.com \
--cc=rcampbell@nvidia.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox