From: Matthew Wilcox <willy@infradead.org>
To: David Hildenbrand <david@redhat.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
"lsf-pc@lists.linux-foundation.org"
<lsf-pc@lists.linux-foundation.org>
Subject: Re: [LSF/MM/BPF TOPIC] Non-lru page migration in a memdesc world
Date: Tue, 7 Jan 2025 16:49:45 +0000 [thread overview]
Message-ID: <Z31bKQ27pluywo7A@casper.infradead.org> (raw)
In-Reply-To: <2612ac8a-d0a9-452b-a53d-75ffc6166224@redhat.com>
On Tue, Jan 07, 2025 at 05:11:02PM +0100, David Hildenbrand wrote:
> one item on my todo list is making PageOffline pages to stop using "struct
> page" members except page->type and 1/2 flags, to prepare them for the
> memdesc future, to avoid unnecessary atomics, and to resolve some (so-far)
> theoretical issues with temporary speculative references.
Well, thank goodness someone's working on this! Because I'm stumped.
> For that, we use the "non-lru page migration" framework and in that process
> we make use of ... way to many members of "struct page"/"struct folio" and
> rely on the refcount not being 0. For example, we certainly don't want to
> allocate memdescs for PageOffline pages just so some of them can be
> migrated.
I mean, let's start with how we migrate pages.
int migrate_pages(struct list_head *from, new_folio_t get_new_folio,
free_folio_t put_new_folio, unsigned long private,
enum migrate_mode mode, int reason, unsigned int *ret_succeeded)
...
list_for_each_entry_safe(folio, folio2, from, lru) {
We identify every folio to be migrated and put them on a list. But once
non-folio things need to be migrated, this code is wrong.
We could rename this to migrate_folios() and have a different function
for migrating non-folio memory. But now the compaction code starts to
look distressingly complex [1]. So we need a way to pass in a list/array
of memory to be migrated that doesn't involve a list_head and magically
trying to deduce what the memory is.
I'm actually wondering about a bitmap. Generally when we migrate memory
it's to create physical contiguity so perhaps passing in a base_pfn
and a bitmap that contains, say, PMD_ORDER bits; then it's the job of
the migration code to figure out what to do for each pfn indicated by
base_pfn and the set bits in the bitmap?
Although now I write this down, I guess NUMA migration doesn't behave
that way. So perhaps compaction-migration and numa-migration end up
using different interfaces? I think NUMA migration always migrates
folios, so it can keep using get_new_folio() and put_new_folio() while
the compaction-migration might need a different pair of callbacks to
allocate/free memory of many different memdesc types.
[1] OK, it is already distressingly complex. But we're making it even
more complex.
next prev parent reply other threads:[~2025-01-07 16:49 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-07 16:11 David Hildenbrand
2025-01-07 16:48 ` Zi Yan
2025-01-07 16:55 ` David Hildenbrand
2025-01-07 17:27 ` Zi Yan
2025-01-13 4:18 ` Alistair Popple
2025-01-13 4:56 ` Matthew Wilcox
2025-01-07 16:49 ` Matthew Wilcox [this message]
2025-01-08 3:39 ` Zi Yan
2025-03-24 18:56 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z31bKQ27pluywo7A@casper.infradead.org \
--to=willy@infradead.org \
--cc=david@redhat.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox