From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Vlastimil Babka <vbabka@suse.cz>,
"Liam R . Howlett" <Liam.Howlett@oracle.com>,
Suren Baghdasaryan <surenb@google.com>,
Matthew Wilcox <willy@infradead.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 1/7] mm/mremap: introduce more mergeable mremap via MREMAP_RELOCATE_ANON
Date: Sat, 22 Mar 2025 07:21:49 +0000 [thread overview]
Message-ID: <9f81bfe4-4cc7-4754-b92f-db3a4e549f86@lucifer.local> (raw)
In-Reply-To: <21f89b73-aaae-4674-aea2-aefc7a4847d9@redhat.com>
On Sat, Mar 22, 2025 at 07:17:05AM +0100, David Hildenbrand wrote:
> On 22.03.25 06:33, David Hildenbrand wrote:
> > On 22.03.25 01:14, Jann Horn wrote:
> > > On Fri, Mar 21, 2025 at 10:54 PM Lorenzo Stoakes
> > > <lorenzo.stoakes@oracle.com> wrote:
> > > > diff --git a/mm/mremap.c b/mm/mremap.c
> > > > index 0865387531ed..bb67562a0114 100644
> > > > --- a/mm/mremap.c
> > > > +++ b/mm/mremap.c
> > > [...]
> > > > +/*
> > > > + * If the folio mapped at the specified pte entry can have its index and mapping
> > > > + * relocated, then do so.
> > > > + *
> > > > + * Returns the number of pages we have traversed, or 0 if the operation failed.
> > > > + */
> > > > +static unsigned long relocate_anon(struct pagetable_move_control *pmc,
> > > > + unsigned long old_addr, unsigned long new_addr, pte_t pte,
> > > > + bool undo)
> > > > +{
> > > > + struct page *page;
> > > > + struct folio *folio;
> > > > + struct vm_area_struct *old, *new;
> > > > + pgoff_t new_index;
> > > > + unsigned long ret = 1;
> > > > +
> > > > + old = pmc->old;
> > > > + new = pmc->new;
> > > > +
> > > > + /* Ensure we have truly got an anon folio. */
> > > > + page = vm_normal_page(old, old_addr, pte);
> > > > + if (!page)
> > > > + return ret;
> > > > + folio = page_folio(page);
> > > > + folio_lock(folio);
> > > > +
> > > > + /* no-op. */
> > > > + if (!folio_test_anon(folio) || folio_test_ksm(folio))
> > > > + goto out;
> > > > +
> > > > + /*
> > > > + * This should not happen as we explicitly disallow this, but check
> > > > + * anyway.
> > > > + */
> > > > + if (folio_test_large(folio)) {
> > > > + ret = 0;
> > > > + goto out;
> > > > + }
> > >
> > > Do I understand correctly that you assume here that the page is
> > > exclusively mapped? Maybe we could at least
> > > WARN_ON(folio_mapcount(folio) != 1) or something like that?
> > >
> > > (I was also wondering if the PageAnonExclusive bit is somehow
> > > relevant, but we should probably not look at or touch that here,
> > > unless we want to think about cases where we _used to_ have a child
> > > from which the page may have been GUP'd...)
> >
> > UFFDIO_MOVE implements something similar. Right now we keep it simple:
> >
> > if (folio_test_large(src_folio) ||
> > folio_maybe_dma_pinned(src_folio) ||
> > !PageAnonExclusive(&src_folio->page)) {
> > err = -EBUSY;
> > goto out;
> > }
> >
> > Whereby we
> >
> > a) Make sure we cover all PTEs (-> small folio, single PTE). Large
> > PTE-mapped folios are split.
> >
> > b) Make sure there are no GUP pins (maybe not required here?)
> >
> > c) The folio is exclusive to this process
>
> On additional note as my memory comes back: if PAE is set, there cannot be
> other (inactive) mappings from the swapcache. So whenever we use folio lock
> + mapcount data, the possibility of the swapcache (having inactive mappings
> from other processes etc.) must be considered.
Ack, do you have a feel for how such a check would work?
>
> --
> Cheers,
>
> David / dhildenb
>
An aside in general:
I realise all of this is very fiddly, this (albeit early) RFC is several
revisions deep and fully expect considerably more fiddly cases to come up :)
I plan to do some deeper testing on real iron running very heavy workloads to
encourage reclaim and races btw.
One test I've done in past is to just hack in forced-on MREMAP_RELOCATE_ANON so
_all_ mremap()'s that move do it, which has been a good way of finding issues,
but also in tests I add in series I intentionally trigger reclaim via
MADV_PAGEOUT.
So in general - I'm going to proactively try to really eek out all and any weird
edge case stuff where possible before more seriously pushing this series
forward.
Thanks again for early review, it's much appreciated! :) and see you at the
topic I'm giving on this and the pub after... ;)
next prev parent reply other threads:[~2025-03-22 7:22 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-21 21:54 [RFC PATCH 0/7] " Lorenzo Stoakes
2025-03-21 21:54 ` [RFC PATCH 1/7] " Lorenzo Stoakes
2025-03-22 0:14 ` Jann Horn
2025-03-22 5:33 ` David Hildenbrand
2025-03-22 6:17 ` David Hildenbrand
2025-03-22 7:21 ` Lorenzo Stoakes [this message]
2025-03-23 12:53 ` David Hildenbrand
2025-03-31 14:19 ` Lorenzo Stoakes
2025-03-22 7:17 ` Lorenzo Stoakes
2025-03-23 12:49 ` David Hildenbrand
2025-03-31 14:50 ` Lorenzo Stoakes
2025-04-01 19:33 ` David Hildenbrand
2025-04-21 13:12 ` Lorenzo Stoakes
2025-03-22 7:07 ` Lorenzo Stoakes
2025-03-21 21:54 ` [RFC PATCH 2/7] mm/mremap: add MREMAP_MUST_RELOCATE_ANON Lorenzo Stoakes
2025-03-21 21:54 ` [RFC PATCH 3/7] mm/mremap: add MREMAP[_MUST]_RELOCATE_ANON support for THP folios Lorenzo Stoakes
2025-03-21 21:54 ` [RFC PATCH 4/7] tools UAPI: Update copy of linux/mman.h from the kernel sources Lorenzo Stoakes
2025-03-21 21:54 ` [RFC PATCH 5/7] tools/testing/selftests: add mremap() cases that merge normally Lorenzo Stoakes
2025-03-21 21:54 ` [RFC PATCH 6/7] tools/testing/selftests: add MREMAP_RELOCATE_ANON merge test cases Lorenzo Stoakes
2025-03-21 21:54 ` [RFC PATCH 7/7] tools/testing/selftests: expand mremap() tests for MREMAP_RELOCATE_ANON Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9f81bfe4-4cc7-4754-b92f-db3a4e549f86@lucifer.local \
--to=lorenzo.stoakes@oracle.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=david@redhat.com \
--cc=jannh@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox