From: Lokesh Gidra <lokeshgidra@google.com>
To: David Hildenbrand <david@redhat.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
akpm@linux-foundation.org, linux-mm@kvack.org,
kaleshsingh@google.com, ngeoffray@google.com,
Harry Yoo <harry.yoo@oracle.com>, Peter Xu <peterx@redhat.com>,
Suren Baghdasaryan <surenb@google.com>,
Barry Song <baohua@kernel.org>, SeongJae Park <sj@kernel.org>
Subject: Re: [RFC PATCH 1/2] mm: always call rmap_walk() on locked folios
Date: Fri, 12 Sep 2025 21:27:03 -0700 [thread overview]
Message-ID: <CA+EESO5D6CBezDCnN6=7zujzczDZp1t-kajXQCEVsgYTYciT4g@mail.gmail.com> (raw)
In-Reply-To: <3f37af16-abf2-4ed4-9894-0028a9f02f76@redhat.com>
On Fri, Sep 12, 2025 at 2:03 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 11.09.25 21:39, Lorenzo Stoakes wrote:
> > Please please please use a cover letter if there's more than 1 patch :)
> >
> > I really dislike the 2/2 replying to the 1/2.
>
> +1000
>
> >
> > On Sun, Sep 07, 2025 at 09:49:49PM -0700, Lokesh Gidra wrote:
> >> Prior discussion about this can be found at [1].
> >>
> >> rmap_walk() requires all folios, except non-KSM anon, to be locked. This
> >> implies that when threads update folio->mapping to an anon_vma with
> >> different root (currently only done by UFFDIO MOVE), they have to
> >
> > I said this on the discussion thread, but can we please stop dancing around
> > and acting as if this isn't an entirely uffd-specific patch please :)
> >
> > Let's very explicitly say that's why we're doing this.
> >
> >> serialize against rmap_walk() with write-lock on the anon_vma, hurting
> >> scalability. Furthermore, this necessitates rechecking anon_vma when
> >> pinning/locking an anon_vma (like in folio_lock_anon_vma_read()).
> >
> > THis is really quite confusing, you're compressing far too much information
> > into a single sentence.
> >
> > Let's reword this to make it clearer like:
> >
> > Userfaultfd has a scaling issue with its UFFDIO_MOVE operation, an
> > operation that is heavily used in android [insert reason why].
> >
> > The issue arises because UFFDIO_MOVE updates folio->mapping to an
> > anon_vma with a different root. It acquires the folio lock to do
> > so, but this is insufficient, because rmap_walk() has a mode in
> > which a folio lock need not be acquired, exclusive to non-KSM
> > anonymous folios.
> >
> > This means that UFFDIO_MOVE has to acquire the anon_vma write lock
> > of the root anon_vma belonging to the folio it wishes to move.
> >
> > This has resulted in scalability issues due to contention between
> > [insert contention information]. We have observed:
> >
> > [ insert some data to back this up ]
> >
> > This patch resolves the issue by removing this exception. This is
> > less problematic than it might seem, as the only caller which
> > utilises this mode is shrink_active_list().
> >
> > Something like this is _a lot_ clearer I think.
>
> Yes, fully agreed.
>
> >
> >>
> >> This can be simplified quite a bit by ensuring that rmap_walk() is
> >> always called on locked folios. Among the few callers of rmap_walk() on
> >> unlocked anon folios, shrink_active_list()->folio_referenced() is the
> >> only performance critical one.
> >
> > Let's please not call this a simplification, I mean yes we simplify the
> > code per se, but we're fundamentally changing the locking logic.
> >
> > Let's explicitly say that.
> >
> > Also I find it odd that you say shrink_active_list()->folio_referenced() is
> > 'performance critical', I mean if so, surely this series is broken then?
> >
> > I'd delete that, the entire basis of this being ok is that it's _not_
> > performance critical to make this change.
>
> I think we can mention that as a side-effect of this performance
> optimization for uffd, folio_get_anon_vma() gets simpler and we no
> langer handle locking of anon folios different to locking of other
> (pagecache, ksm) folios.
>
Thank you both for the valuable feedback. I'll upload next version
within few days addressing all the comments.
>
> --
> Cheers
>
> David / dhildenb
>
next prev parent reply other threads:[~2025-09-13 4:27 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-08 4:49 Lokesh Gidra
2025-09-08 4:49 ` [RFC PATCH 2/2] userfaultfd: remove anon-vma lock for moving folios in MOVE ioctl Lokesh Gidra
2025-09-11 20:07 ` Lorenzo Stoakes
2025-09-12 9:15 ` David Hildenbrand
2025-09-08 21:47 ` [RFC PATCH 1/2] mm: always call rmap_walk() on locked folios Barry Song
2025-09-08 22:12 ` Lokesh Gidra
2025-09-09 0:40 ` Barry Song
2025-09-09 5:37 ` Lokesh Gidra
2025-09-09 5:51 ` Barry Song
2025-09-09 5:56 ` Lokesh Gidra
2025-09-09 6:01 ` Barry Song
2025-09-11 19:05 ` Lokesh Gidra
2025-09-12 5:10 ` Barry Song
2025-09-10 10:10 ` Harry Yoo
2025-09-10 15:33 ` Lokesh Gidra
2025-09-11 8:40 ` Harry Yoo
2025-09-12 3:29 ` Miaohe Lin
2025-09-11 19:39 ` Lorenzo Stoakes
2025-09-12 9:03 ` David Hildenbrand
2025-09-13 4:27 ` Lokesh Gidra [this message]
2025-09-15 11:27 ` Lorenzo Stoakes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CA+EESO5D6CBezDCnN6=7zujzczDZp1t-kajXQCEVsgYTYciT4g@mail.gmail.com' \
--to=lokeshgidra@google.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=david@redhat.com \
--cc=harry.yoo@oracle.com \
--cc=kaleshsingh@google.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=ngeoffray@google.com \
--cc=peterx@redhat.com \
--cc=sj@kernel.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox