From: Lokesh Gidra <lokeshgidra@google.com>
To: akpm@linux-foundation.org
Cc: linux-mm@kvack.org, kaleshsingh@google.com, ngeoffray@google.com,
jannh@google.com, Lokesh Gidra <lokeshgidra@google.com>,
David Hildenbrand <david@redhat.com>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
Harry Yoo <harry.yoo@oracle.com>, Peter Xu <peterx@redhat.com>,
Suren Baghdasaryan <surenb@google.com>,
Barry Song <baohua@kernel.org>, SeongJae Park <sj@kernel.org>
Subject: [PATCH v2 0/2] Improve UFFDIO_MOVE scalability by removing anon_vma lock
Date: Tue, 23 Sep 2025 00:10:17 -0700 [thread overview]
Message-ID: <20250923071019.775806-1-lokeshgidra@google.com> (raw)
Userfaultfd has a scalability issue in its UFFDIO_MOVE ioctl, which is
heavily used in Android as its java garbage collector uses it for
concurrent heap compaction.
The issue arises because UFFDIO_MOVE updates folio->mapping to an
anon_vma with a different root, in order to move the folio from a src
VMA to dst VMA. It performs the operation with the folio locked, but
this is insufficient, because rmap_walk() can be performed on non-KSM
anonymous folios without folio lock.
This means that UFFDIO_MOVE has to acquire the anon_vma write lock
of the root anon_vma belonging to the folio it wishes to move.
This causes scalability bottleneck when multiple threads perform
UFFDIO_MOVE simultanously on distinct pages of the same src VMA. In
field traces of arm64 android devices, we have observed janky user
interactions due to long (sometimes over ~50ms) uninterruptible
sleeps on main UI thread caused by anon_vma lock contention in
UFFDIO_MOVE. This is particularly severe during the beginning of
GC's compaction phase when it is likely to have multiple threads
involved.
This patch resolves the issue by removing the exception in rmap_walk()
for non-KSM anon folios by ensuring that all folios are locked during
rmap walk. This is less problematic than it might seem, as the only
major caller which utilises this mode is shrink_active_list(), which is
covered in detail in the first patch of this series.
As a result of changing our approach to locking, we can remove all
the code that took steps to acquire an anon_vma write lock instead
of a folio lock. This results in a significant simplification and
scalability improvement of the code (currently only in UFFDIO_MOVE).
Furthermore, as a side-effect, folio_lock_anon_vma_read() gets simpler
as we don't need to worry that folio->mapping may have changed under us.
Prior discussions on this can be found at [1, 2].
Changes since v1 [3]:
1. Move relevant parts of cover letter description to first patch, per
David Hildenbrand.
2. Enumerate all callers of rmap_walk(), folio_lock_anon_vma_read(), and
folio_get_anon_vma(), per Lorenzo Stoakes.
3. Make other corrections/improvements to commit message, per Lorenzo
Stoakes.
[1] https://lore.kernel.org/all/CA+EESO4Z6wtX7ZMdDHQRe5jAAS_bQ-POq5+4aDx5jh2DvY6UHg@mail.gmail.com/
[2] https://lore.kernel.org/all/20250908044950.311548-1-lokeshgidra@google.com/
[3] https://lore.kernel.org/all/20250918055135.2881413-1-lokeshgidra@google.com/
Lokesh Gidra (2):
mm: always call rmap_walk() on locked folios
mm/userfaultfd: don't lock anon_vma when performing UFFDIO_MOVE
CC: David Hildenbrand <david@redhat.com>
CC: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
CC: Harry Yoo <harry.yoo@oracle.com>
CC: Peter Xu <peterx@redhat.com>
CC: Suren Baghdasaryan <surenb@google.com>
CC: Barry Song <baohua@kernel.org>
CC: SeongJae Park <sj@kernel.org>
---
mm/damon/ops-common.c | 16 +++--------
mm/huge_memory.c | 22 +--------------
mm/memory-failure.c | 3 +++
mm/page_idle.c | 8 ++----
mm/rmap.c | 42 +++++++++--------------------
mm/userfaultfd.c | 62 ++++++++-----------------------------------
6 files changed, 33 insertions(+), 120 deletions(-)
--
2.51.0.534.gc79095c0ca-goog
next reply other threads:[~2025-09-23 7:10 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-23 7:10 Lokesh Gidra [this message]
2025-09-23 7:10 ` [PATCH v2 1/2] mm: always call rmap_walk() on locked folios Lokesh Gidra
2025-09-24 10:06 ` David Hildenbrand
2025-10-02 7:56 ` David Hildenbrand
2025-11-03 17:51 ` Lorenzo Stoakes
2025-09-23 7:10 ` [PATCH v2 2/2] mm/userfaultfd: don't lock anon_vma when performing UFFDIO_MOVE Lokesh Gidra
2025-09-24 10:07 ` David Hildenbrand
2025-11-03 17:52 ` Lorenzo Stoakes
2025-10-03 23:03 ` [PATCH v2 0/2] Improve UFFDIO_MOVE scalability by removing anon_vma lock Peter Xu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250923071019.775806-1-lokeshgidra@google.com \
--to=lokeshgidra@google.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=david@redhat.com \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=kaleshsingh@google.com \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=ngeoffray@google.com \
--cc=peterx@redhat.com \
--cc=sj@kernel.org \
--cc=surenb@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox