Re: [RFC] Unconditionally lock folios when calling rmap_walk()

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Zi Yan <ziy@nvidia.com>
To: Lokesh Gidra <lokeshgidra@google.com>, Barry Song <21cnbao@gmail.com>
Cc: "open list:MEMORY MANAGEMENT" <linux-mm@kvack.org>,
	Peter Xu <peterx@redhat.com>,
	David Hildenbrand <david@redhat.com>,
	Suren Baghdasaryan <surenb@google.com>,
	Kalesh Singh <kaleshsingh@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	android-mm <android-mm@google.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Jann Horn <jannh@google.com>
Subject: Re: [RFC] Unconditionally lock folios when calling rmap_walk()
Date: Thu, 21 Aug 2025 12:13:52 -0400	[thread overview]
Message-ID: <3133F0B4-4684-4EC7-81FC-BC12A430E4C2@nvidia.com> (raw)
In-Reply-To: <CAGsJ_4xccre0rz5zgRTA=NbFzF4FLS-ZUohgLFnfTGY9Jdequg@mail.gmail.com>

On 21 Aug 2025, at 8:01, Barry Song wrote:

> On Thu, Aug 21, 2025 at 12:29 PM Lokesh Gidra <lokeshgidra@google.com> wrote:
>>
>> Adding linux-mm mailing list. Mistakenly used the wrong email address.
>>
>> On Wed, Aug 20, 2025 at 9:23 PM Lokesh Gidra <lokeshgidra@google.com> wrote:
>>>
>>> Hi all,
>>>
>>> Currently, some callers of rmap_walk() conditionally avoid try-locking
>>> non-ksm anon folios. This necessitates serialization through anon_vma
>>> write-lock when folio->mapping and/or folio->index (fields involved in
>>> rmap_walk()) are to be updated. This hurts scalability due to coarse
>>> granularity of the lock. For instance, when multiple threads invoke
>>> userfaultfd’s MOVE ioctl simultaneously to move distinct pages from
>>> the same src VMA, they all contend for the corresponding anon_vma’s
>>> lock. Field traces for arm64 android devices reveal over 30ms of
>>> uninterruptible sleep in the main UI thread, leading to janky user
>>> interactions.
>>>
>>> Among all rmap_walk() callers that don’t lock anon folios,
>>> folio_referenced() is the most critical (others are
>>> page_idle_clear_pte_refs(), damon_folio_young(), and
>>> damon_folio_mkold()). The relevant code in folio_referenced() is:
>>>
>>> if (!is_locked && (!folio_test_anon(folio) || folio_test_ksm(folio))) {
>>>         we_locked = folio_trylock(folio);
>>>         if (!we_locked)
>>>                 return 1;
>>> }

This seems to be legacy code from commit 5ad6468801d2 ("ksm: let shared pages be
swappable"). From the commit log, the lock is used to protect KSM stable
tree from concurrent modification.

>>>
>>> It’s unclear why locking anon_vma (when updating folio->mapping) is
>>> beneficial over locking the folio here. It’s in the reclaim path, so
>>> should not be a critical path that necessitates some special
>>> treatment, unless I’m missing something.

The decision was made before the first git commit 1da177e4c3f4 based on
git history. Maybe it is time to revisit it and improve it.


>>>
>>> Therefore, I propose simplifying the locking mechanism by
>>> unconditionally try-locking the folio in such cases. This helps avoid
>>> locking anon_vma when updating folio->mapping, which, for instance,
>>> will help eliminate the uninterruptible sleep observed in the field
>>> traces mentioned earlier. Furthermore, it enables us to simplify the
>>> code in folio_lock_anon_vma_read() by removing the re-check to ensure
>>> that the field hasn’t changed under us.
>
> Thanks, I’m personally quite interested in this topic and will take a
> closer look as well. Beyond this one userfaultfd move, we’ve observed
> severe anon_vma lock contention between fork, unmap (process exit), and
> memory reclamation. This has caused noticeable UI stutters, especially
> when many VMAs share the same anon_vma root.
>
> Thanks
> Barry


--
Best Regards,
Yan, Zi

next prev parent reply	other threads:[~2025-08-21 16:14 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CA+EESO6dR5=4zaecmYqQqOX4702wwGSTX=4+Ani_Q9+o+WUnQA@mail.gmail.com>
2025-08-21  4:29 ` Lokesh Gidra
2025-08-21 12:01   ` Barry Song
2025-08-21 16:13     ` Zi Yan [this message]
2025-08-21 17:56       ` Lokesh Gidra
2025-08-22 10:36         ` Harry Yoo
2025-08-22 10:50           ` Lorenzo Stoakes
2025-08-22 17:16             ` Lokesh Gidra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3133F0B4-4684-4EC7-81FC-BC12A430E4C2@nvidia.com \
    --to=ziy@nvidia.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=android-mm@google.com \
    --cc=david@redhat.com \
    --cc=jannh@google.com \
    --cc=kaleshsingh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lokeshgidra@google.com \
    --cc=peterx@redhat.com \
    --cc=surenb@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox