From: Dmitry Ilvokhin <d@ilvokhin.com>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Kairui Song <kasong@tencent.com>, Nhat Pham <nphamcs@gmail.com>,
Baoquan He <bhe@redhat.com>, Barry Song <baohua@kernel.org>,
Chris Li <chrisl@kernel.org>,
Axel Rasmussen <axelrasmussen@google.com>,
Yuanchu Xie <yuanchu@google.com>, Wei Xu <weixugc@google.com>,
Kiryl Shutsemau <kas@kernel.org>,
Usama Arif <usamaarif642@gmail.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kernel-team@meta.com, hughd@google.com, yangge1116@126.com,
david@redhat.com
Subject: Re: [PATCH v2] mm: skip folio_activate() for mlocked folios
Date: Wed, 8 Oct 2025 18:06:07 +0000 [thread overview]
Message-ID: <aOaoD0HQk7YPeLkE@shell.ilvokhin.com> (raw)
In-Reply-To: <ltvv3v4vibvlglpch6urayotenavpzxc7klbcyowjb4wrv3e7z@pzovtvtbmnsp>
On Wed, Oct 08, 2025 at 09:17:49AM -0700, Shakeel Butt wrote:
> [Somehow I messed up the subject, so resending]
>
> Cc Hugh, yangge, David
>
> On Mon, Oct 06, 2025 at 01:25:26PM +0000, Dmitry Ilvokhin wrote:
> > __mlock_folio() does not move folio to unevicable LRU, when
> > folio_activate() removes folio from LRU.
> >
> > To prevent this case also check for folio_test_mlocked() in
> > folio_mark_accessed(). If folio is not yet marked as unevictable, but
> > already marked as mlocked, then skip folio_activate() call to allow
> > __mlock_folio() to make all necessary updates. It should be safe to skip
> > folio_activate() here, because mlocked folio should end up in
> > unevictable LRU eventually anyway.
> >
> > To observe the problem mmap() and mlock() big file and check Unevictable
> > and Mlocked values from /proc/meminfo. On freshly booted system without
> > any other mlocked memory we expect them to match or be quite close.
> >
> > See below for more detailed reproduction steps. Source code of stat.c is
> > available at [1].
> >
> > $ head -c 8G < /dev/urandom > /tmp/random.bin
> >
> > $ cc -pedantic -Wall -std=c99 stat.c -O3 -o /tmp/stat
> > $ /tmp/stat
> > Unevictable: 8389668 kB
> > Mlocked: 8389700 kB
> >
> > Need to run binary twice. Problem does not reproduce on the first run,
> > but always reproduces on the second run.
> >
> > $ /tmp/stat
> > Unevictable: 5374676 kB
> > Mlocked: 8389332 kB
> >
> > [1]: https://gist.github.com/ilvokhin/e50c3d2ff5d9f70dcbb378c6695386dd
> >
> > Co-developed-by: Kiryl Shutsemau <kas@kernel.org>
> > Signed-off-by: Kiryl Shutsemau <kas@kernel.org>
> > Signed-off-by: Dmitry Ilvokhin <d@ilvokhin.com>
> > Acked-by: Usama Arif <usamaarif642@gmail.com>
> > ---
> > Changes in v2:
> > - Rephrase commit message: frame it in terms of unevicable LRU, not stat
> > accounting.
> >
> > mm/swap.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/mm/swap.c b/mm/swap.c
> > index 2260dcd2775e..f682f070160b 100644
> > --- a/mm/swap.c
> > +++ b/mm/swap.c
> > @@ -469,6 +469,16 @@ void folio_mark_accessed(struct folio *folio)
> > * this list is never rotated or maintained, so marking an
> > * unevictable page accessed has no effect.
> > */
> > + } else if (folio_test_mlocked(folio)) {
> > + /*
> > + * Pages that are mlocked, but not yet on unevictable LRU.
> > + * They might be still in mlock_fbatch waiting to be processed
> > + * and activating it here might interfere with
> > + * mlock_folio_batch(). __mlock_folio() will fail
> > + * folio_test_clear_lru() check and give up. It happens because
> > + * __folio_batch_add_and_move() clears LRU flag, when adding
> > + * folio to activate batch.
> > + */
>
> This makes sense as activating an mlocked folio should be a noop but I
> am wondering why we are seeing this now. By this, I mean mlock()ed
> memory being delayed to get to unevictable LRU. Also I remember Hugh
> recently [1] removed the difference betwen mlock percpu cache and other
> percpu caches of clearing LRU bit on entry. Does you repro work even
> with Hugh's changes or without it?
>
Thanks Shakeel for mentioning Hugh's patch, I was not aware of it.
Indeed, I could not reproduce problem on top of Hugh's patch anymore,
which totally make sense, because folio_test_clear_lru() is gone from
__folio_batch_add_and_move().
Now I wonder does folio_test_mlocked() check still make sense in the
current codebase?
> [1] https://lore.kernel.org/all/05905d7b-ed14-68b1-79d8-bdec30367eba@google.com/
next prev parent reply other threads:[~2025-10-08 18:06 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-06 13:25 Dmitry Ilvokhin
2025-10-07 16:26 ` Nhat Pham
2025-10-07 19:53 ` SeongJae Park
2025-10-08 10:33 ` Kiryl Shutsemau
2025-10-08 16:29 ` SeongJae Park
2025-10-08 16:17 ` Shakeel Butt
2025-10-08 18:06 ` Dmitry Ilvokhin [this message]
2025-10-15 19:59 ` Andrew Morton
2025-10-15 20:09 ` Dmitry Ilvokhin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aOaoD0HQk7YPeLkE@shell.ilvokhin.com \
--to=d@ilvokhin.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hughd@google.com \
--cc=kas@kernel.org \
--cc=kasong@tencent.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=shikemeng@huaweicloud.com \
--cc=usamaarif642@gmail.com \
--cc=weixugc@google.com \
--cc=yangge1116@126.com \
--cc=yuanchu@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox