From: Zhang Peng <zippermonkey@icloud.com>
To: pfalcato@suse.de
Cc: Liam.Howlett@oracle.com, akpm@linux-foundation.org,
axelrasmussen@google.com, bruzzhang@tencent.com,
david@kernel.org, hannes@cmpxchg.org, kasong@tencent.com,
linux-kernel@vger.kernel.org, linux-mm@kvack.org, ljs@kernel.org,
mhocko@kernel.org, mhocko@suse.com, rppt@kernel.org,
shakeel.butt@linux.dev, surenb@google.com, vbabka@kernel.org,
weixugc@google.com, yuanchu@google.com,
zhengqi.arch@bytedance.com, zippermonkey@icloud.com
Subject: Re: [PATCH v2 5/5] mm/vmscan: flush TLB for every 31 folios evictions
Date: Mon, 30 Mar 2026 11:19:08 +0800 [thread overview]
Message-ID: <74129cfb-a7f3-4f28-8242-813166616205@icloud.com> (raw)
In-Reply-To: <tq5bx3gz2mzqyf3mylay22bcdo2sxt5urvpbblqej4v4sbk3q6@hwjqs4y3ghxf>
On Thu, Mar 26, 2026 at 12:40:50PM +0000, Pedro Falcato wrote:
> On Thu, Mar 26, 2026 at 04:36:21PM +0800, Zhang Peng wrote:
> > + folio_batch_reinit(fbatch);
> > + for (i = 0; i < count; ++i) {
> > + folio = fbatch->folios[i];
> > + if (!folio_trylock(folio)) {
> > + list_add(&folio->lru, ret_folios);
> > + continue;
> > + }
> > +
> > + if (folio_test_writeback(folio) || folio_test_lru(folio) ||
>
> If PG_lru is set here, we're in a world of trouble as we're actively
using
> folio->lru. I don't think it's possible for it to be set, as
isolating folios
> clears lru, and refcount bump means the folio cannot be reused or
reinserted
> back on the LRU. So perhaps:
> VM_WARN_ON_FOLIO(folio_test_lru(folio), folio);
Agreed. The folio was isolated from LRU in shrink_folio_list()'s caller
with PG_lru cleared, and we hold a reference throughout. There's no path
that can re-set PG_lru while we hold the folio. Will replace the
folio_test_lru() check with VM_WARN_ON_FOLIO.
> > + folio_mapped(folio) || folio_maybe_dma_pinned(folio)) {
> > + folio_unlock(folio);
> > + list_add(&folio->lru, ret_folios);
> > + continue;
> > + }
> > +
> > + folio_batch_add(fbatch, folio);
> > + }
> > +
> > + i = 0;
> > + count = folio_batch_count(fbatch);
> > + if (!count)
> > + return;
> > + /* One TLB flush for the batch */
> > + try_to_unmap_flush_dirty();
> > + for (i = 0; i < count; ++i) {
> > + folio = fbatch->folios[i];
> > + pageout_one(folio, ret_folios, free_folios, sc, stat, plug,
> > + folio_list);
>
> Would be lovely if we could pass the batch down to the swap layer.
Agreed, that would be the logical next step. For now I kept the scope
small to just batch the TLB flush, but submitting swap IO in batches
could further reduce per-folio overhead. Will look into that as a
follow-up.
> > + }
> > + folio_batch_reinit(fbatch);
>
> The way you keep reinitializing fbatch is a bit confusing.
> Probably worth a comment or two (or kdocs for pageout_batch documenting
> that the folio batch is reset, etc).
Will add kdocs. The first reinit saves the original folios into local
variables and resets the batch for reuse as a "still-valid" set. The
second reinit cleans up after pageout_one() has consumed them. Will make
this two-phase usage explicit in the documentation.
> > + folio_unlock(folio);
>
> Why is the folio unlocked? I don't see the need to take the lock trip
twice.
> Is there something I'm missing?
We should not hold the folio lock longer than necessary. The folio sits
in flush_folios while the main loop continues scanning the remaining
folios - accumulating a full batch means processing up to 31 more folios
through trylock, unmap, swap allocation etc.
During that window the folio is still in swap cache and findable by
other CPUs. For example, do_swap_page() can look it up via
swap_cache_get_folio() and then block at folio_lock_or_retry() waiting
for us to finish accumulating. That is a direct stall on a process
trying to fault in its own memory.
Unlocking here and relocking in pageout_batch() is the same pattern used
by the demote path a few lines above.
prev parent reply other threads:[~2026-03-30 3:19 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 8:36 [PATCH v2 0/5] mm: batch TLB flushing for dirty folios in vmscan Zhang Peng
2026-03-26 8:36 ` [PATCH v2 1/5] mm/vmscan: track reclaimed pages in reclaim_stat Zhang Peng
2026-03-26 8:36 ` [PATCH v2 2/5] mm/vmscan: extract folio activation into folio_active_bounce() Zhang Peng
2026-03-26 8:36 ` [PATCH v2 3/5] mm/vmscan: extract folio_free() and pageout_one() Zhang Peng
2026-03-26 8:36 ` [PATCH v2 4/5] mm/vmscan: extract folio unmap logic into folio_try_unmap() Zhang Peng
2026-03-26 8:36 ` [PATCH v2 5/5] mm/vmscan: flush TLB for every 31 folios evictions Zhang Peng
2026-03-26 12:40 ` Pedro Falcato
2026-03-30 3:19 ` Zhang Peng [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=74129cfb-a7f3-4f28-8242-813166616205@icloud.com \
--to=zippermonkey@icloud.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=axelrasmussen@google.com \
--cc=bruzzhang@tencent.com \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=kasong@tencent.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=pfalcato@suse.de \
--cc=rppt@kernel.org \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=vbabka@kernel.org \
--cc=weixugc@google.com \
--cc=yuanchu@google.com \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox