linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Johannes Weiner <hannes@cmpxchg.org>
To: Kairui Song <ryncsn@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Yosry Ahmed <yosryahmed@google.com>,
	Nhat Pham <nphamcs@gmail.com>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	Chris Li <chrisl@kernel.org>, Barry Song <v-songbaohua@oppo.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] mm, zswap: don't touch the XArray lock if there is no entry to free
Date: Fri, 18 Oct 2024 18:38:55 -0400	[thread overview]
Message-ID: <20241018223855.GC81612@cmpxchg.org> (raw)
In-Reply-To: <CAMgjq7AjBMJAE-rj2MmB53FrQKcsARK5tZ3sKB4+uhWhkQ=EGA@mail.gmail.com>

On Sat, Oct 19, 2024 at 04:01:18AM +0800, Kairui Song wrote:
> On Sat, Oct 19, 2024 at 3:46 AM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Sat, Oct 19, 2024 at 03:25:25AM +0800, Kairui Song wrote:
> > >       if (xa_empty(tree))
> > >               return;
> > >
> > > -     entry = xa_erase(tree, offset);
> > > -     if (entry)
> > > +     rcu_read_lock();
> > > +     entry = xas_load(&xas);
> > > +     if (entry) {
> >
> > You should call xas_reset() here.  And I'm not sure it's a great idea to
> > spin waiting for the xa lock while holding the RCU read lock?  Probably
> > not awful but I could easily be wrong.

Spinlocks already implicitly acquire an RCU read-side lock before
beginning to spin, so we shouldn't be worse for wear by doing this.

> Thanks for the review. I thought about it, that could cancel this optimization.
> 
> Oh, and there is a thing I forgot to mention (maybe I should add some
> comments about it?). If xas_load found an entry, that entry must be
> pinned by HAS_CACHE or swap slot count right now, and one entry can
> only be freed once.
> So it should be safe here?
> 
> This might be a little fragile though, maybe this optimization can
> better be done after some zswap invalidation path cleanup.

This seems fine too, exlusivity during invalidation is a fundamental
property of swap. If a load were possible, we'd be freeing an entry
with ptes pointing to it (or readahead a slot whose backing space has
been discarded). If a store were possible, we could write new data
into a dead slot and lose it. Even the swapcache bypass path in
do_swap_page() must at least acquire HAS_CACHE due to this.

So from a swap POV, if we find an entry here it's guaranteed to remain
in the tree by the calling context. The xa lock is for protection the
tree structure against concurrent changes (e.g. from adjacent entries).

With that said, is there still a way for the tree to change internally
before we acquire the lock? Such that tree + index might end up
pointing to the same contents in a different memory location?

AFAIK there are two possible ways:

- xas_split() - this shouldn't be possible because we don't do large
  entries inside the zswap trees.

- xas_shrink() - this could move the entry from a node to xa->head,
  iff it's the last entry in the tree and its index is 0. Swap offset
  0 is never a valid swap entry (swap header), but unfortunately we
  have split trees so it could happen to any offset that is a multiple
  of SWAP_ADDRESS_SPACE_PAGES. AFAICS xas_store() doesn't detect such
  a transition. And making it do that honestly sounds a bit hairy...

So this doesn't look safe to me without a reload :(


      parent reply	other threads:[~2024-10-18 22:39 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-18 19:25 Kairui Song
2024-10-18 19:46 ` Matthew Wilcox
2024-10-18 20:01   ` Kairui Song
2024-10-18 20:40     ` Yosry Ahmed
2024-10-18 20:55       ` Matthew Wilcox
2024-10-18 21:00         ` Yosry Ahmed
2024-10-18 21:38           ` Matthew Wilcox
2024-10-18 22:28             ` Yosry Ahmed
2024-10-18 22:38     ` Johannes Weiner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20241018223855.GC81612@cmpxchg.org \
    --to=hannes@cmpxchg.org \
    --cc=akpm@linux-foundation.org \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ryncsn@gmail.com \
    --cc=v-songbaohua@oppo.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox