From: Kairui Song <ryncsn@gmail.com>
To: Johannes Weiner <hannes@cmpxchg.org>
Cc: Yosry Ahmed <yosry.ahmed@linux.dev>,
lsf-pc@lists.linux-foundation.org, linux-mm <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>,
Chris Li <chrisl@kernel.org>,
Chengming Zhou <chengming.zhou@linux.dev>,
Shakeel Butt <shakeel.butt@linux.dev>,
Hugh Dickins <hughd@google.com>,
Matthew Wilcox <willy@infradead.org>,
Barry Song <21cnbao@gmail.com>, Nhat Pham <nphamcs@gmail.com>,
Usama Arif <usamaarif642@gmail.com>,
Ryan Roberts <ryan.roberts@arm.com>,
"Huang, Ying" <ying.huang@linux.alibaba.com>
Subject: Re: [LSF/MM/BPF TOPIC] Integrate Swap Cache, Swap Maps with Swap Allocator
Date: Wed, 5 Feb 2025 03:25:20 +0800 [thread overview]
Message-ID: <CAMgjq7CL1C_26K1UnK+CidQ2=G+mVY6HTROXTDv1zJWJPcMYYw@mail.gmail.com> (raw)
In-Reply-To: <20250204190904.GC705532@cmpxchg.org>
On Wed, Feb 5, 2025 at 3:09 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Wed, Feb 05, 2025 at 02:38:39AM +0800, Kairui Song wrote:
> > On Wed, Feb 5, 2025 at 2:11 AM Yosry Ahmed <yosry.ahmed@linux.dev> wrote:
> > > However, what we should *not* do is have these clusters be tied to the
> > > disk swap space with the ability to redirect some entries to use
> > > someting like zswap. This does not fix the problem Johannes is
> > > describing.
> >
> > Yes, a virtual swap file can have its own swap space, which is indexed
> > by the cache / table, and reuse all the logic. As long as we don't
> > dramatically change the kernel swapout path, adding a folio to
> > swapcache seems a very reasonable way to avoid redundant IO, and
> > synchronize it upon swapin/swapout, and reusing a lot of
> > infrastructure, even if that's a virtual file. For example a current
> > busy loop issue can be just fixed by leveraging the folio lock:
> > https://lore.kernel.org/lkml/CAMgjq7D5qoFEK9Omvd5_Zqs6M+TEoG03+2i_mhuP5CQPSOPrmQ@mail.gmail.com/
> >
> > The virtual file/space can be decoupled from the lower device. But the
> > virtual file/space's table entry can point to an underlying physical
> > SWAP device or some meta struct.
>
> It's a bit unclear to me still which level will use the struct
> swap_cluster_info in the layered scenario.
>
> Would it be the virtual address space, where ->table has tagged
> pointers to resolve to swapcache/zeromap/zswap/swapfile?
>
> Or would it be the swapfile space, where ->table resolves to disk
> slots?
>
> Or are you proposing to use the same struct on both levels, with
> ->table catering to different needs?
I was thinking about the first case, that in the virtual address
space, ->table[n] will resolve to an offset in a lower (physical)
layer or some other meta structure. But we still reuse the same struct
for both layer, the table could be in dense mode for used clusters on
lower layer (3 bytes (memcg + count), or even 1 bit per entry,
depending on how we want to store info like memcg_id).
This also brings a nice side effect (feature), we can have multiple
swap file/devices, if the upper one (virtual or not) is full, it can
fall back to use the lower one just fine.
>
> Keep in mind, in the virtualized case, it's the top layer that would
> have to keep track of the page table count, the swapcache pointer and
> likely the memcg linkage. That also means the physical layer could
> likely be reduced to a single bit per entry - used or free.
>
> I suppose void *table could also point to such a bitmap? But not sure
> about the other members that would become redundant/unused.
That's very doable. I also wanted shmem to have a dense table, it may
also reduce the entry to one single bit.
next prev parent reply other threads:[~2025-02-04 19:25 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-04 11:44 Kairui Song
2025-02-04 16:24 ` Johannes Weiner
2025-02-04 16:46 ` Kairui Song
2025-02-04 18:11 ` Yosry Ahmed
2025-02-04 18:38 ` Kairui Song
2025-02-04 19:09 ` Johannes Weiner
2025-02-04 19:25 ` Kairui Song [this message]
2025-02-04 16:44 ` Yosry Ahmed
2025-02-04 16:56 ` Kairui Song
2025-03-26 3:23 ` Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMgjq7CL1C_26K1UnK+CidQ2=G+mVY6HTROXTDv1zJWJPcMYYw@mail.gmail.com' \
--to=ryncsn@gmail.com \
--cc=21cnbao@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=chengming.zhou@linux.dev \
--cc=chrisl@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=linux-mm@kvack.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=usamaarif642@gmail.com \
--cc=willy@infradead.org \
--cc=ying.huang@linux.alibaba.com \
--cc=yosry.ahmed@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox