From: Kairui Song <ryncsn@gmail.com>
To: YoungJun Park <youngjun.park@lge.com>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Kemeng Shi <shikemeng@huaweicloud.com>,
Nhat Pham <nphamcs@gmail.com>, Baoquan He <bhe@redhat.com>,
Barry Song <baohua@kernel.org>,
Johannes Weiner <hannes@cmpxchg.org>,
David Hildenbrand <david@kernel.org>,
Lorenzo Stoakes <lorenzo.stoakes@oracle.com>,
linux-kernel@vger.kernel.org, Chris Li <chrisl@kernel.org>
Subject: Re: [PATCH v2 09/12] mm, swap: use the swap table to track the swap count
Date: Mon, 2 Feb 2026 11:27:23 +0800 [thread overview]
Message-ID: <CAMgjq7BYp8XwzzRiFDs39QDs3ZnK5p=ZZ+4i5Pne846OSfbqRw@mail.gmail.com> (raw)
In-Reply-To: <aXsaNsUFCiHYrECk@yjaykim-PowerEdge-T330>
On Thu, Jan 29, 2026 at 4:28 PM YoungJun Park <youngjun.park@lge.com> wrote:
>
> On Wed, Jan 28, 2026 at 05:28:33PM +0800, Kairui Song wrote:
> > From: Kairui Song <kasong@tencent.com>
>
> > index bfafa637c458..751430e2d2a5 100644
> > --- a/mm/swap.h
> > +++ b/mm/swap.h
> > @@ -37,6 +37,7 @@ struct swap_cluster_info {
> > u8 flags;
> > u8 order;
> > atomic_long_t __rcu *table; /* Swap table entries, see mm/swap_table.h */
> > + unsigned long *extend_table; /* For large swap count, protected by ci->lock */
>
> I assume using 'int *' is to save memory on 64-bit architectures (8 bytes ->
> 4 bytes per entry), which aligns with swp_tb_get_count() returning an int.
Right I used long as I'm not very sure if we will ever have a counter
larger than int, but folio's refs are int already, so using int*
should be enough here. Thanks for the suggestion!
>
> Regarding the extended reference table.
> While I agree that a simple array is better for speed, readability and so on, the
> 2KB overhead (assuming SWAPFILE_CLUSTER=256) might be significant in
> constrained environments when only a few entries overflow SWP_TB_COUNT_MAX.
Indeed, but note before this change, we also have a 4K overhead if
only one or few entries overflow CONT_MAX in a given range. That 4K
covers a larger range though. And entries with very large counts seem
very rare in practice.
>
> Have you considered using a resizable hash table(example. or something others)
> instead? I am curious if this approach could be applicable
> as a future optimization after the current code is merged.
Yeah I do have several ideas about how to optimize it :)
Currently using a single plain extended table simplifies things a lot,
and I tested some common workloads with SWP_TB_COUNT_MAX == 2, the
memory consumption and performance overhead is looking good.
A later idea is that we might be able to move the swap count into
folio struct for cached folios, and remove anon shadow completely to
only store the count for swapped out entry. That way we'll always have
zero overhead even if the swap count is super large. That requires
some tweaks for the LRU side.
Or if that's not doable we can use other ideas like you suggested.
next prev parent reply other threads:[~2026-02-02 3:28 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-01-28 9:28 [PATCH v2 00/12] mm, swap: swap table phase III: remove swap_map Kairui Song
2026-01-28 9:28 ` [PATCH v2 01/12] mm, swap: protect si->swap_file properly and use as a mount indicator Kairui Song
2026-01-28 9:28 ` [PATCH v2 02/12] mm, swap: clean up swapon process and locking Kairui Song
2026-01-29 8:35 ` YoungJun Park
2026-02-02 2:31 ` Kairui Song
2026-01-28 9:28 ` [PATCH v2 03/12] mm, swap: remove redundant arguments and locking for enabling a device Kairui Song
2026-01-28 9:28 ` [PATCH v2 04/12] mm, swap: consolidate bad slots setup and make it more robust Kairui Song
2026-01-28 9:28 ` [PATCH v2 05/12] mm/workingset: leave highest bits empty for anon shadow Kairui Song
2026-01-28 9:28 ` [PATCH v2 06/12] mm, swap: implement helpers for reserving data in the swap table Kairui Song
2026-01-29 7:28 ` YoungJun Park
2026-02-02 2:30 ` Kairui Song
2026-01-28 9:28 ` [PATCH v2 07/12] mm, swap: mark bad slots in swap table directly Kairui Song
2026-01-28 9:28 ` [PATCH v2 08/12] mm, swap: simplify swap table sanity range check Kairui Song
2026-01-28 9:28 ` [PATCH v2 09/12] mm, swap: use the swap table to track the swap count Kairui Song
2026-01-29 7:05 ` YoungJun Park
2026-01-29 8:28 ` YoungJun Park
2026-02-02 3:27 ` Kairui Song [this message]
2026-01-28 9:28 ` [PATCH v2 10/12] mm, swap: no need to truncate the scan border Kairui Song
2026-01-28 9:28 ` [PATCH v2 11/12] mm, swap: simplify checking if a folio is swapped Kairui Song
2026-01-28 9:28 ` [PATCH v2 12/12] mm, swap: no need to clear the shadow explicitly Kairui Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAMgjq7BYp8XwzzRiFDs39QDs3ZnK5p=ZZ+4i5Pne846OSfbqRw@mail.gmail.com' \
--to=ryncsn@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=bhe@redhat.com \
--cc=chrisl@kernel.org \
--cc=david@kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=nphamcs@gmail.com \
--cc=shikemeng@huaweicloud.com \
--cc=youngjun.park@lge.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox