From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Nhat Pham <nphamcs@gmail.com>,
akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com,
shakeel.butt@linux.dev, ryan.roberts@arm.com,
ying.huang@intel.com, chrisl@kernel.org, david@redhat.com,
kasong@tencent.com, willy@infradead.org, viro@zeniv.linux.org.uk,
baohua@kernel.org, chengming.zhou@linux.dev, linux-mm@kvack.org,
kernel-team@meta.com, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/2] remove SWAP_MAP_SHMEM
Date: Tue, 24 Sep 2024 11:25:08 +0800 [thread overview]
Message-ID: <9a110f20-42ad-468b-96c6-683e162452a9@linux.alibaba.com> (raw)
In-Reply-To: <CAJD7tkamKcaqHR5V+4+9ixmFc3dC2NnGcu7YzdXqxqNEe8FqqA@mail.gmail.com>
On 2024/9/24 10:15, Yosry Ahmed wrote:
> On Mon, Sep 23, 2024 at 6:55 PM Baolin Wang
> <baolin.wang@linux.alibaba.com> wrote:
>>
>>
>>
>> On 2024/9/24 07:11, Nhat Pham wrote:
>>> The SWAP_MAP_SHMEM state was originally introduced in the commit
>>> aaa468653b4a ("swap_info: note SWAP_MAP_SHMEM"), to quickly determine if a
>>> swap entry belongs to shmem during swapoff.
>>>
>>> However, swapoff has since been rewritten drastically in the commit
>>> b56a2d8af914 ("mm: rid swapoff of quadratic complexity"). Now
>>> having swap count == SWAP_MAP_SHMEM value is basically the same as having
>>> swap count == 1, and swap_shmem_alloc() behaves analogously to
>>> swap_duplicate()
>>>
>>> This RFC proposes the removal of this state and the associated helper to
>>> simplify the state machine (both mentally and code-wise). We will also
>>> have an extra state/special value that can be repurposed (for swap entries
>>> that never gets re-duplicated).
>>>
>>> Another motivation (albeit a bit premature at the moment) is the new swap
>>> abstraction I am currently working on, that would allow for swap/zswap
>>> decoupling, swapoff optimization, etc. The fewer states and swap API
>>> functions there are, the simpler the conversion will be.
>>>
>>> I am sending this series first as an RFC, just in case I missed something
>>> or misunderstood this state, or if someone has a swap optimization in mind
>>> for shmem that would require this special state.
>>
>> The idea makes sense to me. I did a quick test with shmem mTHP, and
>> encountered the following warning which is triggered by
>> 'VM_WARN_ON(usage == 1 && nr > 1)' in __swap_duplicate().
>
> Apparently __swap_duplicate() does not currently handle increasing the
> swap count for multiple swap entries by 1 (i.e. usage == 1) because it
> does not handle rolling back count increases when
> swap_count_continued() fails.
>
> I guess this voids my Reviewed-by until we sort this out. Technically
> swap_count_continued() won't ever be called for shmem because we only
> ever increment the count by 1, but there is no way to know this in
> __swap_duplicate() without SWAP_HAS_SHMEM.
Agreed. An easy solution might be to add a new boolean parameter to
indicate whether the SHMEM swap entry count is increasing?
diff --git a/mm/swapfile.c b/mm/swapfile.c
index cebc244ee60f..21f1eec2c30a 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3607,7 +3607,7 @@ void si_swapinfo(struct sysinfo *val)
* - swap-cache reference is requested but the entry is not used. ->
ENOENT
* - swap-mapped reference requested but needs continued swap count.
-> ENOMEM
*/
-static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr)
+static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int
nr, bool shmem)
{
struct swap_info_struct *si;
struct swap_cluster_info *ci;
@@ -3620,7 +3620,7 @@ static int __swap_duplicate(swp_entry_t entry,
unsigned char usage, int nr)
offset = swp_offset(entry);
VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);
- VM_WARN_ON(usage == 1 && nr > 1);
+ VM_WARN_ON(usage == 1 && nr > 1 && !shmem);
ci = lock_cluster_or_swap_info(si, offset);
err = 0;
@@ -3661,7 +3661,7 @@ static int __swap_duplicate(swp_entry_t entry,
unsigned char usage, int nr)
has_cache = SWAP_HAS_CACHE;
else if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX)
count += usage;
- else if (swap_count_continued(si, offset + i, count))
+ else if (!shmem && swap_count_continued(si, offset + i,
count))
count = COUNT_CONTINUED;
else {
/*
next prev parent reply other threads:[~2024-09-24 3:25 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-23 23:11 Nhat Pham
2024-09-23 23:11 ` [RFC PATCH 1/2] swapfile: add a batched variant for swap_duplicate() Nhat Pham
2024-09-23 23:11 ` [RFC PATCH 2/2] swap: shmem: remove SWAP_MAP_SHMEM Nhat Pham
2024-09-24 0:32 ` Yosry Ahmed
2024-09-24 0:20 ` [RFC PATCH 0/2] " Yosry Ahmed
2024-09-24 1:55 ` Baolin Wang
2024-09-24 2:15 ` Yosry Ahmed
2024-09-24 3:25 ` Baolin Wang [this message]
2024-09-24 14:32 ` Nhat Pham
2024-09-24 15:07 ` Yosry Ahmed
2024-09-24 15:48 ` Nhat Pham
2024-09-24 18:11 ` Yosry Ahmed
2024-09-25 6:26 ` Barry Song
2024-09-25 7:24 ` Huang, Ying
2024-09-25 7:38 ` Barry Song
2024-09-25 1:53 ` Baolin Wang
2024-09-25 14:37 ` Nhat Pham
2024-09-26 1:59 ` Huang, Ying
2024-09-26 3:30 ` Baolin Wang
2024-09-26 3:59 ` Barry Song
2024-09-26 22:50 ` Nhat Pham
2024-09-26 4:00 ` Barry Song
2024-09-25 7:19 ` Huang, Ying
2024-09-25 7:32 ` Barry Song
2024-09-25 14:21 ` Nhat Pham
2024-09-25 14:24 ` Nhat Pham
2024-09-25 14:28 ` Nhat Pham
2024-09-24 20:15 ` Chris Li
2024-09-24 21:30 ` Yosry Ahmed
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9a110f20-42ad-468b-96c6-683e162452a9@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=chengming.zhou@linux.dev \
--cc=chrisl@kernel.org \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=hughd@google.com \
--cc=kasong@tencent.com \
--cc=kernel-team@meta.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=ryan.roberts@arm.com \
--cc=shakeel.butt@linux.dev \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox