linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Nhat Pham <nphamcs@gmail.com>,
	akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com,
	shakeel.butt@linux.dev, ryan.roberts@arm.com,
	ying.huang@intel.com, chrisl@kernel.org, david@redhat.com,
	kasong@tencent.com, willy@infradead.org, viro@zeniv.linux.org.uk,
	baohua@kernel.org, chengming.zhou@linux.dev, linux-mm@kvack.org,
	kernel-team@meta.com, linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH 0/2] remove SWAP_MAP_SHMEM
Date: Tue, 24 Sep 2024 11:25:08 +0800	[thread overview]
Message-ID: <9a110f20-42ad-468b-96c6-683e162452a9@linux.alibaba.com> (raw)
In-Reply-To: <CAJD7tkamKcaqHR5V+4+9ixmFc3dC2NnGcu7YzdXqxqNEe8FqqA@mail.gmail.com>



On 2024/9/24 10:15, Yosry Ahmed wrote:
> On Mon, Sep 23, 2024 at 6:55 PM Baolin Wang
> <baolin.wang@linux.alibaba.com> wrote:
>>
>>
>>
>> On 2024/9/24 07:11, Nhat Pham wrote:
>>> The SWAP_MAP_SHMEM state was originally introduced in the commit
>>> aaa468653b4a ("swap_info: note SWAP_MAP_SHMEM"), to quickly determine if a
>>> swap entry belongs to shmem during swapoff.
>>>
>>> However, swapoff has since been rewritten drastically in the commit
>>> b56a2d8af914 ("mm: rid swapoff of quadratic complexity"). Now
>>> having swap count == SWAP_MAP_SHMEM value is basically the same as having
>>> swap count == 1, and swap_shmem_alloc() behaves analogously to
>>> swap_duplicate()
>>>
>>> This RFC proposes the removal of this state and the associated helper to
>>> simplify the state machine (both mentally and code-wise). We will also
>>> have an extra state/special value that can be repurposed (for swap entries
>>> that never gets re-duplicated).
>>>
>>> Another motivation (albeit a bit premature at the moment) is the new swap
>>> abstraction I am currently working on, that would allow for swap/zswap
>>> decoupling, swapoff optimization, etc. The fewer states and swap API
>>> functions there are, the simpler the conversion will be.
>>>
>>> I am sending this series first as an RFC, just in case I missed something
>>> or misunderstood this state, or if someone has a swap optimization in mind
>>> for shmem that would require this special state.
>>
>> The idea makes sense to me. I did a quick test with shmem mTHP, and
>> encountered the following warning which is triggered by
>> 'VM_WARN_ON(usage == 1 && nr > 1)' in __swap_duplicate().
> 
> Apparently __swap_duplicate() does not currently handle increasing the
> swap count for multiple swap entries by 1 (i.e. usage == 1) because it
> does not handle rolling back count increases when
> swap_count_continued() fails.
> 
> I guess this voids my Reviewed-by until we sort this out. Technically
> swap_count_continued() won't ever be called for shmem because we only
> ever increment the count by 1, but there is no way to know this in
> __swap_duplicate() without SWAP_HAS_SHMEM.

Agreed. An easy solution might be to add a new boolean parameter to 
indicate whether the SHMEM swap entry count is increasing?

diff --git a/mm/swapfile.c b/mm/swapfile.c
index cebc244ee60f..21f1eec2c30a 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3607,7 +3607,7 @@ void si_swapinfo(struct sysinfo *val)
   * - swap-cache reference is requested but the entry is not used. -> 
ENOENT
   * - swap-mapped reference requested but needs continued swap count. 
-> ENOMEM
   */
-static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr)
+static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int 
nr, bool shmem)
  {
         struct swap_info_struct *si;
         struct swap_cluster_info *ci;
@@ -3620,7 +3620,7 @@ static int __swap_duplicate(swp_entry_t entry, 
unsigned char usage, int nr)

         offset = swp_offset(entry);
         VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER);
-       VM_WARN_ON(usage == 1 && nr > 1);
+       VM_WARN_ON(usage == 1 && nr > 1 && !shmem);
         ci = lock_cluster_or_swap_info(si, offset);

         err = 0;
@@ -3661,7 +3661,7 @@ static int __swap_duplicate(swp_entry_t entry, 
unsigned char usage, int nr)
                         has_cache = SWAP_HAS_CACHE;
                 else if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX)
                         count += usage;
-               else if (swap_count_continued(si, offset + i, count))
+               else if (!shmem && swap_count_continued(si, offset + i, 
count))
                         count = COUNT_CONTINUED;
                 else {
                         /*


  reply	other threads:[~2024-09-24  3:25 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-09-23 23:11 Nhat Pham
2024-09-23 23:11 ` [RFC PATCH 1/2] swapfile: add a batched variant for swap_duplicate() Nhat Pham
2024-09-23 23:11 ` [RFC PATCH 2/2] swap: shmem: remove SWAP_MAP_SHMEM Nhat Pham
2024-09-24  0:32   ` Yosry Ahmed
2024-09-24  0:20 ` [RFC PATCH 0/2] " Yosry Ahmed
2024-09-24  1:55 ` Baolin Wang
2024-09-24  2:15   ` Yosry Ahmed
2024-09-24  3:25     ` Baolin Wang [this message]
2024-09-24 14:32       ` Nhat Pham
2024-09-24 15:07         ` Yosry Ahmed
2024-09-24 15:48           ` Nhat Pham
2024-09-24 18:11             ` Yosry Ahmed
2024-09-25  6:26               ` Barry Song
2024-09-25  7:24                 ` Huang, Ying
2024-09-25  7:38                   ` Barry Song
2024-09-25  1:53             ` Baolin Wang
2024-09-25 14:37               ` Nhat Pham
2024-09-26  1:59                 ` Huang, Ying
2024-09-26  3:30                   ` Baolin Wang
2024-09-26  3:59                 ` Barry Song
2024-09-26 22:50                   ` Nhat Pham
2024-09-26  4:00                 ` Barry Song
2024-09-25  7:19             ` Huang, Ying
2024-09-25  7:32               ` Barry Song
2024-09-25 14:21                 ` Nhat Pham
2024-09-25 14:24                   ` Nhat Pham
2024-09-25 14:28                   ` Nhat Pham
2024-09-24 20:15 ` Chris Li
2024-09-24 21:30   ` Yosry Ahmed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9a110f20-42ad-468b-96c6-683e162452a9@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=kasong@tencent.com \
    --cc=kernel-team@meta.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox