From: Chengming Zhou <zhouchengming@bytedance.com>
To: Nhat Pham <nphamcs@gmail.com>,
Yosry Ahmed <yosryahmed@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>
Cc: linux-mm@kvack.org, Nhat Pham <nphamcs@gmail.com>,
Chengming Zhou <zhouchengming@bytedance.com>,
linux-kernel@vger.kernel.org, Yosry Ahmed <yosryahmed@google.com>,
Johannes Weiner <hannes@cmpxchg.org>
Subject: [PATCH v2 2/6] mm/zswap: invalidate zswap entry when swap entry free
Date: Sun, 04 Feb 2024 03:06:00 +0000 [thread overview]
Message-ID: <20240201-b4-zswap-invalidate-entry-v2-2-99d4084260a0@bytedance.com> (raw)
In-Reply-To: <20240201-b4-zswap-invalidate-entry-v2-0-99d4084260a0@bytedance.com>
During testing I found there are some times the zswap_writeback_entry()
return -ENOMEM, which is not we expected:
bpftrace -e 'kr:zswap_writeback_entry {@[(int32)retval]=count()}'
@[-12]: 1563
@[0]: 277221
The reason is that __read_swap_cache_async() return NULL because
swapcache_prepare() failed. The reason is that we won't invalidate
zswap entry when swap entry freed to the per-cpu pool, these zswap
entries are still on the zswap tree and lru list.
This patch moves the invalidation ahead to when swap entry freed
to the per-cpu pool, since there is no any benefit to leave trashy
zswap entry on the tree and lru list.
With this patch:
bpftrace -e 'kr:zswap_writeback_entry {@[(int32)retval]=count()}'
@[0]: 259744
Note: large folio can't have zswap entry for now, so don't bother
to add zswap entry invalidation in the large folio swap free path.
Reviewed-by: Nhat Pham <nphamcs@gmail.com>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
include/linux/zswap.h | 4 ++--
mm/swap_slots.c | 3 +++
mm/swapfile.c | 1 -
mm/zswap.c | 5 +++--
4 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/include/linux/zswap.h b/include/linux/zswap.h
index 91895ce1fdbc..341aea490070 100644
--- a/include/linux/zswap.h
+++ b/include/linux/zswap.h
@@ -29,7 +29,7 @@ struct zswap_lruvec_state {
bool zswap_store(struct folio *folio);
bool zswap_load(struct folio *folio);
-void zswap_invalidate(int type, pgoff_t offset);
+void zswap_invalidate(swp_entry_t swp);
int zswap_swapon(int type, unsigned long nr_pages);
void zswap_swapoff(int type);
void zswap_memcg_offline_cleanup(struct mem_cgroup *memcg);
@@ -50,7 +50,7 @@ static inline bool zswap_load(struct folio *folio)
return false;
}
-static inline void zswap_invalidate(int type, pgoff_t offset) {}
+static inline void zswap_invalidate(swp_entry_t swp) {}
static inline int zswap_swapon(int type, unsigned long nr_pages)
{
return 0;
diff --git a/mm/swap_slots.c b/mm/swap_slots.c
index 0bec1f705f8e..90973ce7881d 100644
--- a/mm/swap_slots.c
+++ b/mm/swap_slots.c
@@ -273,6 +273,9 @@ void free_swap_slot(swp_entry_t entry)
{
struct swap_slots_cache *cache;
+ /* Large folio swap slot is not covered. */
+ zswap_invalidate(entry);
+
cache = raw_cpu_ptr(&swp_slots);
if (likely(use_swap_slot_cache && cache->slots_ret)) {
spin_lock_irq(&cache->free_lock);
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 0580bb3e34d7..65b49db89b36 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -744,7 +744,6 @@ static void swap_range_free(struct swap_info_struct *si, unsigned long offset,
swap_slot_free_notify = NULL;
while (offset <= end) {
arch_swap_invalidate_page(si->type, offset);
- zswap_invalidate(si->type, offset);
if (swap_slot_free_notify)
swap_slot_free_notify(si->bdev, offset);
offset++;
diff --git a/mm/zswap.c b/mm/zswap.c
index 735f1a6ef336..d8bb0e06e2b0 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1738,9 +1738,10 @@ bool zswap_load(struct folio *folio)
return true;
}
-void zswap_invalidate(int type, pgoff_t offset)
+void zswap_invalidate(swp_entry_t swp)
{
- struct zswap_tree *tree = swap_zswap_tree(swp_entry(type, offset));
+ pgoff_t offset = swp_offset(swp);
+ struct zswap_tree *tree = swap_zswap_tree(swp);
struct zswap_entry *entry;
spin_lock(&tree->lock);
--
b4 0.10.1
next prev parent reply other threads:[~2024-02-04 3:06 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-02-04 3:05 [PATCH v2 0/6] mm/zswap: optimize zswap lru list Chengming Zhou
2024-02-04 3:05 ` [PATCH v2 1/6] mm/zswap: add more comments in shrink_memcg_cb() Chengming Zhou
2024-02-04 3:06 ` Chengming Zhou [this message]
2024-02-05 21:20 ` [PATCH v2 2/6] mm/zswap: invalidate zswap entry when swap entry free Yosry Ahmed
2024-02-04 3:06 ` [PATCH v2 3/6] mm/zswap: stop lru list shrinking when encounter warm region Chengming Zhou
2024-02-04 3:06 ` [PATCH v2 4/6] mm/zswap: remove duplicate_entry debug value Chengming Zhou
2024-02-04 3:06 ` [PATCH v2 5/6] mm/zswap: only support zswap_exclusive_loads_enabled Chengming Zhou
2024-02-04 3:06 ` [PATCH v2 6/6] mm/zswap: zswap entry doesn't need refcount anymore Chengming Zhou
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240201-b4-zswap-invalidate-entry-v2-2-99d4084260a0@bytedance.com \
--to=zhouchengming@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=nphamcs@gmail.com \
--cc=yosryahmed@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox