From: Liu Shixin <liushixin2@huawei.com>
To: Yu Zhao <yuzhao@google.com>,
Andrew Morton <akpm@linux-foundation.org>,
Yosry Ahmed <yosryahmed@google.com>,
Huang Ying <ying.huang@intel.com>,
Sachin Sant <sachinp@linux.ibm.com>,
Michal Hocko <mhocko@suse.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>,
Liu Shixin <liushixin2@huawei.com>
Subject: [PATCH v10] mm: vmscan: try to reclaim swapcache pages if no swap space
Date: Tue, 21 Nov 2023 17:06:24 +0800 [thread overview]
Message-ID: <20231121090624.1814733-1-liushixin2@huawei.com> (raw)
When spaces of swap devices are exhausted, only file pages can be
reclaimed. But there are still some swapcache pages in anon lru list.
This can lead to a premature out-of-memory.
The problem is found with such step:
Firstly, set a 9MB disk swap space, then create a cgroup with 10MB
memory limit, then runs an program to allocates about 15MB memory.
The problem occurs occasionally, which may need about 100 times [1].
Fix it by checking number of swapcache pages in can_reclaim_anon_pages().
If the number is not zero, return true and set swapcache_only to 1.
When scan anon lru list in swapcache_only mode, non-swapcache pages will
be skipped to isolate in order to accelerate reclaim efficiency.
However, in swapcache_only mode, the scan count still increased when scan
non-swapcache pages because there are large number of non-swapcache pages
and rare swapcache pages in swapcache_only mode, and if the non-swapcache
is skipped and do not count, the scan of pages in isolate_lru_folios() can
eventually lead to hung task, just as Sachin reported [2].
By the way, since there are enough times of memory reclaim before OOM, it
is not need to isolate too much swapcache pages in one times.
[1]. https://lore.kernel.org/lkml/CAJD7tkZAfgncV+KbKr36=eDzMnT=9dZOT0dpMWcurHLr6Do+GA@mail.gmail.com/
[2]. https://lore.kernel.org/linux-mm/CAJD7tkafz_2XAuqE8tGLPEcpLngewhUo=5US14PAtSM9tLBUQg@mail.gmail.com/
Signed-off-by: Liu Shixin <liushixin2@huawei.com>
Tested-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: "Huang, Ying" <ying.huang@intel.com>
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
---
v9->v10: Use per-node swapcache suggested by Yu Zhao.
v8->v9: Move the swapcache check after can_demote() and refector
can_reclaim_anon_pages() a bit.
v7->v8: Reset swapcache_only at the beginning of can_reclaim_anon_pages().
v6->v7: Reset swapcache_only to zero after there are swap spaces.
v5->v6: Fix NULL pointing derefence and hung task problem reported by Sachin.
mm/vmscan.c | 50 +++++++++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 49 insertions(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 506f8220c5fe..1fcc94717370 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -136,6 +136,9 @@ struct scan_control {
/* Always discard instead of demoting to lower tier memory */
unsigned int no_demotion:1;
+ /* Swap space is exhausted, only reclaim swapcache for anon LRU */
+ unsigned int swapcache_only:1;
+
/* Allocation order */
s8 order;
@@ -308,10 +311,36 @@ static bool can_demote(int nid, struct scan_control *sc)
return true;
}
+#ifdef CONFIG_SWAP
+static bool can_reclaim_swapcache(struct mem_cgroup *memcg, int nid)
+{
+ struct pglist_data *pgdat = NODE_DATA(nid);
+ unsigned long nr_swapcache;
+
+ if (!memcg) {
+ nr_swapcache = node_page_state(pgdat, NR_SWAPCACHE);
+ } else {
+ struct lruvec *lruvec = mem_cgroup_lruvec(memcg, pgdat);
+
+ nr_swapcache = lruvec_page_state_local(lruvec, NR_SWAPCACHE);
+ }
+
+ return nr_swapcache > 0;
+}
+#else
+static bool can_reclaim_swapcache(struct mem_cgroup *memcg, int nid)
+{
+ return false;
+}
+#endif
+
static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,
int nid,
struct scan_control *sc)
{
+ if (sc)
+ sc->swapcache_only = 0;
+
if (memcg == NULL) {
/*
* For non-memcg reclaim, is there
@@ -330,7 +359,17 @@ static inline bool can_reclaim_anon_pages(struct mem_cgroup *memcg,
*
* Can it be reclaimed from this node via demotion?
*/
- return can_demote(nid, sc);
+ if (can_demote(nid, sc))
+ return true;
+
+ /* Is there any swapcache pages to reclaim in this node? */
+ if (can_reclaim_swapcache(memcg, nid)) {
+ if (sc)
+ sc->swapcache_only = 1;
+ return true;
+ }
+
+ return false;
}
/*
@@ -1642,6 +1681,15 @@ static unsigned long isolate_lru_folios(unsigned long nr_to_scan,
*/
scan += nr_pages;
+ /*
+ * Count non-swapcache too because the swapcache pages may
+ * be rare and it takes too much times here if not count
+ * the non-swapcache pages.
+ */
+ if (unlikely(sc->swapcache_only && !is_file_lru(lru) &&
+ !folio_test_swapcache(folio)))
+ goto move;
+
if (!folio_test_lru(folio))
goto move;
if (!sc->may_unmap && folio_mapped(folio))
--
2.25.1
next reply other threads:[~2023-11-21 8:08 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-11-21 9:06 Liu Shixin [this message]
2023-11-21 13:00 ` Michal Hocko
2023-11-22 6:41 ` Liu Shixin
2023-11-22 6:44 ` Yosry Ahmed
2023-11-22 6:57 ` Huang, Ying
2023-11-22 8:55 ` Michal Hocko
2023-11-22 8:52 ` Michal Hocko
2023-11-22 10:09 ` Michal Hocko
2023-11-22 10:39 ` Yosry Ahmed
2023-11-22 13:19 ` Michal Hocko
2023-11-22 20:13 ` Yosry Ahmed
2023-11-23 6:15 ` Huang, Ying
2023-11-24 16:30 ` Michal Hocko
2023-11-27 2:34 ` Huang, Ying
2023-11-27 7:42 ` Chris Li
2023-11-27 8:11 ` Huang, Ying
2023-11-27 8:22 ` Chris Li
2023-11-27 21:31 ` Minchan Kim
2023-11-27 21:56 ` Yosry Ahmed
2023-11-28 3:19 ` Huang, Ying
2023-11-28 3:27 ` Yosry Ahmed
2023-11-28 4:03 ` Huang, Ying
2023-11-28 4:13 ` Yosry Ahmed
2023-11-28 5:37 ` Huang, Ying
2023-11-28 5:41 ` Yosry Ahmed
2023-11-28 5:52 ` Huang, Ying
2023-11-28 22:37 ` Minchan Kim
2023-11-29 3:12 ` Huang, Ying
2023-11-29 10:22 ` Michal Hocko
2023-11-30 8:07 ` Huang, Ying
2023-11-28 23:45 ` Chris Li
2023-11-27 9:10 ` Michal Hocko
2023-11-28 1:31 ` Huang, Ying
2023-11-28 10:16 ` Michal Hocko
2023-11-28 22:45 ` Minchan Kim
2023-11-28 23:05 ` Yosry Ahmed
2023-11-28 23:15 ` Minchan Kim
2023-11-29 10:17 ` Michal Hocko
2023-12-13 23:13 ` Andrew Morton
2023-12-15 5:05 ` Huang, Ying
2023-12-15 19:24 ` Andrew Morton
2023-11-23 17:30 ` Chris Li
2023-11-23 17:19 ` Chris Li
2023-11-28 1:59 ` Liu Shixin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20231121090624.1814733-1-liushixin2@huawei.com \
--to=liushixin2@huawei.com \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=sachinp@linux.ibm.com \
--cc=wangkefeng.wang@huawei.com \
--cc=ying.huang@intel.com \
--cc=yosryahmed@google.com \
--cc=yuzhao@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox