[RFC] mm/swap.c: Enable promotion of unmapped MGLRU page cache pages

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Donet Tom <donettom@linux.ibm.com>
To: Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Gregory Price <gourry@gourry.net>
Cc: Yu Zhao <yuzhao@google.com>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	Huang Ying <ying.huang@linux.alibaba.com>,
	Johannes Weiner <hannes@cmpxchg.org>
Subject: [RFC] mm/swap.c: Enable promotion of unmapped MGLRU page cache pages
Date: Wed, 15 Jan 2025 06:06:25 -0600	[thread overview]
Message-ID: <20250115120625.3785-1-donettom@linux.ibm.com> (raw)

This patch is based on patch [1], which introduced support for
promoting unmapped normal LRU page cache pages. Here, we extend
the functionality to support promotion of MGLRU page cache pages.

An MGLRU page cache page is eligible for promotion when:

1. Memory Tiering and pagecache_promotion_enabled are enabled
2. It resides in a lower memory tier.
3. It is referenced.
4. It is part of the working set.
5. It belongs to the active list.
For MGLRU, the youngest generation and the youngest generation - 1
are treated as the active list.

When a page is accessed through a file descriptor, folio_inc_refs()
is invoked. In this function, we check whether the page is referenced,
is part of the working set, and Belongs to the active list. If all
these conditions are met, the page is added to the promotion list.
The per-process task task_numa_promotion_work() take the pages from
the promotion list and promotes them to a higher memory tier.

Test process:
We measured the read time in below scenarios for both LRU and MGLRU.
Scenario 1: Pages are on Lower tier + promotion off
Scenario 2: Pages are on Lower tier + promotion on
Scenario 3: Pages are on higher tier

Test Results MGLRU
---------------------------------------------------------------
Pages on higher   | Pages Lower tier |  Pages on Lower Tier   |
   Tier           |  promotion off   |   Promotion On         |
---------------------------------------------------------------
 1.5s             |    3.2s          |During Promotion - 6.6s |
                  |                  |After Promotion  - 1.6s |
                  |                  |                        |
---------------------------------------------------------------

Test Results LRU
---------------------------------------------------------------
Pages on higher   | Pages Lower tier |  Pages on Lower Tier   |
   Tier           |  promotion off   |   Promotion On         |
---------------------------------------------------------------
 1.5s             |    3.2s          |During Promotion - 4.2s |
                  |                  |                 - 3.4s |
                  |                  |                 - 5.6s |
                  |                  |After Promotion  - 1.6s |
                  |                  |                        |
---------------------------------------------------------------

In LRU, pages are initially added to the inactive list. When a page
is referenced, it is moved to the active list, and promotion to a
higher memory tier occurs from the active list. This process often
requires multiple reads to trigger promotion. In contrast, MGLRU adds
new pages to the youngest generation immediately. As a result, we
observe that MGLRU promotes pages on the first read itself, unlike
LRU which takes multiple reads to trigger promotion.

This difference also impacts read latency:

For MGLRU, the first read shows higher latency due to the combined
overhead of accessing a lower tier and performing promotion.

For LRU, the first 3–4 reads typically exhibit lower latency since
promotion does not occur immediately.

MGLRU and LRU are showing similar performance benefit.

[1] https://lore.kernel.org/all/20250107000346.1338481-1-gourry@gourry.net/

Signed-off-by: Donet Tom <donettom@linux.ibm.com>
---
 mm/swap.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)

diff --git a/mm/swap.c b/mm/swap.c
index b2341bc18452..121de1d7e938 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -386,6 +386,9 @@ static void __lru_cache_activate_folio(struct folio *folio)

 static void lru_gen_inc_refs(struct folio *folio)
 {
+	struct mem_cgroup *memcg;
+	struct lruvec *lruvec;
+	int gen;
 	unsigned long new_flags, old_flags = READ_ONCE(folio->flags);

 	if (folio_test_unevictable(folio))
@@ -399,13 +402,29 @@ static void lru_gen_inc_refs(struct folio *folio)

 	do {
 		if ((old_flags & LRU_REFS_MASK) == LRU_REFS_MASK) {
-			if (!folio_test_workingset(folio))
+			if (!folio_test_workingset(folio)) {
 				folio_set_workingset(folio);
-			return;
+				return;
+			}
+			goto promo_candid;
 		}

 		new_flags = old_flags + BIT(LRU_REFS_PGOFF);
 	} while (!try_cmpxchg(&folio->flags, &old_flags, new_flags));
+
+promo_candid:
+	if (!folio_test_isolated(folio) &&
+		(sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) &&
+		numa_pagecache_promotion_enabled) {
+		memcg = folio_memcg(folio);
+		if (memcg) {
+			lruvec = mem_cgroup_lruvec(memcg, folio_pgdat(folio));
+			gen = folio_lru_gen(folio);
+
+			if ((gen < MAX_NR_GENS) && lru_gen_is_active(lruvec, gen))
+				promotion_candidate(folio);
+		}
+	}
 }

 static bool lru_gen_clear_refs(struct folio *folio)
-- 
2.43.5

next             reply	other threads:[~2025-01-15 12:06 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-15 12:06 Donet Tom [this message]
2025-01-15 13:25 ` Matthew Wilcox
2025-01-16 11:10   ` Donet Tom
2025-01-16 18:46     ` Matthew Wilcox
2025-01-15 16:09 ` Gregory Price
2025-01-16 11:16   ` Donet Tom

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250115120625.3785-1-donettom@linux.ibm.com \
    --to=donettom@linux.ibm.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@kernel.org \
    --cc=david@redhat.com \
    --cc=gourry@gourry.net \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ritesh.list@gmail.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@linux.alibaba.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox