linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope
@ 2024-11-04 17:52 Kairui Song
  2024-11-04 17:52 ` [PATCH v3 1/6] mm/list_lru: don't pass unnecessary key parameters Kairui Song
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Kairui Song @ 2024-11-04 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Michal Hocko, Chengming Zhou,
	Qi Zheng, Muchun Song, Kairui Song

From: Kairui Song <kasong@tencent.com>

Currently, every list_lru has a per-node lock that protects adding,
deletion, isolation, and reparenting of all list_lru_one instances
belonging to this list_lru on this node. This lock contention is heavy
when multiple cgroups modify the same list_lru.

This can be alleviated by splitting the lock into per-cgroup scope.

To achieve this, this series reworked and optimized the reparenting
process step by step, making it possible to have a stable list_lru_one,
and making it possible to pin the list_lru_one. Then split the lock
into per-cgroup scope.

The result is ~15% performance gain for simple multi-cgroup tar test
of small files, and reduced LOC. See PATCH 5/6 for test details.

V2: https://lore.kernel.org/linux-mm/20240925171020.32142-1-ryncsn@gmail.com/
Updates from V2:
- Collect Acked-by.
- Fix a WARN_ON issue caused by potential compiler optimization issue.
  [Dan Carpenter, Naresh Kamboju]
  https://lore.kernel.org/linux-mm/62a65418-2393-40ec-b462-151605a5efcf@stanley.mountain/
  [Applied to "mm/list_lru: split the lock to per-cgroup scope"]
- Fix a BUG_ON issue, V2 forgot to cover user of LRU_STOP. [Usama Arif]
  https://lore.kernel.org/linux-mm/CAMgjq7D_OA=vYf5SnNnKXjppPFhDqsbYF--6=cOayKiadxuwrQ@mail.gmail.com/

V1: https://lore.kernel.org/linux-mm/20240624175313.47329-1-ryncsn@gmail.com/
Updates from V1:
- Collect Review-by.
- Fix a race of initialization issue that may lead to mem leak [Muchun
  Song]
- Drop a unrelated and incorrect fix [Shakeel Butt]
- Use VM_WARN_ON instead of WARN_ON for several sanity checks.

Kairui Song (6):
  mm/list_lru: don't pass unnecessary key parameters
  mm/list_lru: don't export list_lru_add
  mm/list_lru: code clean up for reparenting
  mm/list_lru: simplify reparenting and initial allocation
  mm/list_lru: split the lock to per-cgroup scope
  mm/list_lru: simplify the list_lru walk callback function

 drivers/android/binder_alloc.c |   8 +-
 drivers/android/binder_alloc.h |   2 +-
 fs/dcache.c                    |   4 +-
 fs/gfs2/quota.c                |   2 +-
 fs/inode.c                     |   5 +-
 fs/nfs/nfs42xattr.c            |   4 +-
 fs/nfsd/filecache.c            |   5 +-
 fs/xfs/xfs_buf.c               |   2 -
 fs/xfs/xfs_qm.c                |   6 +-
 include/linux/list_lru.h       |  26 ++-
 mm/list_lru.c                  | 383 ++++++++++++++++-----------------
 mm/memcontrol.c                |  10 +-
 mm/workingset.c                |  20 +-
 mm/zswap.c                     |  12 +-
 14 files changed, 241 insertions(+), 248 deletions(-)

-- 
2.47.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 1/6] mm/list_lru: don't pass unnecessary key parameters
  2024-11-04 17:52 [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope Kairui Song
@ 2024-11-04 17:52 ` Kairui Song
  2024-11-04 17:52 ` [PATCH v3 2/6] mm/list_lru: don't export list_lru_add Kairui Song
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Kairui Song @ 2024-11-04 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Michal Hocko, Chengming Zhou,
	Qi Zheng, Muchun Song, Kairui Song

From: Kairui Song <kasong@tencent.com>

When LOCKDEP is not enabled, lock_class_key is an empty struct that
is never used. But the list_lru initialization function still takes
a placeholder pointer as parameter, and the compiler cannot optimize
it because the function is not static and exported.

Remove this parameter and move it inside the list_lru struct. Only
use it when LOCKDEP is enabled. Kernel builds with LOCKDEP will be
slightly larger, while !LOCKDEP builds without it will be slightly
smaller (the common case).

Signed-off-by: Kairui Song <kasong@tencent.com>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
---
 include/linux/list_lru.h | 18 +++++++++++++++---
 mm/list_lru.c            |  9 +++++----
 mm/workingset.c          |  4 ++--
 3 files changed, 22 insertions(+), 9 deletions(-)

diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 5099a8ccd5f4..eba93f6511f3 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -56,16 +56,28 @@ struct list_lru {
 	bool			memcg_aware;
 	struct xarray		xa;
 #endif
+#ifdef CONFIG_LOCKDEP
+	struct lock_class_key	*key;
+#endif
 };
 
 void list_lru_destroy(struct list_lru *lru);
 int __list_lru_init(struct list_lru *lru, bool memcg_aware,
-		    struct lock_class_key *key, struct shrinker *shrinker);
+		    struct shrinker *shrinker);
 
 #define list_lru_init(lru)				\
-	__list_lru_init((lru), false, NULL, NULL)
+	__list_lru_init((lru), false, NULL)
 #define list_lru_init_memcg(lru, shrinker)		\
-	__list_lru_init((lru), true, NULL, shrinker)
+	__list_lru_init((lru), true, shrinker)
+
+static inline int list_lru_init_memcg_key(struct list_lru *lru, struct shrinker *shrinker,
+					  struct lock_class_key *key)
+{
+#ifdef CONFIG_LOCKDEP
+	lru->key = key;
+#endif
+	return list_lru_init_memcg(lru, shrinker);
+}
 
 int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru,
 			 gfp_t gfp);
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 9b7ff06e9d32..ea7dc9fa4d05 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -562,8 +562,7 @@ static void memcg_destroy_list_lru(struct list_lru *lru)
 }
 #endif /* CONFIG_MEMCG */
 
-int __list_lru_init(struct list_lru *lru, bool memcg_aware,
-		    struct lock_class_key *key, struct shrinker *shrinker)
+int __list_lru_init(struct list_lru *lru, bool memcg_aware, struct shrinker *shrinker)
 {
 	int i;
 
@@ -583,8 +582,10 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 
 	for_each_node(i) {
 		spin_lock_init(&lru->node[i].lock);
-		if (key)
-			lockdep_set_class(&lru->node[i].lock, key);
+#ifdef CONFIG_LOCKDEP
+		if (lru->key)
+			lockdep_set_class(&lru->node[i].lock, lru->key);
+#endif
 		init_one_lru(&lru->node[i].lru);
 	}
 
diff --git a/mm/workingset.c b/mm/workingset.c
index a2b28e356e68..df3937c5eedc 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -823,8 +823,8 @@ static int __init workingset_init(void)
 	if (!workingset_shadow_shrinker)
 		goto err;
 
-	ret = __list_lru_init(&shadow_nodes, true, &shadow_nodes_key,
-			      workingset_shadow_shrinker);
+	ret = list_lru_init_memcg_key(&shadow_nodes, workingset_shadow_shrinker,
+				      &shadow_nodes_key);
 	if (ret)
 		goto err_list_lru;
 
-- 
2.47.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 2/6] mm/list_lru: don't export list_lru_add
  2024-11-04 17:52 [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope Kairui Song
  2024-11-04 17:52 ` [PATCH v3 1/6] mm/list_lru: don't pass unnecessary key parameters Kairui Song
@ 2024-11-04 17:52 ` Kairui Song
  2024-11-04 17:52 ` [PATCH v3 3/6] mm/list_lru: code clean up for reparenting Kairui Song
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Kairui Song @ 2024-11-04 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Michal Hocko, Chengming Zhou,
	Qi Zheng, Muchun Song, Kairui Song

From: Kairui Song <kasong@tencent.com>

It's no longer used by any module, just remove it.

Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
---
 mm/list_lru.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/mm/list_lru.c b/mm/list_lru.c
index ea7dc9fa4d05..a798e7624f69 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -106,7 +106,6 @@ bool list_lru_add(struct list_lru *lru, struct list_head *item, int nid,
 	spin_unlock(&nlru->lock);
 	return false;
 }
-EXPORT_SYMBOL_GPL(list_lru_add);
 
 bool list_lru_add_obj(struct list_lru *lru, struct list_head *item)
 {
-- 
2.47.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 3/6] mm/list_lru: code clean up for reparenting
  2024-11-04 17:52 [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope Kairui Song
  2024-11-04 17:52 ` [PATCH v3 1/6] mm/list_lru: don't pass unnecessary key parameters Kairui Song
  2024-11-04 17:52 ` [PATCH v3 2/6] mm/list_lru: don't export list_lru_add Kairui Song
@ 2024-11-04 17:52 ` Kairui Song
  2024-11-04 17:52 ` [PATCH v3 4/6] mm/list_lru: simplify reparenting and initial allocation Kairui Song
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Kairui Song @ 2024-11-04 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Michal Hocko, Chengming Zhou,
	Qi Zheng, Muchun Song, Kairui Song

From: Kairui Song <kasong@tencent.com>

No feature change, just change of code structure and fix comment.

The list lrus are not empty until memcg_reparent_list_lru_node() calls
are all done, so the comments in memcg_offline_kmem were slightly
inaccurate.

Signed-off-by: Kairui Song <kasong@tencent.com>
Reviewed-by: Muchun Song <muchun.song@linux.dev>
Acked-by: Shakeel Butt <shakeel.butt@linux.dev>
---
 mm/list_lru.c   | 39 +++++++++++++++++----------------------
 mm/memcontrol.c |  7 -------
 2 files changed, 17 insertions(+), 29 deletions(-)

diff --git a/mm/list_lru.c b/mm/list_lru.c
index a798e7624f69..b54f092d4d65 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -421,35 +421,16 @@ static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
 	spin_unlock_irq(&nlru->lock);
 }
 
-static void memcg_reparent_list_lru(struct list_lru *lru,
-				    int src_idx, struct mem_cgroup *dst_memcg)
-{
-	int i;
-
-	for_each_node(i)
-		memcg_reparent_list_lru_node(lru, i, src_idx, dst_memcg);
-
-	memcg_list_lru_free(lru, src_idx);
-}
-
 void memcg_reparent_list_lrus(struct mem_cgroup *memcg, struct mem_cgroup *parent)
 {
 	struct cgroup_subsys_state *css;
 	struct list_lru *lru;
-	int src_idx = memcg->kmemcg_id;
+	int src_idx = memcg->kmemcg_id, i;
 
 	/*
 	 * Change kmemcg_id of this cgroup and all its descendants to the
 	 * parent's id, and then move all entries from this cgroup's list_lrus
 	 * to ones of the parent.
-	 *
-	 * After we have finished, all list_lrus corresponding to this cgroup
-	 * are guaranteed to remain empty. So we can safely free this cgroup's
-	 * list lrus in memcg_list_lru_free().
-	 *
-	 * Changing ->kmemcg_id to the parent can prevent memcg_list_lru_alloc()
-	 * from allocating list lrus for this cgroup after memcg_list_lru_free()
-	 * call.
 	 */
 	rcu_read_lock();
 	css_for_each_descendant_pre(css, &memcg->css) {
@@ -460,9 +441,23 @@ void memcg_reparent_list_lrus(struct mem_cgroup *memcg, struct mem_cgroup *paren
 	}
 	rcu_read_unlock();
 
+	/*
+	 * With kmemcg_id set to parent, holding the lock of each list_lru_node
+	 * below can prevent list_lru_{add,del,isolate} from touching the lru,
+	 * safe to reparent.
+	 */
 	mutex_lock(&list_lrus_mutex);
-	list_for_each_entry(lru, &memcg_list_lrus, list)
-		memcg_reparent_list_lru(lru, src_idx, parent);
+	list_for_each_entry(lru, &memcg_list_lrus, list) {
+		for_each_node(i)
+			memcg_reparent_list_lru_node(lru, i, src_idx, parent);
+
+		/*
+		 * Here all list_lrus corresponding to the cgroup are guaranteed
+		 * to remain empty, we can safely free this lru, any further
+		 * memcg_list_lru_alloc() call will simply bail out.
+		 */
+		memcg_list_lru_free(lru, src_idx);
+	}
 	mutex_unlock(&list_lrus_mutex);
 }
 
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7845c64a2c57..8e90aa026c47 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3099,13 +3099,6 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg)
 		parent = root_mem_cgroup;
 
 	memcg_reparent_objcgs(memcg, parent);
-
-	/*
-	 * After we have finished memcg_reparent_objcgs(), all list_lrus
-	 * corresponding to this cgroup are guaranteed to remain empty.
-	 * The ordering is imposed by list_lru_node->lock taken by
-	 * memcg_reparent_list_lrus().
-	 */
 	memcg_reparent_list_lrus(memcg, parent);
 }
 
-- 
2.47.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 4/6] mm/list_lru: simplify reparenting and initial allocation
  2024-11-04 17:52 [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope Kairui Song
                   ` (2 preceding siblings ...)
  2024-11-04 17:52 ` [PATCH v3 3/6] mm/list_lru: code clean up for reparenting Kairui Song
@ 2024-11-04 17:52 ` Kairui Song
  2024-11-04 17:52 ` [PATCH v3 5/6] mm/list_lru: split the lock to per-cgroup scope Kairui Song
  2024-11-04 17:52 ` [PATCH v3 6/6] mm/list_lru: Simplify the list_lru walk callback function Kairui Song
  5 siblings, 0 replies; 7+ messages in thread
From: Kairui Song @ 2024-11-04 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Michal Hocko, Chengming Zhou,
	Qi Zheng, Muchun Song, Kairui Song

From: Kairui Song <kasong@tencent.com>

Currently, there is a lot of code for detecting reparent racing
using kmemcg_id as the synchronization flag. And an intermediate
table is required to record and compare the kmemcg_id.

We can simplify this by just checking the cgroup css status, skip
if cgroup is being offlined. On the reparenting side, ensure no
more allocation is on going and no further allocation will occur
by using the XArray lock as barrier.

Combined with a O(n^2) top-down walk for the allocation, we get rid
of the intermediate table allocation completely. Despite being O(n^2),
it should be actually faster because it's not practical to have a very
deep cgroup level, and in most cases the parent cgroup should have been
allocated already.

This also avoided changing kmemcg_id before reparenting, making
cgroups have a stable index for list_lru_memcg. After this change
it's possible that a dying cgroup will see a NULL value in XArray
corresponding to the kmemcg_id, because the kmemcg_id will point
to an empty slot. In such case, just fallback to use its parent.

As a result the code is simpler, following test also showed a
very slight performance gain (12 test runs):

prepare() {
        mkdir /tmp/test-fs
        modprobe brd rd_nr=1 rd_size=16777216
        mkfs.xfs -f /dev/ram0
        mount -t xfs /dev/ram0 /tmp/test-fs
        for i in $(seq 10000); do
                seq 8000 > "/tmp/test-fs/$i"
        done
        mkdir -p /sys/fs/cgroup/system.slice/bench/test/1
        echo +memory > /sys/fs/cgroup/system.slice/bench/cgroup.subtree_control
        echo +memory > /sys/fs/cgroup/system.slice/bench/test/cgroup.subtree_control
        echo +memory > /sys/fs/cgroup/system.slice/bench/test/1/cgroup.subtree_control
        echo 768M > /sys/fs/cgroup/system.slice/bench/memory.max
}

do_test() {
        read_worker() {
                mkdir -p "/sys/fs/cgroup/system.slice/bench/test/1/$1"
                echo $BASHPID > "/sys/fs/cgroup/system.slice/bench/test/1/$1/cgroup.procs"
                read -r __TMP < "/tmp/test-fs/$1";
        }
        read_in_all() {
                for i in $(seq 10000); do
                        read_worker "$i" &
                done; wait
        }
        echo 3 > /proc/sys/vm/drop_caches
        time read_in_all
        for i in $(seq 1 10000); do
                rmdir "/sys/fs/cgroup/system.slice/bench/test/1/$i" &>/dev/null
        done
}

Before:
real    0m3.498s   user    0m11.037s  sys     0m35.872s
real    1m33.860s  user    0m11.593s  sys     3m1.169s
real    1m31.883s  user    0m11.265s  sys     2m59.198s
real    1m32.394s  user    0m11.294s  sys     3m1.616s
real    1m31.017s  user    0m11.379s  sys     3m1.349s
real    1m31.931s  user    0m11.295s  sys     2m59.863s
real    1m32.758s  user    0m11.254s  sys     2m59.538s
real    1m35.198s  user    0m11.145s  sys     3m1.123s
real    1m30.531s  user    0m11.393s  sys     2m58.089s
real    1m31.142s  user    0m11.333s  sys     3m0.549s

After:
real    0m3.489s   user    0m10.943s  sys     0m36.036s
real    1m10.893s  user    0m11.495s  sys     2m38.545s
real    1m29.129s  user    0m11.382s  sys     3m1.601s
real    1m29.944s  user    0m11.494s  sys     3m1.575s
real    1m31.208s  user    0m11.451s  sys     2m59.693s
real    1m25.944s  user    0m11.327s  sys     2m56.394s
real    1m28.599s  user    0m11.312s  sys     3m0.162s
real    1m26.746s  user    0m11.538s  sys     2m55.462s
real    1m30.668s  user    0m11.475s  sys     3m2.075s
real    1m29.258s  user    0m11.292s  sys     3m0.780s

Which is slightly faster in real time.

Signed-off-by: Kairui Song <kasong@tencent.com>
---
 mm/list_lru.c | 178 +++++++++++++++++++++-----------------------------
 mm/zswap.c    |   7 +-
 2 files changed, 77 insertions(+), 108 deletions(-)

diff --git a/mm/list_lru.c b/mm/list_lru.c
index b54f092d4d65..172b16146e15 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -59,6 +59,20 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid, int idx)
 	}
 	return &lru->node[nid].lru;
 }
+
+static inline struct list_lru_one *
+list_lru_from_memcg(struct list_lru *lru, int nid, struct mem_cgroup *memcg)
+{
+	struct list_lru_one *l;
+again:
+	l = list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg));
+	if (likely(l))
+		return l;
+
+	memcg = parent_mem_cgroup(memcg);
+	VM_WARN_ON(!css_is_dying(&memcg->css));
+	goto again;
+}
 #else
 static void list_lru_register(struct list_lru *lru)
 {
@@ -83,6 +97,12 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid, int idx)
 {
 	return &lru->node[nid].lru;
 }
+
+static inline struct list_lru_one *
+list_lru_from_memcg(struct list_lru *lru, int nid, int idx)
+{
+	return &lru->node[nid].lru;
+}
 #endif /* CONFIG_MEMCG */
 
 /* The caller must ensure the memcg lifetime. */
@@ -94,7 +114,7 @@ bool list_lru_add(struct list_lru *lru, struct list_head *item, int nid,
 
 	spin_lock(&nlru->lock);
 	if (list_empty(item)) {
-		l = list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg));
+		l = list_lru_from_memcg(lru, nid, memcg);
 		list_add_tail(item, &l->list);
 		/* Set shrinker bit if the first element was added */
 		if (!l->nr_items++)
@@ -133,7 +153,7 @@ bool list_lru_del(struct list_lru *lru, struct list_head *item, int nid,
 
 	spin_lock(&nlru->lock);
 	if (!list_empty(item)) {
-		l = list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg));
+		l = list_lru_from_memcg(lru, nid, memcg);
 		list_del_init(item);
 		l->nr_items--;
 		nlru->nr_items--;
@@ -355,20 +375,6 @@ static struct list_lru_memcg *memcg_init_list_lru_one(gfp_t gfp)
 	return mlru;
 }
 
-static void memcg_list_lru_free(struct list_lru *lru, int src_idx)
-{
-	struct list_lru_memcg *mlru = xa_erase_irq(&lru->xa, src_idx);
-
-	/*
-	 * The __list_lru_walk_one() can walk the list of this node.
-	 * We need kvfree_rcu() here. And the walking of the list
-	 * is under lru->node[nid]->lock, which can serve as a RCU
-	 * read-side critical section.
-	 */
-	if (mlru)
-		kvfree_rcu(mlru, rcu);
-}
-
 static inline void memcg_init_list_lru(struct list_lru *lru, bool memcg_aware)
 {
 	if (memcg_aware)
@@ -393,22 +399,18 @@ static void memcg_destroy_list_lru(struct list_lru *lru)
 }
 
 static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
-					 int src_idx, struct mem_cgroup *dst_memcg)
+					 struct list_lru_one *src,
+					 struct mem_cgroup *dst_memcg)
 {
 	struct list_lru_node *nlru = &lru->node[nid];
-	int dst_idx = dst_memcg->kmemcg_id;
-	struct list_lru_one *src, *dst;
+	struct list_lru_one *dst;
 
 	/*
 	 * Since list_lru_{add,del} may be called under an IRQ-safe lock,
 	 * we have to use IRQ-safe primitives here to avoid deadlock.
 	 */
 	spin_lock_irq(&nlru->lock);
-
-	src = list_lru_from_memcg_idx(lru, nid, src_idx);
-	if (!src)
-		goto out;
-	dst = list_lru_from_memcg_idx(lru, nid, dst_idx);
+	dst = list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(dst_memcg));
 
 	list_splice_init(&src->list, &dst->list);
 
@@ -417,46 +419,43 @@ static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
 		set_shrinker_bit(dst_memcg, nid, lru_shrinker_id(lru));
 		src->nr_items = 0;
 	}
-out:
 	spin_unlock_irq(&nlru->lock);
 }
 
 void memcg_reparent_list_lrus(struct mem_cgroup *memcg, struct mem_cgroup *parent)
 {
-	struct cgroup_subsys_state *css;
 	struct list_lru *lru;
-	int src_idx = memcg->kmemcg_id, i;
-
-	/*
-	 * Change kmemcg_id of this cgroup and all its descendants to the
-	 * parent's id, and then move all entries from this cgroup's list_lrus
-	 * to ones of the parent.
-	 */
-	rcu_read_lock();
-	css_for_each_descendant_pre(css, &memcg->css) {
-		struct mem_cgroup *child;
-
-		child = mem_cgroup_from_css(css);
-		WRITE_ONCE(child->kmemcg_id, parent->kmemcg_id);
-	}
-	rcu_read_unlock();
+	int i;
 
-	/*
-	 * With kmemcg_id set to parent, holding the lock of each list_lru_node
-	 * below can prevent list_lru_{add,del,isolate} from touching the lru,
-	 * safe to reparent.
-	 */
 	mutex_lock(&list_lrus_mutex);
 	list_for_each_entry(lru, &memcg_list_lrus, list) {
+		struct list_lru_memcg *mlru;
+		XA_STATE(xas, &lru->xa, memcg->kmemcg_id);
+
+		/*
+		 * Lock the Xarray to ensure no on going list_lru_memcg
+		 * allocation and further allocation will see css_is_dying().
+		 */
+		xas_lock_irq(&xas);
+		mlru = xas_store(&xas, NULL);
+		xas_unlock_irq(&xas);
+		if (!mlru)
+			continue;
+
+		/*
+		 * With Xarray value set to NULL, holding the lru lock below
+		 * prevents list_lru_{add,del,isolate} from touching the lru,
+		 * safe to reparent.
+		 */
 		for_each_node(i)
-			memcg_reparent_list_lru_node(lru, i, src_idx, parent);
+			memcg_reparent_list_lru_node(lru, i, &mlru->node[i], parent);
 
 		/*
 		 * Here all list_lrus corresponding to the cgroup are guaranteed
 		 * to remain empty, we can safely free this lru, any further
 		 * memcg_list_lru_alloc() call will simply bail out.
 		 */
-		memcg_list_lru_free(lru, src_idx);
+		kvfree_rcu(mlru, rcu);
 	}
 	mutex_unlock(&list_lrus_mutex);
 }
@@ -472,77 +471,48 @@ static inline bool memcg_list_lru_allocated(struct mem_cgroup *memcg,
 int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru,
 			 gfp_t gfp)
 {
-	int i;
 	unsigned long flags;
-	struct list_lru_memcg_table {
-		struct list_lru_memcg *mlru;
-		struct mem_cgroup *memcg;
-	} *table;
+	struct list_lru_memcg *mlru;
+	struct mem_cgroup *pos, *parent;
 	XA_STATE(xas, &lru->xa, 0);
 
 	if (!list_lru_memcg_aware(lru) || memcg_list_lru_allocated(memcg, lru))
 		return 0;
 
 	gfp &= GFP_RECLAIM_MASK;
-	table = kmalloc_array(memcg->css.cgroup->level, sizeof(*table), gfp);
-	if (!table)
-		return -ENOMEM;
-
 	/*
 	 * Because the list_lru can be reparented to the parent cgroup's
 	 * list_lru, we should make sure that this cgroup and all its
 	 * ancestors have allocated list_lru_memcg.
 	 */
-	for (i = 0; memcg; memcg = parent_mem_cgroup(memcg), i++) {
-		if (memcg_list_lru_allocated(memcg, lru))
-			break;
-
-		table[i].memcg = memcg;
-		table[i].mlru = memcg_init_list_lru_one(gfp);
-		if (!table[i].mlru) {
-			while (i--)
-				kfree(table[i].mlru);
-			kfree(table);
-			return -ENOMEM;
+	do {
+		/*
+		 * Keep finding the farest parent that wasn't populated
+		 * until found memcg itself.
+		 */
+		pos = memcg;
+		parent = parent_mem_cgroup(pos);
+		while (!memcg_list_lru_allocated(parent, lru)) {
+			pos = parent;
+			parent = parent_mem_cgroup(pos);
 		}
-	}
-
-	xas_lock_irqsave(&xas, flags);
-	while (i--) {
-		int index = READ_ONCE(table[i].memcg->kmemcg_id);
-		struct list_lru_memcg *mlru = table[i].mlru;
 
-		xas_set(&xas, index);
-retry:
-		if (unlikely(index < 0 || xas_error(&xas) || xas_load(&xas))) {
-			kfree(mlru);
-		} else {
-			xas_store(&xas, mlru);
-			if (xas_error(&xas) == -ENOMEM) {
-				xas_unlock_irqrestore(&xas, flags);
-				if (xas_nomem(&xas, gfp))
-					xas_set_err(&xas, 0);
-				xas_lock_irqsave(&xas, flags);
-				/*
-				 * The xas lock has been released, this memcg
-				 * can be reparented before us. So reload
-				 * memcg id. More details see the comments
-				 * in memcg_reparent_list_lrus().
-				 */
-				index = READ_ONCE(table[i].memcg->kmemcg_id);
-				if (index < 0)
-					xas_set_err(&xas, 0);
-				else if (!xas_error(&xas) && index != xas.xa_index)
-					xas_set(&xas, index);
-				goto retry;
+		mlru = memcg_init_list_lru_one(gfp);
+		if (!mlru)
+			return -ENOMEM;
+		xas_set(&xas, pos->kmemcg_id);
+		do {
+			xas_lock_irqsave(&xas, flags);
+			if (!xas_load(&xas) && !css_is_dying(&pos->css)) {
+				xas_store(&xas, mlru);
+				if (!xas_error(&xas))
+					mlru = NULL;
 			}
-		}
-	}
-	/* xas_nomem() is used to free memory instead of memory allocation. */
-	if (xas.xa_alloc)
-		xas_nomem(&xas, gfp);
-	xas_unlock_irqrestore(&xas, flags);
-	kfree(table);
+			xas_unlock_irqrestore(&xas, flags);
+		} while (xas_nomem(&xas, gfp));
+		if (mlru)
+			kfree(mlru);
+	} while (pos != memcg && !css_is_dying(&pos->css));
 
 	return xas_error(&xas);
 }
diff --git a/mm/zswap.c b/mm/zswap.c
index 162013952074..6910c37cb8ec 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -703,12 +703,11 @@ static void zswap_lru_add(struct list_lru *list_lru, struct zswap_entry *entry)
 
 	/*
 	 * Note that it is safe to use rcu_read_lock() here, even in the face of
-	 * concurrent memcg offlining. Thanks to the memcg->kmemcg_id indirection
-	 * used in list_lru lookup, only two scenarios are possible:
+	 * concurrent memcg offlining:
 	 *
-	 * 1. list_lru_add() is called before memcg->kmemcg_id is updated. The
+	 * 1. list_lru_add() is called before list_lru_memcg is erased. The
 	 *    new entry will be reparented to memcg's parent's list_lru.
-	 * 2. list_lru_add() is called after memcg->kmemcg_id is updated. The
+	 * 2. list_lru_add() is called after list_lru_memcg is erased. The
 	 *    new entry will be added directly to memcg's parent's list_lru.
 	 *
 	 * Similar reasoning holds for list_lru_del().
-- 
2.47.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 5/6] mm/list_lru: split the lock to per-cgroup scope
  2024-11-04 17:52 [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope Kairui Song
                   ` (3 preceding siblings ...)
  2024-11-04 17:52 ` [PATCH v3 4/6] mm/list_lru: simplify reparenting and initial allocation Kairui Song
@ 2024-11-04 17:52 ` Kairui Song
  2024-11-04 17:52 ` [PATCH v3 6/6] mm/list_lru: Simplify the list_lru walk callback function Kairui Song
  5 siblings, 0 replies; 7+ messages in thread
From: Kairui Song @ 2024-11-04 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Michal Hocko, Chengming Zhou,
	Qi Zheng, Muchun Song, Kairui Song

From: Kairui Song <kasong@tencent.com>

Currently, every list_lru has a per-node lock that protects adding,
deletion, isolation, and reparenting of all list_lru_one instances
belonging to this list_lru on this node. This lock contention is
heavy when multiple cgroups modify the same list_lru.

This lock can be split into per-cgroup scope to reduce contention.

To achieve this, we need a stable list_lru_one for every cgroup.
This commit adds a lock to each list_lru_one and introduced a
helper function lock_list_lru_of_memcg, making it possible to pin
the list_lru of a memcg. Then reworked the reparenting process.

Reparenting will switch the list_lru_one instances one by one.
By locking each instance and marking it dead using the nr_items
counter, reparenting ensures that all items in the corresponding
cgroup (on-list or not, because items have a stable cgroup, see below)
will see the list_lru_one switch synchronously.

Objcg reparent is also moved after list_lru reparent so items will have
a stable mem cgroup until all list_lru_one instances are drained.

The only caller that doesn't work the *_obj interfaces are direct
calls to list_lru_{add,del}. But it's only used by zswap and
that's also based on objcg, so it's fine.

This also changes the bahaviour of the isolation function when
LRU_RETRY or LRU_REMOVED_RETRY is returned, because now releasing the
lock could unblock reparenting and free the list_lru_one, isolation
function will have to return withoug re-lock the lru.

prepare() {
    mkdir /tmp/test-fs
    modprobe brd rd_nr=1 rd_size=33554432
    mkfs.xfs -f /dev/ram0
    mount -t xfs /dev/ram0 /tmp/test-fs
    for i in $(seq 1 512); do
        mkdir "/tmp/test-fs/$i"
        for j in $(seq 1 10240); do
            echo TEST-CONTENT > "/tmp/test-fs/$i/$j"
        done &
    done; wait
}

do_test() {
    read_worker() {
        sleep 1
        tar -cv "$1" &>/dev/null
    }
    read_in_all() {
        cd "/tmp/test-fs" && ls
        for i in $(seq 1 512); do
            (exec sh -c 'echo "$PPID"') > "/sys/fs/cgroup/benchmark/$i/cgroup.procs"
            read_worker "$i" &
        done; wait
    }
    for i in $(seq 1 512); do
        mkdir -p "/sys/fs/cgroup/benchmark/$i"
    done
    echo +memory > /sys/fs/cgroup/benchmark/cgroup.subtree_control
    echo 512M > /sys/fs/cgroup/benchmark/memory.max
    echo 3 > /proc/sys/vm/drop_caches
    time read_in_all
}

Above script simulates compression of small files in multiple cgroups
with memory pressure. Run prepare() then do_test for 6 times:

Before:
real      0m7.762s user      0m11.340s sys       3m11.224s
real      0m8.123s user      0m11.548s sys       3m2.549s
real      0m7.736s user      0m11.515s sys       3m11.171s
real      0m8.539s user      0m11.508s sys       3m7.618s
real      0m7.928s user      0m11.349s sys       3m13.063s
real      0m8.105s user      0m11.128s sys       3m14.313s

After this commit (about ~15% faster):
real      0m6.953s user      0m11.327s sys       2m42.912s
real      0m7.453s user      0m11.343s sys       2m51.942s
real      0m6.916s user      0m11.269s sys       2m43.957s
real      0m6.894s user      0m11.528s sys       2m45.346s
real      0m6.911s user      0m11.095s sys       2m43.168s
real      0m6.773s user      0m11.518s sys       2m40.774s

Signed-off-by: Kairui Song <kasong@tencent.com>
---
 drivers/android/binder_alloc.c |   1 -
 fs/inode.c                     |   1 -
 fs/xfs/xfs_qm.c                |   1 -
 include/linux/list_lru.h       |   6 +-
 mm/list_lru.c                  | 216 +++++++++++++++++++--------------
 mm/memcontrol.c                |   7 +-
 mm/workingset.c                |   1 -
 mm/zswap.c                     |   5 +-
 8 files changed, 135 insertions(+), 103 deletions(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index b3acbc4174fb..86bbe40f4bcd 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -1106,7 +1106,6 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
 	mmput_async(mm);
 	__free_page(page_to_free);
 
-	spin_lock(lock);
 	return LRU_REMOVED_RETRY;
 
 err_invalid_vma:
diff --git a/fs/inode.c b/fs/inode.c
index 8dabb224f941..442cb4fc09b2 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -934,7 +934,6 @@ static enum lru_status inode_lru_isolate(struct list_head *item,
 			mm_account_reclaimed_pages(reap);
 		}
 		inode_unpin_lru_isolating(inode);
-		spin_lock(lru_lock);
 		return LRU_RETRY;
 	}
 
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 7e2307921deb..665d26990b78 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -496,7 +496,6 @@ xfs_qm_dquot_isolate(
 	trace_xfs_dqreclaim_busy(dqp);
 	XFS_STATS_INC(dqp->q_mount, xs_qm_dqreclaim_misses);
 	xfs_dqunlock(dqp);
-	spin_lock(lru_lock);
 	return LRU_RETRY;
 }
 
diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index eba93f6511f3..10ba9a54d42c 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -32,6 +32,8 @@ struct list_lru_one {
 	struct list_head	list;
 	/* may become negative during memcg reparenting */
 	long			nr_items;
+	/* protects all fields above */
+	spinlock_t		lock;
 };
 
 struct list_lru_memcg {
@@ -41,11 +43,9 @@ struct list_lru_memcg {
 };
 
 struct list_lru_node {
-	/* protects all lists on the node, including per cgroup */
-	spinlock_t		lock;
 	/* global list, used for the root cgroup in cgroup aware lrus */
 	struct list_lru_one	lru;
-	long			nr_items;
+	atomic_long_t		nr_items;
 } ____cacheline_aligned_in_smp;
 
 struct list_lru {
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 172b16146e15..c139202e27f7 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -61,18 +61,51 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid, int idx)
 }
 
 static inline struct list_lru_one *
-list_lru_from_memcg(struct list_lru *lru, int nid, struct mem_cgroup *memcg)
+lock_list_lru_of_memcg(struct list_lru *lru, int nid, struct mem_cgroup *memcg,
+		       bool irq, bool skip_empty)
 {
 	struct list_lru_one *l;
+	long nr_items;
+
+	rcu_read_lock();
 again:
 	l = list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(memcg));
-	if (likely(l))
-		return l;
-
-	memcg = parent_mem_cgroup(memcg);
+	if (likely(l)) {
+		if (irq)
+			spin_lock_irq(&l->lock);
+		else
+			spin_lock(&l->lock);
+		nr_items = READ_ONCE(l->nr_items);
+		if (likely(nr_items != LONG_MIN)) {
+			WARN_ON(nr_items < 0);
+			rcu_read_unlock();
+			return l;
+		}
+		if (irq)
+			spin_unlock_irq(&l->lock);
+		else
+			spin_unlock(&l->lock);
+	}
+	/*
+	 * Caller may simply bail out if raced with reparenting or
+	 * may iterate through the list_lru and expect empty slots.
+	 */
+	if (skip_empty) {
+		rcu_read_unlock();
+		return NULL;
+	}
 	VM_WARN_ON(!css_is_dying(&memcg->css));
+	memcg = parent_mem_cgroup(memcg);
 	goto again;
 }
+
+static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off)
+{
+	if (irq_off)
+		spin_unlock_irq(&l->lock);
+	else
+		spin_unlock(&l->lock);
+}
 #else
 static void list_lru_register(struct list_lru *lru)
 {
@@ -99,31 +132,48 @@ list_lru_from_memcg_idx(struct list_lru *lru, int nid, int idx)
 }
 
 static inline struct list_lru_one *
-list_lru_from_memcg(struct list_lru *lru, int nid, int idx)
+lock_list_lru_of_memcg(struct list_lru *lru, int nid, struct mem_cgroup *memcg,
+		       bool irq, bool skip_empty)
 {
-	return &lru->node[nid].lru;
+	struct list_lru_one *l = &lru->node[nid].lru;
+
+	if (irq)
+		spin_lock_irq(&l->lock);
+	else
+		spin_lock(&l->lock);
+
+	return l;
+}
+
+static inline void unlock_list_lru(struct list_lru_one *l, bool irq_off)
+{
+	if (irq_off)
+		spin_unlock_irq(&l->lock);
+	else
+		spin_unlock(&l->lock);
 }
 #endif /* CONFIG_MEMCG */
 
 /* The caller must ensure the memcg lifetime. */
 bool list_lru_add(struct list_lru *lru, struct list_head *item, int nid,
-		    struct mem_cgroup *memcg)
+		  struct mem_cgroup *memcg)
 {
 	struct list_lru_node *nlru = &lru->node[nid];
 	struct list_lru_one *l;
 
-	spin_lock(&nlru->lock);
+	l = lock_list_lru_of_memcg(lru, nid, memcg, false, false);
+	if (!l)
+		return false;
 	if (list_empty(item)) {
-		l = list_lru_from_memcg(lru, nid, memcg);
 		list_add_tail(item, &l->list);
 		/* Set shrinker bit if the first element was added */
 		if (!l->nr_items++)
 			set_shrinker_bit(memcg, nid, lru_shrinker_id(lru));
-		nlru->nr_items++;
-		spin_unlock(&nlru->lock);
+		unlock_list_lru(l, false);
+		atomic_long_inc(&nlru->nr_items);
 		return true;
 	}
-	spin_unlock(&nlru->lock);
+	unlock_list_lru(l, false);
 	return false;
 }
 
@@ -146,24 +196,23 @@ EXPORT_SYMBOL_GPL(list_lru_add_obj);
 
 /* The caller must ensure the memcg lifetime. */
 bool list_lru_del(struct list_lru *lru, struct list_head *item, int nid,
-		    struct mem_cgroup *memcg)
+		  struct mem_cgroup *memcg)
 {
 	struct list_lru_node *nlru = &lru->node[nid];
 	struct list_lru_one *l;
-
-	spin_lock(&nlru->lock);
+	l = lock_list_lru_of_memcg(lru, nid, memcg, false, false);
+	if (!l)
+		return false;
 	if (!list_empty(item)) {
-		l = list_lru_from_memcg(lru, nid, memcg);
 		list_del_init(item);
 		l->nr_items--;
-		nlru->nr_items--;
-		spin_unlock(&nlru->lock);
+		unlock_list_lru(l, false);
+		atomic_long_dec(&nlru->nr_items);
 		return true;
 	}
-	spin_unlock(&nlru->lock);
+	unlock_list_lru(l, false);
 	return false;
 }
-EXPORT_SYMBOL_GPL(list_lru_del);
 
 bool list_lru_del_obj(struct list_lru *lru, struct list_head *item)
 {
@@ -220,25 +269,24 @@ unsigned long list_lru_count_node(struct list_lru *lru, int nid)
 	struct list_lru_node *nlru;
 
 	nlru = &lru->node[nid];
-	return nlru->nr_items;
+	return atomic_long_read(&nlru->nr_items);
 }
 EXPORT_SYMBOL_GPL(list_lru_count_node);
 
 static unsigned long
-__list_lru_walk_one(struct list_lru *lru, int nid, int memcg_idx,
+__list_lru_walk_one(struct list_lru *lru, int nid, struct mem_cgroup *memcg,
 		    list_lru_walk_cb isolate, void *cb_arg,
-		    unsigned long *nr_to_walk)
+		    unsigned long *nr_to_walk, bool irq_off)
 {
 	struct list_lru_node *nlru = &lru->node[nid];
-	struct list_lru_one *l;
+	struct list_lru_one *l = NULL;
 	struct list_head *item, *n;
 	unsigned long isolated = 0;
 
 restart:
-	l = list_lru_from_memcg_idx(lru, nid, memcg_idx);
+	l = lock_list_lru_of_memcg(lru, nid, memcg, irq_off, true);
 	if (!l)
-		goto out;
-
+		return isolated;
 	list_for_each_safe(item, n, &l->list) {
 		enum lru_status ret;
 
@@ -250,19 +298,19 @@ __list_lru_walk_one(struct list_lru *lru, int nid, int memcg_idx,
 			break;
 		--*nr_to_walk;
 
-		ret = isolate(item, l, &nlru->lock, cb_arg);
+		ret = isolate(item, l, &l->lock, cb_arg);
 		switch (ret) {
+		/*
+		 * LRU_RETRY, LRU_REMOVED_RETRY and LRU_STOP will drop the lru
+		 * lock. List traversal will have to restart from scratch.
+		 */
+		case LRU_RETRY:
+			goto restart;
 		case LRU_REMOVED_RETRY:
-			assert_spin_locked(&nlru->lock);
 			fallthrough;
 		case LRU_REMOVED:
 			isolated++;
-			nlru->nr_items--;
-			/*
-			 * If the lru lock has been dropped, our list
-			 * traversal is now invalid and so we have to
-			 * restart from scratch.
-			 */
+			atomic_long_dec(&nlru->nr_items);
 			if (ret == LRU_REMOVED_RETRY)
 				goto restart;
 			break;
@@ -271,20 +319,13 @@ __list_lru_walk_one(struct list_lru *lru, int nid, int memcg_idx,
 			break;
 		case LRU_SKIP:
 			break;
-		case LRU_RETRY:
-			/*
-			 * The lru lock has been dropped, our list traversal is
-			 * now invalid and so we have to restart from scratch.
-			 */
-			assert_spin_locked(&nlru->lock);
-			goto restart;
 		case LRU_STOP:
-			assert_spin_locked(&nlru->lock);
 			goto out;
 		default:
 			BUG();
 		}
 	}
+	unlock_list_lru(l, irq_off);
 out:
 	return isolated;
 }
@@ -294,14 +335,8 @@ list_lru_walk_one(struct list_lru *lru, int nid, struct mem_cgroup *memcg,
 		  list_lru_walk_cb isolate, void *cb_arg,
 		  unsigned long *nr_to_walk)
 {
-	struct list_lru_node *nlru = &lru->node[nid];
-	unsigned long ret;
-
-	spin_lock(&nlru->lock);
-	ret = __list_lru_walk_one(lru, nid, memcg_kmem_id(memcg), isolate,
-				  cb_arg, nr_to_walk);
-	spin_unlock(&nlru->lock);
-	return ret;
+	return __list_lru_walk_one(lru, nid, memcg, isolate,
+				   cb_arg, nr_to_walk, false);
 }
 EXPORT_SYMBOL_GPL(list_lru_walk_one);
 
@@ -310,14 +345,8 @@ list_lru_walk_one_irq(struct list_lru *lru, int nid, struct mem_cgroup *memcg,
 		      list_lru_walk_cb isolate, void *cb_arg,
 		      unsigned long *nr_to_walk)
 {
-	struct list_lru_node *nlru = &lru->node[nid];
-	unsigned long ret;
-
-	spin_lock_irq(&nlru->lock);
-	ret = __list_lru_walk_one(lru, nid, memcg_kmem_id(memcg), isolate,
-				  cb_arg, nr_to_walk);
-	spin_unlock_irq(&nlru->lock);
-	return ret;
+	return __list_lru_walk_one(lru, nid, memcg, isolate,
+				   cb_arg, nr_to_walk, true);
 }
 
 unsigned long list_lru_walk_node(struct list_lru *lru, int nid,
@@ -332,16 +361,21 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid,
 #ifdef CONFIG_MEMCG
 	if (*nr_to_walk > 0 && list_lru_memcg_aware(lru)) {
 		struct list_lru_memcg *mlru;
+		struct mem_cgroup *memcg;
 		unsigned long index;
 
 		xa_for_each(&lru->xa, index, mlru) {
-			struct list_lru_node *nlru = &lru->node[nid];
-
-			spin_lock(&nlru->lock);
-			isolated += __list_lru_walk_one(lru, nid, index,
+			rcu_read_lock();
+			memcg = mem_cgroup_from_id(index);
+			if (!mem_cgroup_tryget(memcg)) {
+				rcu_read_unlock();
+				continue;
+			}
+			rcu_read_unlock();
+			isolated += __list_lru_walk_one(lru, nid, memcg,
 							isolate, cb_arg,
-							nr_to_walk);
-			spin_unlock(&nlru->lock);
+							nr_to_walk, false);
+			mem_cgroup_put(memcg);
 
 			if (*nr_to_walk <= 0)
 				break;
@@ -353,14 +387,19 @@ unsigned long list_lru_walk_node(struct list_lru *lru, int nid,
 }
 EXPORT_SYMBOL_GPL(list_lru_walk_node);
 
-static void init_one_lru(struct list_lru_one *l)
+static void init_one_lru(struct list_lru *lru, struct list_lru_one *l)
 {
 	INIT_LIST_HEAD(&l->list);
+	spin_lock_init(&l->lock);
 	l->nr_items = 0;
+#ifdef CONFIG_LOCKDEP
+	if (lru->key)
+		lockdep_set_class(&l->lock, lru->key);
+#endif
 }
 
 #ifdef CONFIG_MEMCG
-static struct list_lru_memcg *memcg_init_list_lru_one(gfp_t gfp)
+static struct list_lru_memcg *memcg_init_list_lru_one(struct list_lru *lru, gfp_t gfp)
 {
 	int nid;
 	struct list_lru_memcg *mlru;
@@ -370,7 +409,7 @@ static struct list_lru_memcg *memcg_init_list_lru_one(gfp_t gfp)
 		return NULL;
 
 	for_each_node(nid)
-		init_one_lru(&mlru->node[nid]);
+		init_one_lru(lru, &mlru->node[nid]);
 
 	return mlru;
 }
@@ -398,28 +437,27 @@ static void memcg_destroy_list_lru(struct list_lru *lru)
 	xas_unlock_irq(&xas);
 }
 
-static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
-					 struct list_lru_one *src,
-					 struct mem_cgroup *dst_memcg)
+static void memcg_reparent_list_lru_one(struct list_lru *lru, int nid,
+					struct list_lru_one *src,
+					struct mem_cgroup *dst_memcg)
 {
-	struct list_lru_node *nlru = &lru->node[nid];
+	int dst_idx = dst_memcg->kmemcg_id;
 	struct list_lru_one *dst;
 
-	/*
-	 * Since list_lru_{add,del} may be called under an IRQ-safe lock,
-	 * we have to use IRQ-safe primitives here to avoid deadlock.
-	 */
-	spin_lock_irq(&nlru->lock);
-	dst = list_lru_from_memcg_idx(lru, nid, memcg_kmem_id(dst_memcg));
+	spin_lock_irq(&src->lock);
+	dst = list_lru_from_memcg_idx(lru, nid, dst_idx);
+	spin_lock_nested(&dst->lock, SINGLE_DEPTH_NESTING);
 
 	list_splice_init(&src->list, &dst->list);
-
 	if (src->nr_items) {
 		dst->nr_items += src->nr_items;
 		set_shrinker_bit(dst_memcg, nid, lru_shrinker_id(lru));
-		src->nr_items = 0;
 	}
-	spin_unlock_irq(&nlru->lock);
+	/* Mark the list_lru_one dead */
+	src->nr_items = LONG_MIN;
+
+	spin_unlock(&dst->lock);
+	spin_unlock_irq(&src->lock);
 }
 
 void memcg_reparent_list_lrus(struct mem_cgroup *memcg, struct mem_cgroup *parent)
@@ -448,7 +486,7 @@ void memcg_reparent_list_lrus(struct mem_cgroup *memcg, struct mem_cgroup *paren
 		 * safe to reparent.
 		 */
 		for_each_node(i)
-			memcg_reparent_list_lru_node(lru, i, &mlru->node[i], parent);
+			memcg_reparent_list_lru_one(lru, i, &mlru->node[i], parent);
 
 		/*
 		 * Here all list_lrus corresponding to the cgroup are guaranteed
@@ -497,7 +535,7 @@ int memcg_list_lru_alloc(struct mem_cgroup *memcg, struct list_lru *lru,
 			parent = parent_mem_cgroup(pos);
 		}
 
-		mlru = memcg_init_list_lru_one(gfp);
+		mlru = memcg_init_list_lru_one(lru, gfp);
 		if (!mlru)
 			return -ENOMEM;
 		xas_set(&xas, pos->kmemcg_id);
@@ -544,14 +582,8 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware, struct shrinker *shr
 	if (!lru->node)
 		return -ENOMEM;
 
-	for_each_node(i) {
-		spin_lock_init(&lru->node[i].lock);
-#ifdef CONFIG_LOCKDEP
-		if (lru->key)
-			lockdep_set_class(&lru->node[i].lock, lru->key);
-#endif
-		init_one_lru(&lru->node[i].lru);
-	}
+	for_each_node(i)
+		init_one_lru(lru, &lru->node[i].lru);
 
 	memcg_init_list_lru(lru, memcg_aware);
 	list_lru_register(lru);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 8e90aa026c47..f421dfcfe8a1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3098,8 +3098,13 @@ static void memcg_offline_kmem(struct mem_cgroup *memcg)
 	if (!parent)
 		parent = root_mem_cgroup;
 
-	memcg_reparent_objcgs(memcg, parent);
 	memcg_reparent_list_lrus(memcg, parent);
+
+	/*
+	 * Objcg's reparenting must be after list_lru's, make sure list_lru
+	 * helpers won't use parent's list_lru until child is drained.
+	 */
+	memcg_reparent_objcgs(memcg, parent);
 }
 
 #ifdef CONFIG_CGROUP_WRITEBACK
diff --git a/mm/workingset.c b/mm/workingset.c
index df3937c5eedc..8c4b6738dcad 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -777,7 +777,6 @@ static enum lru_status shadow_lru_isolate(struct list_head *item,
 	ret = LRU_REMOVED_RETRY;
 out:
 	cond_resched();
-	spin_lock_irq(lru_lock);
 	return ret;
 }
 
diff --git a/mm/zswap.c b/mm/zswap.c
index 6910c37cb8ec..63bcd94dc2cb 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -705,9 +705,9 @@ static void zswap_lru_add(struct list_lru *list_lru, struct zswap_entry *entry)
 	 * Note that it is safe to use rcu_read_lock() here, even in the face of
 	 * concurrent memcg offlining:
 	 *
-	 * 1. list_lru_add() is called before list_lru_memcg is erased. The
+	 * 1. list_lru_add() is called before list_lru_one is dead. The
 	 *    new entry will be reparented to memcg's parent's list_lru.
-	 * 2. list_lru_add() is called after list_lru_memcg is erased. The
+	 * 2. list_lru_add() is called after list_lru_one is dead. The
 	 *    new entry will be added directly to memcg's parent's list_lru.
 	 *
 	 * Similar reasoning holds for list_lru_del().
@@ -1172,7 +1172,6 @@ static enum lru_status shrink_memcg_cb(struct list_head *item, struct list_lru_o
 		zswap_written_back_pages++;
 	}
 
-	spin_lock(lock);
 	return ret;
 }
 
-- 
2.47.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v3 6/6] mm/list_lru: Simplify the list_lru walk callback function
  2024-11-04 17:52 [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope Kairui Song
                   ` (4 preceding siblings ...)
  2024-11-04 17:52 ` [PATCH v3 5/6] mm/list_lru: split the lock to per-cgroup scope Kairui Song
@ 2024-11-04 17:52 ` Kairui Song
  5 siblings, 0 replies; 7+ messages in thread
From: Kairui Song @ 2024-11-04 17:52 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, Matthew Wilcox, Johannes Weiner, Roman Gushchin,
	Waiman Long, Shakeel Butt, Michal Hocko, Chengming Zhou,
	Qi Zheng, Muchun Song, Kairui Song

From: Kairui Song <kasong@tencent.com>

Now isolation no longer takes the list_lru global node lock, only use the
per-cgroup lock instead. And this lock is inside the list_lru_one being
walked, no longer needed to pass the lock explicitly.

Signed-off-by: Kairui Song <kasong@tencent.com>
---
 drivers/android/binder_alloc.c |  7 +++----
 drivers/android/binder_alloc.h |  2 +-
 fs/dcache.c                    |  4 ++--
 fs/gfs2/quota.c                |  2 +-
 fs/inode.c                     |  4 ++--
 fs/nfs/nfs42xattr.c            |  4 ++--
 fs/nfsd/filecache.c            |  5 +----
 fs/xfs/xfs_buf.c               |  2 --
 fs/xfs/xfs_qm.c                |  5 ++---
 include/linux/list_lru.h       |  2 +-
 mm/list_lru.c                  |  2 +-
 mm/workingset.c                | 15 +++++++--------
 mm/zswap.c                     |  4 ++--
 13 files changed, 25 insertions(+), 33 deletions(-)

diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 86bbe40f4bcd..a738e7745865 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -1047,7 +1047,7 @@ void binder_alloc_vma_close(struct binder_alloc *alloc)
 /**
  * binder_alloc_free_page() - shrinker callback to free pages
  * @item:   item to free
- * @lock:   lock protecting the item
+ * @lru:    list_lru instance of the item
  * @cb_arg: callback argument
  *
  * Called from list_lru_walk() in binder_shrink_scan() to free
@@ -1055,9 +1055,8 @@ void binder_alloc_vma_close(struct binder_alloc *alloc)
  */
 enum lru_status binder_alloc_free_page(struct list_head *item,
 				       struct list_lru_one *lru,
-				       spinlock_t *lock,
 				       void *cb_arg)
-	__must_hold(lock)
+	__must_hold(&lru->lock)
 {
 	struct binder_lru_page *page = container_of(item, typeof(*page), lru);
 	struct binder_alloc *alloc = page->alloc;
@@ -1092,7 +1091,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
 
 	list_lru_isolate(lru, item);
 	spin_unlock(&alloc->lock);
-	spin_unlock(lock);
+	spin_unlock(&lru->lock);
 
 	if (vma) {
 		trace_binder_unmap_user_start(alloc, index);
diff --git a/drivers/android/binder_alloc.h b/drivers/android/binder_alloc.h
index 70387234477e..c02c8ebcb466 100644
--- a/drivers/android/binder_alloc.h
+++ b/drivers/android/binder_alloc.h
@@ -118,7 +118,7 @@ static inline void binder_selftest_alloc(struct binder_alloc *alloc) {}
 #endif
 enum lru_status binder_alloc_free_page(struct list_head *item,
 				       struct list_lru_one *lru,
-				       spinlock_t *lock, void *cb_arg);
+				       void *cb_arg);
 struct binder_buffer *binder_alloc_new_buf(struct binder_alloc *alloc,
 					   size_t data_size,
 					   size_t offsets_size,
diff --git a/fs/dcache.c b/fs/dcache.c
index 0f6b16ba30d0..d7f6866f5f52 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1089,7 +1089,7 @@ void shrink_dentry_list(struct list_head *list)
 }
 
 static enum lru_status dentry_lru_isolate(struct list_head *item,
-		struct list_lru_one *lru, spinlock_t *lru_lock, void *arg)
+		struct list_lru_one *lru, void *arg)
 {
 	struct list_head *freeable = arg;
 	struct dentry	*dentry = container_of(item, struct dentry, d_lru);
@@ -1170,7 +1170,7 @@ long prune_dcache_sb(struct super_block *sb, struct shrink_control *sc)
 }
 
 static enum lru_status dentry_lru_isolate_shrink(struct list_head *item,
-		struct list_lru_one *lru, spinlock_t *lru_lock, void *arg)
+		struct list_lru_one *lru, void *arg)
 {
 	struct list_head *freeable = arg;
 	struct dentry	*dentry = container_of(item, struct dentry, d_lru);
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c
index 2e6bc77f4f81..72b48f6f5561 100644
--- a/fs/gfs2/quota.c
+++ b/fs/gfs2/quota.c
@@ -149,7 +149,7 @@ static void gfs2_qd_list_dispose(struct list_head *list)
 
 
 static enum lru_status gfs2_qd_isolate(struct list_head *item,
-		struct list_lru_one *lru, spinlock_t *lru_lock, void *arg)
+		struct list_lru_one *lru, void *arg)
 {
 	struct list_head *dispose = arg;
 	struct gfs2_quota_data *qd =
diff --git a/fs/inode.c b/fs/inode.c
index 442cb4fc09b2..46fbd5b23482 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -881,7 +881,7 @@ void invalidate_inodes(struct super_block *sb)
  * with this flag set because they are the inodes that are out of order.
  */
 static enum lru_status inode_lru_isolate(struct list_head *item,
-		struct list_lru_one *lru, spinlock_t *lru_lock, void *arg)
+		struct list_lru_one *lru, void *arg)
 {
 	struct list_head *freeable = arg;
 	struct inode	*inode = container_of(item, struct inode, i_lru);
@@ -923,7 +923,7 @@ static enum lru_status inode_lru_isolate(struct list_head *item,
 	if (inode_has_buffers(inode) || !mapping_empty(&inode->i_data)) {
 		inode_pin_lru_isolating(inode);
 		spin_unlock(&inode->i_lock);
-		spin_unlock(lru_lock);
+		spin_unlock(&lru->lock);
 		if (remove_inode_buffers(inode)) {
 			unsigned long reap;
 			reap = invalidate_mapping_pages(&inode->i_data, 0, -1);
diff --git a/fs/nfs/nfs42xattr.c b/fs/nfs/nfs42xattr.c
index b6e3d8f77b91..37d79400e5f4 100644
--- a/fs/nfs/nfs42xattr.c
+++ b/fs/nfs/nfs42xattr.c
@@ -802,7 +802,7 @@ static struct shrinker *nfs4_xattr_large_entry_shrinker;
 
 static enum lru_status
 cache_lru_isolate(struct list_head *item,
-	struct list_lru_one *lru, spinlock_t *lru_lock, void *arg)
+	struct list_lru_one *lru, void *arg)
 {
 	struct list_head *dispose = arg;
 	struct inode *inode;
@@ -867,7 +867,7 @@ nfs4_xattr_cache_count(struct shrinker *shrink, struct shrink_control *sc)
 
 static enum lru_status
 entry_lru_isolate(struct list_head *item,
-	struct list_lru_one *lru, spinlock_t *lru_lock, void *arg)
+	struct list_lru_one *lru, void *arg)
 {
 	struct list_head *dispose = arg;
 	struct nfs4_xattr_bucket *bucket;
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c
index 2e6783f63712..09c444eb944f 100644
--- a/fs/nfsd/filecache.c
+++ b/fs/nfsd/filecache.c
@@ -487,7 +487,6 @@ void nfsd_file_net_dispose(struct nfsd_net *nn)
  * nfsd_file_lru_cb - Examine an entry on the LRU list
  * @item: LRU entry to examine
  * @lru: controlling LRU
- * @lock: LRU list lock (unused)
  * @arg: dispose list
  *
  * Return values:
@@ -497,9 +496,7 @@ void nfsd_file_net_dispose(struct nfsd_net *nn)
  */
 static enum lru_status
 nfsd_file_lru_cb(struct list_head *item, struct list_lru_one *lru,
-		 spinlock_t *lock, void *arg)
-	__releases(lock)
-	__acquires(lock)
+		 void *arg)
 {
 	struct list_head *head = arg;
 	struct nfsd_file *nf = list_entry(item, struct nfsd_file, nf_lru);
diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index aa4dbda7b536..43b914c1f621 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1857,7 +1857,6 @@ static enum lru_status
 xfs_buftarg_drain_rele(
 	struct list_head	*item,
 	struct list_lru_one	*lru,
-	spinlock_t		*lru_lock,
 	void			*arg)
 
 {
@@ -1956,7 +1955,6 @@ static enum lru_status
 xfs_buftarg_isolate(
 	struct list_head	*item,
 	struct list_lru_one	*lru,
-	spinlock_t		*lru_lock,
 	void			*arg)
 {
 	struct xfs_buf		*bp = container_of(item, struct xfs_buf, b_lru);
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 665d26990b78..8413ac368042 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -412,9 +412,8 @@ static enum lru_status
 xfs_qm_dquot_isolate(
 	struct list_head	*item,
 	struct list_lru_one	*lru,
-	spinlock_t		*lru_lock,
 	void			*arg)
-		__releases(lru_lock) __acquires(lru_lock)
+		__releases(&lru->lock) __acquires(&lru->lock)
 {
 	struct xfs_dquot	*dqp = container_of(item,
 						struct xfs_dquot, q_lru);
@@ -460,7 +459,7 @@ xfs_qm_dquot_isolate(
 		trace_xfs_dqreclaim_dirty(dqp);
 
 		/* we have to drop the LRU lock to flush the dquot */
-		spin_unlock(lru_lock);
+		spin_unlock(&lru->lock);
 
 		error = xfs_qm_dqflush(dqp, &bp);
 		if (error)
diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index 10ba9a54d42c..05c166811f6b 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -184,7 +184,7 @@ void list_lru_isolate_move(struct list_lru_one *list, struct list_head *item,
 			   struct list_head *head);
 
 typedef enum lru_status (*list_lru_walk_cb)(struct list_head *item,
-		struct list_lru_one *list, spinlock_t *lock, void *cb_arg);
+		struct list_lru_one *list, void *cb_arg);
 
 /**
  * list_lru_walk_one: walk a @lru, isolating and disposing freeable items.
diff --git a/mm/list_lru.c b/mm/list_lru.c
index c139202e27f7..f93ada6a207b 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -298,7 +298,7 @@ __list_lru_walk_one(struct list_lru *lru, int nid, struct mem_cgroup *memcg,
 			break;
 		--*nr_to_walk;
 
-		ret = isolate(item, l, &l->lock, cb_arg);
+		ret = isolate(item, l, cb_arg);
 		switch (ret) {
 		/*
 		 * LRU_RETRY, LRU_REMOVED_RETRY and LRU_STOP will drop the lru
diff --git a/mm/workingset.c b/mm/workingset.c
index 8c4b6738dcad..4b58ef535a17 100644
--- a/mm/workingset.c
+++ b/mm/workingset.c
@@ -712,8 +712,7 @@ static unsigned long count_shadow_nodes(struct shrinker *shrinker,
 
 static enum lru_status shadow_lru_isolate(struct list_head *item,
 					  struct list_lru_one *lru,
-					  spinlock_t *lru_lock,
-					  void *arg) __must_hold(lru_lock)
+					  void *arg) __must_hold(lru->lock)
 {
 	struct xa_node *node = container_of(item, struct xa_node, private_list);
 	struct address_space *mapping;
@@ -722,20 +721,20 @@ static enum lru_status shadow_lru_isolate(struct list_head *item,
 	/*
 	 * Page cache insertions and deletions synchronously maintain
 	 * the shadow node LRU under the i_pages lock and the
-	 * lru_lock.  Because the page cache tree is emptied before
-	 * the inode can be destroyed, holding the lru_lock pins any
+	 * &lru->lock. Because the page cache tree is emptied before
+	 * the inode can be destroyed, holding the &lru->lock pins any
 	 * address_space that has nodes on the LRU.
 	 *
 	 * We can then safely transition to the i_pages lock to
 	 * pin only the address_space of the particular node we want
-	 * to reclaim, take the node off-LRU, and drop the lru_lock.
+	 * to reclaim, take the node off-LRU, and drop the &lru->lock.
 	 */
 
 	mapping = container_of(node->array, struct address_space, i_pages);
 
 	/* Coming from the list, invert the lock order */
 	if (!xa_trylock(&mapping->i_pages)) {
-		spin_unlock_irq(lru_lock);
+		spin_unlock_irq(&lru->lock);
 		ret = LRU_RETRY;
 		goto out;
 	}
@@ -744,7 +743,7 @@ static enum lru_status shadow_lru_isolate(struct list_head *item,
 	if (mapping->host != NULL) {
 		if (!spin_trylock(&mapping->host->i_lock)) {
 			xa_unlock(&mapping->i_pages);
-			spin_unlock_irq(lru_lock);
+			spin_unlock_irq(&lru->lock);
 			ret = LRU_RETRY;
 			goto out;
 		}
@@ -753,7 +752,7 @@ static enum lru_status shadow_lru_isolate(struct list_head *item,
 	list_lru_isolate(lru, item);
 	__dec_node_page_state(virt_to_page(node), WORKINGSET_NODES);
 
-	spin_unlock(lru_lock);
+	spin_unlock(&lru->lock);
 
 	/*
 	 * The nodes should only contain one or more shadow entries,
diff --git a/mm/zswap.c b/mm/zswap.c
index 63bcd94dc2cb..0e29a0b0db71 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -1095,7 +1095,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry,
  *    for reclaim by this ratio.
  */
 static enum lru_status shrink_memcg_cb(struct list_head *item, struct list_lru_one *l,
-				       spinlock_t *lock, void *arg)
+				       void *arg)
 {
 	struct zswap_entry *entry = container_of(item, struct zswap_entry, lru);
 	bool *encountered_page_in_swapcache = (bool *)arg;
@@ -1151,7 +1151,7 @@ static enum lru_status shrink_memcg_cb(struct list_head *item, struct list_lru_o
 	 * It's safe to drop the lock here because we return either
 	 * LRU_REMOVED_RETRY or LRU_RETRY.
 	 */
-	spin_unlock(lock);
+	spin_unlock(&l->lock);
 
 	writeback_result = zswap_writeback_entry(entry, swpentry);
 
-- 
2.47.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-11-04 17:55 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-11-04 17:52 [PATCH v3 0/6] mm/list_lru: Split list_lru lock into per-cgroup scope Kairui Song
2024-11-04 17:52 ` [PATCH v3 1/6] mm/list_lru: don't pass unnecessary key parameters Kairui Song
2024-11-04 17:52 ` [PATCH v3 2/6] mm/list_lru: don't export list_lru_add Kairui Song
2024-11-04 17:52 ` [PATCH v3 3/6] mm/list_lru: code clean up for reparenting Kairui Song
2024-11-04 17:52 ` [PATCH v3 4/6] mm/list_lru: simplify reparenting and initial allocation Kairui Song
2024-11-04 17:52 ` [PATCH v3 5/6] mm/list_lru: split the lock to per-cgroup scope Kairui Song
2024-11-04 17:52 ` [PATCH v3 6/6] mm/list_lru: Simplify the list_lru walk callback function Kairui Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox