* [PATCH mm-unstable v4 0/5] Improve mem_cgroup_iter()
@ 2024-09-05 0:30 Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 1/5] cgroup: clarify css sibling linkage is protected by cgroup_mutex or RCU Kinsey Ho
` (4 more replies)
0 siblings, 5 replies; 6+ messages in thread
From: Kinsey Ho @ 2024-09-05 0:30 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, cgroups, Yosry Ahmed, Roman Gushchin,
Johannes Weiner, Michal Hocko, Shakeel Butt, Muchun Song,
Tejun Heo, Zefan Li, mkoutny, T . J . Mercier, Hugh Dickins,
Kinsey Ho
Incremental cgroup iteration is being used again [1]. This patchset
improves the reliability of mem_cgroup_iter(). It also improves
simplicity and code readability.
[1] https://lore.kernel.org/20240514202641.2821494-1-hannes@cmpxchg.org/
---
v4: Fixed memcg restart bug reported by hughd in v3, patch 4/5
v3: https://lore.kernel.org/20240827230753.2073580-1-kinseyho@google.com/
Removed __rcu tag from patch 2/5 which removes the need for
rcu_dereference(). This helps readability.
v2: https://lore.kernel.org/20240813204716.842811-1-kinseyho@google.com/
Add patch to clarify css sibling linkage is RCU protected. The
kernel build bot RCU sparse error from v1 has been ignored.
v1: https://lore.kernel.org/20240724190214.1108049-1-kinseyho@google.com/
Kinsey Ho (5):
cgroup: clarify css sibling linkage is protected by cgroup_mutex or
RCU
mm: don't hold css->refcnt during traversal
mm: increment gen # before restarting traversal
mm: restart if multiple traversals raced
mm: clean up mem_cgroup_iter()
include/linux/cgroup-defs.h | 6 ++-
include/linux/memcontrol.h | 4 +-
kernel/cgroup/cgroup.c | 16 +++----
mm/memcontrol.c | 84 ++++++++++++++++---------------------
4 files changed, 51 insertions(+), 59 deletions(-)
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH mm-unstable v4 1/5] cgroup: clarify css sibling linkage is protected by cgroup_mutex or RCU
2024-09-05 0:30 [PATCH mm-unstable v4 0/5] Improve mem_cgroup_iter() Kinsey Ho
@ 2024-09-05 0:30 ` Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 2/5] mm: don't hold css->refcnt during traversal Kinsey Ho
` (3 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Kinsey Ho @ 2024-09-05 0:30 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, cgroups, Yosry Ahmed, Roman Gushchin,
Johannes Weiner, Michal Hocko, Shakeel Butt, Muchun Song,
Tejun Heo, Zefan Li, mkoutny, T . J . Mercier, Hugh Dickins,
Kinsey Ho
Explicitly document that css sibling/descendant linkage is protected by
cgroup_mutex or RCU. Also, document in css_next_descendant_pre() and
similar functions that it isn't necessary to hold a ref on @pos.
The following changes in this patchset rely on this clarification
for simplification in memcg iteration code.
Suggested-by: Yosry Ahmed <yosryahmed@google.com>
Reviewed-by: Michal Koutný <mkoutny@suse.com>
Signed-off-by: Kinsey Ho <kinseyho@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: T.J. Mercier <tjmercier@google.com>
---
include/linux/cgroup-defs.h | 6 +++++-
kernel/cgroup/cgroup.c | 16 +++++++++-------
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index 7fc2d0195f56..ca7e912b8355 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -172,7 +172,11 @@ struct cgroup_subsys_state {
/* reference count - access via css_[try]get() and css_put() */
struct percpu_ref refcnt;
- /* siblings list anchored at the parent's ->children */
+ /*
+ * siblings list anchored at the parent's ->children
+ *
+ * linkage is protected by cgroup_mutex or RCU
+ */
struct list_head sibling;
struct list_head children;
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 0a97cb2ef124..ece2316e2bca 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -4602,8 +4602,9 @@ struct cgroup_subsys_state *css_next_child(struct cgroup_subsys_state *pos,
*
* While this function requires cgroup_mutex or RCU read locking, it
* doesn't require the whole traversal to be contained in a single critical
- * section. This function will return the correct next descendant as long
- * as both @pos and @root are accessible and @pos is a descendant of @root.
+ * section. Additionally, it isn't necessary to hold onto a reference to @pos.
+ * This function will return the correct next descendant as long as both @pos
+ * and @root are accessible and @pos is a descendant of @root.
*
* If a subsystem synchronizes ->css_online() and the start of iteration, a
* css which finished ->css_online() is guaranteed to be visible in the
@@ -4651,8 +4652,9 @@ EXPORT_SYMBOL_GPL(css_next_descendant_pre);
*
* While this function requires cgroup_mutex or RCU read locking, it
* doesn't require the whole traversal to be contained in a single critical
- * section. This function will return the correct rightmost descendant as
- * long as @pos is accessible.
+ * section. Additionally, it isn't necessary to hold onto a reference to @pos.
+ * This function will return the correct rightmost descendant as long as @pos
+ * is accessible.
*/
struct cgroup_subsys_state *
css_rightmost_descendant(struct cgroup_subsys_state *pos)
@@ -4696,9 +4698,9 @@ css_leftmost_descendant(struct cgroup_subsys_state *pos)
*
* While this function requires cgroup_mutex or RCU read locking, it
* doesn't require the whole traversal to be contained in a single critical
- * section. This function will return the correct next descendant as long
- * as both @pos and @cgroup are accessible and @pos is a descendant of
- * @cgroup.
+ * section. Additionally, it isn't necessary to hold onto a reference to @pos.
+ * This function will return the correct next descendant as long as both @pos
+ * and @cgroup are accessible and @pos is a descendant of @cgroup.
*
* If a subsystem synchronizes ->css_online() and the start of iteration, a
* css which finished ->css_online() is guaranteed to be visible in the
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH mm-unstable v4 2/5] mm: don't hold css->refcnt during traversal
2024-09-05 0:30 [PATCH mm-unstable v4 0/5] Improve mem_cgroup_iter() Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 1/5] cgroup: clarify css sibling linkage is protected by cgroup_mutex or RCU Kinsey Ho
@ 2024-09-05 0:30 ` Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 3/5] mm: increment gen # before restarting traversal Kinsey Ho
` (2 subsequent siblings)
4 siblings, 0 replies; 6+ messages in thread
From: Kinsey Ho @ 2024-09-05 0:30 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, cgroups, Yosry Ahmed, Roman Gushchin,
Johannes Weiner, Michal Hocko, Shakeel Butt, Muchun Song,
Tejun Heo, Zefan Li, mkoutny, T . J . Mercier, Hugh Dickins,
Kinsey Ho
To obtain the pointer to the next memcg position, mem_cgroup_iter()
currently holds css->refcnt during memcg traversal only to put
css->refcnt at the end of the routine. This isn't necessary as an
rcu_read_lock is already held throughout the function. The use of
the RCU read lock with css_next_descendant_pre() guarantees that
sibling linkage is safe without holding a ref on the passed-in @css.
Remove css->refcnt usage during traversal by leveraging RCU.
Signed-off-by: Kinsey Ho <kinseyho@google.com>
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Hugh Dickins <hughd@google.com>
---
mm/memcontrol.c | 18 +-----------------
1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index dbc1d43a5c4c..cdd324812b55 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -1015,20 +1015,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
else if (reclaim->generation != iter->generation)
goto out_unlock;
- while (1) {
- pos = READ_ONCE(iter->position);
- if (!pos || css_tryget(&pos->css))
- break;
- /*
- * css reference reached zero, so iter->position will
- * be cleared by ->css_released. However, we should not
- * rely on this happening soon, because ->css_released
- * is called from a work queue, and by busy-waiting we
- * might block it. So we clear iter->position right
- * away.
- */
- (void)cmpxchg(&iter->position, pos, NULL);
- }
+ pos = READ_ONCE(iter->position);
} else if (prev) {
pos = prev;
}
@@ -1069,9 +1056,6 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
*/
(void)cmpxchg(&iter->position, pos, memcg);
- if (pos)
- css_put(&pos->css);
-
if (!memcg)
iter->generation++;
}
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH mm-unstable v4 3/5] mm: increment gen # before restarting traversal
2024-09-05 0:30 [PATCH mm-unstable v4 0/5] Improve mem_cgroup_iter() Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 1/5] cgroup: clarify css sibling linkage is protected by cgroup_mutex or RCU Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 2/5] mm: don't hold css->refcnt during traversal Kinsey Ho
@ 2024-09-05 0:30 ` Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 4/5] mm: restart if multiple traversals raced Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 5/5] mm: clean up mem_cgroup_iter() Kinsey Ho
4 siblings, 0 replies; 6+ messages in thread
From: Kinsey Ho @ 2024-09-05 0:30 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, cgroups, Yosry Ahmed, Roman Gushchin,
Johannes Weiner, Michal Hocko, Shakeel Butt, Muchun Song,
Tejun Heo, Zefan Li, mkoutny, T . J . Mercier, Hugh Dickins,
Kinsey Ho
The generation number in struct mem_cgroup_reclaim_iter should be
incremented on every round-trip. Currently, it is possible for a
concurrent reclaimer to jump in at the end of the hierarchy, causing a
traversal restart (resetting the iteration position) without
incrementing the generation number.
By resetting the position without incrementing the generation, it's
possible for another ongoing mem_cgroup_iter() thread to walk the tree
twice.
Move the traversal restart such that the generation number is
incremented before the restart.
Signed-off-by: Kinsey Ho <kinseyho@google.com>
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Hugh Dickins <hughd@google.com>
---
mm/memcontrol.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index cdd324812b55..c24ef6a106e0 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -999,7 +999,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
root = root_mem_cgroup;
rcu_read_lock();
-
+restart:
if (reclaim) {
struct mem_cgroup_per_node *mz;
@@ -1026,14 +1026,6 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
for (;;) {
css = css_next_descendant_pre(css, &root->css);
if (!css) {
- /*
- * Reclaimers share the hierarchy walk, and a
- * new one might jump in right at the end of
- * the hierarchy - make sure they see at least
- * one group and restart from the beginning.
- */
- if (!prev)
- continue;
break;
}
@@ -1056,8 +1048,18 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
*/
(void)cmpxchg(&iter->position, pos, memcg);
- if (!memcg)
+ if (!memcg) {
iter->generation++;
+
+ /*
+ * Reclaimers share the hierarchy walk, and a
+ * new one might jump in right at the end of
+ * the hierarchy - make sure they see at least
+ * one group and restart from the beginning.
+ */
+ if (!prev)
+ goto restart;
+ }
}
out_unlock:
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH mm-unstable v4 4/5] mm: restart if multiple traversals raced
2024-09-05 0:30 [PATCH mm-unstable v4 0/5] Improve mem_cgroup_iter() Kinsey Ho
` (2 preceding siblings ...)
2024-09-05 0:30 ` [PATCH mm-unstable v4 3/5] mm: increment gen # before restarting traversal Kinsey Ho
@ 2024-09-05 0:30 ` Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 5/5] mm: clean up mem_cgroup_iter() Kinsey Ho
4 siblings, 0 replies; 6+ messages in thread
From: Kinsey Ho @ 2024-09-05 0:30 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, cgroups, Yosry Ahmed, Roman Gushchin,
Johannes Weiner, Michal Hocko, Shakeel Butt, Muchun Song,
Tejun Heo, Zefan Li, mkoutny, T . J . Mercier, Hugh Dickins,
Kinsey Ho, syzbot+e099d407346c45275ce9
Currently, if multiple reclaimers raced on the same position, the
reclaimers which detect the race will still reclaim from the same memcg.
Instead, the reclaimers which detect the race should move on to the next
memcg in the hierarchy.
So, in the case where multiple traversals race, jump back to the start
of the mem_cgroup_iter() function to find the next memcg in the
hierarchy to reclaim from.
Reported-by: syzbot+e099d407346c45275ce9@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/000000000000817cf10620e20d33@google.com/
Signed-off-by: Kinsey Ho <kinseyho@google.com>
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Hugh Dickins <hughd@google.com>
---
include/linux/memcontrol.h | 4 ++--
mm/memcontrol.c | 26 +++++++++++++++++---------
2 files changed, 19 insertions(+), 11 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index fe05fdb92779..2ef94c74847d 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -57,7 +57,7 @@ enum memcg_memory_event {
struct mem_cgroup_reclaim_cookie {
pg_data_t *pgdat;
- unsigned int generation;
+ int generation;
};
#ifdef CONFIG_MEMCG
@@ -78,7 +78,7 @@ struct lruvec_stats;
struct mem_cgroup_reclaim_iter {
struct mem_cgroup *position;
/* scan generation, increased every round-trip */
- unsigned int generation;
+ atomic_t generation;
};
/*
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c24ef6a106e0..ce5d066393c9 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -988,8 +988,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
struct mem_cgroup_reclaim_cookie *reclaim)
{
struct mem_cgroup_reclaim_iter *iter;
- struct cgroup_subsys_state *css = NULL;
- struct mem_cgroup *memcg = NULL;
+ struct cgroup_subsys_state *css;
+ struct mem_cgroup *memcg;
struct mem_cgroup *pos = NULL;
if (mem_cgroup_disabled())
@@ -1000,19 +1000,23 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
rcu_read_lock();
restart:
+ memcg = NULL;
+
if (reclaim) {
+ int gen;
struct mem_cgroup_per_node *mz;
mz = root->nodeinfo[reclaim->pgdat->node_id];
iter = &mz->iter;
+ gen = atomic_read(&iter->generation);
/*
* On start, join the current reclaim iteration cycle.
* Exit when a concurrent walker completes it.
*/
if (!prev)
- reclaim->generation = iter->generation;
- else if (reclaim->generation != iter->generation)
+ reclaim->generation = gen;
+ else if (reclaim->generation != gen)
goto out_unlock;
pos = READ_ONCE(iter->position);
@@ -1020,8 +1024,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
pos = prev;
}
- if (pos)
- css = &pos->css;
+ css = pos ? &pos->css : NULL;
for (;;) {
css = css_next_descendant_pre(css, &root->css);
@@ -1035,21 +1038,26 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
* and kicking, and don't take an extra reference.
*/
if (css == &root->css || css_tryget(css)) {
- memcg = mem_cgroup_from_css(css);
break;
}
}
+ memcg = mem_cgroup_from_css(css);
+
if (reclaim) {
/*
* The position could have already been updated by a competing
* thread, so check that the value hasn't changed since we read
* it to avoid reclaiming from the same cgroup twice.
*/
- (void)cmpxchg(&iter->position, pos, memcg);
+ if (cmpxchg(&iter->position, pos, memcg) != pos) {
+ if (css && css != &root->css)
+ css_put(css);
+ goto restart;
+ }
if (!memcg) {
- iter->generation++;
+ atomic_inc(&iter->generation);
/*
* Reclaimers share the hierarchy walk, and a
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH mm-unstable v4 5/5] mm: clean up mem_cgroup_iter()
2024-09-05 0:30 [PATCH mm-unstable v4 0/5] Improve mem_cgroup_iter() Kinsey Ho
` (3 preceding siblings ...)
2024-09-05 0:30 ` [PATCH mm-unstable v4 4/5] mm: restart if multiple traversals raced Kinsey Ho
@ 2024-09-05 0:30 ` Kinsey Ho
4 siblings, 0 replies; 6+ messages in thread
From: Kinsey Ho @ 2024-09-05 0:30 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, cgroups, Yosry Ahmed, Roman Gushchin,
Johannes Weiner, Michal Hocko, Shakeel Butt, Muchun Song,
Tejun Heo, Zefan Li, mkoutny, T . J . Mercier, Hugh Dickins,
Kinsey Ho
A clean up to make variable names more clear and to improve code
readability.
No functional change.
Signed-off-by: Kinsey Ho <kinseyho@google.com>
Reviewed-by: T.J. Mercier <tjmercier@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Michal Koutný <mkoutny@suse.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Roman Gushchin <roman.gushchin@linux.dev>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Tejun Heo <tj@kernel.org>
Cc: Yosry Ahmed <yosryahmed@google.com>
Cc: Zefan Li <lizefan.x@bytedance.com>
Cc: Hugh Dickins <hughd@google.com>
---
mm/memcontrol.c | 32 ++++++++++++--------------------
1 file changed, 12 insertions(+), 20 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index ce5d066393c9..31db8467f19d 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -989,8 +989,8 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
{
struct mem_cgroup_reclaim_iter *iter;
struct cgroup_subsys_state *css;
- struct mem_cgroup *memcg;
- struct mem_cgroup *pos = NULL;
+ struct mem_cgroup *pos;
+ struct mem_cgroup *next;
if (mem_cgroup_disabled())
return NULL;
@@ -1000,14 +1000,13 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
rcu_read_lock();
restart:
- memcg = NULL;
+ next = NULL;
if (reclaim) {
int gen;
- struct mem_cgroup_per_node *mz;
+ int nid = reclaim->pgdat->node_id;
- mz = root->nodeinfo[reclaim->pgdat->node_id];
- iter = &mz->iter;
+ iter = &root->nodeinfo[nid]->iter;
gen = atomic_read(&iter->generation);
/*
@@ -1020,29 +1019,22 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
goto out_unlock;
pos = READ_ONCE(iter->position);
- } else if (prev) {
+ } else
pos = prev;
- }
css = pos ? &pos->css : NULL;
- for (;;) {
- css = css_next_descendant_pre(css, &root->css);
- if (!css) {
- break;
- }
-
+ while ((css = css_next_descendant_pre(css, &root->css))) {
/*
* Verify the css and acquire a reference. The root
* is provided by the caller, so we know it's alive
* and kicking, and don't take an extra reference.
*/
- if (css == &root->css || css_tryget(css)) {
+ if (css == &root->css || css_tryget(css))
break;
- }
}
- memcg = mem_cgroup_from_css(css);
+ next = mem_cgroup_from_css(css);
if (reclaim) {
/*
@@ -1050,13 +1042,13 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
* thread, so check that the value hasn't changed since we read
* it to avoid reclaiming from the same cgroup twice.
*/
- if (cmpxchg(&iter->position, pos, memcg) != pos) {
+ if (cmpxchg(&iter->position, pos, next) != pos) {
if (css && css != &root->css)
css_put(css);
goto restart;
}
- if (!memcg) {
+ if (!next) {
atomic_inc(&iter->generation);
/*
@@ -1075,7 +1067,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root,
if (prev && prev != root)
css_put(&prev->css);
- return memcg;
+ return next;
}
/**
--
2.46.0.469.g59c65b2a67-goog
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2024-09-05 0:32 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-09-05 0:30 [PATCH mm-unstable v4 0/5] Improve mem_cgroup_iter() Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 1/5] cgroup: clarify css sibling linkage is protected by cgroup_mutex or RCU Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 2/5] mm: don't hold css->refcnt during traversal Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 3/5] mm: increment gen # before restarting traversal Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 4/5] mm: restart if multiple traversals raced Kinsey Ho
2024-09-05 0:30 ` [PATCH mm-unstable v4 5/5] mm: clean up mem_cgroup_iter() Kinsey Ho
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox