From: Waiman Long <longman@redhat.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Muchun Song <songmuchun@bytedance.com>,
Roman Gushchin <roman.gushchin@linux.dev>,
Waiman Long <longman@redhat.com>
Subject: [PATCH v2] mm/list_lru: Fix possible race in memcg_reparent_list_lru_node()
Date: Wed, 30 Mar 2022 13:26:46 -0400 [thread overview]
Message-ID: <20220330172646.2687555-1-longman@redhat.com> (raw)
Muchun Song found out there could be a race between list_lru_add()
and memcg_reparent_list_lru_node() causing the later function to miss
reparenting of a lru entry as shown below:
CPU0: CPU1:
list_lru_add()
spin_lock(&nlru->lock)
l = list_lru_from_kmem(memcg)
memcg_reparent_objcgs(memcg)
memcg_reparent_list_lrus(memcg)
memcg_reparent_list_lru()
memcg_reparent_list_lru_node()
if (!READ_ONCE(nlru->nr_items))
// Miss reparenting
return
// Assume 0->1
l->nr_items++
// Assume 0->1
nlru->nr_items++
Though it is not likely that a list_lru_node that has 0 item suddenly
has a newly added lru entry at the end of its life. The race is still
theoretically possible.
With the lock/unlock pair used within the percpu_ref_kill() which is
the last function call of memcg_reparent_objcgs(), any read issued
in memcg_reparent_list_lru_node() will not be reordered before the
reparenting of objcgs.
Adding a !spin_is_locked()/smp_rmb()/!READ_ONCE(nlru->nr_items) check
to ensure that either the reading of nr_items is valid or the racing
list_lru_add() will see the reparented objcg.
Fixes: 405cc51fc104 ("mm/list_lru: optimize memcg_reparent_list_lru_node()")
Reported-by: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
mm/list_lru.c | 31 +++++++++++++++++++++++++++----
1 file changed, 27 insertions(+), 4 deletions(-)
diff --git a/mm/list_lru.c b/mm/list_lru.c
index c669d87001a6..08ff54ffabd6 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -395,10 +395,33 @@ static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
struct list_lru_one *src, *dst;
/*
- * If there is no lru entry in this nlru, we can skip it immediately.
+ * With the lock/unlock pair used within the percpu_ref_kill()
+ * which is the last function call of memcg_reparent_objcgs(), any
+ * read issued here will not be reordered before the reparenting
+ * of objcgs.
+ *
+ * Assuming a racing list_lru_add():
+ * list_lru_add()
+ * <- memcg_reparent_list_lru_node()
+ * spin_lock(&nlru->lock)
+ * l = list_lru_from_kmem(memcg)
+ * nlru->nr_items++
+ * spin_unlock(&nlru->lock)
+ * <- memcg_reparent_list_lru_node()
+ *
+ * The !spin_is_locked(&nlru->lock) check is true means it is
+ * either before the spin_lock() or after the spin_unlock(). In the
+ * former case, list_lru_add() will see the reparented objcg and so
+ * won't touch the lru to be reparented. In the later case, it will
+ * see the updated nr_items. So we can use the optimization that if
+ * there is no lru entry in this nlru, skip it immediately.
*/
- if (!READ_ONCE(nlru->nr_items))
- return;
+ if (!spin_is_locked(&nlru->lock)) {
+ /* nr_items read must be ordered after nlru->lock */
+ smp_rmb();
+ if (!READ_ONCE(nlru->nr_items))
+ return;
+ }
/*
* Since list_lru_{add,del} may be called under an IRQ-safe lock,
@@ -407,7 +430,7 @@ static void memcg_reparent_list_lru_node(struct list_lru *lru, int nid,
spin_lock_irq(&nlru->lock);
src = list_lru_from_memcg_idx(lru, nid, src_idx);
- if (!src)
+ if (!src || !src->nr_items)
goto out;
dst = list_lru_from_memcg_idx(lru, nid, dst_idx);
--
2.27.0
next reply other threads:[~2022-03-30 17:27 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-30 17:26 Waiman Long [this message]
2022-03-30 19:46 ` Roman Gushchin
2022-03-31 2:14 ` Andrew Morton
2022-03-31 2:48 ` Roman Gushchin
2022-03-31 6:39 ` Shakeel Butt
2022-03-31 7:46 ` Michal Hocko
2022-04-01 1:11 ` Andrew Morton
2022-04-01 4:23 ` Shakeel Butt
2022-04-01 4:30 ` Muchun Song
2022-04-01 11:28 ` Michal Hocko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20220330172646.2687555-1-longman@redhat.com \
--to=longman@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=roman.gushchin@linux.dev \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox