linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Qi Zheng <qi.zheng@linux.dev>
To: Harry Yoo <harry.yoo@oracle.com>
Cc: hannes@cmpxchg.org, hughd@google.com, mhocko@suse.com,
	roman.gushchin@linux.dev, shakeel.butt@linux.dev,
	muchun.song@linux.dev, david@kernel.org,
	lorenzo.stoakes@oracle.com, ziy@nvidia.com,
	yosry.ahmed@linux.dev, imran.f.khan@oracle.com,
	kamalesh.babulal@oracle.com, axelrasmussen@google.com,
	yuanchu@google.com, weixugc@google.com,
	chenridong@huaweicloud.com, mkoutny@suse.com,
	akpm@linux-foundation.org, hamzamahfooz@linux.microsoft.com,
	apais@linux.microsoft.com, lance.yang@linux.dev,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	cgroups@vger.kernel.org, Muchun Song <songmuchun@bytedance.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH v3 24/30] mm: memcontrol: prepare for reparenting LRU pages for lruvec lock
Date: Tue, 20 Jan 2026 19:51:29 +0800	[thread overview]
Message-ID: <88d90d30-8f54-43f5-98d6-1769aa05a10a@linux.dev> (raw)
In-Reply-To: <aW86_5SOdtQQnVr7@hyeyoo>



On 1/20/26 4:21 PM, Harry Yoo wrote:
> On Wed, Jan 14, 2026 at 07:32:51PM +0800, Qi Zheng wrote:
>> From: Muchun Song <songmuchun@bytedance.com>
>>
>> The following diagram illustrates how to ensure the safety of the folio
>> lruvec lock when LRU folios undergo reparenting.
>>
>> In the folio_lruvec_lock(folio) function:
>> ```
>>      rcu_read_lock();
>> retry:
>>      lruvec = folio_lruvec(folio);
>>      /* There is a possibility of folio reparenting at this point. */
>>      spin_lock(&lruvec->lru_lock);
>>      if (unlikely(lruvec_memcg(lruvec) != folio_memcg(folio))) {
>>          /*
>>           * The wrong lruvec lock was acquired, and a retry is required.
>>           * This is because the folio resides on the parent memcg lruvec
>>           * list.
>>           */
>>          spin_unlock(&lruvec->lru_lock);
>>          goto retry;
>>      }
>>
>>      /* Reaching here indicates that folio_memcg() is stable. */
>> ```
>>
>> In the memcg_reparent_objcgs(memcg) function:
>> ```
>>      spin_lock(&lruvec->lru_lock);
>>      spin_lock(&lruvec_parent->lru_lock);
>>      /* Transfer folios from the lruvec list to the parent's. */
>>      spin_unlock(&lruvec_parent->lru_lock);
>>      spin_unlock(&lruvec->lru_lock);
>> ```
>>
>> After acquiring the lruvec lock, it is necessary to verify whether
>> the folio has been reparented. If reparenting has occurred, the new
>> lruvec lock must be reacquired. During the LRU folio reparenting
>> process, the lruvec lock will also be acquired (this will be
>> implemented in a subsequent patch). Therefore, folio_memcg() remains
>> unchanged while the lruvec lock is held.
>>
>> Given that lruvec_memcg(lruvec) is always equal to folio_memcg(folio)
>> after the lruvec lock is acquired, the lruvec_memcg_debug() check is
>> redundant. Hence, it is removed.
>>
>> This patch serves as a preparation for the reparenting of LRU folios.
>>
>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>> Acked-by: Johannes Weiner <hannes@cmpxchg.org>
>> ---
>>   include/linux/memcontrol.h | 45 +++++++++++++++++++----------
>>   include/linux/swap.h       |  1 +
>>   mm/compaction.c            | 29 +++++++++++++++----
>>   mm/memcontrol.c            | 59 +++++++++++++++++++++-----------------
>>   mm/swap.c                  |  4 +++
>>   5 files changed, 91 insertions(+), 47 deletions(-)
>>
>> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
>> index 4b6f20dc694ba..26c3c0e375f58 100644
>> --- a/include/linux/memcontrol.h
>> +++ b/include/linux/memcontrol.h
>> @@ -742,7 +742,15 @@ static inline struct lruvec *mem_cgroup_lruvec(struct mem_cgroup *memcg,
>>    * folio_lruvec - return lruvec for isolating/putting an LRU folio
>>    * @folio: Pointer to the folio.
>>    *
>> - * This function relies on folio->mem_cgroup being stable.
>> + * Call with rcu_read_lock() held to ensure the lifetime of the returned lruvec.
>> + * Note that this alone will NOT guarantee the stability of the folio->lruvec
>> + * association; the folio can be reparented to an ancestor if this races with
>> + * cgroup deletion.
>> + *
>> + * Use folio_lruvec_lock() to ensure both lifetime and stability of the binding.
>> + * Once a lruvec is locked, folio_lruvec() can be called on other folios, and
>> + * their binding is stable if the returned lruvec matches the one the caller has
>> + * locked. Useful for lock batching.
>>    */
>>   static inline struct lruvec *folio_lruvec(struct folio *folio)
>>   {
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 548e67dbf2386..a1573600d4188 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> diff --git a/mm/swap.c b/mm/swap.c
>> index cb1148a92d8ec..7e53479ca1732 100644
>> --- a/mm/swap.c
>> +++ b/mm/swap.c
>> @@ -284,9 +286,11 @@ void lru_note_cost_unlock_irq(struct lruvec *lruvec, bool file,
>>   		}
>>   
>>   		spin_unlock_irq(&lruvec->lru_lock);
>> +		rcu_read_unlock();
>>   		lruvec = parent_lruvec(lruvec);
> 
> It looks bit weird to call parent_lruvec(lruvec) outside RCU read lock
> because the reason why it holds RCU read lock is to prevent release of
> memory cgroup and its lruvec.
> 
> I guess this isn't broken (for now) because all callers of
> lru_note_cost_unlock_irq() are holding a reference to the memcg?

I checked all the callers again, and they do indeed hold the refcnt
for the memcg, so it's safe for now. But it seems rather fragile,
perhaps we should also include parent_lruvec() within the RCU lock.

> 
>>   		if (!lruvec)
>>   			break;
>> +		rcu_read_lock();
>>   		spin_lock_irq(&lruvec->lru_lock);
>>   	}
>>   }
> 



  reply	other threads:[~2026-01-20 11:51 UTC|newest]

Thread overview: 101+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-01-14 11:26 [PATCH v3 00/30] Eliminate Dying Memory Cgroup Qi Zheng
2026-01-14 11:26 ` [PATCH v3 01/30] mm: memcontrol: remove dead code of checking parent memory cgroup Qi Zheng
2026-01-14 11:26 ` [PATCH v3 02/30] mm: workingset: use folio_lruvec() in workingset_refault() Qi Zheng
2026-01-14 11:26 ` [PATCH v3 03/30] mm: rename unlock_page_lruvec_irq and its variants Qi Zheng
2026-01-14 11:26 ` [PATCH v3 04/30] mm: vmscan: prepare for the refactoring the move_folios_to_lru() Qi Zheng
2026-01-16  9:10   ` Harry Yoo
2026-01-16  9:14   ` Muchun Song
2026-01-14 11:26 ` [PATCH v3 05/30] mm: vmscan: refactor move_folios_to_lru() Qi Zheng
2026-01-16 11:31   ` Harry Yoo
2026-01-14 11:26 ` [PATCH v3 06/30] mm: memcontrol: allocate object cgroup for non-kmem case Qi Zheng
2026-01-14 11:32 ` [PATCH v3 07/30] mm: memcontrol: return root object cgroup for root memory cgroup Qi Zheng
2026-01-16 12:53   ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 08/30] mm: memcontrol: prevent memory cgroup release in get_mem_cgroup_from_folio() Qi Zheng
2026-01-17 20:00   ` Shakeel Butt
2026-01-18  0:31   ` Shakeel Butt
2026-01-19  3:20     ` Qi Zheng
2026-01-19  8:53     ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 09/30] buffer: prevent memory cgroup release in folio_alloc_buffers() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 10/30] writeback: prevent memory cgroup release in writeback module Qi Zheng
2026-01-14 11:32 ` [PATCH v3 11/30] mm: memcontrol: prevent memory cgroup release in count_memcg_folio_events() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 12/30] mm: page_io: prevent memory cgroup release in page_io module Qi Zheng
2026-01-14 11:32 ` [PATCH v3 13/30] mm: migrate: prevent memory cgroup release in folio_migrate_mapping() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 14/30] mm: mglru: prevent memory cgroup release in mglru Qi Zheng
2026-01-17 22:46   ` Shakeel Butt
2026-01-19  9:25   ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 15/30] mm: memcontrol: prevent memory cgroup release in mem_cgroup_swap_full() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 16/30] mm: workingset: prevent memory cgroup release in lru_gen_eviction() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 17/30] mm: thp: prevent memory cgroup release in folio_split_queue_lock{_irqsave}() Qi Zheng
2026-01-16  9:15   ` Muchun Song
2026-01-14 11:32 ` [PATCH v3 18/30] mm: zswap: prevent memory cgroup release in zswap_compress() Qi Zheng
2026-01-16  9:18   ` Muchun Song
2026-01-20  7:47   ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 19/30] mm: workingset: prevent lruvec release in workingset_refault() Qi Zheng
2026-01-17 23:02   ` Shakeel Butt
2026-01-14 11:32 ` [PATCH v3 20/30] mm: zswap: prevent lruvec release in zswap_folio_swapin() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 21/30] mm: swap: prevent lruvec release in lru_gen_clear_refs() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 22/30] mm: workingset: prevent lruvec release in workingset_activation() Qi Zheng
2026-01-14 11:32 ` [PATCH v3 23/30] mm: do not open-code lruvec lock Qi Zheng
2026-01-15  9:26   ` Baoquan He
2026-01-15  9:31     ` Qi Zheng
2026-01-16  9:20   ` Muchun Song
2026-01-17 23:08   ` Shakeel Butt
2026-01-20  7:58   ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 24/30] mm: memcontrol: prepare for reparenting LRU pages for " Qi Zheng
2026-01-16  9:43   ` Muchun Song
2026-01-16  9:50     ` Qi Zheng
2026-01-18  0:44       ` Shakeel Butt
2026-01-19  3:44         ` Qi Zheng
2026-01-20 15:54           ` Shakeel Butt
2026-01-18  0:46   ` Shakeel Butt
2026-01-20  8:21   ` Harry Yoo
2026-01-20 11:51     ` Qi Zheng [this message]
2026-01-20 12:50       ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 25/30] mm: vmscan: prepare for reparenting traditional LRU folios Qi Zheng
2026-01-16  9:49   ` Muchun Song
2026-01-18  1:11   ` Shakeel Butt
2026-01-19  3:24     ` Qi Zheng
2026-01-14 11:32 ` [PATCH v3 26/30] mm: vmscan: prepare for reparenting MGLRU folios Qi Zheng
2026-01-15 10:44   ` [PATCH v3 26/30 fix] mm: mglru: do not call update_lru_size() during reparenting Qi Zheng
2026-01-15 17:46     ` Andrew Morton
2026-01-21  3:53     ` Harry Yoo
2026-01-21  4:19       ` Harry Yoo
2026-01-21 11:21         ` Qi Zheng
2026-01-18  3:25   ` [PATCH v3 26/30] mm: vmscan: prepare for reparenting MGLRU folios Shakeel Butt
2026-01-18  3:29   ` Shakeel Butt
2026-01-19  3:39     ` Qi Zheng
2026-01-14 11:32 ` [PATCH v3 27/30] mm: memcontrol: refactor memcg_reparent_objcgs() Qi Zheng
2026-01-18  2:31   ` Shakeel Butt
2026-01-22  9:04   ` Harry Yoo
2026-01-22  9:13   ` Muchun Song
2026-01-14 11:32 ` [PATCH v3 28/30] mm: memcontrol: prepare for reparenting state_local Qi Zheng
2026-01-15 10:41   ` [PATCH v3 28/30 fix 1/2] mm: memcontrol: fix lruvec_stats->state_local reparenting Qi Zheng
2026-01-15 10:41     ` [PATCH v3 28/30 fix 2/2] mm: memcontrol: change state_locals to atomic_long_t type Qi Zheng
2026-01-15 17:47     ` [PATCH v3 28/30 fix 1/2] mm: memcontrol: fix lruvec_stats->state_local reparenting Andrew Morton
2026-01-16  3:27       ` Qi Zheng
2026-01-18  3:22     ` Shakeel Butt
2026-01-19  3:36       ` Qi Zheng
2026-01-20  7:19         ` Muchun Song
2026-01-20 18:47           ` Shakeel Butt
2026-01-21  3:43             ` Qi Zheng
2026-01-21  8:20               ` Shakeel Butt
2026-01-21 11:25                 ` Qi Zheng
2026-01-18  3:20   ` [PATCH v3 28/30] mm: memcontrol: prepare for reparenting state_local Shakeel Butt
2026-01-19  3:34     ` Qi Zheng
2026-01-29  2:10       ` Harry Yoo
2026-01-29  8:50         ` Qi Zheng
2026-01-29 12:23           ` Harry Yoo
2026-01-30  7:22             ` Qi Zheng
2026-02-02  3:15               ` Harry Yoo
2026-01-14 11:32 ` [PATCH v3 29/30] mm: memcontrol: eliminate the problem of dying memory cgroup for LRU folios Qi Zheng
2026-01-14 11:32 ` [PATCH v3 30/30] mm: lru: add VM_WARN_ON_ONCE_FOLIO to lru maintenance helpers Qi Zheng
2026-01-14 17:07 ` [syzbot ci] Re: Eliminate Dying Memory Cgroup syzbot ci
2026-01-15  3:47   ` Qi Zheng
2026-01-14 17:58 ` [PATCH v3 00/30] " Andrew Morton
2026-01-15  3:52   ` Qi Zheng
2026-01-15  5:59     ` Andrew Morton
2026-01-15  6:05       ` Qi Zheng
2026-01-15 12:40   ` Lorenzo Stoakes
2026-01-16  0:43     ` Andrew Morton
2026-01-16  8:33       ` Lorenzo Stoakes
2026-01-16 12:25         ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=88d90d30-8f54-43f5-98d6-1769aa05a10a@linux.dev \
    --to=qi.zheng@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=apais@linux.microsoft.com \
    --cc=axelrasmussen@google.com \
    --cc=cgroups@vger.kernel.org \
    --cc=chenridong@huaweicloud.com \
    --cc=david@kernel.org \
    --cc=hamzamahfooz@linux.microsoft.com \
    --cc=hannes@cmpxchg.org \
    --cc=harry.yoo@oracle.com \
    --cc=hughd@google.com \
    --cc=imran.f.khan@oracle.com \
    --cc=kamalesh.babulal@oracle.com \
    --cc=lance.yang@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=mkoutny@suse.com \
    --cc=muchun.song@linux.dev \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=songmuchun@bytedance.com \
    --cc=weixugc@google.com \
    --cc=yosry.ahmed@linux.dev \
    --cc=yuanchu@google.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox