linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Vlastimil Babka (SUSE)" <vbabka@kernel.org>
To: Hao Li <hao.li@linux.dev>, Johannes Weiner <hannes@cmpxchg.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Michal Hocko <mhocko@kernel.org>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	Vlastimil Babka <vbabka@suse.cz>,
	Harry Yoo <harry.yoo@oracle.com>,
	linux-mm@kvack.org, cgroups@vger.kernel.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 5/5] mm: memcg: separate slab stat accounting from objcg charge cache
Date: Tue, 3 Mar 2026 11:42:31 +0100	[thread overview]
Message-ID: <541a6661-7bfe-4517-a32c-5839002c61e5@kernel.org> (raw)
In-Reply-To: <ji2jjt4vtmt2ox7wzytpivttc4z7j3u6cwmv23r6xit5322gns@te4t4djl5nlk>

On 3/3/26 09:54, Hao Li wrote:
> On Mon, Mar 02, 2026 at 02:50:18PM -0500, Johannes Weiner wrote:
>> Cgroup slab metrics are cached per-cpu the same way as the sub-page
>> charge cache. However, the intertwined code to manage those dependent
>> caches right now is quite difficult to follow.
>> 
>> Specifically, cached slab stat updates occur in consume() if there was
>> enough charge cache to satisfy the new object. If that fails, whole
>> pages are reserved, and slab stats are updated when the remainder of
>> those pages, after subtracting the size of the new slab object, are
>> put into the charge cache. This already juggles a delicate mix of the
>> object size, the page charge size, and the remainder to put into the
>> byte cache. Doing slab accounting in this path as well is fragile, and
>> has recently caused a bug where the input parameters between the two
>> caches were mixed up.
>> 
>> Refactor the consume() and refill() paths into unlocked and locked
>> variants that only do charge caching. Then let the slab path manage
>> its own lock section and open-code charging and accounting.
>> 
>> This makes the slab stat cache subordinate to the charge cache:
>> __refill_obj_stock() is called first to prepare it;
>> __account_obj_stock() follows to hitch a ride.
>> 
>> This results in a minor behavioral change: previously, a mismatching
>> percpu stock would always be drained for the purpose of setting up
>> slab account caching, even if there was no byte remainder to put into
>> the charge cache. Now, the stock is left alone, and slab accounting
>> takes the uncached path if there is a mismatch. This is exceedingly
>> rare, and it was probably never worth draining the whole stock just to
>> cache the slab stat update.
>> 
>> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
>> ---
>>  mm/memcontrol.c | 100 +++++++++++++++++++++++++++++-------------------
>>  1 file changed, 61 insertions(+), 39 deletions(-)
>> 
>> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
>> index 4f12b75743d4..9c6f9849b717 100644
>> --- a/mm/memcontrol.c
>> +++ b/mm/memcontrol.c
>> @@ -3218,16 +3218,18 @@ static struct obj_stock_pcp *trylock_stock(void)
>>  
> 
> [...]
> 
>> @@ -3376,17 +3383,14 @@ static bool obj_stock_flush_required(struct obj_stock_pcp *stock,
>>  	return flush;
>>  }
>>  
>> -static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes,
>> -		bool allow_uncharge, int nr_acct, struct pglist_data *pgdat,
>> -		enum node_stat_item idx)
>> +static void __refill_obj_stock(struct obj_cgroup *objcg,
>> +			       struct obj_stock_pcp *stock,
>> +			       unsigned int nr_bytes,
>> +			       bool allow_uncharge)
>>  {
>> -	struct obj_stock_pcp *stock;
>>  	unsigned int nr_pages = 0;
>>  
>> -	stock = trylock_stock();
>>  	if (!stock) {
>> -		if (pgdat)
>> -			__account_obj_stock(objcg, NULL, nr_acct, pgdat, idx);
>>  		nr_pages = nr_bytes >> PAGE_SHIFT;
>>  		nr_bytes = nr_bytes & (PAGE_SIZE - 1);
>>  		atomic_add(nr_bytes, &objcg->nr_charged_bytes);
>> @@ -3404,20 +3408,25 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes,
>>  	}
>>  	stock->nr_bytes += nr_bytes;
>>  
>> -	if (pgdat)
>> -		__account_obj_stock(objcg, stock, nr_acct, pgdat, idx);
>> -
>>  	if (allow_uncharge && (stock->nr_bytes > PAGE_SIZE)) {
>>  		nr_pages = stock->nr_bytes >> PAGE_SHIFT;
>>  		stock->nr_bytes &= (PAGE_SIZE - 1);
>>  	}
>>  
>> -	unlock_stock(stock);
>>  out:
>>  	if (nr_pages)
>>  		obj_cgroup_uncharge_pages(objcg, nr_pages);
>>  }
>>  
>> +static void refill_obj_stock(struct obj_cgroup *objcg,
>> +			     unsigned int nr_bytes,
>> +			     bool allow_uncharge)
>> +{
>> +	struct obj_stock_pcp *stock = trylock_stock();
>> +	__refill_obj_stock(objcg, stock, nr_bytes, allow_uncharge);
>> +	unlock_stock(stock);
> 
> Hi Johannes,
> 
> I noticed that after this patch, obj_cgroup_uncharge_pages() is now inside
> the obj_stock.lock critical section. Since obj_cgroup_uncharge_pages() calls
> refill_stock(), which seems non-trivial, this might increase the lock hold time.
> In particular, could that lead to more failed trylocks for IRQ handlers on
> non-RT kernel (or for tasks that preempt others on RT kernel)?

Yes, it also seems a bit self-defeating? (at least in theory)

refill_obj_stock()
  trylock_stock()
  __refill_obj_stock()
    obj_cgroup_uncharge_pages()
      refill_stock()
        local_trylock() -> nested, will fail


      reply	other threads:[~2026-03-03 10:42 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-02 19:50 [PATCH 0/5]: memcg: obj stock and slab stat caching cleanups Johannes Weiner
2026-03-02 19:50 ` [PATCH 1/5] mm: memcg: factor out trylock_stock() and unlock_stock() Johannes Weiner
2026-03-02 21:43   ` Shakeel Butt
2026-03-03  7:56   ` Hao Li
2026-03-03  9:23   ` Vlastimil Babka (SUSE)
2026-03-02 19:50 ` [PATCH 2/5] mm: memcg: simplify objcg charge size and stock remainder math Johannes Weiner
2026-03-02 21:44   ` Shakeel Butt
2026-03-03  8:01   ` Hao Li
2026-03-03  9:34   ` Vlastimil Babka (SUSE)
2026-03-02 19:50 ` [PATCH 3/5] mm: memcontrol: split out __obj_cgroup_charge() Johannes Weiner
2026-03-02 21:45   ` Shakeel Butt
2026-03-03  8:04   ` Hao Li
2026-03-03  9:37   ` Vlastimil Babka (SUSE)
2026-03-02 19:50 ` [PATCH 4/5] mm: memcontrol: use __account_obj_stock() in the !locked path Johannes Weiner
2026-03-02 21:50   ` Shakeel Butt
2026-03-03  8:06   ` Hao Li
2026-03-03  9:39   ` Vlastimil Babka (SUSE)
2026-03-02 19:50 ` [PATCH 5/5] mm: memcg: separate slab stat accounting from objcg charge cache Johannes Weiner
2026-03-02 22:20   ` Shakeel Butt
2026-03-03  8:54   ` Hao Li
2026-03-03 10:42     ` Vlastimil Babka (SUSE) [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=541a6661-7bfe-4517-a32c-5839002c61e5@kernel.org \
    --to=vbabka@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=cgroups@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=hao.li@linux.dev \
    --cc=harry.yoo@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=roman.gushchin@linux.dev \
    --cc=shakeel.butt@linux.dev \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox