From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1BBBC282EC for ; Fri, 14 Mar 2025 10:54:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 060D7280005; Fri, 14 Mar 2025 06:54:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F294D280004; Fri, 14 Mar 2025 06:54:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7E48280005; Fri, 14 Mar 2025 06:54:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B53EA280004 for ; Fri, 14 Mar 2025 06:54:36 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 78456120E63 for ; Fri, 14 Mar 2025 10:54:38 +0000 (UTC) X-FDA: 83219848236.22.5AE2083 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf09.hostedemail.com (Postfix) with ESMTP id 0AC46140003 for ; Fri, 14 Mar 2025 10:54:35 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZixxNQVp; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="/3WGbyQ+"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZixxNQVp; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="/3WGbyQ+"; spf=pass (imf09.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741949676; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sVEcF79uzA4C2CoSVuP2r2XYz5yM9noYVSeby4OOVTE=; b=GkEQCup9qoWPKlCWZJctAyZOGQIUqHClv/NruOAvG8UaXKvxX/YgySuqN1NfZdB/hJaZBk DbgHRw42bPkhGzS0nSBHr+teU4rVB9tV7P1cL4DaF2Bgl0GO9dSUgd1lLTr0RHeeKKUL6S qTp0dP0LGCdKFhJEuNZertDUg/DGVa4= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZixxNQVp; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="/3WGbyQ+"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ZixxNQVp; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="/3WGbyQ+"; spf=pass (imf09.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741949676; a=rsa-sha256; cv=none; b=PptZFJii1MWixk78HhSfGQnimq7hLsBQEveuaB9wvD9JHyIJUI0iNTgxXdUujVYQPCVZVo 2SUoA5XpM7EQ5i5w+7lyHoOHbwvYLIWH7IyhY2g3g7+oNB5Ud/n9anJbM3Mrt07y/8pZw3 +2xOblmU8Ah/wuSYVPIoL+oLbxGWPNk= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 574201F388; Fri, 14 Mar 2025 10:54:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1741949674; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sVEcF79uzA4C2CoSVuP2r2XYz5yM9noYVSeby4OOVTE=; b=ZixxNQVpA4v5+w2Grz2zLpxIlSmrVWM/Ej2xQbBtoSz0/cjXCRDnxMGkdlBHLmtnv7OXYp Ewv5kTPecoMJMJ26wb9aPigFwryBlEa6KeK8bAKmM/QUciAdeM3gKw6n4Btq8o3U3qaogz JG9zYi45KwyRbFzf9kNrwmLytNsiQEo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1741949674; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sVEcF79uzA4C2CoSVuP2r2XYz5yM9noYVSeby4OOVTE=; b=/3WGbyQ+ZCNmcYtR/C6DLXI+Rs5re7stMb9n56ODer7gjt+ynLqCFaVe2tuYt/tdcXW8Dj /gpkYM/JVZX0x9Cg== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1741949674; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sVEcF79uzA4C2CoSVuP2r2XYz5yM9noYVSeby4OOVTE=; b=ZixxNQVpA4v5+w2Grz2zLpxIlSmrVWM/Ej2xQbBtoSz0/cjXCRDnxMGkdlBHLmtnv7OXYp Ewv5kTPecoMJMJ26wb9aPigFwryBlEa6KeK8bAKmM/QUciAdeM3gKw6n4Btq8o3U3qaogz JG9zYi45KwyRbFzf9kNrwmLytNsiQEo= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1741949674; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sVEcF79uzA4C2CoSVuP2r2XYz5yM9noYVSeby4OOVTE=; b=/3WGbyQ+ZCNmcYtR/C6DLXI+Rs5re7stMb9n56ODer7gjt+ynLqCFaVe2tuYt/tdcXW8Dj /gpkYM/JVZX0x9Cg== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 3D213132DD; Fri, 14 Mar 2025 10:54:34 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id dCx8DuoK1Gd9bgAAD6G6ig (envelope-from ); Fri, 14 Mar 2025 10:54:34 +0000 Message-ID: <9e1e3877-55ae-4546-a5c1-08ea730ea638@suse.cz> Date: Fri, 14 Mar 2025 11:54:34 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 10/10] memcg: no more irq disabling for stock locks Content-Language: en-US To: Shakeel Butt , Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Sebastian Andrzej Siewior , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team References: <20250314061511.1308152-1-shakeel.butt@linux.dev> <20250314061511.1308152-11-shakeel.butt@linux.dev> From: Vlastimil Babka In-Reply-To: <20250314061511.1308152-11-shakeel.butt@linux.dev> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 0AC46140003 X-Rspamd-Server: rspam08 X-Stat-Signature: c8b8cjf9rkh4nuih9h8uycy4783mrfum X-HE-Tag: 1741949675-457341 X-HE-Meta: U2FsdGVkX1+5SdJBNyXjDDk00mMsxe6Jwosv+Qtq15iZvSVJBw3uEVI5LGGfPjWzPdIwqArcSIymIKSi121tcoOI82APfDYmAp4LLo6CjouZUwlDXGCMoaDjcURFaK20ZgcR1T8tM8uOLBmpYVjrWsegPKVhvooO93lbwMRrgFHU7v0aXHDAWzT0boTh2PNvIKu4iU9DG4xE3QRk/c+wPG3AqXBXzzgiy0F+CooQexNZYopqadvCRPComRFxSMuGGWTW3eqheIYzybeuUJEQou5AEimDM278v9BsYz0fyDyj8bJ0upPad1TEmaPdCCpO2O0wOFxh6JLiEd6IaGIzQIFPv8ggARqVj0hUr+K3Wh/2/beu/gmgebEkQXYo6UashSTXk1+gSozH91GrIOnpSX6k5BUEfNUCODyRtR1oFuD0vlHbdJnWUe00w8tvBNp2fPRLWaE0Jqt/HH25HP6Ow9lqON2uVBJx+irwJTBk5fBlO4BYnFrzQlUt6d0JSxpgdJ6gRi1FsqJywvmr3ID9NSscCab5uAu1YQDD3PVCzcpXtb3H0oCIkovwRdLftrBGsAYaf0eBWvrcLj1a9GQ6INQyHqj7o02ZRiMdBTivo+e9Rl0o6YmSrBbK3onUQExkW3btpTpDOOBKSdGENrTcPGK1B22uCc7w1QrtA5w2puAxHY2HYu+YqMd2V2yfxpB/RLJNEmE3l7KWZYv8DH7YgLdXsBIjkX3RO1m4ecFvsGDnPI4UQPk6NtHZpobKHbdqETl8hEd6cm0H8AFwSpfjUQRS6QzxDIU6wwFsHYIk7Dn3ISs9j0xN0PoGXzZ0tWe1lA1AVRIxE+IVZt5qZ+Jnc9xribJ9Fem7S9cUEfg/j+s5HJtEJCvtzvKilkQKJ2O4Mdk7yPgMwgXqx6JCL9hyIDR1rCGa2MCFzKOXY3EV/IMT1KzduCAYD5PG1x3xEJllDD525unsVYL7aliEKBs 3751E3Yw ChCFhomrXs+H3iMeMzG+hvdjH2kZKEE6BmQSYc6ymMD06TjqtO6UrF887ni0McQMTUL54GG/D8I3W6PQ06puegO7GnUxG2eGtvulRfhSGjA/HW0EAlK4v7R/kOLq+in5olzTz9KcsEQ9PCOclpDzcdKQEyomqzKqf7fwx/7gqv8L2cJnncjJ7+fxu18Z1YWrEFMrfuvRnNFA5n4evUecY6O4UPPqpBP5pC0VJQ8sahs6wFyqyzNlFxyOmIs6K74rOWIoGCoSZJTF0fVNupKgKW5Gdiurks/V77P13 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 3/14/25 07:15, Shakeel Butt wrote: > Let's switch all memcg_stock locks acquire and release places to not > disable and enable irqs. There are two still functions (i.e. > mod_objcg_state() and drain_obj_stock) which needs to disable irqs to > update the stats on non-RT kernels. For now add a simple wrapper for > that functionality. BTW, which part of __mod_objcg_mlstate() really needs disabled irqs and not just preemption? I see it does rcu_read_lock() anyway, which disables preemption. Then in __mod_memcg_lruvec_state() we do some __this_cpu_add() updates. I think these also are fine with just disabled preemption as they are atomic vs irqs (but don't need LOCK prefix to be atomic vs other cpus updates). Is it just memcg_rstat_updated() which does READ_ONCE/WRITE_ONCE? Could we perhaps just change it to operations where disabled preemption is enough? > Signed-off-by: Shakeel Butt > --- > mm/memcontrol.c | 83 ++++++++++++++++++++++++++++++++----------------- > 1 file changed, 54 insertions(+), 29 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index ba5d004049d3..fa28efa298f4 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -1796,22 +1796,17 @@ static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, > * > * returns true if successful, false otherwise. > */ > -static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, > - gfp_t gfp_mask) > +static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > { > struct memcg_stock_pcp *stock; > unsigned int stock_pages; > - unsigned long flags; > bool ret = false; > > if (nr_pages > MEMCG_CHARGE_BATCH) > return ret; > > - if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > - if (!gfpflags_allow_spinning(gfp_mask)) > + if (!localtry_trylock(&memcg_stock.stock_lock)) > return ret; > - localtry_lock_irqsave(&memcg_stock.stock_lock, flags); > - } > > stock = this_cpu_ptr(&memcg_stock); > stock_pages = READ_ONCE(stock->nr_pages); > @@ -1820,7 +1815,7 @@ static bool consume_stock(struct mem_cgroup *memcg, unsigned int nr_pages, > ret = true; > } > > - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); > + localtry_unlock(&memcg_stock.stock_lock); > > return ret; > } > @@ -1855,7 +1850,6 @@ static void drain_stock(struct memcg_stock_pcp *stock) > static void drain_local_stock(struct work_struct *dummy) > { > struct memcg_stock_pcp *stock; > - unsigned long flags; > > lockdep_assert_once(in_task()); > > @@ -1864,14 +1858,14 @@ static void drain_local_stock(struct work_struct *dummy) > * drain_stock races is that we always operate on local CPU stock > * here with IRQ disabled > */ > - localtry_lock_irqsave(&memcg_stock.stock_lock, flags); > + localtry_lock(&memcg_stock.stock_lock); > > stock = this_cpu_ptr(&memcg_stock); > drain_obj_stock(stock); > drain_stock(stock); > clear_bit(FLUSHING_CACHED_CHARGE, &stock->flags); > > - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); > + localtry_unlock(&memcg_stock.stock_lock); > } > > /* Should never be called with root_mem_cgroup. */ > @@ -1879,9 +1873,8 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > { > struct memcg_stock_pcp *stock; > unsigned int stock_pages; > - unsigned long flags; > > - if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > + if (!localtry_trylock(&memcg_stock.stock_lock)) { > /* > * In case of unlikely failure to lock percpu stock_lock > * uncharge memcg directly. > @@ -1902,7 +1895,7 @@ static void refill_stock(struct mem_cgroup *memcg, unsigned int nr_pages) > if (stock_pages > MEMCG_CHARGE_BATCH) > drain_stock(stock); > > - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); > + localtry_unlock(&memcg_stock.stock_lock); > } > > /* > @@ -1953,17 +1946,12 @@ void drain_all_stock(struct mem_cgroup *root_memcg) > static int memcg_hotplug_cpu_dead(unsigned int cpu) > { > struct memcg_stock_pcp *stock; > - unsigned long flags; > > lockdep_assert_once(in_task()); > > stock = &per_cpu(memcg_stock, cpu); > > - /* drain_obj_stock requires stock_lock */ > - localtry_lock_irqsave(&memcg_stock.stock_lock, flags); > drain_obj_stock(stock); > - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); > - > drain_stock(stock); > > return 0; > @@ -2254,7 +2242,7 @@ static int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, > unsigned long pflags; > > retry: > - if (consume_stock(memcg, nr_pages, gfp_mask)) > + if (consume_stock(memcg, nr_pages)) > return 0; > > if (!gfpflags_allow_spinning(gfp_mask)) > @@ -2757,6 +2745,28 @@ static void replace_stock_objcg(struct memcg_stock_pcp *stock, > WRITE_ONCE(stock->cached_objcg, objcg); > } > > +static unsigned long rt_lock(void) > +{ > +#ifdef CONFIG_PREEMPT_RT > + migrate_disable(); > + return 0; > +#else > + unsigned long flags = 0; > + > + local_irq_save(flags); > + return flags; > +#endif > +} > + > +static void rt_unlock(unsigned long flags) > +{ > +#ifdef CONFIG_PREEMPT_RT > + migrate_enable(); > +#else > + local_irq_restore(flags); > +#endif > +} > + > static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, > enum node_stat_item idx, int nr) > { > @@ -2764,7 +2774,8 @@ static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, > unsigned long flags; > int *bytes; > > - if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > + if (!localtry_trylock(&memcg_stock.stock_lock)) { > + /* Do we need mix_rt_[un]lock here too. */ > __mod_objcg_mlstate(objcg, pgdat, idx, nr); > return; > } > @@ -2783,6 +2794,8 @@ static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, > /* Flush the existing cached vmstat data */ > struct pglist_data *oldpg = stock->cached_pgdat; > > + flags = rt_lock(); > + > if (stock->nr_slab_reclaimable_b) { > __mod_objcg_mlstate(objcg, oldpg, NR_SLAB_RECLAIMABLE_B, > stock->nr_slab_reclaimable_b); > @@ -2793,6 +2806,8 @@ static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, > stock->nr_slab_unreclaimable_b); > stock->nr_slab_unreclaimable_b = 0; > } > + > + rt_unlock(flags); > stock->cached_pgdat = pgdat; > } > > @@ -2814,19 +2829,21 @@ static void mod_objcg_state(struct obj_cgroup *objcg, struct pglist_data *pgdat, > nr = 0; > } > } > - if (nr) > + if (nr) { > + flags = rt_lock(); > __mod_objcg_mlstate(objcg, pgdat, idx, nr); > + rt_unlock(flags); > + } > > - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); > + localtry_unlock(&memcg_stock.stock_lock); > } > > static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) > { > struct memcg_stock_pcp *stock; > - unsigned long flags; > bool ret = false; > > - if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) > + if (!localtry_trylock(&memcg_stock.stock_lock)) > return ret; > > stock = this_cpu_ptr(&memcg_stock); > @@ -2835,7 +2852,7 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) > ret = true; > } > > - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); > + localtry_unlock(&memcg_stock.stock_lock); > > return ret; > } > @@ -2843,10 +2860,16 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes) > static void drain_obj_stock(struct memcg_stock_pcp *stock) > { > struct obj_cgroup *old = READ_ONCE(stock->cached_objcg); > + unsigned long flags; > + bool locked = stock->nr_bytes || stock->nr_slab_reclaimable_b || > + stock->nr_slab_unreclaimable_b; > > if (!old) > return; > > + if (locked) > + flags = rt_lock(); > + > if (stock->nr_bytes) { > unsigned int nr_pages = stock->nr_bytes >> PAGE_SHIFT; > unsigned int nr_bytes = stock->nr_bytes & (PAGE_SIZE - 1); > @@ -2897,6 +2920,9 @@ static void drain_obj_stock(struct memcg_stock_pcp *stock) > stock->cached_pgdat = NULL; > } > > + if (locked) > + rt_unlock(flags); > + > WRITE_ONCE(stock->cached_objcg, NULL); > obj_cgroup_put(old); > } > @@ -2920,10 +2946,9 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, > bool allow_uncharge) > { > struct memcg_stock_pcp *stock; > - unsigned long flags; > unsigned int nr_pages = 0; > > - if (!localtry_trylock_irqsave(&memcg_stock.stock_lock, flags)) { > + if (!localtry_trylock(&memcg_stock.stock_lock)) { > atomic_add(nr_bytes, &objcg->nr_charged_bytes); > return; > } > @@ -2940,7 +2965,7 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, > stock->nr_bytes &= (PAGE_SIZE - 1); > } > > - localtry_unlock_irqrestore(&memcg_stock.stock_lock, flags); > + localtry_unlock(&memcg_stock.stock_lock); > > if (nr_pages) > obj_cgroup_uncharge_pages(objcg, nr_pages);