From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E11CCC3ABC9 for ; Tue, 13 May 2025 10:22:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E3136B0085; Tue, 13 May 2025 06:22:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3679C6B00D0; Tue, 13 May 2025 06:22:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1B93B6B00CD; Tue, 13 May 2025 06:22:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id EC4B36B0085 for ; Tue, 13 May 2025 06:22:32 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BCA6216054C for ; Tue, 13 May 2025 10:22:33 +0000 (UTC) X-FDA: 83437495386.05.82B0304 Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.223.130]) by imf28.hostedemail.com (Postfix) with ESMTP id 33F0DC000F for ; Tue, 13 May 2025 10:22:30 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ChNCLmYI; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="GXZC/sjq"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ChNCLmYI; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="GXZC/sjq"; spf=pass (imf28.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747131751; a=rsa-sha256; cv=none; b=u0P1Jaj4ZrO8Lfwf3p3JdUWjRi8t0Iteh6ltXU1nOOT5q3IRXvSk4NZxmrFImkESrPDsii amxq+mO+DCGNda/C7I5yhTg9hAe0AMcgXXADL4ulw2mKdlcpIp8Cx6PwULC6xYWIGEPQQ4 /pigzVVcgeeVEjFObfnD9s1eoSGhVn4= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ChNCLmYI; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="GXZC/sjq"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=ChNCLmYI; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="GXZC/sjq"; spf=pass (imf28.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.130 as permitted sender) smtp.mailfrom=vbabka@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747131751; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mrZHYTr94L6UiUBzTPPsFJdqRFgV3hVQF43r7tnRnVM=; b=LWAOja/bxSoam+Bwi9o8NtZV/lgcIEsv1sQqNt2FrdkaSpTkPSDlHPZAk4aB5CMC5/ZKsM PFrznULf/FwXBozOqFtbwEltpHUXnp8Kl9XItFhThioz020oJNiTTGoDl8vppN7WmTLz6g 7rpLhDUZfsFTPt1kh7h7qtuR807WUAs= Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 4E8AE211D2; Tue, 13 May 2025 10:22:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1747131749; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mrZHYTr94L6UiUBzTPPsFJdqRFgV3hVQF43r7tnRnVM=; b=ChNCLmYISGXYX57UNr2nhroIDOi+2pNZhFuO5IVV8uaTn8ZrMF83DJG6bCLUwVuQveMrw2 mOodNvkGB3R2zdSdYGaub2D9F+fyF9Xgt1dXY6FBZ+jS7rgTprqwfUvr/cqWRZRtLJfhO7 OmrdBjya7s8VCJWuNqx0qWU3RzyMC/U= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1747131749; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mrZHYTr94L6UiUBzTPPsFJdqRFgV3hVQF43r7tnRnVM=; b=GXZC/sjqv9BstPD8ulzua7sBguJNvVGKA4T/kHsmUupPwVlRSxCvhFjXmKJKPO8yv79fti u1OEC5HxJytRfEAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1747131749; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mrZHYTr94L6UiUBzTPPsFJdqRFgV3hVQF43r7tnRnVM=; b=ChNCLmYISGXYX57UNr2nhroIDOi+2pNZhFuO5IVV8uaTn8ZrMF83DJG6bCLUwVuQveMrw2 mOodNvkGB3R2zdSdYGaub2D9F+fyF9Xgt1dXY6FBZ+jS7rgTprqwfUvr/cqWRZRtLJfhO7 OmrdBjya7s8VCJWuNqx0qWU3RzyMC/U= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1747131749; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=mrZHYTr94L6UiUBzTPPsFJdqRFgV3hVQF43r7tnRnVM=; b=GXZC/sjqv9BstPD8ulzua7sBguJNvVGKA4T/kHsmUupPwVlRSxCvhFjXmKJKPO8yv79fti u1OEC5HxJytRfEAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 251D6137E8; Tue, 13 May 2025 10:22:29 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 8EG0CGUdI2hdGwAAD6G6ig (envelope-from ); Tue, 13 May 2025 10:22:29 +0000 Message-ID: Date: Tue, 13 May 2025 12:22:28 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 1/7] memcg: memcg_rstat_updated re-entrant safe against irqs Content-Language: en-US To: Shakeel Butt , Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Alexei Starovoitov , Sebastian Andrzej Siewior , Harry Yoo , Yosry Ahmed , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team , Mathieu Desnoyers References: <20250513031316.2147548-1-shakeel.butt@linux.dev> <20250513031316.2147548-2-shakeel.butt@linux.dev> From: Vlastimil Babka In-Reply-To: <20250513031316.2147548-2-shakeel.butt@linux.dev> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Action: no action X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 33F0DC000F X-Stat-Signature: czcfrdhkksz4sawmjwqay1huot3h1dpx X-HE-Tag: 1747131750-143438 X-HE-Meta: U2FsdGVkX18lEFnQ0K6fobhMrFXCedC8NW1KcePvoETFweqze4U6PBmyEYE91/Guukdoj7JZwgdn+hv3vWndhZfz8P25MpBmM8T3hrrpOFSRbJf2gKw9ubJ4DiIwpVb4gFhat4R/BbBDZXYPN5l98qNdCSdvC8xzclfGrUeFLoROQfvXUhjBbDbafKFtI5uW24O48PA35HOZ1Oks7Y/Zu2Oic2LDtb02Z0eEdPxRLP6btjVkiKoaIYV3vCbXd8JnuUDvHBXg4jDNAwsx9eJNwuTi5nO/0b6y0kY836GAvL/YRMR42qPETcFpeBR9woWwg3+31ICvp7KhFV6xNBd/luWmS0gr5vX5VfVHFyRLVTrsJhfZ9e2CIfRq7ViQqnZf6fyKxrHU/Bm3PbTXHMwEaQwupMN6oGoD3/nPd2OENLALWC208UfC9LmA4lo/aYx7ZQp1GbbB7433171lR2bZAzIxqIF2/zs59aTmRvNGDtxjsZ+IHNQe9h2ja7ZjcTkosb9tKKJ8PffFAklIClFKHbrgokzY6VJj6vtXDpqy2269LSAHnD9Ts7BUhOW7kUMP2xZmb9s38rICtxlHgdo0TNdpvhiajrEH43J9o29vJaSoyCMXQ26xCFk6Jlg7MZjMCdcCpCraSLx1vhxpQ6WOF8rMeFJLa4+mZ5xG/PKdipTCJqEXOPY7jsGo7LhwkfiQgfVP6/nsshi+lCub//1is89DdgfKnNoHjf38NqAnjSokQng5Niq2YFbkJGds9FCYepdztMO7W6kows69xpw6wPkj+Y4MohcRovivaMEC/1ikeJvm07/49xEeaJL/uoSJiaECerm7yccHaCBTfsJtrRiJwXdlN/LF6qmB4gnryr1VmGrEos+1mRIvKRQfiQuuH3fU4GiBzXm+5ocCv1Ejztvk3Jb7ygDML7WKlX3zOc2h81IsDaXWWorDFK22sSABI6yy6tMHw5zjEtvV3ap wTFYjmCK uGUs5sNLFFi2lURGJAbRv/IOL0DSSWS3qBDULMbuk1x2Rtyg2WQ9xzavl73L6WqhfFF+GVynLPOyAMO6oNASkPG6IPOXX2fcOApzRChSp6k2BopnMF8OjShwB3vC7zXJOZXaMkFKONG9SQfMTVkPF4YwYMUR8PQgFr0Zz9MWDkacjhntCZXk8ItEoYq/XsJS8z6ThsseM1+punvz4CuWF6HY9Pe5gVbCwIr7ZRjtIPUm9wlT0Jlh0h0jlLG+mMLxBVPHjhFkVjrtIYIq6DPMc8bMdQmQ0oPZ+HFjIlIL2K9QUEiUvkQBCLWt+cA6Mx85svxudA4WEUwfTaaL7j/YSXfRsxCI4bqBGdI4GUUB6Q3nPomYbGPNOtsptYS1uBQW14nwQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 5/13/25 05:13, Shakeel Butt wrote: > The function memcg_rstat_updated() is used to track the memcg stats > updates for optimizing the flushes. At the moment, it is not re-entrant > safe and the callers disabled irqs before calling. However to achieve > the goal of updating memcg stats without irqs, memcg_rstat_updated() > needs to be re-entrant safe against irqs. > > This patch makes memcg_rstat_updated() re-entrant safe against irqs. > However it is using atomic_* ops which on x86, adds lock prefix to the > instructions. Since this is per-cpu data, the this_cpu_* ops are > preferred. However the percpu pointer is stored in struct mem_cgroup and > doing the upward traversal through struct mem_cgroup may cause two cache > misses as compared to traversing through struct memcg_vmstats_percpu > pointer. > > NOTE: explore if there is atomic_* ops alternative without lock prefix. local_t might be what you want here https://docs.kernel.org/core-api/local_ops.html Or maybe just add __percpu to parent like this? struct memcg_vmstats_percpu { ... struct memcg_vmstats_percpu __percpu *parent; ... } Yes, it means on each cpu's struct memcg_vmstats_percpu instance there will be actually the same value stored (the percpu offset) instead of the cpu-specific parent pointer, which might seem wasteful. But AFAIK this_cpu_* is optimized enough thanks to the segment register usage, that it doesn't matter? It shouldn't cause any extra cache miss you worry about, IIUC? With that I think you could refactor that code to use e.g. this_cpu_add_return() and this_cpu_xchg() on the stats_updates and obtain the parent "pointer" in a way that's also compatible with these operations. That is unless we want also nmi safety, then we're back to the issue of the previous series... > Signed-off-by: Shakeel Butt > --- > mm/memcontrol.c | 21 +++++++++++++-------- > 1 file changed, 13 insertions(+), 8 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 6cfa3550f300..2c4c095bf26c 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -503,7 +503,7 @@ static inline int memcg_events_index(enum vm_event_item idx) > > struct memcg_vmstats_percpu { > /* Stats updates since the last flush */ > - unsigned int stats_updates; > + atomic_t stats_updates; > > /* Cached pointers for fast iteration in memcg_rstat_updated() */ > struct memcg_vmstats_percpu *parent; > @@ -590,12 +590,15 @@ static bool memcg_vmstats_needs_flush(struct memcg_vmstats *vmstats) > static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) > { > struct memcg_vmstats_percpu *statc; > - int cpu = smp_processor_id(); > - unsigned int stats_updates; > + int cpu; > + int stats_updates; > > if (!val) > return; > > + /* Don't assume callers have preemption disabled. */ > + cpu = get_cpu(); > + > cgroup_rstat_updated(memcg->css.cgroup, cpu); > statc = this_cpu_ptr(memcg->vmstats_percpu); > for (; statc; statc = statc->parent) { > @@ -607,14 +610,16 @@ static inline void memcg_rstat_updated(struct mem_cgroup *memcg, int val) > if (memcg_vmstats_needs_flush(statc->vmstats)) > break; > > - stats_updates = READ_ONCE(statc->stats_updates) + abs(val); > - WRITE_ONCE(statc->stats_updates, stats_updates); > + stats_updates = atomic_add_return(abs(val), &statc->stats_updates); > if (stats_updates < MEMCG_CHARGE_BATCH) > continue; > > - atomic64_add(stats_updates, &statc->vmstats->stats_updates); > - WRITE_ONCE(statc->stats_updates, 0); > + stats_updates = atomic_xchg(&statc->stats_updates, 0); > + if (stats_updates) > + atomic64_add(stats_updates, > + &statc->vmstats->stats_updates); > } > + put_cpu(); > } > > static void __mem_cgroup_flush_stats(struct mem_cgroup *memcg, bool force) > @@ -4155,7 +4160,7 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) > mem_cgroup_stat_aggregate(&ac); > > } > - WRITE_ONCE(statc->stats_updates, 0); > + atomic_set(&statc->stats_updates, 0); > /* We are in a per-cpu loop here, only do the atomic write once */ > if (atomic64_read(&memcg->vmstats->stats_updates)) > atomic64_set(&memcg->vmstats->stats_updates, 0);