From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5294CC282C6 for ; Mon, 3 Mar 2025 20:09:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6174280003; Mon, 3 Mar 2025 15:09:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B0D11280002; Mon, 3 Mar 2025 15:09:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9D438280003; Mon, 3 Mar 2025 15:09:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7C4E0280002 for ; Mon, 3 Mar 2025 15:09:24 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2D64380FD9 for ; Mon, 3 Mar 2025 20:09:24 +0000 (UTC) X-FDA: 83181329448.13.EF7467C Received: from out-180.mta0.migadu.com (out-180.mta0.migadu.com [91.218.175.180]) by imf04.hostedemail.com (Postfix) with ESMTP id 350FC40003 for ; Mon, 3 Mar 2025 20:09:22 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=u6WBQzx1; spf=pass (imf04.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.180 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741032562; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7Th9D+rd8bSzLJID5iKQjkuUJMxydBwVvwzo4lZVXc4=; b=yB/+3Q1DkbZs4MzanjSzOZqG1WuAcIcnMAFmzNWNxHgDGWky9fzpV2s50n8zpPZknkzCrK Rt7/Iqp+6Gmy3V+y/3yMgaSNmnMELYCNvCny8JUwO0YQkQXDmMfNjwgQBJceVE+cU8m7W6 X1GnYq3LtnJWczvG529bd3vnwYxKzL0= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=u6WBQzx1; spf=pass (imf04.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.180 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741032562; a=rsa-sha256; cv=none; b=irksXU+1GWJB7xxV7M49q4k3jVKBzlmpS7ruNJ4Vb6WN1rSLC1Ul8TiEAEfXj9/ofsBJ5Z zYiTV1KR+Ohjq8ZMb/Tgc88SFp/rj5qkdB4vJ/CAqGUt7FuEPpN8z/Oz/9ekMG/3n8j0jr xj7tmXFk91qXM5/g9jxqWGX4lC/WIeI= Date: Mon, 3 Mar 2025 12:09:13 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1741032558; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7Th9D+rd8bSzLJID5iKQjkuUJMxydBwVvwzo4lZVXc4=; b=u6WBQzx1+XAEpc82Y/8kgGbNvs5bmHB3gzyZKj0gB5BwH+1SD5+hRkTv4tyFB1FkDCTRan xEn1e2ivvTnLP6Fknb35MLYfHBJPJKtWO/o/zcmmfV+FXzAy8YdXK3x57BQ6Js+oUmXBne g2/LYLijxSz2g7Gmju363Z6TitGpLHk= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Yosry Ahmed Cc: Michal =?utf-8?Q?Koutn=C3=BD?= , inwardvessel , tj@kernel.org, mhocko@kernel.org, hannes@cmpxchg.org, akpm@linux-foundation.org, linux-mm@kvack.org, cgroups@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH 3/4 v2] cgroup: separate rstat locks for subsystems Message-ID: <4u2evk2hcg4ynas4jom6hk45vcwgysjxlodhun3gmgbyld7krd@g66fne6mzt6y> References: <20250227215543.49928-1-inwardvessel@gmail.com> <20250227215543.49928-4-inwardvessel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 350FC40003 X-Stat-Signature: mh7w4oyt9rn3kizx3wbxscey1fgg6jeb X-HE-Tag: 1741032562-166949 X-HE-Meta: U2FsdGVkX18kZ7HUFENaF7Acbd9L1CnBUmTp7cwO71ojGatd32mYhGCZ/4IfKAx+awCqxut0lVr4sw4wI+PnVD2m5J704wLKsnpao3MkCKNl0Pvd8WEaDFgK4QAiuv0r/mwMymPWuRQp90RWDOhJr6Ap0Beag0d77f3Sz0bXQzqez7vE0trc7LtGB56J+IfFz6Mk3ShWkXSFrtJWBK428YHQponnNnpHIAo6yf1Vh2IbMl9PnsfWtX4CtxVWdZEiX4xZdjhrl+T21JUIaj0+mQuGvQ9aytPVUPuK4rBEy6VKjXzH4QpIjTpTHBipiE2fjAB3Hfebv9uXIrhrVm8IFbK+rNFs/XmLEou1aT7uTM7jaSi0wHJ9Af5QeaDE+9Oh1Y9C/32olbIhklnLmZ7Ao8ti98goxxJ7JTZ79ILtDx2Yxo0gXS2RcUQpHK+1+IaufRG8YwJRcGC/BBJK3bNpJDEQde6Tg4Q+VFo5EOhw/jAMQa06GnoudWwn6awyrpise69b1do1xhyKMgQj4H+bNuB1my7ttNt+iphJaDKmTyWqqMNzMcPJuidZC6DYHEn6z2XoCj1zXDpqKsUm24hfJylP/KE3sCmezkHRcLOlu5eLqcghpWOgculnsEI7XNHGyB1K7+uPWtxVpb00Vn4spNlFqBx38iR6R4GmNk02Az6DPKN9koBJphjbD8Tkculu6UXxppLuiDC9Df34GpHL4hAvKI8hgsYlVCySMLr2XMeE6FkUrz+tuB2tM2Pk9MjP06WkdWHrYCgQgZ6tJZXgTczNLM7x1/YSzCx7wjNAimmsz98a06Cls+nDQEcmjRvxgsiFMejhc8KFnnazoVPJ+Tke4B+GUCfVOU6bBQ3+C0J4aTxP/gfUDaa1sAEef3j8tqETEFCC1f1dB2Gm9OauCD+8RU+P41aAtoa2P4vbNAsRFUyzlnxq9+gC5ckYTwKz4OzuMXDFPTAWsxt4la/ 5E4EutMH mnv6gPsmST18uZHRBALls1/aoP7058d8e7Q4O4Jm1zxwuAljXk6nwUauHvF1JxKd46bWovk/JGNJ0hk0vSZU/RrQkCEzlj0fc9qxXAo3nAdLAfFCWuCjhlEuGLLcMhpdneM2NRZGSQsmRP6Ku1koZ8QA+UpzSrknMpb1M1CHwOi/+AD9ixn+JGE0qHSK9nidJnKPxFxy7ImiIE1rCIVmKmrrBym0Ebt5Ifi4ihpe99TvMVsdXIZiiZ2DlEmR6uYhGUFxVOyN0EdFGtKg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Mar 03, 2025 at 07:50:20PM +0000, Yosry Ahmed wrote: > On Mon, Mar 03, 2025 at 10:40:45AM -0800, Shakeel Butt wrote: > > On Mon, Mar 03, 2025 at 06:29:53PM +0000, Yosry Ahmed wrote: > > > On Mon, Mar 03, 2025 at 04:22:42PM +0100, Michal Koutný wrote: > > > > On Thu, Feb 27, 2025 at 01:55:42PM -0800, inwardvessel wrote: > > > > > From: JP Kobryn > > > > ... > > > > > +static inline bool is_base_css(struct cgroup_subsys_state *css) > > > > > +{ > > > > > + return css->ss == NULL; > > > > > +} > > > > > > > > Similar predicate is also used in cgroup.c (various cgroup vs subsys > > > > lifecycle functions, e.g. css_free_rwork_fn()). I think it'd better > > > > unified, i.e. open code the predicate here or use the helper in both > > > > cases (css_is_cgroup() or similar). > > > > > > > > > void __init cgroup_rstat_boot(void) > > > > > { > > > > > - int cpu; > > > > > + struct cgroup_subsys *ss; > > > > > + int cpu, ssid; > > > > > > > > > > - for_each_possible_cpu(cpu) > > > > > - raw_spin_lock_init(per_cpu_ptr(&cgroup_rstat_cpu_lock, cpu)); > > > > > + for_each_subsys(ss, ssid) { > > > > > + spin_lock_init(&cgroup_rstat_subsys_lock[ssid]); > > > > > + } > > > > > > > > Hm, with this loop I realize it may be worth putting this lock into > > > > struct cgroup_subsys_state and initializing them in > > > > cgroup_init_subsys() to keep all per-subsys data in one pack. > > > > > > I thought about this, but this would have unnecessary memory overhead as > > > we only need one lock per-subsystem. So having a lock in every single > > > css is wasteful. > > > > > > Maybe we can put the lock in struct cgroup_subsys? Then we can still > > > initialize them in cgroup_init_subsys(). > > > > > > > Actually one of things I was thinking about if we can just not have > > per-subsystem lock at all. At the moment, it is protecting > > rstat_flush_next field (today in cgroup and JP's series it is in css). > > What if we make it a per-cpu then we don't need the per-subsystem lock > > all? Let me know if I missed something which is being protected by this > > lock. > > I think it protects more than that. I remember locking into this before, > and the thing I remember is that stats themselves. Looking at > mem_cgroup_stat_aggregate(), we aggregate the stats into the per-cgroup > counters non-atomically. This is only protected by the rstat lock > (currently global, per-subsystem with the series) AFAICT. > > Not sure if only the memory subsystem has this dependency or if others > do as well. I remember looking into switching these counters to atomics > to remove the global lock, but it performed worse IIRC. > > I remember also looking into partioning the lock into a per-cgroup (or > per-css now?) lock, and only holding locks of the parent and child > cgroups as we flush each cgroup. I don't remember if I actually > implemented this, but it introduces complexity. > > Perhaps we can defer the locking into the subsystem, if only the memory > controller requires it. In this case mem_cgroup_css_rstat_flush() (or > mem_cgroup_stat_aggregate()) can hold a memcg-specific spinlock only > while aggregating the stats. > > There could be other things protected by the lock, but that's what I > remember. Thanks for reminding me. It does not seem straight forward or simple. Lower priority then.