From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D23D9C282C5 for ; Fri, 28 Feb 2025 20:33:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 30FB0280002; Fri, 28 Feb 2025 15:33:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2BEFB280001; Fri, 28 Feb 2025 15:33:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1867E280002; Fri, 28 Feb 2025 15:33:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E6211280001 for ; Fri, 28 Feb 2025 15:33:49 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 665B616125E for ; Fri, 28 Feb 2025 20:33:49 +0000 (UTC) X-FDA: 83170504578.17.E3819A9 Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) by imf01.hostedemail.com (Postfix) with ESMTP id 7EA4940011 for ; Fri, 28 Feb 2025 20:33:47 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="vVRF/+JO"; spf=pass (imf01.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740774827; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EqMg0hL+XFhc4KPH2HmRXMMpzVqSXfOvXWygEKuCZPs=; b=kEkQB27p0VTjjw1pJ8U4eezYkpiTIaNXfyf0JW9LNjWDqa8r/y5rio/7UqXChvoRyiUxs6 V9pgZ+0A+GPcXZ+qiRpz2kYv8DSyCsrRof7S9TEHDIn4mm6oXKdBYGQKcHxKuntm1k9KMo 70gv4aJO8lVwiqhWH0N7FEOSNUtIgU4= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="vVRF/+JO"; spf=pass (imf01.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740774827; a=rsa-sha256; cv=none; b=WPI9m5XI55vx1QSMKdZ6tOZTC5wxWF6fpt5Y6Ow0fS4X0OC+irF+esogcXNZKtDk+yqs8S BKUAvgMXdFVxaEkq8Z9q/RsCTnMpTEapuE6SGTuQ/JBIEEG0v0t/rPSIEb8ac3SXeDEtPE Ffxvu9GR9GOvpVIjT9+WHDxXLrltslI= Date: Fri, 28 Feb 2025 20:33:40 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1740774825; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=EqMg0hL+XFhc4KPH2HmRXMMpzVqSXfOvXWygEKuCZPs=; b=vVRF/+JONwu4qvOU4RlxS92UudoXv+J6mW+zKd9lVc+BxJOVlJv11JFNY5NZ9nxJCajIqA 3/cAfgxpsJXTTJFIowEglyiVMTL4p9I2MajupG504g+/knlI4zT7zYfRlGoS5Xb2oQDcaM YxdC9O6A6YA6Ct0ScE1ROK140bRN/mg= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: inwardvessel Cc: tj@kernel.org, shakeel.butt@linux.dev, mhocko@kernel.org, hannes@cmpxchg.org, akpm@linux-foundation.org, linux-mm@kvack.org, cgroups@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH 4/4 v2] cgroup: separate rstat list pointers from base stats Message-ID: References: <20250227215543.49928-1-inwardvessel@gmail.com> <20250227215543.49928-5-inwardvessel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250227215543.49928-5-inwardvessel@gmail.com> X-Migadu-Flow: FLOW_OUT X-Stat-Signature: z4xidcgwaukfcisoiufra3jtjmm94r13 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 7EA4940011 X-Rspam-User: X-HE-Tag: 1740774827-553546 X-HE-Meta: U2FsdGVkX1+2rpWazYj64MLOrQvY9BTBS7PzPZJcfeUGZbkNl5F2n3P5tKFw48TcXO67CCkcANOXI4rFb6gbdgsR95iE1A1fZeOSV9wBVYME79Lj7DXvGhx9TTkwWgf4ms1Jp01tsHC04ezZuNfgxV9YSI7A5bmkvXVH49zQuiwasCuOKKy4hOrLqYRyvRroQidSy1LVEF+Ise53ryVdDJ6gtWlUJpgiTRfQb8WN4TqkdFWnVJocQf/1/xNzCQuwucKq5PVk5sbdCdsyvHjjoPEZ2WQAGKvLDv2Q42GnhC570liwcfyU+eMTv8vNnfcwQU/Zvo/vyPkXr1noqDFaLgwSiRoMu7RTIJKwmr0gMhOIbC3EfgGsmCRrU2fRWkkRMB6U5AwfxHgpZz/6VRdNmbRWhAh9B//HxKmLt4Yk7M3QoKgpqOB6vA2rrwzw8Y7rpKeIMrT38cYCi9cbGmSbcZsm4CfVQTSbEcMu6516m1Xy7BzhdjQQqlUs9gPd52InN81XskpE1WStp+H6puR4+7O18hE45aR0xOVCTISjwgClYgG9lcL78P7tCIc3Ao4NvqHYTw3RPybJrdN9zIOXBZtNNyWvSIePDpB8UJ+BlWqTQRrDUiscUp+tnORFuxDz0P6g+JKzJ3Stom4b8cpLiiWcdgJDi06NGuKuM9RQ5eKbZdz5g0rwVbEhK+Mbdm2UGEn+/HurkiFX6InUsx+T3G8vXt3kcIVih6FvKsxH5nUGiy1ROxTdZuXD+TwPeVAIACNZislmpcDbrhAblcVgBpGoI3lrUo8Rkx80uxOwhKZe3ZKZYbHkUbhmRnZ10tpgAISL4nKTgaNeLOw2c+Usz7yVnQuiOTWEZScpY6DwxxD/K2t0zm2wb7T5J2oocTrv5rycULBEH7bHayCTe7sMCvMxgAlV1fBvoRvs1mkwj+7J6Pg5gs5hfArjqTIiVYdEq4a68Wys+Nr5vP6OE/y cyLy2r0t HZ9x6CzCEh+Wn31ZqdSWHLH19dLAzr6MsBXLz4E8zVK+kg0oakHNWBrNEeBCcwdHZIcEAd6yuFjVLYBASMEFQcV7gG6qBWRPV+S1OZRevHMSzCEbN4KziMvPPmHJjqbYhtQwcrdyZTMyy9WbSrIMpcvIjqLLxPgxNaEQZDRWChRDvMIYPwhymTPp2cXKg6WsRE2VPG4DrZQqh/CNwBmUiCvM12Z0T7D0Up42gdBmgSKHR7q4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 27, 2025 at 01:55:43PM -0800, inwardvessel wrote: > From: JP Kobryn > > A majority of the cgroup_rstat_cpu struct size is made up of the base > stat entities. Since only the "self" subsystem state makes use of these, > move them into a struct of their own. This allows for a new compact > cgroup_rstat_cpu struct that the formal subsystems can make use of. > Where applicable, decide on whether to allocate the compact or full > struct including the base stats. Mentioning the memory savings in this patch's log would be helpful. > > Signed-off-by: JP Kobryn > --- > include/linux/cgroup-defs.h | 37 ++++++++++++++---------- > kernel/cgroup/rstat.c | 57 +++++++++++++++++++++++++------------ > 2 files changed, 61 insertions(+), 33 deletions(-) > > diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h > index 1598e1389615..b0a07c63fd46 100644 > --- a/include/linux/cgroup-defs.h > +++ b/include/linux/cgroup-defs.h > @@ -170,7 +170,10 @@ struct cgroup_subsys_state { > struct percpu_ref refcnt; > > /* per-cpu recursive resource statistics */ > - struct cgroup_rstat_cpu __percpu *rstat_cpu; > + union { > + struct cgroup_rstat_cpu __percpu *rstat_cpu; > + struct cgroup_rstat_base_cpu __percpu *rstat_base_cpu; > + }; > > /* > * siblings list anchored at the parent's ->children > @@ -356,6 +359,24 @@ struct cgroup_base_stat { > * resource statistics on top of it - bsync, bstat and last_bstat. > */ > struct cgroup_rstat_cpu { > + /* > + * Child cgroups with stat updates on this cpu since the last read > + * are linked on the parent's ->updated_children through > + * ->updated_next. > + * > + * In addition to being more compact, singly-linked list pointing > + * to the cgroup makes it unnecessary for each per-cpu struct to > + * point back to the associated cgroup. > + * > + * Protected by per-cpu cgroup_rstat_cpu_lock. I just noticed, the patch that split the lock should have updated this comment. > + */ > + struct cgroup_subsys_state *updated_children; /* terminated by self */ > + struct cgroup_subsys_state *updated_next; /* NULL if not on list */ > +}; > + > +struct cgroup_rstat_base_cpu { > + struct cgroup_rstat_cpu self; Why 'self'? Why not 'rstat_cpu' like it's named in struct cgroup_subsys_state? > + > /* > * ->bsync protects ->bstat. These are the only fields which get > * updated in the hot path. > @@ -382,20 +403,6 @@ struct cgroup_rstat_cpu { > * deltas to propagate to the per-cpu subtree_bstat. > */ > struct cgroup_base_stat last_subtree_bstat; > - > - /* > - * Child cgroups with stat updates on this cpu since the last read > - * are linked on the parent's ->updated_children through > - * ->updated_next. > - * > - * In addition to being more compact, singly-linked list pointing > - * to the cgroup makes it unnecessary for each per-cpu struct to > - * point back to the associated cgroup. > - * > - * Protected by per-cpu cgroup_rstat_cpu_lock. > - */ > - struct cgroup_subsys_state *updated_children; /* terminated by self */ > - struct cgroup_subsys_state *updated_next; /* NULL if not on list */ > }; > > struct cgroup_freezer_state { > diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c > index b3eaefc1fd07..c08ebe2f9568 100644 > --- a/kernel/cgroup/rstat.c > +++ b/kernel/cgroup/rstat.c > @@ -24,6 +24,12 @@ static struct cgroup_rstat_cpu *cgroup_rstat_cpu( > return per_cpu_ptr(css->rstat_cpu, cpu); > } > > +static struct cgroup_rstat_base_cpu *cgroup_rstat_base_cpu( > + struct cgroup_subsys_state *css, int cpu) > +{ > + return per_cpu_ptr(css->rstat_base_cpu, cpu); > +} > + > static inline bool is_base_css(struct cgroup_subsys_state *css) > { > return css->ss == NULL; > @@ -438,17 +444,31 @@ int cgroup_rstat_init(struct cgroup_subsys_state *css) > > /* the root cgrp's self css has rstat_cpu preallocated */ > if (!css->rstat_cpu) { > - css->rstat_cpu = alloc_percpu(struct cgroup_rstat_cpu); > - if (!css->rstat_cpu) > - return -ENOMEM; > - } > + if (is_base_css(css)) { > + css->rstat_base_cpu = alloc_percpu(struct cgroup_rstat_base_cpu); > + if (!css->rstat_base_cpu) > + return -ENOMEM; > > - /* ->updated_children list is self terminated */ > - for_each_possible_cpu(cpu) { > - struct cgroup_rstat_cpu *rstatc = cgroup_rstat_cpu(css, cpu); > + for_each_possible_cpu(cpu) { > + struct cgroup_rstat_base_cpu *rstatc; We should use different variable names for cgroup_rstat_base_cpu and cgroup_rstat_cpu throughout. Maybe 'brstatc' or 'rstatbc' for the latter? > + > + rstatc = cgroup_rstat_base_cpu(css, cpu); > + rstatc->self.updated_children = css; > + u64_stats_init(&rstatc->bsync); > + } > + } else { > + css->rstat_cpu = alloc_percpu(struct cgroup_rstat_cpu); > + if (!css->rstat_cpu) > + return -ENOMEM; > + > + for_each_possible_cpu(cpu) { > + struct cgroup_rstat_cpu *rstatc; > + > + rstatc = cgroup_rstat_cpu(css, cpu); > + rstatc->updated_children = css; > + } > + } > > - rstatc->updated_children = css; > - u64_stats_init(&rstatc->bsync); I think there's too much replication here. We can probably do something like this (untested): diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c index de057c992f824..1750a69887a2e 100644 --- a/kernel/cgroup/rstat.c +++ b/kernel/cgroup/rstat.c @@ -443,34 +443,30 @@ int cgroup_rstat_init(struct cgroup_subsys_state *css) int cpu; /* the root cgrp's self css has rstat_cpu preallocated */ - if (!css->rstat_cpu) { - if (is_base_css(css)) { - css->rstat_base_cpu = alloc_percpu(struct cgroup_rstat_base_cpu); - if (!css->rstat_base_cpu) - return -ENOMEM; - - for_each_possible_cpu(cpu) { - struct cgroup_rstat_base_cpu *rstatc; + if (css->rstat_cpu) + return 0; - rstatc = cgroup_rstat_base_cpu(css, cpu); - rstatc->self.updated_children = css; - u64_stats_init(&rstatc->bsync); - } - } else { - css->rstat_cpu = alloc_percpu(struct cgroup_rstat_cpu); - if (!css->rstat_cpu) - return -ENOMEM; + if (is_base_css(css)) { + css->rstat_base_cpu = alloc_percpu(struct cgroup_rstat_base_cpu); + if (!css->rstat_base_cpu) + return -ENOMEM; + } else { + css->rstat_cpu = alloc_percpu(struct cgroup_rstat_cpu); + if (!css->rstat_cpu) + return -ENOMEM; + } - for_each_possible_cpu(cpu) { - struct cgroup_rstat_cpu *rstatc; + for_each_possible_cpu(cpu) { + struct cgroup_rstat_base_cpu *brstatc = NULL; + struct cgroup_rstat_cpu *rstatc; - rstatc = cgroup_rstat_cpu(css, cpu); - rstatc->updated_children = css; - } + if (is_base_css(css)) { + brstatc = cgroup_rstat_base_cpu(css, cpu); + u64_stats_init(&brstatc->bsync); + rstatc = brstatc->self; } - + rstatc->updated_children = css; } - return 0; } > } > > return 0; > @@ -522,9 +542,10 @@ static void cgroup_base_stat_sub(struct cgroup_base_stat *dst_bstat, > > static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu) > { > - struct cgroup_rstat_cpu *rstatc = cgroup_rstat_cpu(&cgrp->self, cpu); > + struct cgroup_rstat_base_cpu *rstatc = cgroup_rstat_base_cpu( > + &cgrp->self, cpu); > struct cgroup *parent = cgroup_parent(cgrp); > - struct cgroup_rstat_cpu *prstatc; > + struct cgroup_rstat_base_cpu *prstatc; Same here, we should use different names than rstatc and prstatc. Same applies for the rest of the diff. > struct cgroup_base_stat delta; > unsigned seq; > > @@ -552,25 +573,25 @@ static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu) > cgroup_base_stat_add(&cgrp->last_bstat, &delta); > > delta = rstatc->subtree_bstat; > - prstatc = cgroup_rstat_cpu(&parent->self, cpu); > + prstatc = cgroup_rstat_base_cpu(&parent->self, cpu); > cgroup_base_stat_sub(&delta, &rstatc->last_subtree_bstat); > cgroup_base_stat_add(&prstatc->subtree_bstat, &delta); > cgroup_base_stat_add(&rstatc->last_subtree_bstat, &delta); > } > } > > -static struct cgroup_rstat_cpu * > +static struct cgroup_rstat_base_cpu * > cgroup_base_stat_cputime_account_begin(struct cgroup *cgrp, unsigned long *flags) > { > - struct cgroup_rstat_cpu *rstatc; > + struct cgroup_rstat_base_cpu *rstatc; > > - rstatc = get_cpu_ptr(cgrp->self.rstat_cpu); > + rstatc = get_cpu_ptr(cgrp->self.rstat_base_cpu); > *flags = u64_stats_update_begin_irqsave(&rstatc->bsync); > return rstatc; > } > > static void cgroup_base_stat_cputime_account_end(struct cgroup *cgrp, > - struct cgroup_rstat_cpu *rstatc, > + struct cgroup_rstat_base_cpu *rstatc, > unsigned long flags) > { > u64_stats_update_end_irqrestore(&rstatc->bsync, flags); > @@ -580,7 +601,7 @@ static void cgroup_base_stat_cputime_account_end(struct cgroup *cgrp, > > void __cgroup_account_cputime(struct cgroup *cgrp, u64 delta_exec) > { > - struct cgroup_rstat_cpu *rstatc; > + struct cgroup_rstat_base_cpu *rstatc; > unsigned long flags; > > rstatc = cgroup_base_stat_cputime_account_begin(cgrp, &flags); > @@ -591,7 +612,7 @@ void __cgroup_account_cputime(struct cgroup *cgrp, u64 delta_exec) > void __cgroup_account_cputime_field(struct cgroup *cgrp, > enum cpu_usage_stat index, u64 delta_exec) > { > - struct cgroup_rstat_cpu *rstatc; > + struct cgroup_rstat_base_cpu *rstatc; > unsigned long flags; > > rstatc = cgroup_base_stat_cputime_account_begin(cgrp, &flags); > -- > 2.43.5 > >