From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BFFF2C4345F for ; Mon, 29 Apr 2024 15:50:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 533A66B0083; Mon, 29 Apr 2024 11:50:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4E3776B0087; Mon, 29 Apr 2024 11:50:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3AB816B0092; Mon, 29 Apr 2024 11:50:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1E7F66B0083 for ; Mon, 29 Apr 2024 11:50:20 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 2FB941A013C for ; Mon, 29 Apr 2024 15:50:19 +0000 (UTC) X-FDA: 82063006158.24.20E881B Received: from out-173.mta0.migadu.com (out-173.mta0.migadu.com [91.218.175.173]) by imf08.hostedemail.com (Postfix) with ESMTP id 4C422160016 for ; Mon, 29 Apr 2024 15:50:17 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=U8EKPJVx; spf=pass (imf08.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714405817; a=rsa-sha256; cv=none; b=7px8ODfIysgBFOcJLCFaC9D4ad2mhPRAe7q4CzrrMqAhyQUbK0SoaPt2dsQySMtQGyPqfZ NNVilwStMmruKtz9h9C4oney8aNhauHXD+Ht7CGJ2O7gt9UeEYDxvrz6bdoP9Qdu0TJonf iIZIxD5seiCC7U2su9GvS8gzIzq915g= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=U8EKPJVx; spf=pass (imf08.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.173 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714405817; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yKgMs8IC2AjlGiD4IMMSk2bRlPofCUkI6VfQX63dzXs=; b=fX4mv1rtUQPSWMWGnmMOV6CAaL9ORdZWLXXUTWg4CQivGP2OaPjZHtdEuhVIMJZlL1TJqW ov28UGfYYZ5lvHYXoKlsVU3HtFU3MaPogTr1J2nmWv42PADkiBZo+mTCnHBt82KsujUHns lgK4a8SjG+/Uy+lrxsxh2rMbeEN+ctI= Date: Mon, 29 Apr 2024 08:50:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714405815; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=yKgMs8IC2AjlGiD4IMMSk2bRlPofCUkI6VfQX63dzXs=; b=U8EKPJVxDe+gGrnq8VxO9TUzwVfMi12vCHwaC4iKRbH8fpLgeztt2LNd09ccuI4RfWN6rU ju1yfFWF5PZ0dxpPtCic709OqJrJr/m75dMyXzVvLr8XSFZQ7MtyKfr5i6zlkXVWhPGG2r Escx8Kb6Jo1GI7h7/KHairkqLfrX25A= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 2/7] memcg: dynamically allocate lruvec_stats Message-ID: References: <20240427003733.3898961-1-shakeel.butt@linux.dev> <20240427003733.3898961-3-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240427003733.3898961-3-shakeel.butt@linux.dev> X-Migadu-Flow: FLOW_OUT X-Stat-Signature: f75k1tx1b8nsr39h4qnc7tr83f4nqpu5 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 4C422160016 X-HE-Tag: 1714405817-21343 X-HE-Meta: U2FsdGVkX1+nW2HpcdrwnJF021f+sOSDfHnrY05M6/jOaeQsPnyXlyz0sY0YOCPYBLEAkOR8GhCyQavuoxTVwqeKL6gEi+SSCpU2+f6uAU0gPtRdzC1/w6WAtTOWlimw8RQV21t5erTbabOf3nT0CfIoyZ1yzUeKZetu3zb6tFFIqdnZKEQtCN6SDd8babVsC+C+6cCF8rZRGdSCb3SCrpVIMUnCKTynhQ5OFpw2HIE8J7/zNPRYavLHMIpEMmEz4jYTmWSOy8r8qjai3E65+zntR+kQpCQx6capGAYSGWFooOFrBbXuuAoq3X9sziwrsJDKmqLJGpnEMvaxB89OcGERnH9TK9LRfiQ2e9602Ash3deDLY9FdjM1SpR39ct+gCj/sLcGTOvq1/2c4gvcA8nwtfZ5n/LBgsvWGqNN3VwHd0n05VdWqt7+4Z+bTCQoSn6aBffr4vvEMSfFFqc5cXix4l/iAtRk7sgzWYFe4/fw8pmwZ/KmzN/vG8rqYG56vVpqxCYsB+1CGmPOBuJKLYnxR8U802708+/imIrzxYRFNVmum63/4+odDexbzMYEoO1MNzdcH7gJt6OI4vIey5QnkbpkHkRbPKYS7DsbtG+1Q9hOKfAKFFO7oS3h4o2osxNIm6baFB+wvUFpkPY9Z/Ybl7smNOXKg4diteZKPRCo9U8hFqeM6/GgxEfcMSn3Q47sl/JWkaXcnjpcKAzsriCmMmgXfL5Qg5RWquaFwWoyVoSplGP6ePBLpKHcbz/QHv1zNl/uzHP1UyzSV7O23nb2E+VMBlQUjLbd641wMCC8NJqc1B4ciyLqxAQKbZEUKp8MXyfRTVKdMZej2xYpQYEJZ8cQmpKmt1RfxRiBweloSEiLnNa3zRqYKer+kYp7S74mQE9HHRmaNMsuu+JHP18604qw2adZUDyIFrIQ/e+aIsN6FDq5MG8kpAcOtaMOpnDvIIFEkiBrtuTcJkC RY4JT1B3 aBUvYnqy0PYIuJRAPMa1TiXId7VyU8Hz+cW8ZueYbcOo2zLRkK4DKs7SFCy+yKY9v6PuWHdBPnHoClM9JkP7Og5sjvtKFZYlYR/F4K4Mfl39UbBfasSHDwAXacQXJFHAsgHEn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 26, 2024 at 05:37:28PM -0700, Shakeel Butt wrote: > To decouple the dependency of lruvec_stats on NR_VM_NODE_STAT_ITEMS, we > need to dynamically allocate lruvec_stats in the mem_cgroup_per_node > structure. Also move the definition of lruvec_stats_percpu and > lruvec_stats and related functions to the memcontrol.c to facilitate > later patches. No functional changes in the patch. > > Signed-off-by: Shakeel Butt > --- > include/linux/memcontrol.h | 62 +++------------------------ > mm/memcontrol.c | 87 ++++++++++++++++++++++++++++++++------ > 2 files changed, 81 insertions(+), 68 deletions(-) > > diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h > index 9aba0d0462ca..ab8a6e884375 100644 > --- a/include/linux/memcontrol.h > +++ b/include/linux/memcontrol.h > @@ -83,6 +83,8 @@ enum mem_cgroup_events_target { > > struct memcg_vmstats_percpu; > struct memcg_vmstats; > +struct lruvec_stats_percpu; > +struct lruvec_stats; > > struct mem_cgroup_reclaim_iter { > struct mem_cgroup *position; > @@ -90,25 +92,6 @@ struct mem_cgroup_reclaim_iter { > unsigned int generation; > }; > > -struct lruvec_stats_percpu { > - /* Local (CPU and cgroup) state */ > - long state[NR_VM_NODE_STAT_ITEMS]; > - > - /* Delta calculation for lockless upward propagation */ > - long state_prev[NR_VM_NODE_STAT_ITEMS]; > -}; > - > -struct lruvec_stats { > - /* Aggregated (CPU and subtree) state */ > - long state[NR_VM_NODE_STAT_ITEMS]; > - > - /* Non-hierarchical (CPU aggregated) state */ > - long state_local[NR_VM_NODE_STAT_ITEMS]; > - > - /* Pending child counts during tree propagation */ > - long state_pending[NR_VM_NODE_STAT_ITEMS]; > -}; > - > /* > * per-node information in memory controller. > */ > @@ -116,7 +99,7 @@ struct mem_cgroup_per_node { > struct lruvec lruvec; > > struct lruvec_stats_percpu __percpu *lruvec_stats_percpu; > - struct lruvec_stats lruvec_stats; > + struct lruvec_stats *lruvec_stats; > > unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; > > @@ -1037,42 +1020,9 @@ static inline void mod_memcg_page_state(struct page *page, > } > > unsigned long memcg_page_state(struct mem_cgroup *memcg, int idx); > - > -static inline unsigned long lruvec_page_state(struct lruvec *lruvec, > - enum node_stat_item idx) > -{ > - struct mem_cgroup_per_node *pn; > - long x; > - > - if (mem_cgroup_disabled()) > - return node_page_state(lruvec_pgdat(lruvec), idx); > - > - pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > - x = READ_ONCE(pn->lruvec_stats.state[idx]); > -#ifdef CONFIG_SMP > - if (x < 0) > - x = 0; > -#endif > - return x; > -} > - > -static inline unsigned long lruvec_page_state_local(struct lruvec *lruvec, > - enum node_stat_item idx) > -{ > - struct mem_cgroup_per_node *pn; > - long x = 0; > - > - if (mem_cgroup_disabled()) > - return node_page_state(lruvec_pgdat(lruvec), idx); > - > - pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > - x = READ_ONCE(pn->lruvec_stats.state_local[idx]); > -#ifdef CONFIG_SMP > - if (x < 0) > - x = 0; > -#endif > - return x; > -} > +unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item idx); > +unsigned long lruvec_page_state_local(struct lruvec *lruvec, > + enum node_stat_item idx); > > void mem_cgroup_flush_stats(struct mem_cgroup *memcg); > void mem_cgroup_flush_stats_ratelimited(struct mem_cgroup *memcg); > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 53769d06053f..5e337ed6c6bf 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -576,6 +576,60 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) > return mz; > } > > +struct lruvec_stats_percpu { > + /* Local (CPU and cgroup) state */ > + long state[NR_VM_NODE_STAT_ITEMS]; > + > + /* Delta calculation for lockless upward propagation */ > + long state_prev[NR_VM_NODE_STAT_ITEMS]; > +}; > + > +struct lruvec_stats { > + /* Aggregated (CPU and subtree) state */ > + long state[NR_VM_NODE_STAT_ITEMS]; > + > + /* Non-hierarchical (CPU aggregated) state */ > + long state_local[NR_VM_NODE_STAT_ITEMS]; > + > + /* Pending child counts during tree propagation */ > + long state_pending[NR_VM_NODE_STAT_ITEMS]; > +}; > + > +unsigned long lruvec_page_state(struct lruvec *lruvec, enum node_stat_item idx) > +{ > + struct mem_cgroup_per_node *pn; > + long x; > + > + if (mem_cgroup_disabled()) > + return node_page_state(lruvec_pgdat(lruvec), idx); > + > + pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > + x = READ_ONCE(pn->lruvec_stats->state[idx]); > +#ifdef CONFIG_SMP > + if (x < 0) > + x = 0; > +#endif > + return x; > +} > + > +unsigned long lruvec_page_state_local(struct lruvec *lruvec, > + enum node_stat_item idx) > +{ > + struct mem_cgroup_per_node *pn; > + long x = 0; > + > + if (mem_cgroup_disabled()) > + return node_page_state(lruvec_pgdat(lruvec), idx); > + > + pn = container_of(lruvec, struct mem_cgroup_per_node, lruvec); > + x = READ_ONCE(pn->lruvec_stats->state_local[idx]); > +#ifdef CONFIG_SMP > + if (x < 0) > + x = 0; > +#endif Not directly related to your change, but do we still need it? And if yes, do we really care about !CONFIG_SMP case enough to justify these #ifdefs? > + return x; > +} > + > /* Subset of vm_event_item to report for memcg event stats */ > static const unsigned int memcg_vm_event_stat[] = { > PGPGIN, > @@ -5492,18 +5546,25 @@ static int alloc_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node) > if (!pn) > return 1; > > + pn->lruvec_stats = kzalloc_node(sizeof(struct lruvec_stats), GFP_KERNEL, > + node); Why not GFP_KERNEL_ACCOUNT? > + if (!pn->lruvec_stats) > + goto fail; > + > pn->lruvec_stats_percpu = alloc_percpu_gfp(struct lruvec_stats_percpu, > GFP_KERNEL_ACCOUNT); > - if (!pn->lruvec_stats_percpu) { > - kfree(pn); > - return 1; > - } > + if (!pn->lruvec_stats_percpu) > + goto fail; > > lruvec_init(&pn->lruvec); > pn->memcg = memcg; > > memcg->nodeinfo[node] = pn; > return 0; > +fail: > + kfree(pn->lruvec_stats); > + kfree(pn); > + return 1; > } > > static void free_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node) > @@ -5514,6 +5575,7 @@ static void free_mem_cgroup_per_node_info(struct mem_cgroup *memcg, int node) > return; > > free_percpu(pn->lruvec_stats_percpu); > + kfree(pn->lruvec_stats); > kfree(pn); > } > > @@ -5866,18 +5928,19 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) > > for_each_node_state(nid, N_MEMORY) { > struct mem_cgroup_per_node *pn = memcg->nodeinfo[nid]; > - struct mem_cgroup_per_node *ppn = NULL; > + struct lruvec_stats *lstats = pn->lruvec_stats; > + struct lruvec_stats *plstats = NULL; > struct lruvec_stats_percpu *lstatc; > > if (parent) > - ppn = parent->nodeinfo[nid]; > + plstats = parent->nodeinfo[nid]->lruvec_stats; > > lstatc = per_cpu_ptr(pn->lruvec_stats_percpu, cpu); > > for (i = 0; i < NR_VM_NODE_STAT_ITEMS; i++) { > - delta = pn->lruvec_stats.state_pending[i]; > + delta = lstats->state_pending[i]; > if (delta) > - pn->lruvec_stats.state_pending[i] = 0; > + lstats->state_pending[i] = 0; > > delta_cpu = 0; > v = READ_ONCE(lstatc->state[i]); > @@ -5888,12 +5951,12 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) > } > > if (delta_cpu) > - pn->lruvec_stats.state_local[i] += delta_cpu; > + lstats->state_local[i] += delta_cpu; > > if (delta) { > - pn->lruvec_stats.state[i] += delta; > - if (ppn) > - ppn->lruvec_stats.state_pending[i] += delta; > + lstats->state[i] += delta; > + if (plstats) > + plstats->state_pending[i] += delta; > } > } > } > -- > 2.43.0 >