From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 53EAFC4345F for ; Mon, 29 Apr 2024 16:00:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A3F8D6B007B; Mon, 29 Apr 2024 12:00:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9EF6C6B0088; Mon, 29 Apr 2024 12:00:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B8696B008A; Mon, 29 Apr 2024 12:00:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 72A3D6B007B for ; Mon, 29 Apr 2024 12:00:30 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id DD8B4C012F for ; Mon, 29 Apr 2024 16:00:29 +0000 (UTC) X-FDA: 82063031778.18.A792374 Received: from out-171.mta0.migadu.com (out-171.mta0.migadu.com [91.218.175.171]) by imf06.hostedemail.com (Postfix) with ESMTP id 0E99818001C for ; Mon, 29 Apr 2024 16:00:26 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=i5TEj7+5; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.171 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714406427; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DsNvp7x461b8Grs3tUPrt/6gegR3eirNH9Z+IzTL520=; b=6CxsoKQg00tydxWCMmNxCu13Zv2w17m2K1qfyDhTpZiNOyw+YuXAEqmsUAAA3DGmcJlZ5H 62xUTcJSxMKYOROpLLVzCQY4uG5kWi8A3Tj7Q9tQE/eHA68TjTnjlkMrN/Gpw1D+MGpjrI YthgacUsGkA4iOSgZu+c93JQSyOOzCw= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=i5TEj7+5; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf06.hostedemail.com: domain of roman.gushchin@linux.dev designates 91.218.175.171 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714406427; a=rsa-sha256; cv=none; b=uw/lQerfWgIuU7AIErdluFdwWbY0iQYU34w2xQ2xymX3wcMP5Jnhxkv7ynKYsaq2ed+iO5 q/F4Sxft4p+KZy4soj/D2dr8Z0NRYAX39Roiq/Geo+xztP5kP8Ppx3L9Rfn1E0hV7TBfla t0/dIV9S4CZ277nQhUjb1pPeWslynlo= Date: Mon, 29 Apr 2024 09:00:16 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1714406422; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DsNvp7x461b8Grs3tUPrt/6gegR3eirNH9Z+IzTL520=; b=i5TEj7+56rkhJjx8F3DaeWOepz0FRG/jHTzBj1f+8VAEajV8LDr5QwNWjbt54rNRQEAH+v UzuQzORHz6rW8bghjfiSpjLA3jALrJLZsn+3WnCFzzJNDUHNU7dOQWAiQoqEVpTjSSBQzJ mHzE4gUOjTgMFEEbFyM22zqsCov0DnY= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Shakeel Butt Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Muchun Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 3/7] memcg: reduce memory for the lruvec and memcg stats Message-ID: References: <20240427003733.3898961-1-shakeel.butt@linux.dev> <20240427003733.3898961-4-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240427003733.3898961-4-shakeel.butt@linux.dev> X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 0E99818001C X-Stat-Signature: bg6m6jrni4eewi1ttyt753win96bsthc X-Rspam-User: X-HE-Tag: 1714406426-174334 X-HE-Meta: U2FsdGVkX18P8NAnSidsACVJ/kJFAduqZyGf+feOk2Vw0IkCrya/naJIpXFCQJU9VymPwHc4em4BTk9RUjqbjqm2G3FI5lkM0yy2WBnos6AqRYg6Xm9JUumbt567PONwY1vL3WAME12HtEOaHP7R8PP6ZM/rSW0khG9EiknbjaOMXFsKQWcd87OnDuumA4AllRYFbn9RN/6TQ8usSiUNt0Ectc+DBY5L9bHINhgUqnquq/qErsTmj6ZQ3HYeR2nkzKm1Kg8yoxM2S0/sXy0ugPiflLghKm79uH3/93nCGZ6+JYaeVelPdptI/vJhn0meYKNhNwyYjYklizfB1d3aonTiCfczYJ3/va2OZRfuU14MCXRcXi1EOK+g5XAmrjIs3BC2WQG4hDAdQ4JV8EZb/pi41czMRHaUjhPXHxFZnpPtfu8+fOyPkMU4JpLKw7/bzFd8EW1B+/ngSUggA9auGFjfBADxJsPrJihqE5HT2i7UBTWbLsF0eufBUHgIpY+UsUfW2Vj0nnBbly1cfDKYWQI/48rcNqycPvmUmRV2gKRrlFM36gNLQh58x2Vg1xpUNH+dB8+noTCq8B/E1tF3KrdOn0Bk08XKFsImUUy8DzLg+pPI+LdGuIVO/sfUPN0BrFtr25Lwm9IQhbLwPQNd3NqZT3wyp6qcudXIHIQ5Q0bhnzrT3DwFCwLfRbH09esABDj1Xw7ue1FzXmo0J9aJ8ymB9h3K5781ds3A9waCc/oWsMvg++ccEWruKau34We4kuPPLdKufXlaYW0nzMFNWVj8uiodomgDAdTVq/c+ZcoGbN4padBTL0vrywmnpXDrIqYrV2GSQWHTFZPov4q3BH79YQM2XlzVHPus8BT/9+i+eqMwLVQyL2NyZpFWGJ10kW/r+RLIdYkc1bLsHNSCALqMY0KM5g1PPRSxFmNIaX8hOnoCLV0HliUteI1qn9ehDADFaKq3MV+kh9mG6H9 nXBgU60u 4iizRg81oLLXaH64AdTAAa5ck/crzD7Lmz//sjacoCS4ulgyOztF3MzyTKK9kNbfO8cDYia2CAOPRFI4PqUAYxKrXjg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 26, 2024 at 05:37:29PM -0700, Shakeel Butt wrote: > At the moment, the amount of memory allocated for stats related structs > in the mem_cgroup corresponds to the size of enum node_stat_item. > However not all fields in enum node_stat_item has corresponding memcg > stats. So, let's use indirection mechanism similar to the one used for > memcg vmstats management. > > For a given x86_64 config, the size of stats with and without patch is: > > structs size in bytes w/o with > > struct lruvec_stats 1128 648 > struct lruvec_stats_percpu 752 432 > struct memcg_vmstats 1832 1352 > struct memcg_vmstats_percpu 1280 960 > > The memory savings is further compounded by the fact that these structs > are allocated for each cpu and for each node. To be precise, for each > memcg the memory saved would be: > > Memory saved = ((21 * 3 * NR_NODES) + (21 * 2 * NR_NODS * NR_CPUS) + > (21 * 3) + (21 * 2 * NR_CPUS)) * sizeof(long) > > Where 21 is the number of fields eliminated. Nice savings! > > Signed-off-by: Shakeel Butt > --- > mm/memcontrol.c | 138 ++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 115 insertions(+), 23 deletions(-) > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index 5e337ed6c6bf..c164bc9b8ed6 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -576,35 +576,105 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz) > return mz; > } > > +/* Subset of node_stat_item for memcg stats */ > +static const unsigned int memcg_node_stat_items[] = { > + NR_INACTIVE_ANON, > + NR_ACTIVE_ANON, > + NR_INACTIVE_FILE, > + NR_ACTIVE_FILE, > + NR_UNEVICTABLE, > + NR_SLAB_RECLAIMABLE_B, > + NR_SLAB_UNRECLAIMABLE_B, > + WORKINGSET_REFAULT_ANON, > + WORKINGSET_REFAULT_FILE, > + WORKINGSET_ACTIVATE_ANON, > + WORKINGSET_ACTIVATE_FILE, > + WORKINGSET_RESTORE_ANON, > + WORKINGSET_RESTORE_FILE, > + WORKINGSET_NODERECLAIM, > + NR_ANON_MAPPED, > + NR_FILE_MAPPED, > + NR_FILE_PAGES, > + NR_FILE_DIRTY, > + NR_WRITEBACK, > + NR_SHMEM, > + NR_SHMEM_THPS, > + NR_FILE_THPS, > + NR_ANON_THPS, > + NR_KERNEL_STACK_KB, > + NR_PAGETABLE, > + NR_SECONDARY_PAGETABLE, > +#ifdef CONFIG_SWAP > + NR_SWAPCACHE, > +#endif > +}; > + > +static const unsigned int memcg_stat_items[] = { > + MEMCG_SWAP, > + MEMCG_SOCK, > + MEMCG_PERCPU_B, > + MEMCG_VMALLOC, > + MEMCG_KMEM, > + MEMCG_ZSWAP_B, > + MEMCG_ZSWAPPED, > +}; > + > +#define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items) > +#define NR_MEMCG_STATS (NR_MEMCG_NODE_STAT_ITEMS + ARRAY_SIZE(memcg_stat_items)) > +static int8_t mem_cgroup_stats_index[MEMCG_NR_STAT] __read_mostly; > + > +static void init_memcg_stats(void) > +{ > + int8_t i, j = 0; > + > + /* Switch to short once this failure occurs. */ > + BUILD_BUG_ON(NR_MEMCG_STATS >= 127 /* INT8_MAX */); > + > + for (i = 0; i < NR_MEMCG_NODE_STAT_ITEMS; ++i) > + mem_cgroup_stats_index[memcg_node_stat_items[i]] = ++j; > + > + for (i = 0; i < ARRAY_SIZE(memcg_stat_items); ++i) > + mem_cgroup_stats_index[memcg_stat_items[i]] = ++j; > +} > + > +static inline int memcg_stats_index(int idx) > +{ > + return mem_cgroup_stats_index[idx] - 1; > +} Hm, I'm slightly worried about the performance penalty due to the increased cache footprint. Can't we have some formula to translate idx to memcg_idx instead of a translation table? If it requires a re-arrangement of items we can add a translation table on the read side to save the visible order in procfs/sysfs. Or I'm overthinking and the real difference is negligible? Thanks!