From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 655ACC3ABBC for ; Fri, 9 May 2025 23:29:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C9B198E0012; Fri, 9 May 2025 19:29:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C71C88E000E; Fri, 9 May 2025 19:29:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B5FF08E0012; Fri, 9 May 2025 19:29:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 97D148E000E for ; Fri, 9 May 2025 19:29:38 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5D077120C04 for ; Fri, 9 May 2025 23:29:40 +0000 (UTC) X-FDA: 83424963720.29.24F09A1 Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) by imf12.hostedemail.com (Postfix) with ESMTP id A3AFA40005 for ; Fri, 9 May 2025 23:29:38 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=upWllJvj; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf12.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746833378; a=rsa-sha256; cv=none; b=LcX+3DsjRIdbWqYWUtSkhbzSHPUIbmMnbd7kl1lcVgMfWh6fQbvE5uMfRwfTcn4EBDT4v8 h7aPJ4LOCvl6dGPzlyUYAMpKg0vxdC3JlsUoNrUmtCmvYsGSrWW2z1k/Cj7fZanhNDSdCg bQqEk5TmXdHaLxBmXK+upDCZEOvFbS8= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=upWllJvj; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf12.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746833378; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4a7blvwRplGNz94ltPdS2ZfQLa5WT241cOYcMci6OlE=; b=OYc4Jf/rIwy1oP/Aum8ixj+54VemYuF5G4rDD99nZOjxp4V0KYOX7Vey8NEHZjJmUECvbJ yw61sDy0XZH1xfD8SgHoaLHHyPoAaAp0mgBqs7UFFMMq/gVcXSo32igGYR1AzALTo9Cqr1 sDJmKoxSE//nxlWHIzXvY/DhuMUwzuE= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1746833375; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=4a7blvwRplGNz94ltPdS2ZfQLa5WT241cOYcMci6OlE=; b=upWllJvjyssR355V9DxEy9YgOOnuM89e6EPEsSsj7BWpMwwxp18AEvxIVmgLgmeEaGKgeo 23A2iciBm/qv6J/AqonpZylyfBuq6k00afaI7tJOGCpnwjZv4YIehpGlovweg5WjsUX5MN jVL1akwm51bHymPVW1JHokRABqrvvYM= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: [PATCH 1/4] memcg: add infra for nmi safe memcg stats Date: Fri, 9 May 2025 16:28:56 -0700 Message-ID: <20250509232859.657525-2-shakeel.butt@linux.dev> In-Reply-To: <20250509232859.657525-1-shakeel.butt@linux.dev> References: <20250509232859.657525-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: A3AFA40005 X-Stat-Signature: 5oqe8eqex5swd3mpbsad5csmkeurawp4 X-Rspam-User: X-HE-Tag: 1746833378-406919 X-HE-Meta: U2FsdGVkX19RZwkV73hI/+ikl2IYJYaMMd0Z6kC9CeB9hT6b+r3bomT3owmwLauEA3qierGb5Mvk7Gsey3Q7gyKJxT7JnmyV5bpIuDMjJmF10M3lseFLVcaWfAKnDlyBSZ2iGkAI0H0yZ2IibWBP0Xe44aCTR15qXAU6ZGRL3A8xOLTQ6ash/l/vJ8pfgmjTw7RT61URpzWBmyIxxB025UsHwsHXUpiLsHp27oYIzEM0U7YdFjqPJ52BDwFmbNII4NhR0aIEhe2oxe0btUIOE5W0N6P8HHq7/cg4YOGB4WPTGRlGJsAqGbcgxBZ9ChA5TIL4T/Ncx2w56OiRZmmF31cW9XHf8OJOeovnUthfBa05W+BRp1y1P7CngFIahaNkzN0TxRTwcbqqYkbRPOAMBpSNuiws7ERVjAhha0hIKaoDMvSBMnirYvn1P6j93R0axjkIr47RMYLnnfIv/M/PBXp4poYt0NNbnzruMDckADRc9pXtHFYzS17GYNctZgLt2/lfbUHXsiCBY/7UJcEgNZvMNaoqQfcPIGD0dfb+l52iHb/wFcn2I//sTDzndZMnHQByuBinOddcK/bAiy1npxhXOhZiE1IpoAg1n9caasP9BzS0yg30Pf+D5YKwbcXm+cMyAXIAUnD9scneqqQVy25G6nyswC2BUpVnl5Vk9TtVIaPrkp1ouC9Yh1oVIODhbWwwrxULzbPQAtePgigACY3LcVehi0j4LbK54DJYR/bynsemdK+9rBb0ZkUqzt8eowor4qkvoMvkW575gZZsxwGjMzyyfksGWwTX6nNlzp+Vtk4X8lIKOEj7jXxANt9+dX27imUmWa1qIFPlcHqTVB5NT7wQCIV7lJD4McnolAlaTE4FDC3CHjm0XfCI8iFUq9QTjhpi9S59xtskJNSGrivsQsfIvO2S+RHcWWVfHP1GfcGie3+TWQPYZktEb6u2mltJivLt0FdfpHGO1Y1 iFX90pt7 5KL2++ywbUKmwxVdUY+gGBfZlNWRakh3LrphOHgeTiV6Rbqn6gdKMgyB8ujcwf04Xv4PijWdJZzXtg8v5GdvOC4X54ynRUfuhRn5E3sKUVrW6eQMmz2RAT4rRz434kuoOfdxGbkqzwOw9Y362yz517IoXr3mVZNy2vQW61s1fH17xV+I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: BPF programs can trigger memcg charging in nmi context and at the moment memcg charging code path for kernel memory does not have support for nmi context. To support kernel memory charging for nmi support, we need to make objcg charging nmi safe and also memcg stats nmi. At the moment, the memcg stats which get updated in the objcg charging path are MEMCG_KMEM, NR_SLAB_RECLAIMABLE_B & NR_SLAB_UNRECLAIMABLE_B. Rather than adding support for all memcg stats to be nmi safe, let's just add infra to make these three stats nmi safe which this patch is doing. Signed-off-by: Shakeel Butt --- include/linux/memcontrol.h | 6 ++++++ mm/memcontrol.c | 43 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 308c01bf98f5..ed9acb68652a 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -113,6 +113,9 @@ struct mem_cgroup_per_node { CACHELINE_PADDING(_pad2_); unsigned long lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS]; struct mem_cgroup_reclaim_iter iter; + /* slab stats for nmi context */ + atomic64_t slab_reclaimable; + atomic64_t slab_unreclaimable; }; struct mem_cgroup_threshold { @@ -236,6 +239,9 @@ struct mem_cgroup { atomic_long_t memory_events[MEMCG_NR_MEMORY_EVENTS]; atomic_long_t memory_events_local[MEMCG_NR_MEMORY_EVENTS]; + /* MEMCG_KMEM for nmi context */ + atomic64_t kmem_stat; + /* * Hint of reclaim pressure for socket memroy management. Note * that this indicator should NOT be used in legacy cgroup mode diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9ea6e5591cab..7200f6930daf 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4023,6 +4023,47 @@ static void mem_cgroup_stat_aggregate(struct aggregate_control *ac) } } +static void flush_nmi_stats(struct mem_cgroup *memcg, struct mem_cgroup *parent, + int cpu) +{ + int nid; + + if (atomic64_read(&memcg->kmem_stat)) { + s64 kmem = atomic64_xchg(&memcg->kmem_stat, 0); + int index = memcg_stats_index(MEMCG_KMEM); + + memcg->vmstats->state[index] += kmem; + if (parent) + parent->vmstats->state_pending[index] += kmem; + } + + for_each_node_state(nid, N_MEMORY) { + struct mem_cgroup_per_node *pn = memcg->nodeinfo[nid]; + struct lruvec_stats *lstats = pn->lruvec_stats; + struct lruvec_stats *plstats = NULL; + + if (parent) + plstats = parent->nodeinfo[nid]->lruvec_stats; + + if (atomic64_read(&pn->slab_reclaimable)) { + s64 slab = atomic64_xchg(&pn->slab_reclaimable, 0); + int index = memcg_stats_index(NR_SLAB_RECLAIMABLE_B); + + lstats->state[index] += slab; + if (plstats) + plstats->state_pending[index] += slab; + } + if (atomic64_read(&pn->slab_unreclaimable)) { + s64 slab = atomic64_xchg(&pn->slab_unreclaimable, 0); + int index = memcg_stats_index(NR_SLAB_UNRECLAIMABLE_B); + + lstats->state[index] += slab; + if (plstats) + plstats->state_pending[index] += slab; + } + } +} + static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) { struct mem_cgroup *memcg = mem_cgroup_from_css(css); @@ -4031,6 +4072,8 @@ static void mem_cgroup_css_rstat_flush(struct cgroup_subsys_state *css, int cpu) struct aggregate_control ac; int nid; + flush_nmi_stats(memcg, parent, cpu); + statc = per_cpu_ptr(memcg->vmstats_percpu, cpu); ac = (struct aggregate_control) { -- 2.47.1