From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0BB75C3ABDD for ; Wed, 14 May 2025 16:46:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 518AC6B019A; Wed, 14 May 2025 12:46:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4CB4E6B019B; Wed, 14 May 2025 12:46:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 390D26B019C; Wed, 14 May 2025 12:46:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1C4916B019A for ; Wed, 14 May 2025 12:46:19 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 0699D160D58 for ; Wed, 14 May 2025 16:46:21 +0000 (UTC) X-FDA: 83442091362.01.AEFB1B5 Received: from out-178.mta1.migadu.com (out-178.mta1.migadu.com [95.215.58.178]) by imf09.hostedemail.com (Postfix) with ESMTP id 48F7814000E for ; Wed, 14 May 2025 16:46:19 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=xQQnPdcJ; spf=pass (imf09.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=xQQnPdcJ; spf=pass (imf09.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.178 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747241179; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=97EX9VKYbW9uArLbHVrLKrGh49PbRe9B/FYp29idK8c=; b=D8/RS5WlX2uw3whTOIiKsZy30dn8UwD7jONeR+f91N3NbrjrtdUmIxiNELXA34fIHJbldU TZEjrozTSApTFChFd4ERRprXVrMW0sIt4AvN92uVzcA90vx8kjDRs2LTPMO0zirr7Qae2E 7FA1OGs2HY49AML9jk3Rk0Kw7sS5eWA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747241179; a=rsa-sha256; cv=none; b=fVf54l2Nbo8AkqdL2uWdRk0I1QfDnKxDu40K2F+sN6rCtGZE9tBHmgTUK1PsVAz4TXZzMo BXIpwK1/m28VfR1HxniaW7vjbsLugqYFeCTX2ezZW0ClmDUVp3tcXz70lcilF+1zE6kpDe JWjnPiMh5vV+8MvDBdI0uFTg854q/TE= Date: Wed, 14 May 2025 09:46:12 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747241177; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=97EX9VKYbW9uArLbHVrLKrGh49PbRe9B/FYp29idK8c=; b=xQQnPdcJcBuVIlzx6fvaXoCeI5vmaHyd5mgIvDgr3cTI1nz4+K8i5lf2jc3yF85bqjjbFf Hmyfsf+eN+PvW3EGtx01bmOJmFEGQwEN7DH8TpGOTtfyDXc+xg/8X4q3VFl+aDWhk+yRzy ej3quNlw9SDXYBLahahLdLR3pVya2Vs= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Alexei Starovoitov Cc: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , bpf , linux-mm , "open list:CONTROL GROUP (CGROUP)" , LKML , Meta kernel team Subject: Re: [PATCH 4/4] memcg: make objcg charging nmi safe Message-ID: References: <20250509232859.657525-1-shakeel.butt@linux.dev> <20250509232859.657525-5-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 48F7814000E X-Stat-Signature: 8uq1518efdyopta4gsxtdn1akdum7t9k X-Rspam-User: X-HE-Tag: 1747241179-513383 X-HE-Meta: U2FsdGVkX1+UZ9++GdI/Se/xGQcsggx/DEvqsq/a4+KkLmcxWSbD+8xyAkOtKyGs8Emok7d6QfgmapP5KYEVoydXIRaxBs7FV7hok0WudnoxANvm6ooboYuKn1hbvwCHJCNXZFtM1PoR6lqi+wUdKxGvxQk/TdpMLdIPpXQ2Un3zURVUF9CUrwfeEMsMzIxJhgFw6PA0ho0yhj+6qk9KCfNy5n0RK9q3x6VF5xaQlgtdG3MifAymsKmoDnGflziGwpJh3oyI5vwyLvwWjl9i2BYnQHfowSRBBpb66zDHHFxiQ0BqUarqcvjrUNKxJyXu+1vp5filEITpNV3V+RxfZNlWk5cPb1jzkA80vXYKs1ZgJ5X7OQW7YMtSIYsaS22Xdkr3yUTTg4O5ysxSfa46fDqxpWQYeOrlEhBAx/VgXFsN8fCUbKK/by9VWOk6vGgXJObKu2N5TxSi6J9fuwZ+Hag34rFsUmjGKdCIS/DIvhFZLyk5avQeR+sg80T7SeavbhZOVhccRwzlYXmPMnXy/N9l22dRBxi0oRJjPtrI0yifDOckaw8rE//75P+zHVtOmwiwqySOvIa4pbMrTOkXq4pHrzoVr2zsuoAgbroyeNSCLslHGGfbm/EOa2HpDmBi2TiPp1qEmgfe4CyvLSq0eAy3JrfGrZxRXCou0ZFBXHa0M8dFxK6ZYOUSQ3ZGvtbTLPjot5IkgPaSEfKkKS8dtOphnWsWvhoS55NFStNkuFvynOYJMqTUiFjEaBAXxfR970Y+MZliboaRhOZ0kJ/nH8I7yh0+EzDm2gUSUmPXJSGZq+iG0UzmQAjKkTlpvxC2k2bXy52JRr+PFdGdmmzD9CRKBBSv1vne/WSkgtb3v0nmdbw5bdIrFgdWsjxb+neQAYrGIwlTQGW4RWrfQKo0xiYUGURb2l5CyFfjingzm4/izAeBFfDE0Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 13, 2025 at 03:25:31PM -0700, Alexei Starovoitov wrote: > On Fri, May 9, 2025 at 4:29 PM Shakeel Butt wrote: > > > > To enable memcg charged kernel memory allocations from nmi context, > > consume_obj_stock() and refill_obj_stock() needs to be nmi safe. With > > the simple in_nmi() check, take the slow path of the objcg charging > > which handles the charging and memcg stats updates correctly for the nmi > > context. > > > > Signed-off-by: Shakeel Butt > > --- > > mm/memcontrol.c | 14 +++++++++++++- > > 1 file changed, 13 insertions(+), 1 deletion(-) > > > > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > > index bba549c1f18c..6cfa3550f300 100644 > > --- a/mm/memcontrol.c > > +++ b/mm/memcontrol.c > > @@ -2965,6 +2965,9 @@ static bool consume_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, > > unsigned long flags; > > bool ret = false; > > > > + if (unlikely(in_nmi())) > > + return ret; > > + > > local_lock_irqsave(&obj_stock.lock, flags); > > > > stock = this_cpu_ptr(&obj_stock); > > @@ -3068,6 +3071,15 @@ static void refill_obj_stock(struct obj_cgroup *objcg, unsigned int nr_bytes, > > unsigned long flags; > > unsigned int nr_pages = 0; > > > > + if (unlikely(in_nmi())) { > > + if (pgdat) > > + __mod_objcg_mlstate(objcg, pgdat, idx, nr_bytes); > > + nr_pages = nr_bytes >> PAGE_SHIFT; > > + nr_bytes = nr_bytes & (PAGE_SIZE - 1); > > + atomic_add(nr_bytes, &objcg->nr_charged_bytes); > > + goto out; > > + } > > > Now I see what I did incorrectly in my series and how this patch 4 > combined with patch 3 is doing accounting properly. > > The only issue here and in other patches is that in_nmi() is > an incomplete condition to check for. > The reentrance is possible through kprobe or tracepoint. > In PREEMP_RT we will be fully preemptible, but > obj_stock.lock will be already taken by the current task. > To fix it you need to use local_lock_is_locked(&obj_stock.lock) > instead of in_nmi() or use local_trylock_irqsave(&obj_stock.lock). > > local_trylock_irqsave() is cleaner and works today, > while local_lock_is_locked() hasn't landed yet, but if we go > is_locked route we can decouple reentrant obj_stock operation vs normal. > Like the if (!local_lock_is_locked(&obj_stock.lock)) > can be done much higher up the stack from > __memcg_slab_post_alloc_hook() the way I did in my series, > and if locked it can do atomic_add()-style charging. > So refill_obj_stock() and friends won't need to change. Thanks Alexei for taking a look. For now I am going with the trylock path and later will check if your suggested is_locked() makes things better.