From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F0858C3ABCB for ; Mon, 12 May 2025 19:12:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7344E6B0082; Mon, 12 May 2025 15:12:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E0E86B0085; Mon, 12 May 2025 15:12:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5D0396B0088; Mon, 12 May 2025 15:12:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 3FECF6B0082 for ; Mon, 12 May 2025 15:12:58 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id F022BBBB80 for ; Mon, 12 May 2025 19:12:58 +0000 (UTC) X-FDA: 83435203236.06.38906F0 Received: from out-181.mta1.migadu.com (out-181.mta1.migadu.com [95.215.58.181]) by imf17.hostedemail.com (Postfix) with ESMTP id 122ED4000F for ; Mon, 12 May 2025 19:12:56 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=GEWi9xep; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf17.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.181 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1747077177; a=rsa-sha256; cv=none; b=UgmbGK7/1Hat40BCpP0o5bhmBeC/2ydq4k1tjSqs8+0nd6qfAkp2KRunjn91m9rWA1gY2x Cc2aIVKwdHshDElSa1zozIrx6Av4fdj61VB3RZ4jvKb5SHyPXYypI9oJu0eVh9YEbjxWTf OnNg8NDMveCwGqfPHInCMa4uguvQFS8= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=GEWi9xep; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf17.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.181 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1747077177; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sqVwqlZm0Qlg6KxhYJj/U/iuxKDNt6d7QLnu+Tbwxmg=; b=t6VF8r+rQdizmPNatmqABBOvuuh2Bhfe39lrgszEezQhzKGMdJO+Z651Ux2w9O91vfDXdr JHA/0HLgub/blmSbPeprxQP5JfZwILBTuSiSPZfDmMvz3TjwCMSYwNsCRw4bYB/RDKQHAq WCsKRBqV2iZhEnwqs5Ohx/iVS6ReDpg= Date: Mon, 12 May 2025 12:12:48 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1747077174; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=sqVwqlZm0Qlg6KxhYJj/U/iuxKDNt6d7QLnu+Tbwxmg=; b=GEWi9xepBuUD84Xj6HySZwI1fUdZ1NMcdnQxSfPl28Sz4E2JabYfQWGfYtVKaSfs3gWfd7 vZzse9fB54pbGKa9+vqFt0R8nEldYEJgJ5Js2BD3+rLgFEvvnPZYyDi4aM+W4h8F978mTp R2WWBjsuU51IuooEqJXc7vpYirmx8dM= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Vlastimil Babka Cc: Andrew Morton , Tejun Heo , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Alexei Starovoitov , Sebastian Andrzej Siewior , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: Re: [PATCH 0/4] memcg: nmi-safe kmem charging Message-ID: References: <20250509232859.657525-1-shakeel.butt@linux.dev> <2e2f0568-3687-4574-836d-c23d09614bce@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <2e2f0568-3687-4574-836d-c23d09614bce@suse.cz> X-Migadu-Flow: FLOW_OUT X-Stat-Signature: 37a87m4dsd5nmpaxu4esc4wii645xoqq X-Rspam-User: X-Rspamd-Queue-Id: 122ED4000F X-Rspamd-Server: rspam06 X-HE-Tag: 1747077176-973722 X-HE-Meta: U2FsdGVkX1+YkhJTSCM05PsZmPG/mpP9OkkMIpT5oygPQbiutzwFqOSVebpD6yIZ/l7L/tbZ/NvZ/ZsEPsYq6g251K2Z3Q8vgcawbJNI0vnwT7Ni+yMOM5yGQdc0UFR1wzu90ofXP4BOWvjCtUqqM3QUzxTL3cZoFwoZsWLsWlLsBk8znOYa2WzoSTuThHwNWvvgGuqEcqAPOOrddhaCpZJB2TcMI/cB85r8l1/hIgd/RrK0cPtYJCSX8nWaPWswIo6eHf0gk7ur29QN/Q3w2aK8OagYPMI10LQHug3JPpf/iEf7lHDf4ZpJwRwFApKcR3Erk/AKs87o8OkrzywBAKdJxPiP3fgx2dJWF36hHeQhpd4EXk3XoacfTWHAS8LxL/3KT2mFOge+24nLxHUVoCt9n0Zs11XHWymM2HGVHt9DeUKBAoaUk0SqfyQU7bPRV7vOcBWAxn5F8PKvWdzAHq3XNcYKMeLb8yhU3CXpV/jV7velXSpcEH/bFQoeXkaDu2qAO2VWATgTDDRiX5rlmO25izLCCgnm7kjyDHCgGs0r30/Fi/ueXU9dwCGAe/f0ZEbMSK5V3GDbsgCDJ3YoEP9lvMZsl8XUxRepi7HYMgeNHsgVI1fD4/pVxKdC+cb99YIo4PhO97nMWNZirbNMreYj77WpzKSBAJZfZiHUJtXYVxyK9E6dIJSrFY9GSCgYI3duDKCsKfox0ZUV21OIKORH8NsuZZPQCVHzpXga+8xWg03f8mnayajxj5IbvFUOV2iyQyG9YveC02fUlKgvqFVExizyLnvsc2YUJv1wXccRek1v+jaOf5TWOQIUbAKHdlUFMHv+vj0cW1jSXDzZUxgVxuDZfzngXlGuwUFfMkP72QosI2exeBsIwaiWM18SooS6T7BuTL3/6qqq3rtD6Vr6wR9wZtgG6Bsv8CfNltFAMrL+f3McR4HUo2AQijkNAyBQkiTe9iRPWIM9D/M 2+vkiqVF Dsc7JaKsXjwc0yZdzRz+6Oa8HJWkXIkmGvLc0Nue/qXqWqghz6XW17rKHgikHVELFc5pT8pZSL5dBdM5m31TaVID+qutSxhw7G5433wu75RdRCOPODFT0Jq4RpPhuGi/HuJqgZAOktm6SMjQ3EMYMowtGyux6KPjd6chv9ysw+bODfyHL9eZtVih7Ug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: I forgot to CC Tejun, so doing it now. On Mon, May 12, 2025 at 05:56:09PM +0200, Vlastimil Babka wrote: > On 5/10/25 01:28, Shakeel Butt wrote: > > BPF programs can trigger memcg charged kernel allocations in nmi > > context. However memcg charging infra for kernel memory is not equipped > > to handle nmi context. This series adds support for kernel memory > > charging for nmi context. > > > > The initial prototype tried to make memcg charging infra for kernel > > memory re-entrant against irq and nmi. However upon realizing that > > this_cpu_* operations are not safe on all architectures (Tejun), this > > I assume it was an off-list discussion? > Could we avoid this for the architectures where these are safe, which should > be the major ones I hope? Yes it was an off-list discussion. The discussion was more about the this_cpu_* ops vs atomic_* ops as on x86 this_cpu_* does not have lock prefix and how I should prefer this_cpu_* over atomic_* for my series on objcg charging without disabling irqs. Tejun pointed out this_cpu_* are not nmi safe for some archs and it would be better to handle nmi context separately. So, I am not that worried about optimizing for NMI context but your next comment on generic_atomic64_* ops is giving me headache. > > > series took a different approach targeting only nmi context. Since the > > number of stats that are updated in kernel memory charging path are 3, > > this series added special handling of those stats in nmi context rather > > than making all >100 memcg stats nmi safe. > > Hmm so from patches 2 and 3 I see this relies on atomic64_add(). > But AFAIU lib/atomic64.c has the generic fallback implementation for > architectures that don't know better, and that would be using the "void > generic_atomic64_##op" macro, which AFAICS is doing: > > local_irq_save(flags); \ > arch_spin_lock(lock); \ > v->counter c_op a; \ > arch_spin_unlock(lock); \ > local_irq_restore(flags); \ > > so in case of a nmi hitting after the spin_lock this can still deadlock? > > Hm or is there some assumption that we only use these paths when already > in_nmi() and then another nmi can't come in that context? > > But even then, flush_nmi_stats() in patch 1 isn't done in_nmi() and uses > atomic64_xchg() which in generic_atomic64_xchg() implementation also has the > irq_save+spin_lock. So can't we deadlock there? I was actually assuming that atomic_* ops are safe against nmis for all archs. I looked at atomic_* ops in include/asm-generic/atomic.h and it is using arch_cmpxchg() for CONFIG_SMP and it seems like for archs with cmpxchg should be fine against nmi. I am not sure why atomic64_* are not using arch_cmpxchg() instead. I will dig more. I also have the followup series on objcg charging without irq almost ready. I will send it out as rfc soon. Thanks a lot for awesome and insightful comments. Shakeel