From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D67B7C3ABBC for ; Tue, 6 May 2025 19:30:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 47BA06B0088; Tue, 6 May 2025 15:30:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 42B0F6B0092; Tue, 6 May 2025 15:30:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3189B6B0093; Tue, 6 May 2025 15:30:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 150756B0088 for ; Tue, 6 May 2025 15:30:28 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CE55F8016F for ; Tue, 6 May 2025 19:30:28 +0000 (UTC) X-FDA: 83413474536.25.8A766C5 Received: from out-177.mta1.migadu.com (out-177.mta1.migadu.com [95.215.58.177]) by imf22.hostedemail.com (Postfix) with ESMTP id D87EEC000B for ; Tue, 6 May 2025 19:30:26 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=WgHMedeS; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746559827; a=rsa-sha256; cv=none; b=FsobhXgc2TD3b0k6xCknCp43bmgIAB/tc4vS6dEqGCmvwWz9ZaSQqzqsKzfl2PdsaVTNQN 6QPCZgKiApQ1YmsdwNzAeNeDIqa7pSiDJRjI+dKOThv9r6evkPZlgE9wGH3FVbg3CKCC9D 6hTGfIqMM+w52jCaiVxmKcYHlIZmroY= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=WgHMedeS; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.177 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746559827; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Db7MspoP+g08mHEp6RilCFjIi85bP/lhDWJS4kUMnGA=; b=5Gc02yQQngiuQsrGwX2Uuia1sk7teiR+8zBvRAefKP0XMmcOj3fzW9NlIHrxyQR4LUOMT9 +KBERpkQdFT6O1bltT1v/WItYEgkcTOsO/jBk6wPWoaa57FHrg3nlKUELVqrDkmC84w0UW pz2o6XytyRZEwTpLtYwsaWyUr8zNBN8= Date: Tue, 6 May 2025 12:30:18 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1746559824; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=Db7MspoP+g08mHEp6RilCFjIi85bP/lhDWJS4kUMnGA=; b=WgHMedeStoJSRNGI5VPxIeFmszKMKGVN84MrVLAxJ/1hLXKDKsbQ+ltgT4OqupjK1mnVnh YmFcqpGYpnyHXjUXEWg2gUz/ypJXo6z5pm+YzOdHFSGWcgT7d/KHDKxNng5Gdn05emiz4h SlUp/GBNVP0wd2MFonT/+O6b3MOwKro= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Yosry Ahmed Cc: Tejun Heo , Andrew Morton , Alexei Starovoitov , Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Michal =?utf-8?Q?Koutn=C3=BD?= , Vlastimil Babka , Sebastian Andrzej Siewior , JP Kobryn , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: Re: [RFC PATCH 3/3] cgroup: make css_rstat_updated nmi safe Message-ID: References: <20250429061211.1295443-1-shakeel.butt@linux.dev> <20250429061211.1295443-4-shakeel.butt@linux.dev> <6u7ccequ5ye3e4iqblcdeqsigindo3xjpsvkdb6hyaw7cpjddc@u2ujv7ymlxc6> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: D87EEC000B X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 44jyabqpjiabrcat57qpxe5sj5bgqjyp X-HE-Tag: 1746559826-20511 X-HE-Meta: U2FsdGVkX1/xhyvFH7KTgpQN4nZqSoEMo4DHgex1JVEHPQ6sTk/hSIVzyq+cqKXEddyIFraOfPZJc3OiWBYMEP2K40SBBrPaNaXRQLQO9zWZgrJKsNdRl70COLEL6BuHNS+hHpjbKXjsj/2BQLcbERWtRDGrFD+dht2hZccwtSTepAy42droQX+YlxNs6/9XBBI3/bOldl/TPrF3ztL+Yr0iPUjjR9+66ojdKfuBKdBE0RZtiXpYNhUutpNQ1mgb9tatU4jJScg4jzA4qt1B+ReTca6H7phDAChh9oPBLiB3zVCqSBBteiz/ED4/SKXPpfiHG7difufBgnoy5Cee4DD8LzIf6duGuIUGLT1HOi+3s9OEG7h+lrsAwWf/CfLJq8DyF+95plhiVe62AEUFjHozdQ2qeuPeSPEiVgDTf0CBLKLeEJgR28d3Ixsvop5Y17JOA42mQkPLFv/Oby3yjSNEZlp422z5lKNe833b6F3TMKeohKD42QnDECKz9+JWnY4lDNuwb81xFyAbe6tg9x+Vu4teqIJSNQK/Ed7Bt1ZjOr1MQ9TfX6qIBU+9XTkEya2w24XktxNXv2j6ynDBM8An/O/9+rV1J+A1DKpQ0TMpy1iAIF5pK7P1lV2nngCONcqDKKlO6fDuHk+LOhmo/v6+DhQwwmKMvN14LfpXKHDtjWwgQLZCpPQMJs+Zw7wDGxS6xSHQPWalwMorr7TWbkz4v6vI62b5oBKYELpuIILaYICbOc70VeTqUCbe66YqIEmgDYyll/ZhCfzomPJUIFdMcardU4Me2aikBVOMmcVPJMP5UBFwtAWKcsuWnKf2Loat6tJUPZHj18wIo+9xdo7uZQriANoXZkuv/xhmRloYRqrLA2w3DvoAvjNTwGnLilpNwV/1jMBH383OIKPX3hQFkWuiCpmMteeQWgoAdqteSmd8NtP+vAVosIStYCROVzk9lrWqCs+HBG9qbWo wO6TKm8W MtMWQzl7GeNEPZRU5HGxL6RSMTwtl5Vc9SgMgWNXOz8XmFIS2dE56nMbEKJE5apG5j3CbQM86peh3cSqevkAvtJbQnFhBNo/pRo5+ZAAfi88yNJLCVKPsTycMmGB2+TPEUQinc7MHoiW8t5hhSk6ktkS8EmivVFjn8G93HjucPAMlZHC6Vif8SPfVNkgV4e988+VWM+gJ/1dPc7Uj+HRfzKwVGAS+67zHJnwYQgsFvfi8Q4A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, May 06, 2025 at 09:41:04AM +0000, Yosry Ahmed wrote: > On Thu, May 01, 2025 at 03:10:20PM -0700, Shakeel Butt wrote: > > On Wed, Apr 30, 2025 at 06:14:28AM -0700, Yosry Ahmed wrote: > > [...] > > > > + > > > > + if (!_css_rstat_cpu_trylock(css, cpu, &flags)) { > > > > > > > > > IIUC this trylock will only fail if a BPF program runs in NMI context > > > and tries to update cgroup stats, interrupting a context that is already > > > holding the lock (i.e. updating or flushing stats). > > > > > > > Correct (though note that flushing side can be on a different CPU). > > > > > How often does this happen in practice tho? Is it worth the complexity? > > > > This is about correctness, so even a chance of occurance need the > > solution. > > Right, my question was more about the need to special case NMIs, see > below. > > > > > > > > > I wonder if it's better if we make css_rstat_updated() inherently > > > lockless instead. > > > > > > What if css_rstat_updated() always just adds to a lockless tree, > > > > Here I assume you meant lockless list instead of tree. > > Yeah, in a sense. I meant using lockless lists to implement the rstat > tree instead of normal linked lists. > > > > > > and we > > > defer constructing the proper tree to the flushing side? This should > > > make updates generally faster and avoids locking or disabling interrupts > > > in the fast path. We essentially push more work to the flushing side. > > > > > > We may be able to consolidate some of the code too if all the logic > > > manipulating the tree is on the flushing side. > > > > > > WDYT? Am I missing something here? > > > > > > > Yes this can be done but I don't think we need to tie that to current > > series. I think we can start with lockless in the nmi context and then > > iteratively make css_rstat_updated() lockless for all contexts. > > My question is basically whether it would be simpler to actually make it > all lockless than special casing NMIs. With this patch we have two > different paths and a deferred list that we process at a later point. I > think it may be simpler if we just make it all lockless to begin with. > Then we would have a single path and no special deferred processing. > > WDYT? So, in the update side, always add to the lockless list (if not already) and on the flush side, built the udpate tree from the lockless list and flush it. Hopefully this tree building and flushing can be done in a more optimized way. Is this what you are suggesting?