From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B2F3CC83030 for ; Thu, 3 Jul 2025 22:29:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D70E8E0012; Thu, 3 Jul 2025 18:29:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3AF408E0009; Thu, 3 Jul 2025 18:29:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2ED3B8E0012; Thu, 3 Jul 2025 18:29:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 1F1248E0009 for ; Thu, 3 Jul 2025 18:29:20 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C3A5F1A0182 for ; Thu, 3 Jul 2025 22:29:19 +0000 (UTC) X-FDA: 83624395638.17.407986B Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf01.hostedemail.com (Postfix) with ESMTP id 15BC440003 for ; Thu, 3 Jul 2025 22:29:17 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WaLlLqgR; spf=pass (imf01.hostedemail.com: domain of "SRS0=1KN4=ZQ=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 147.75.193.91 as permitted sender) smtp.mailfrom="SRS0=1KN4=ZQ=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org"; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751581758; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=40j2by3M/Z3mYGiJyRvudpcq4PAkQqQChTiSUeNuTUM=; b=DF//gcnyERyGa7J1oCgSveKr59gpdAagIm+gckyMe/4VG3L8UH4r7o7akghSsl+gUlwpFw PLeDj4W7482dhoQzQpJULFPRzlOEre/Y4ZzBhBzC3Vt/yJ59watCloOchdUhjFUTJGApXW LrCDrJYdoNlV2G5YZItzF+Mk+DC/vvU= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=WaLlLqgR; spf=pass (imf01.hostedemail.com: domain of "SRS0=1KN4=ZQ=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org" designates 147.75.193.91 as permitted sender) smtp.mailfrom="SRS0=1KN4=ZQ=paulmck-ThinkPad-P17-Gen-1.home=paulmck@kernel.org"; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751581758; a=rsa-sha256; cv=none; b=NY3lG8xoIdLjes8lHcLqwYWPeGahLBZC44TKx25lL0jSsKPfqEVxeuxvUwjuK0QsqeY/y9 iHZ+pEkXwZpsepJQbY5MyJhAMeBEgP6BCuvENimSxKAElxPZfswvDwsjftfhUuIn1SX4Ng vb2dpR7T5/NsXLPXPYMEDTKRAkwDzwc= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 538DBA53A67; Thu, 3 Jul 2025 22:29:17 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E77D5C4CEED; Thu, 3 Jul 2025 22:29:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1751581757; bh=XJVEb1yF5sFUaYCLLq3nTetyfMZ6c3HU7Msa9fSlfrc=; h=Date:From:To:Cc:Subject:Reply-To:References:In-Reply-To:From; b=WaLlLqgR++apx0cyR91VqHO24UiPOOt2Ihj/uEFSLXCopmm3u0F8NP0H3Ibs4lmUO IcoVF+JeO8g8yBPMPZa6DcGTbMqbimCp1JaCU7MWBQNHAKwGHr/pngjMJSGdBpV+wi rgRYO5Em/QqUXvPWkP72lCVOl1pOsEoU/NEkAM9gtK8q6KqMop92GBXMFzbpgkPstB E2g5D1Nm1GitciHcaeDzSDFqsPvHwc0CAMiQ4IkE/Y0g8uKCO9kysvF+85gM1iRfuJ Uj2YUerShddPClB7rFdtiuhx4x3RUI73mAcumHgNenK4KpA0Yv035sy383g8BBEbnH 76bhvlQadCzUw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 90E27CE0C97; Thu, 3 Jul 2025 15:29:16 -0700 (PDT) Date: Thu, 3 Jul 2025 15:29:16 -0700 From: "Paul E. McKenney" To: Shakeel Butt Cc: Tejun Heo , Andrew Morton , JP Kobryn , Johannes Weiner , Ying Huang , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Michal =?iso-8859-1?Q?Koutn=FD?= , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: Re: [PATCH 2/2] cgroup: explain the race between updater and flusher Message-ID: Reply-To: paulmck@kernel.org References: <20250703200012.3734798-1-shakeel.butt@linux.dev> <20250703200012.3734798-2-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250703200012.3734798-2-shakeel.butt@linux.dev> X-Stat-Signature: n85b6hx4ghhtsgcdqnmk7muqwj85hwyq X-Rspamd-Queue-Id: 15BC440003 X-Rspamd-Server: rspam11 X-Rspam-User: X-HE-Tag: 1751581757-101168 X-HE-Meta: U2FsdGVkX19Pe3hqFEAmp5mY+XQYxYsfcC2IWaUne3hLj3yX4eqrxcjo/jGc5M1nmI0P+dMm4QY/EdMz3lciyC78fdBjOcycANiS1WN1br05uazUJrL3WKtqVL3bL7H17d2iy1aFNtOwFJcT6YY4oB2yB7yZg05yO32ifQNxDZ6lBCkrI4DizYW34lsI+BOkdiWyfHkGU5YcQirnQJpaPbp7My0jSGsWrGmMVvxI2P/2fcCEr9ztf4k80KxPqOmaZzqKEqas25wHFWV+2OT9kYSoo2YuPLsEUKLSYA73EvzOR6qUej1xHPpZHNGPXzQQVh64UOZZnhsSNG+OefdQWrKbO8vazEoD2uL1TGc6JH0+8dSfge4HQCxnOTGYjJz/3a6cuf1PJHzWbbex5Zjj3lUEAU3OUF0wtR0lmRbikV5Xm9rypZ4MU/+/t3mflt6E44OKZXN4Q84vnue3Cp5Td2/tn9fYzH7rhMvCCkheWJXtSq41ndHJ1EmmXMbBsLLHp8cLNhPbcvSawSfPCWA+IuaMAlR4Ic8hDdItVaep/qcdS8Prts9JLAuQa2zwMotcvW4xLn0z1SEg55j/q5+o0ha/a2kA4BBhxn9cU+LUT2yX6b210ty8UILOj85av1YqqIHUw/zFwzqqtlBzV0oiFp5xNki6UIGr2VlZcXhiqGg8BIOlkI4d4aE02eszZavcSAlbFp1h9Dz3SyC/tvnB7lch5gQ34NZ4/gNToipICQMJ4oEblB4069J+UkFR5Moaw114JWdEk8WIfhNkSdnt+F7SnGmw6TLz9sPSVQGyXWPTA7AyrYerSkAKywemoIJoixLTT24bBHIwSq2gB7wGowfmSVM6D75/BN1oORJAJd39MWv7yWH58c7En4akqKred3ddZMISGX9ZyyrOaETwoKoZG6qwSQWweNEtnuuG2CLS+fkGUa1SpnegIvgzoeWK7GRKx9+nZDdiNWT0Ghu ukcymYRl fPEWCIZFM/o/VW8b+nrBVD1Cy8mDPzjVJh6O95PfydSF8r7n+MB36+3aWhd/XNfCyFZ3+p8J+dXM54ZnnFYhB6y5XUO1A71c9KX9iD/fQuSjZUZIHiKwf4NffxWJjOYpkGRgRgmmegBVK122gC5BXpvvOTN0dzQb5J5TtTvhA6TCyi957MR5pQjOgAQfPrCPfkD+uZjvsBugIwrsjSWl/qxBZ5rzK+rtw46Hfryv+YjF8Rcke0Rc8oAdbbS1oIpQY7Haqe+PZbnBJajSXgR9nkvD3llshRILNBJV356lpgtecgVITIGrIde/2DoLjayHXvJLKWl8CUTOVBXKoSN0ZdpR4Egnbe9wQwJxLhV3vLZzNWhTPNF+7d24p0gSGZbbSqBfu9WoaiBEZNhs5PMv2I7fDwSJrjFzZybTMcO/Iwlrgroq8S7qLRbTBKJ4fFRbDqMnsvZJSxJynvz9l79Bn38t6Z3GJCwJYUamMtKaylh+TVKSa0hCh2DYqOPzQt4qFBsd31R4y+JBDACie8wS+rACyuo/86yekPNLQYen/HBJdhSiyywjtnAwrMvAtQf0yduPdEu84SJgI1fL3ArRyBgkDw6Uy+eZZ2H/aawz6oWPtjpycSRalSiSSluXsBLzNQMM2CPb+PdAibGSnlFiOkoAeE97Xprq/xs/TWGZncBPW8vDQU2yXRTX/kA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 03, 2025 at 01:00:12PM -0700, Shakeel Butt wrote: > Currently the rstat updater and the flusher can race and cause a > scenario where the stats updater skips adding the css to the lockless > list but the flusher might not see those updates done by the skipped > updater. This is benign race and the subsequent flusher will flush those > stats and at the moment there aren't any rstat users which are not fine > with this kind of race. However some future user might want more > stricter guarantee, so let's add appropriate comments and data_race() > tags to ease the job of future users. > > Signed-off-by: Shakeel Butt > --- > kernel/cgroup/rstat.c | 32 +++++++++++++++++++++++++++++--- > 1 file changed, 29 insertions(+), 3 deletions(-) > > diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c > index c8a48cf83878..b98c03b1af25 100644 > --- a/kernel/cgroup/rstat.c > +++ b/kernel/cgroup/rstat.c > @@ -60,6 +60,12 @@ static inline struct llist_head *ss_lhead_cpu(struct cgroup_subsys *ss, int cpu) > * Atomically inserts the css in the ss's llist for the given cpu. This is > * reentrant safe i.e. safe against softirq, hardirq and nmi. The ss's llist > * will be processed at the flush time to create the update tree. > + * > + * NOTE: if the user needs the guarantee that the updater either add itself in > + * the lockless list or the concurrent flusher flushes its updated stats, a > + * memory barrier is needed before the call to css_rstat_updated() i.e. a > + * barrier after updating the per-cpu stats and before calling > + * css_rstat_updated(). > */ > __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) > { > @@ -86,8 +92,13 @@ __bpf_kfunc void css_rstat_updated(struct cgroup_subsys_state *css, int cpu) > return; > > rstatc = css_rstat_cpu(css, cpu); > - /* If already on list return. */ > - if (llist_on_list(&rstatc->lnode)) > + /* > + * If already on list return. This check is racy and smp_mb() is needed > + * to pair it with the smp_mb() in css_process_update_tree() if the > + * guarantee that the updated stats are visible to concurrent flusher is > + * needed. > + */ > + if (data_race(llist_on_list(&rstatc->lnode))) OK, I will bite... Why is this needed given the READ_ONCE() that the earlier patch added to llist_on_list()? > return; > > /* > @@ -145,9 +156,24 @@ static void css_process_update_tree(struct cgroup_subsys *ss, int cpu) > struct llist_head *lhead = ss_lhead_cpu(ss, cpu); > struct llist_node *lnode; > > - while ((lnode = llist_del_first_init(lhead))) { > + while ((lnode = data_race(llist_del_first_init(lhead)))) { And for this one, why not make init_llist_node(), which is invoked from llist_del_first_init(), do a WRITE_ONCE()? Thanx, Paul > struct css_rstat_cpu *rstatc; > > + /* > + * smp_mb() is needed here (more specifically in between > + * init_llist_node() and per-cpu stats flushing) if the > + * guarantee is required by a rstat user where etiher the > + * updater should add itself on the lockless list or the > + * flusher flush the stats updated by the updater who have > + * observed that they are already on the list. The > + * corresponding barrier pair for this one should be before > + * css_rstat_updated() by the user. > + * > + * For now, there aren't any such user, so not adding the > + * barrier here but if such a use-case arise, please add > + * smp_mb() here. > + */ > + > rstatc = container_of(lnode, struct css_rstat_cpu, lnode); > __css_process_update_tree(rstatc->owner, cpu); > } > -- > 2.47.1 >