From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 860CCC83F34 for ; Thu, 17 Jul 2025 17:44:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1655A8D0010; Thu, 17 Jul 2025 13:44:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1162C8D0006; Thu, 17 Jul 2025 13:44:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 02BFB8D0010; Thu, 17 Jul 2025 13:44:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E69448D0006 for ; Thu, 17 Jul 2025 13:44:44 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BCB89160114 for ; Thu, 17 Jul 2025 17:44:44 +0000 (UTC) X-FDA: 83674481688.27.F86B202 Received: from tor.source.kernel.org (tor.source.kernel.org [172.105.4.254]) by imf21.hostedemail.com (Postfix) with ESMTP id 3AA2D1C0008 for ; Thu, 17 Jul 2025 17:44:43 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dG9kkDkV; spf=pass (imf21.hostedemail.com: domain of tj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=tj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752774283; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ppC9F8UH8Zw5Utn2+0qL46Ws5H4ya8DJmcQPv5vKQ+w=; b=jD7yiU96gNRtEeN4eGZqU0dDbb3/hxlM1cqd4F5EIlmCGJYiWqcsHi3Dt54s9Kp0fyV65S OVf5ETtbAmm2Ttb/0yaeThoJS4FVjjA1jjlU0v6qxsvbKn0L1kBw+2DkZXPB3mQ8z4Yxen CDqnRkK0luS2VPyLbN3Nm0G/uMlises= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752774283; a=rsa-sha256; cv=none; b=WGg3d2PBzfugxQC8vL972XLbjhIDwrfdOinxKMWAxKHMoM7nYzsDC8EtwjLNdgs0HH50nD mQM8orZKvpVdpQVFPvzeBDht7yDUsjY3ZTPF3oZljmWrfZ4m9LytLisaJWEzUwOkk8gfoO JddQQGSoCLNBMxhikhFaEW0KuWPZuHI= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=dG9kkDkV; spf=pass (imf21.hostedemail.com: domain of tj@kernel.org designates 172.105.4.254 as permitted sender) smtp.mailfrom=tj@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by tor.source.kernel.org (Postfix) with ESMTP id 9422260203; Thu, 17 Jul 2025 17:44:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 19486C4CEE3; Thu, 17 Jul 2025 17:44:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1752774282; bh=2WNUl7kyz/4/3vixbfH5Y+4CmMQgWbIBu6ljnStnCq0=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=dG9kkDkVBUZSvPKcI1O58rQ74A1sQkG0I7YLwsM06A3Q77yyjfcUXEG05ibbDYcFb 88IjxnUyEiGPMNxvER3n+juXuGnte81eM1WOg6ZyoX/r/VgY5Q1tCfROCeeH2DmVSc 3/ft0SQofUZ6n2F68zJfyWJ2KyqXnlxYr40csj5CIzQ2d0QLW71Ccp7IfTLEI4qv6K deXFnWSCwBXbuZwpsysgJ7sKps18BS+dMSV6xycZUCqk7ASY71CwBbkHVTH2y9dKRI a6OkH9CrO/JMwpOeOrmqT0bltQsvFHGlGhqg07PjGdo2ywOgFbz9bHIParCdGyDQEs Y8+a6U19cT6aQ== Date: Thu, 17 Jul 2025 07:44:41 -1000 From: Tejun Heo To: Shakeel Butt Cc: "Paul E . McKenney" , Andrew Morton , JP Kobryn , Johannes Weiner , Ying Huang , Vlastimil Babka , Alexei Starovoitov , Sebastian Andrzej Siewior , Michal =?iso-8859-1?Q?Koutn=FD?= , bpf@vger.kernel.org, linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Meta kernel team Subject: Re: [PATCH v3] cgroup: llist: avoid memory tears for llist_node Message-ID: References: <20250704180804.3598503-1-shakeel.butt@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250704180804.3598503-1-shakeel.butt@linux.dev> X-Rspamd-Queue-Id: 3AA2D1C0008 X-Stat-Signature: gf1bqti8ihrxsfbic66h34ibo1wr7e5n X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1752774283-387985 X-HE-Meta: U2FsdGVkX1+CPJLFodWlgSRvaOG+vxgd0tPNxWBQ86yytK5ulPrv/LohJFYEgF90TLWCoc21rKTmtEB7RxH3OmjOe7epzXzbj5QaK/3NtGFWm9LkXK6Oh1EQofu8mWSu44PFgPHJP8YP3jk1TvxV6IWt/O2/XFdEI/2mh9AS6JZiINOUBPzr9Oliq6xApOC4sUQPTBEc55I2l6mhtneke+IpRdL/pbWkCjZZjQo8JmWVEtfLelZkvRgBbsomqZ3osLSLDveY4rdmoKvEfomeHo+EENCZaRY132PrRXlFf8Ls4S8zlzXsuNQ236mmS4fvVgXvg1FvwpiJQe+nIkZGNre4H4mn59Ft4MKNBmCfk7gKqXNG3C9L954qRcO1X8q+vAlvRs6eM+kLT+Ee4wAmYCBCAkr2f7Iu5bLhvDYrAzDIWkV0QcMykCqDHUPlLjFBZhB9B1sUIiZ1JFf/qgT/1a87fBAGcv5a6vo3Kc2ko1tnJAnitCPQCWrQN/i8COIb1W/rDn8ZnC+hFxPjzFugluxGe2GmZZ88pYrLzKjZ4CPsTEwWj0+lrXRSuaJwltCX2V2GpoIdPObO6gTWGefV2mEC7ayqRIFceg4KLLRXa1e0aATypMyLsvrVUtHG5CpK2PvVnNmhr2Di+vFrnbNSND13OXEyvwj2TC7v71OqvnHy94e2N8itttfJrOSCOWCbWyAoZeLwwl7Rc7azXGA4onA9DwLxbYgWYhJXN54igKz16AQgO/ZgHvsyq6Rf8jCGOd7h/tcREg3impi9tchCRroRJqpvPyHOJnpaTP4g1zcJSBYFCxYEaquDBu3CSbBsMNMeXCfWJYHkP6Vow8RBPeAjKhaN5KqkJjBF0pZfN+EaHxQSh+TC6dQ3dYFEynKAO/YHZiHkpnng7QynKTrOFLBHD2beqHNp+pFzNu1wKC9aeOCc1/5bSoC/6IyaeTIUb9isaHrEE2ZSOV6u2Qn pzaFF/8q WvyMndVDfWdZCi72qf2/RVCwEuZXYc3nuQQLiu4e1tMerjz/eloBheiyoBSYScx4S/xL6lHyzyI38YwpV3g0qZYkEGB0U8T55ar7gdzi/3N3IrGxYfbtFlJkx1oED2i2o3jx9sWbZO+CdahUcCu9na5FiLO5QXRSyAp5KV/V513SrRcKJRUjSeo8oKG9bfuJKNSjkrfCOPAGNx3J3Frf13QwD3d6JRPaiQ8hfTvNbTnXAAO/ShtxhMWekXoanZqXizp94Vxl02Ezf8EPZu1aXkdzH/wY7KoXiLMZzdsZTkYihJ/llNK/KJcLlvWD7xwb/bOZJymoqiZK7F32tLHc0u+WWLT6fbKa35WT1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jul 04, 2025 at 11:08:04AM -0700, Shakeel Butt wrote: > Before the commit 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi > safe"), the struct llist_node is expected to be private to the one > inserting the node to the lockless list or the one removing the node > from the lockless list. After the mentioned commit, the llist_node in > the rstat code is per-cpu shared between the stacked contexts i.e. > process, softirq, hardirq & nmi. It is possible the compiler may tear > the loads or stores of llist_node. Let's avoid that. > > KCSAN reported the following race: > > Reported by Kernel Concurrency Sanitizer on: > CPU: 60 UID: 0 PID: 5425 ... 6.16.0-rc3-next-20250626 #1 NONE > Tainted: [E]=UNSIGNED_MODULE > Hardware name: ... > ================================================================== > ================================================================== > BUG: KCSAN: data-race in css_rstat_flush / css_rstat_updated > write to 0xffffe8fffe1c85f0 of 8 bytes by task 1061 on cpu 1: > css_rstat_flush+0x1b8/0xeb0 > __mem_cgroup_flush_stats+0x184/0x190 > flush_memcg_stats_dwork+0x22/0x50 > process_one_work+0x335/0x630 > worker_thread+0x5f1/0x8a0 > kthread+0x197/0x340 > ret_from_fork+0xd3/0x110 > ret_from_fork_asm+0x11/0x20 > read to 0xffffe8fffe1c85f0 of 8 bytes by task 3551 on cpu 15: > css_rstat_updated+0x81/0x180 > mod_memcg_lruvec_state+0x113/0x2d0 > __mod_lruvec_state+0x3d/0x50 > lru_add+0x21e/0x3f0 > folio_batch_move_lru+0x80/0x1b0 > __folio_batch_add_and_move+0xd7/0x160 > folio_add_lru_vma+0x42/0x50 > do_anonymous_page+0x892/0xe90 > __handle_mm_fault+0xfaa/0x1520 > handle_mm_fault+0xdc/0x350 > do_user_addr_fault+0x1dc/0x650 > exc_page_fault+0x5c/0x110 > asm_exc_page_fault+0x22/0x30 > value changed: 0xffffe8fffe18e0d0 -> 0xffffe8fffe1c85f0 > > $ ./scripts/faddr2line vmlinux css_rstat_flush+0x1b8/0xeb0 > css_rstat_flush+0x1b8/0xeb0: > init_llist_node at include/linux/llist.h:86 > (inlined by) llist_del_first_init at include/linux/llist.h:308 > (inlined by) css_process_update_tree at kernel/cgroup/rstat.c:148 > (inlined by) css_rstat_updated_list at kernel/cgroup/rstat.c:258 > (inlined by) css_rstat_flush at kernel/cgroup/rstat.c:389 > > $ ./scripts/faddr2line vmlinux css_rstat_updated+0x81/0x180 > css_rstat_updated+0x81/0x180: > css_rstat_updated at kernel/cgroup/rstat.c:90 (discriminator 1) > > These are expected race and a simple READ_ONCE/WRITE_ONCE resolves these > reports. However let's add comments to explain the race and the need for > memory barriers if stronger guarantees are needed. > > More specifically the rstat updater and the flusher can race and cause a > scenario where the stats updater skips adding the css to the lockless > list but the flusher might not see those updates done by the skipped > updater. This is benign race and the subsequent flusher will flush those > stats and at the moment there aren't any rstat users which are not fine > with this kind of race. However some future user might want more > stricter guarantee, so let's add appropriate comments to ease the job of > future users. > > Signed-off-by: Shakeel Butt > Reviewed-by: Paul E. McKenney > Fixes: 36df6e3dbd7e ("cgroup: make css_rstat_updated nmi safe") Applied to cgroup/for-6.17. Sorry about the delay. I'm on a vacation and ended up a lot more offline than I expected to be. Thanks. -- tejun