From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6826DC433FE for ; Fri, 30 Sep 2022 18:26:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 966828D0001; Fri, 30 Sep 2022 14:26:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9167E6B0073; Fri, 30 Sep 2022 14:26:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DD848D0001; Fri, 30 Sep 2022 14:26:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6A9246B0072 for ; Fri, 30 Sep 2022 14:26:25 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 246F980198 for ; Fri, 30 Sep 2022 18:26:25 +0000 (UTC) X-FDA: 79969581930.28.4B38E76 Received: from out2.migadu.com (out2.migadu.com [188.165.223.204]) by imf05.hostedemail.com (Postfix) with ESMTP id 79EB6100008 for ; Fri, 30 Sep 2022 18:26:24 +0000 (UTC) Date: Fri, 30 Sep 2022 11:26:17 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1664562382; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bozQYQ1ItXY9G43M8jtCYuxizhROlKE4s9wubZ56+HI=; b=H0/2RA6SRRc9Ov2ipdLFhQzKpskdAcFqVeHK/w0k4kMBSIWDjWIjS4H6iezwdKQNZhPrJP Z4V5fJulEcg9gMXtU5v6Se/9LrwVmrWmztvXtwBi45Tw9kGx0yn42NOLYVhvr8t4EVEoDe x+/JTiACIIk6+RkL8sTZrui+9hC9FNs= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Alexander Fedorov Cc: Johannes Weiner , Michal Hocko , Shakeel Butt , Vladimir Davydov , Muchun Song , Sebastian Andrzej Siewior , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: Re: Possible race in obj_stock_flush_required() vs drain_obj_stock() Message-ID: References: <1664546131660.1777662787.1655319815@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1664546131660.1777662787.1655319815@gmail.com> X-Migadu-Flow: FLOW_OUT ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="H0/2RA6S"; spf=pass (imf05.hostedemail.com: domain of roman.gushchin@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664562384; a=rsa-sha256; cv=none; b=RnOFsWoiY2zxtNGe6X3eVcfV6X4j3yDdGrOFLxw/BLzeBmrqQVlx+1M85KrUQuWRoJfX7u NkYIOgaIBYe+dzIhSmk5VyUmDr4A727puzee8ltTGVsdhe7AnhR9HIyzOAK7zQQ1RZdLK4 hRzy7uE40zpb7tUJSZKgWpWxK6myakY= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664562384; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bozQYQ1ItXY9G43M8jtCYuxizhROlKE4s9wubZ56+HI=; b=n9kiB4D84uG71kQ8qLcG9r2Vik0yNisv/68jMzYXSL7unqlD/ciXka0xLBsbMMkCF8TU8s 9jRfPCYfcU4xAFc62BiHd/YRwYKg1Rk/ezdgmaHNC/hZ+pqM3HHj5Mgl1exI8WAMUq04QF grZTARc6QDh8vEEx414cL9BxJaBEpC4= X-Rspam-User: Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="H0/2RA6S"; spf=pass (imf05.hostedemail.com: domain of roman.gushchin@linux.dev designates 188.165.223.204 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Rspamd-Server: rspam10 X-Stat-Signature: 6ga63fobxaas7zq6h4p9frxf5qnnzbat X-Rspamd-Queue-Id: 79EB6100008 X-HE-Tag: 1664562384-686616 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Sep 30, 2022 at 02:06:48PM +0000, Alexander Fedorov wrote: > Hi, > > reposting this to the mainline list as requested and with updated patch. > > I've encountered a race on kernel version 5.10.131-rt72 when running > LTP cgroup_fj_stress_memory* tests and need help with understanding > synchronization in mm/memcontrol.c, it seems really not-trivial... > Have also checked patches in the latest mainline kernel but do not see > anything similar to the problem. Realtime patch also does not seem to > be the culprit: it just changed preemption to migration disabling and > irq_disable to local_lock. > > It goes as follows: > > 1) First CPU: > css_killed_work_fn() -> mem_cgroup_css_offline() -> > drain_all_stock() -> obj_stock_flush_required() > if (stock->cached_objcg) { > > This check sees a non-NULL pointer for *another* CPU's `memcg_stock` > instance. > > 2) Second CPU: > css_free_rwork_fn() -> __mem_cgroup_free() -> free_percpu() -> > obj_cgroup_uncharge() -> drain_obj_stock() > It frees `cached_objcg` pointer in its own `memcg_stock` instance: > struct obj_cgroup *old = stock->cached_objcg; > < ... > > obj_cgroup_put(old); > stock->cached_objcg = NULL; > > 3) First CPU continues after the 'if' check and re-reads the pointer > again, now it is NULL and dereferencing it leads to kernel panic: > static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, > struct mem_cgroup *root_memcg) > { > < ... > > if (stock->cached_objcg) { > memcg = obj_cgroup_memcg(stock->cached_objcg); Great catch! I'm not sure about switching to rcu primitives though. In all other cases stock->cached_objcg is accessed only from a local cpu, so using rcu_* function is an overkill. How's something about this? (completely untested) Also, please add Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API") Thank you! -- diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b69979c9ced5..93e9637108f0 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3245,10 +3245,18 @@ static struct obj_cgroup *drain_obj_stock(struct memcg_stock_pcp *stock) static bool obj_stock_flush_required(struct memcg_stock_pcp *stock, struct mem_cgroup *root_memcg) { + struct obj_cgroup *objcg; struct mem_cgroup *memcg; - if (stock->cached_objcg) { - memcg = obj_cgroup_memcg(stock->cached_objcg); + /* + * stock->cached_objcg can be changed asynchronously, so read + * it using READ_ONCE(). The objcg can't go away though because + * obj_stock_flush_required() is called from within a rcu read + * section. + */ + objcg = READ_ONCE(stock->cached_objcg); + if (objcg) { + memcg = obj_cgroup_memcg(objcg); if (memcg && mem_cgroup_is_descendant(memcg, root_memcg)) return true; } Thank you!