From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87F3CC433F5 for ; Thu, 19 May 2022 11:23:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D12066B0072; Thu, 19 May 2022 07:23:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CC37A6B0073; Thu, 19 May 2022 07:23:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B89586B0074; Thu, 19 May 2022 07:23:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A72C66B0072 for ; Thu, 19 May 2022 07:23:37 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6E8F020C9E for ; Thu, 19 May 2022 11:23:37 +0000 (UTC) X-FDA: 79482257274.03.8940DB9 Received: from r3-25.sinamail.sina.com.cn (r3-25.sinamail.sina.com.cn [202.108.3.25]) by imf01.hostedemail.com (Postfix) with SMTP id 9BA1940015 for ; Thu, 19 May 2022 11:23:33 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.249.57.134]) by sina.com (172.16.97.23) with ESMTP id 6286286D0001DA1B; Thu, 19 May 2022 19:22:23 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 25622354919344 From: Hillf Danton To: Tejun Heo Cc: Tadeusz Struk , Michal Koutny , linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzbot+e42ae441c3b10acf9e9d@syzkaller.appspotmail.com Subject: Re: [PATCH] cgroup: don't queue css_release_work if one already pending Date: Thu, 19 May 2022 19:23:19 +0800 Message-Id: <20220519112319.2455-1-hdanton@sina.com> In-Reply-To: <317701e1-20a7-206f-92cd-cd36d436eee2@linaro.org> References: <20220412192459.227740-1-tadeusz.struk@linaro.org> <20220414164409.GA5404@blackbody.suse.cz> <20220422100400.GA29552@blackbody.suse.cz> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9BA1940015 X-Stat-Signature: nduywq9nrorssmotin9f5tr1gx4uxfsg X-Rspam-User: Authentication-Results: imf01.hostedemail.com; dkim=none; dmarc=none; spf=pass (imf01.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.25 as permitted sender) smtp.mailfrom=hdanton@sina.com X-HE-Tag: 1652959413-515865 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 18 May 2022 09:48:21 -0700 Tadeusz Struk wrote: > On 4/22/22 04:05, Michal Koutny wrote: > > On Thu, Apr 21, 2022 at 02:00:56PM -1000, Tejun Heo wrote: > >> If this is the case, we need to hold an extra reference to be put by the > >> css_killed_work_fn(), right? That put could trigger INIT_WORK in css_release() and warning [1] on init active (active state 0) object OTOH as the same css->destroy_work is used in both kill and release pathes. Hillf [1] https://lore.kernel.org/lkml/000000000000ff747805debce6c6@google.com/ > > > > I looked into it a bit more lately and found that there already is such > > a fuse in kill_css() [1]. > > > > At the same type syzbots stack trace demonstrates the fuse is > > ineffective > > > >> css_release+0xae/0xc0 kernel/cgroup/cgroup.c:5146 (**) > >> percpu_ref_put_many include/linux/percpu-refcount.h:322 [inline] > >> percpu_ref_put include/linux/percpu-refcount.h:338 [inline] > >> percpu_ref_call_confirm_rcu lib/percpu-refcount.c:162 [inline] (*) > >> percpu_ref_switch_to_atomic_rcu+0x5a2/0x5b0 lib/percpu-refcount.c:199 > >> rcu_do_batch+0x4f8/0xbc0 kernel/rcu/tree.c:2485 > >> rcu_core+0x59b/0xe30 kernel/rcu/tree.c:2722 > >> rcu_core_si+0x9/0x10 kernel/rcu/tree.c:2735 > >> __do_softirq+0x27e/0x596 kernel/softirq.c:305 > > > > (*) this calls css_killed_ref_fn confirm_switch > > (**) zero references after confirmed kill? > > > > So, I was also looking at the possible race with css_free_rwork_fn() > > (from failed css_create()) but that would likely emit a warning from > > __percpu_ref_exit(). > > > > So, I still think there's something fishy (so far possible only via > > artificial ENOMEM injection) that needs an explanation... > > I can't reliably reproduce this issue on neither mainline nor v5.10, where > syzbot originally found it. It still triggers for syzbot though. > > -- > Thanks, > Tadeusz