From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7519BC636D4 for ; Wed, 15 Feb 2023 07:25:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 936456B0072; Wed, 15 Feb 2023 02:25:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8E5BF6B0073; Wed, 15 Feb 2023 02:25:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D6F26B0074; Wed, 15 Feb 2023 02:25:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6D2B66B0072 for ; Wed, 15 Feb 2023 02:25:21 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 42151ABE16 for ; Wed, 15 Feb 2023 07:25:21 +0000 (UTC) X-FDA: 80468690442.28.18995D1 Received: from mail3-163.sinamail.sina.com.cn (mail3-163.sinamail.sina.com.cn [202.108.3.163]) by imf20.hostedemail.com (Postfix) with ESMTP id B5E021C000B for ; Wed, 15 Feb 2023 07:25:17 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.163 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676445919; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=72UC+slqkDHus1tzhDuLwB/3kEZ8wIJMD15pw2I/58s=; b=mVvLv6ns9yiaSnxRlXryDqW+WIQz+MAm0wGUtjoJYFKUvgt0IOJt2uR5F7HIKUbOuLHBGh tpMXIvR5k+dciV/A4pXPLxNbzrE6xROCLWslFGO+I3+N7DWDBMWvU9DEAdrUck0RpGhdre sr3Hq+jA12+dMjVwzPn8BYsUSr9Sdus= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=none; spf=pass (imf20.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.163 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676445919; a=rsa-sha256; cv=none; b=jzyNzo1FuIMPNDxQw4/vpH3MhtZHPW1grVleMJMuTwvw26q47iEEXqGyBanjocmaRT+eRj vc8uUFw+DYOb21ueV7VHPnN6ZcqN9B2EUYUOx+AzLlERHbb4fOPbpLesnGl0DnBMu4304M mCJgmbIMIOQQS3RcKDx1rsDLEsKOIo0= Received: from unknown (HELO localhost.localdomain)([114.249.61.130]) by sina.com (172.16.97.27) with ESMTP id 63EC885F00028DE2; Wed, 15 Feb 2023 15:23:13 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 47369249284027 From: Hillf Danton To: Xiubo Li Cc: tj@kernel.org, hannes@cmpxchg.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: cgroup: deadlock between cpu_hotplug_lock and freezer_mutex Date: Wed, 15 Feb 2023 15:25:01 +0800 Message-Id: <20230215072501.3764-1-hdanton@sina.com> In-Reply-To: <768be93b-a401-deab-600c-f946e0bd27fa@redhat.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: rkjo7h9fb9ru3pd119ea3ibg37a6swub X-Rspam-User: X-Rspamd-Queue-Id: B5E021C000B X-Rspamd-Server: rspam06 X-HE-Tag: 1676445917-953650 X-HE-Meta: U2FsdGVkX1+y3dJkAAxUDigsJ9q1quqrRv3daVt8oc380vb4ZtTO+3BoX9nBO9i/Zrj2S09TZjydFGLs4bz+ordD5ifrdR8Uk0CQfn3OQap8OugMA2WuoqzS/3OYCBjjfFOccIs0IkroybblhRLy3g2v3urHkj5jpSSdb5El1PL9w+crhdjHN4nqsEIOis//NFMzdN90pc6I/LttIgs1jTwZ30tduyKJqDb/qwnmGghL6UN0+Dgq4+2LNGnxX6JjtXzJcTidY9eWMBh5Z0syyNM3/uhGswUEYJFGZ52VL9pSOEHeSjZYmTeZ0J++lFmieT8JgKHvzZybJJ3Xn3CaWn73L41Pj+Tt3A/y7/luyWi8z0nUaCy+e650Jx0POWIZH0nMMxmeTIohTd+ziHA4GUhhgEpdg09K9/GZRPc0k9Vc4w6IhGe34zXWK66eoJURAet8hgCYS84iCwdzD5ga+WDu8YE4ok4ujumKvjPKV0iOuJfC+y+GwMsdE9SNgoXheyb4JLIWZoWblqGoUHjMSpWGnkQT3r+5wJauIM+VFLUZBrkB7eQpKVengFhcpA3JCrdA03Gn1uHDAgDOgS8o1+MLOkZ31TRTPVZovs4oDpWiMhWoKhrcLW+g1gPzN3hMLw84MMXs+IH2fOkRP3z2+CYPmBrXFj4kb9ENInwOK1sq4hPvbPQSWNnsAhQShlALzRj4MPZ/geikyj2N8wqd/Ot/QPSLyedufE4M0AY2Dv/TH5WeL0qjaPj6nTyhuUzpxlRF/q/H6EQc5Zt3lqD260ItU30DXEL/g4jtymmgwaojr2GPUavANZsz+n9C+2oV8+W9q9ulfYOkCcfK+xq3drIMt7HAEJ6hehnloHqhiQSbsQhr2M1UMQpazkw1J35QBrhfYxqhQtV6df9WFrn7y0fVGejqGla3NsxM+jG/LVem44LhNtru4r2f473iyMLxM3XK3+4Gk5rjiLHnT2p IqYrLc28 OkEZJ4WUD2yeeXVcBFYJF4gVK9HxPKqy9SYTzCk2C3mnbST20V3I6OLz9NckQgOQ1RIvga5hXlSFRsmsHIkc8gYY+LGju91MtCRDDkmY9e8+CmuwvRFqu0GZW/OqtbbBr4I46xEVzcEMTnLbWtGlVj8zi05RAsXa40flTOdcetqT0f+oTSYnHAVR29ZeWTac23f8bDegegejY0kylB51/scqbBZVTVgLgBbtro4IaJsRxh/0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000020, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 15 Feb 2023 10:07:23 +0800 Xiubo Li > Hi > > Recently when running some test cases for ceph we hit the following > deadlock issue in cgroup code. Has this been fixed ? I have checked the > latest code and it seems no any commit is fixing this. > > This call trace could also be found in > https://tracker.ceph.com/issues/58564#note-4, which is more friendly to > read. > >  ====================================================== >  WARNING: possible circular locking dependency detected >  6.1.0-rc5-ceph-gc90f64b588ff #1 Tainted: G S >  ------------------------------------------------------ >  runc/90769 is trying to acquire lock: >  ffffffff82664cb0 (cpu_hotplug_lock){++++}-{0:0}, at: > static_key_slow_inc+0xe/0x20 >  #012but task is already holding lock: >  ffffffff8276e468 (freezer_mutex){+.+.}-{3:3}, at: freezer_write+0x89/0x530 >  #012which lock already depends on the new lock. >  #012the existing dependency chain (in reverse order) is: >  #012-> #2 (freezer_mutex){+.+.}-{3:3}: >        __mutex_lock+0x9c/0xf20 >        freezer_attach+0x2c/0xf0 >        cgroup_migrate_execute+0x3f3/0x4c0 >        cgroup_attach_task+0x22e/0x3e0 >        __cgroup1_procs_write.constprop.12+0xfb/0x140 >        cgroup_file_write+0x91/0x230 >        kernfs_fop_write_iter+0x137/0x1d0 >        vfs_write+0x344/0x4d0 >        ksys_write+0x5c/0xd0 >        do_syscall_64+0x34/0x80 >        entry_SYSCALL_64_after_hwframe+0x63/0xcd >  #012-> #1 (cgroup_threadgroup_rwsem){++++}-{0:0}: >        percpu_down_write+0x45/0x2c0 >        cgroup_procs_write_start+0x84/0x270 >        __cgroup1_procs_write.constprop.12+0x57/0x140 >        cgroup_file_write+0x91/0x230 >        kernfs_fop_write_iter+0x137/0x1d0 >        vfs_write+0x344/0x4d0 >        ksys_write+0x5c/0xd0 >        do_syscall_64+0x34/0x80 >        entry_SYSCALL_64_after_hwframe+0x63/0xcd >  #012-> #0 (cpu_hotplug_lock){++++}-{0:0}: >        __lock_acquire+0x103f/0x1de0 >        lock_acquire+0xd4/0x2f0 >        cpus_read_lock+0x3c/0xd0 >        static_key_slow_inc+0xe/0x20 >        freezer_apply_state+0x98/0xb0 >        freezer_write+0x307/0x530 >        cgroup_file_write+0x91/0x230 >        kernfs_fop_write_iter+0x137/0x1d0 >        vfs_write+0x344/0x4d0 >        ksys_write+0x5c/0xd0 >        do_syscall_64+0x34/0x80 >        entry_SYSCALL_64_after_hwframe+0x63/0xcd >  #012other info that might help us debug this: >  Chain exists of:#012  cpu_hotplug_lock --> cgroup_threadgroup_rwsem > --> freezer_mutex >  Possible unsafe locking scenario: >        CPU0                    CPU1 >        ----                    ---- >   lock(freezer_mutex); >                                lock(cgroup_threadgroup_rwsem); >                                lock(freezer_mutex); >   lock(cpu_hotplug_lock); >  #012 *** DEADLOCK *** Thanks for your report. Change locking order if it is impossible to update freezer_active in atomic manner. Only for thoughts. Hillf +++ linux-6.1.3/kernel/cgroup/legacy_freezer.c @@ -350,7 +350,7 @@ static void freezer_apply_state(struct f if (freeze) { if (!(freezer->state & CGROUP_FREEZING)) - static_branch_inc(&freezer_active); + static_branch_inc_cpuslocked(&freezer_active); freezer->state |= state; freeze_cgroup(freezer); } else { @@ -361,7 +361,7 @@ static void freezer_apply_state(struct f if (!(freezer->state & CGROUP_FREEZING)) { freezer->state &= ~CGROUP_FROZEN; if (was_freezing) - static_branch_dec(&freezer_active); + static_branch_dec_cpuslocked(&freezer_active); unfreeze_cgroup(freezer); } } @@ -379,6 +379,7 @@ static void freezer_change_state(struct { struct cgroup_subsys_state *pos; + cpus_read_lock(); /* * Update all its descendants in pre-order traversal. Each * descendant will try to inherit its parent's FREEZING state as @@ -407,6 +408,7 @@ static void freezer_change_state(struct } rcu_read_unlock(); mutex_unlock(&freezer_mutex); + cpus_read_unlock(); } static ssize_t freezer_write(struct kernfs_open_file *of,