From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65225C43334 for ; Tue, 12 Jul 2022 22:39:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7A3B69400D5; Tue, 12 Jul 2022 18:39:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 75229940063; Tue, 12 Jul 2022 18:39:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 642EE9400D5; Tue, 12 Jul 2022 18:39:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 56EE7940063 for ; Tue, 12 Jul 2022 18:39:13 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 270E012084D for ; Tue, 12 Jul 2022 22:39:13 +0000 (UTC) X-FDA: 79679914986.15.38BBE5B Received: from r3-24.sinamail.sina.com.cn (r3-24.sinamail.sina.com.cn [202.108.3.24]) by imf27.hostedemail.com (Postfix) with SMTP id F34994007A for ; Tue, 12 Jul 2022 22:39:10 +0000 (UTC) Received: from unknown (HELO localhost.localdomain)([114.249.59.48]) by sina.com (172.16.97.23) with ESMTP id 62CDF80100017E0C; Tue, 13 Jul 2022 06:38:59 +0800 (CST) X-Sender: hdanton@sina.com X-Auth-ID: hdanton@sina.com X-SMAIL-MID: 35231054919342 From: Hillf Danton To: Paul Gortmaker Cc: John Keeping , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Peter Zijlstra , Thomas Gleixner , Mel Gorman , Theodore Ts'o Subject: Re: [PATCH v3] sched/core: Always flush pending blk_plug Date: Wed, 13 Jul 2022 06:38:57 +0800 Message-Id: <20220712223857.1060-1-hdanton@sina.com> In-Reply-To: <20220712182914.GK1723@windriver.com> References: <20220708162702.1758865-1-john@metanate.com> <20220710010136.2510-1-hdanton@sina.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657665552; a=rsa-sha256; cv=none; b=2RIL2NX6soYxWAhEOhoutT76p2lMkN3iDpDgyzLRhXHNWfLfWxRj2TvRLWaEX7t8OGF/1x sBeBgAuqeWFMkIh++89yUDMOjZXWf6c30Ig5MYh6z3+Pfl/sU5OeliyzcEGpWo29Z1RUHo 7uQOQTiB9yZR/1+cRJRizDWkw6Aten8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.24 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657665552; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=LVNhS+0M9fYJOcemgdwlXfBDmCUULbf/BNBcEFD3Yi8=; b=kU4GOCpYZy51OPuwbr9/diio5deozndcnfIq0hgQQhXsPw5ZEdEZtAkXQkdi6WlSsmOxXD /zxEgq2YdhMfE0s/eg2g6rq2LOxrnQdCjsRLOKpcvyGrXeimDenEIeT1ISZrXzpOK7NVUE n0N/e1yTAtpyZYEvSu8S2oyHDIjONPo= X-Rspamd-Queue-Id: F34994007A Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of hdanton@sina.com designates 202.108.3.24 as permitted sender) smtp.mailfrom=hdanton@sina.com; dmarc=none X-Rspamd-Server: rspam02 X-Rspam-User: X-Stat-Signature: 5g9k17cmapoqm1ozz4kszsrj9pwoksqg X-HE-Tag: 1657665550-651786 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, 12 Jul 2022 14:29:14 -0400 Paul Gortmaker wrote: > On 10/07/2022 (Sun 09:01) Hillf Danton wrote: > > > Looks like a ABBA deadlock that should be fed to lockdep, given what was > > blocking flusher (pid 213). > > [,,,] > > > Make lock_buffer have mutex_lock() semantics in attempt to catch the > > deadlock above. > > > > Only for thoughts now. > > Thanks for the test patch - I'm keeping it as a reference for future > use. > > So, preempt-rt issues an early lockdep splat at partition probe bootup, > both with and without the original v3 patch from this thread. > > Of course then I figured I'd try the debug patch on vanilla v5.19-rc3 > and we get pretty much the same lockdep complaint. > > > sd 1:0:0:0: Attached scsi generic sg1 type 0 > sd 1:0:0:0: [sdb] Preferred minimum I/O size 512 bytes > scsi 2:0:0:0: CD-ROM HL-DT-ST DVD-ROM DH30N A101 PQ: 0 ANSI: 5 > > ===================================== > WARNING: bad unlock balance detected! > 5.19.0-rc3-dirty #2 Not tainted > ------------------------------------- > swapper/2/0 is trying to release lock (buffer_head_lock) at: > [] end_buffer_async_read+0x5b/0x180 > but there are no more locks to release! The lock releaser different from the lock acquirer was caught. > > other info that might help us debug this: > 1 lock held by swapper/2/0: > #0: ffff8bee27744080 (&ret->b_uptodate_lock){....}-{2:2}, at: end_buffer_async_read+0x47/0x180 > > stack backtrace: > CPU: 2 PID: 0 Comm: swapper/2 Not tainted 5.19.0-rc3-dirty #2 > Hardware name: Dell Inc. Precision WorkStation T5500 /0D883F, BIOS A16 05/28/2013 > Call Trace: > > dump_stack_lvl+0x40/0x5c > lock_release+0x245/0x3f0 > unlock_buffer+0x15/0x30 > end_buffer_async_read+0x5b/0x180 > end_bio_bh_io_sync+0x1e/0x40 > blk_update_request+0x9a/0x470 > scsi_end_request+0x27/0x190 > scsi_io_completion+0x3c/0x580 > blk_complete_reqs+0x39/0x50 > __do_softirq+0x11d/0x344 > irq_exit_rcu+0xa9/0xc0 > common_interrupt+0xa5/0xc0 > > > asm_common_interrupt+0x27/0x40 > RIP: 0010:cpuidle_enter_state+0x12d/0x3f0 > Code: 49 43 0f 84 b7 02 00 00 31 ff e8 fe 1c 74 ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 8e 02 00 00 31 ff e8 97 95 7a ff fb 45 85 f6 <0f> 88 12 01 00 00 49 63 d6 4c 2b 24 24 48 8d 04 52 48 8d 04 82 49 > RSP: 0000:ffffa2890013fe90 EFLAGS: 00000206 > RAX: 0000000000000000 RBX: 0000000000000003 RCX: 0000000000000002 > RDX: 0000000080000002 RSI: ffffffffbd78d911 RDI: 00000000ffffffff > RBP: ffff8bee2739a400 R08: 0000000000000000 R09: 00000000001faf40 > R10: ffffa2890013fdc8 R11: 0000000000000000 R12: 00000000a7b149d6 > R13: ffffffffbdbee3a0 R14: 0000000000000003 R15: 0000000000000001 > cpuidle_enter+0x24/0x40 > do_idle+0x1e3/0x230 > cpu_startup_entry+0x14/0x20 > start_secondary+0xe8/0xf0 > secondary_startup_64_no_verify+0xe0/0xeb > > sda: sda1 sda2 sda3 > sd 0:0:0:0: [sda] Attached SCSI disk > sdb: sdb1 sdb2 > sd 1:0:0:0: [sdb] Attached SCSI disk > > Not quite sure what to make of that. You provide the answer to the question why it has been quite a long while that lock_buffer has no annotation to help lockdep. And perhaps lock_page as well despite the relevant long running deadlock does not sit deep under the water. Thanks Hillf